Page tree
Skip to end of metadata
Go to start of metadata

Tempus is the time-based algorithm in Moogsoft AIOps which clusters alerts into Situations based on the similarity of their timestamps. 

The underlying premise of Tempus is that when things go wrong, they go wrong together. For example, if a core element of your network infrastructure such as a switch fails and becomes disconnected then a lot of other interconnected elements are affected and send events at a similar time.

Tempus uses the Jaccard index to calculate the similarity of different alerts. Then community detection methods are used to identify which Alerts with similar arrival patterns should be clustered into Situations.

As it is time-based, Tempus should not be used to detect events relating to the slow or gradual degradation of a service from disks filling up or CPU usage.

One advantage of Tempus is it only uses event timestamps for clustering so no alert enrichment is required.

Time-based Clustering

AIOps applies Tempus incrementally to Alerts as it ingests them so that it can create Situations in real-time

The diagrams below show how Tempus sorts and the groups Alerts with similar timestamps into Situations:

Raw Alerts from either the AlertBuilder or Alert Rules Engine arrive over a period of time. These are shown as gray dots in the diagram below:

Tempus identifies and sorts which Alerts have similar arrival patterns:

 Alerts with similar arrival patterns are clustered into Situations:

Tempus Configuration

Tempus is configured and tuned using parameters in moog_farmd.conf. The Moolet parameters configure general information about each Sigaliser. The Output parameters control where the output processed by Tempus originates from. The Trigger and Sigalising parameters control the Sigaliser execution and duration.

Moolet Parameters

The parameters that relate to the Tempus Moolet are as follows:

run_on_startup

Determines whether Tempus runs when AIOps is started or not. When you run Tempus at startup, you ensure you capture every Alert.

Type: Boolean
Defaultfalse

persist_state

Enables Tempus to save its state for High Availability systems so if a failover occurs, the second moogfarmd can continue from the same point.

Type: Boolean
Defaultfalse

metric_path_moolet

Determines whether Tempus is factored into the Event Processing metric for Self Monitoring or not.

Type: Boolean
Defaultfalse

description

Describes the Situation produced by the Sigaliser.

Type: String
DefaultA Tempus (a.k.a. Sigaliser V2) Situation

The default Tempus moolet parameters are as follows:

    name              : "Tempus",
    classname         : "com.moogsoft.farmd.moolet.tempus.CTempus",
    run_on_startup    : false,
    persist_state     : false,
    metric_path_moolet   : true,
    #process_output_of : "AlertRulesEngine",
    process_output_of : "AlertBuilder"
    description       : "A Tempus (a.k.a. Sigaliser V2) Situation",

name and classname are hardcoded and should not be changed.

Output Parameters

These parameters control the output processed by the Sigaliser:

process_output_of

Informs AIOps to process the output of either the Alert Builder or the Alert Rules Engine. By default, the Sigaliser connects directly to the Alert Builder and Alert Rules Engine is only being used if automations are desired prior to Situation resolution. The Sigaliser can have only one input.

Type: List
One of: AlertBuilder, AlertRulesEngine
DefaultAlertBuilder

entropy_threshold

Sets the minimum entropy an Alert must have to be included in the Sigaliser calculation. Any Alert which arrives with entopy below this value will never be included. The default is 0.0 which means all Alerts will be included.

Type: Integer
Default0.0

The default output parameters are as follows:

    # process_output_of : "AlertRulesEngine",
    process_output_of : "AlertBuilder"
    description       : "A Tempus (a.k.a. Sigaliser V2) Situation",
 
    # Algorithm
    entropy_threshold : 0.0,

Trigger and Sigalising Window Parameters

The execution and duration of Tempus is controlled by the trigger, window and bucket parameters:

  • The sig_interval trigger determines when Tempus starts to run
  • The window is the total span of time in seconds in which Alerts will be analyzed each time Tempus runs
  • Time buckets are small five-second subdivisions of the window in which the Alerts are captured.

sig_interval

Executes the Tempus algorithm after a defined number of seconds. In the example above, the Sigaliser will run every 120 seconds (two minutes). 

Type: Integer
Default: 120

window_size

Determines the length of time of the window in which Alerts are analysed and a Situation develops each time the Sigaliser is run. By default the Sigalising window is 1200 seconds (20 minutes).

Type: Integer
Default1200

bucket_size

Determines the time span of each bucket in which Alerts are captured in seconds. By default each bucket is five seconds long so there will be 240 buckets per window.

Type: Integer
Default5 

Moogsoft do not recommend you change the bucket size. If you do want to change the bucket_size then change with caution because Tempus is designed to use small bucket sizes

arrival_spread

Sets the acceptable latency or arrival window for each Alert in seconds. This can be used to minimise or reduce the impact of multiple Alerts arriving over a small amount of time and landing in separate buckets.

Type: Integer
Default15 

min_arrival_similarity

Determines how similar Alerts must be to be consider for clustering. This is useful way to determine what proportion of the events two Alerts need to share to have a similar pattern of arrival. By default this is 0.6667 which means Tempus will disregard any Alerts with less than two-thirds similarity.

Type: Integer
Default0.6667

The default trigger and sigalising window parameters are as follows:

       # Triggers
       sig_interval      : 120,    # seconds => sigalise every 2 minutes

       # Sigalising Window
       window_size       : 1200,   # seconds => 20 minutes
       bucket_size       : 5,      # seconds : Take Care if changing - Tempus is designed to use small bucket sizes
       arrival_spread    : 15,     # seconds : acceptbale latency/arrival window for each event

Partitioning

Partitioning is set to 'null' by default. There are two methods to partition data into Situations. The first is 'partition_by' which splits the clusters according to the parameters specified. The second is 'pre_partition', which splits the incoming event stream before clustering. 

Pre-partitioning is recommended as it does not interfere with the results of the clustering algorithms

partition_by

After clustering has taken place and before you enter merging and resolution, you can split clusters into sub-clusters based on a component of the events. For example, you can use the manager parameter to ensure the Situations only contain events from the same manager. In general, and by default, you should comment out the partition_by parameter.

Partitioning by components is not recommended

pre_partition

An alternative way of partitioning is to use pre_partition  which allows you to specify a component field (from the list of specified components) around which the event stream will be partitioned before clustering occurs. The Alerts in the resulting Situations will each contain a single value for the component field chosen.

Tempus Example 

Tempus appears in moog_farmd.conf as shown below:

                    {
                        # Moolet
                        name              : "Tempus",
                        classname         : "com.moogsoft.farmd.moolet.tempus.CTempus",
                        run_on_startup    : false,
                        persist_state     : false,
                        metric_path_moolet   : true,
                        #process_output_of : "AlertRulesEngine",
                        process_output_of : "AlertBuilder"
                        description       : "A Tempus (a.k.a. Sigaliser V2) Situation",

                        # Algorithm
                        entropy_threshold : 0.0,

                        # Triggers
                        sig_interval      : 120,    # seconds => sigalise every 2 minutes

                        # Sigalising Window
                        window_size       : 1200,   # seconds => 20 minutes
                        bucket_size       : 5,      # seconds : Take Care if changing - Tempus is designed to use small bucket sizes
                        arrival_spread    : 15,     # seconds : acceptbale latency/arrival window for each event

                        # How similar must alerts be to be considered for clustering?
                        #min_arrival_similarity : 0.6667,

                        pre_partition     : null,
                        partition_by      : null
                    }

                        #
                        # Pre-partitioning partitions the events into seperate streams BEFORE clustering
                        # based upon the value of the selected component. Sigs generated with partitioning
                        # will each contain a single value for the component.
                        #
                        # Multiple values can be provided in a list in a custom_info field. To make multiple 
                        # pre-partitions using value in the list individually, set "treat_as : list",
                        # e.g. pre_partition   : { name: "custom_info.list_field_name", treat_as: "list" }
                        # The pre_partition value will be treated as a string if not specified.
                        #
                        pre_partition   : null,
                        #
                        # We can choose a single attribute as a singleton component, which splits
                        # the discovered sigs so that they only ever contain events with a unique value
                        # of the supplied attribute. In that way we force partition the situations by
                        # that component.
                        #
                        # NOTE: this differs from pre-partitioning in that the clusters are generated from
                        # the whole data set and only then partitioned based upon the selected component.
                        # This configuration parameter will generally result in a different set of Sigs being
                        # generated than would be produced using pre-partitioning. It is possible to configure
                        # both pre_partition and partition_by at the same time. The partition_by parameter will
                        # only have any effect if it is applied to a different component.
                        #
                        partition_by   : null
                    }
  • No labels

14 Comments

  1. Rob has advised this was a hack and should be taken out/shouldn't be used. 

  2. Rob advised as it was disabled by default to leave it out.

  3. Rob advised to leave out because it shouldn't be changed. In V1 the buckets were one minute by default so were much too large... so events which should have been clustered together were missed.

  4. These diagrams came from Rob's slides. We may want to get Michelle or someone to draw us up some nicer/Moogified diagrams perhaps?

  5. yes these need work 

  6. Sentence is very long. Possible to break here?

  7. Is the algorythm executed? Or : AIOps applies Tempus incrementally to Alerts as it ingests them so that it can create Situations in real-time.

  8. What is "this"? It is not clear what the windows & buckets control. Maybe: Two factors control how Tempus groups (question) alerts into situations:

  9. we need to clarify that the circles in the diagram represent Raw alerts.

  10. Configure and tune Tempus using parameters in moog_farmd.conf.

  11. the origin of the Tempus output?

  12. control the Sigaliser execution and duration?

  13. When you run Tempus at startup, you capture every Alert...

  14. Hmmm the timing of when Tempus is run and for how long is controlled by the windows/buckets... thinking about it, this doesn't need to be here. Might move it down to the section about the window/bucket parameters.