Tempus for Moogsoft AIOps is a time-based algorithm which clusters Alerts into Situations based on the similarity of their timestamps.
The underlying premise of Tempus is that when things go wrong, they go wrong together. For example, if a core element of your network infrastructure such as a switch fails and becomes disconnected then a lot of other interconnected elements are affected and send events at a similar time.
Tempus uses the Jaccard index to calculate the similarity of different Alerts. Then community detection methods are used to identify which Alerts with similar arrival patterns should be clustered into Situations.
As it is time-based, Tempus should not be used to detect events relating to the slow or gradual degradation of a service from disks filling up or CPU usage.
One advantage of Tempus is it only uses event timestamps for clustering so no Alert enrichment is required
AIOps applies Tempus incrementally to Alerts as it ingests them so that it can create Situations in real-time.
The diagrams below show how Tempus sorts and the groups Alerts with similar timestamps into Situations:
Raw Alerts from either the AlertBuilder or Alert Rules Engine arrive over a period of time. These are shown as gray dots in the diagram below:
Tempus identifies and sorts which Alerts have similar arrival patterns:
Alerts with similar arrival patterns are clustered into Situations:
Tempus can be configured and tuned using parameters in
moog_farmd.conf. The Moolet parameters configure general information about each Sigaliser. The Output parameters control where the output processed by Tempus originates from. The Trigger and Sigalising parameters control the Sigaliser execution and duration.
The parameters which relate to the Tempus Moolet are as follows:
Please note: name and classname are hardcoded and should not be changed
Determines whether Tempus runs when AIOps is started or not. When you run Tempus at startup, you ensure you capture every Alert.
Enables Tempus to save its state for High Availability systems so if a failover occurs, the second moogfarmd can continue from the same point.
Describes the Situation produced by the Sigaliser.
A Tempus (a.k.a. Sigaliser V2) Situation
These parameters control the output processed by the Sigaliser:
Informs AIOps to process the output of either the Alert Builder or the Alert Rules Engine. By default, the Sigaliser connects directly to the Alert Builder and Alert Rules Engine is only being used if automations are desired prior to Situation resolution. The Sigaliser can have only one input.
Sets the minimum entropy an Alert must have to be included in the Sigaliser calculation. Any Alert which arrives with entopy below this value will never be included. The default is 0.0 which means all Alerts will be included.
Trigger and Sigalising Window Parameters
The execution and duration of Tempus is controlled by the trigger, window and bucket parameters:
- The sig_interval trigger determines when Tempus starts to run
- The window is the total span of time in seconds in which Alerts will be analyzed each time Tempus runs
- Time buckets are small five-second subdivisions of the window in which the Alerts are captured.
Executes the Tempus algorithm after a defined number of seconds. In the example above, the Sigaliser will run every 120 seconds (two minutes).
Determines the length of time of the window in which Alerts are analysed and a Situation develops each time the Sigaliser is run. By default the Sigalising window is 1200 seconds (20 minutes).
Determines the time span of each bucket in which Alerts are captured in seconds. By default each bucket is five seconds long so there will be 240 buckets per window.
Moogsoft do not recommend you change the bucket size. If you do want to change the
bucket_size then change with caution because Tempus is designed to use small bucket sizes
Sets the acceptable latency or arrival window for each Alert in seconds. This can be used to minimise or reduce the impact of multiple Alerts arriving over a small amount of time and landing in separate buckets.
Determines how similar Alerts must be to be consider for clustering. This is useful way to determine what proportion of the events two Alerts need to share to have a similar pattern of arrival. By default this is 0.6667 which means Tempus will disregard any Alerts with less than two-thirds similarity.
Partitioning is set to 'null' by default. There are two methods to partition data into Situations. The first is 'partition_by' which splits the clusters according to the parameters specified. The second is 'pre_partition', which splits the incoming event stream before clustering.
Please note: Pre-partitioning is recommended as it does not interfere with the results of the clustering algorithms
After clustering has taken place and before you enter merging and resolution, you can split clusters into sub-clusters based on a component of the events. For example, you can use the
manager parameter to ensure the Situations only contain events from the same manager. In general, and by default, you should comment out the
Partitioning by components is not recommended
An alternative way of partitioning is to use
pre_partition which allows you to specify a component field (from the list of specified components) around which the event stream will be partitioned before clustering occurs. The Alerts in the resulting Situations will each contain a single value for the component field chosen.
Tempus appears in
moog_farmd.conf as shown below: