Moogsoft Docs

Tempus

Tempus is the time-based algorithm in Moogsoft AIOps which clusters alerts into Situations based on the similarity of their timestamps.

The underlying premise of Tempus is that when things go wrong, they go wrong together. For example, if a core element of your network infrastructure such as a switch fails and disconnects then it affects a lot of other interconnected elements and send events at a similar time.

Tempus uses the Jaccard index to calculate the similarity of different alerts. It also uses community detection methods to identify which alerts with similar arrival patterns it should cluster into Situations.

As Tempus is time-based, you should not be use it to detect events relating to the slow or gradual degradation of a service from disks filling up or CPU usage.

Note

One advantage of Tempus is it only uses event timestamps for clustering so no alert enrichment is required.

Time-based Clustering

AIOps applies Tempus incrementally to alerts as it ingests them so that it can create Situations in real-time .

The diagrams below show how Tempus sorts and the groups alerts with similar timestamps into Situations:

Raw alerts from either the AlertBuilder or Alert Rules Engine arrive over a period of time. These are shown as gray dots in the diagram below:

Tempus identifies and sorts which alerts have similar arrival patterns:

Alerts with similar arrival patterns are clustered into Situations:

Configure Tempus

Tempus is configured and tuned using parameters in moog_farmd.conf . The Moolet parameters configure general information about each Sigaliser. The Output parameters control where the output processed by Tempus originates from . The Trigger and Sigalising parameters control the Sigaliser execution and duration.

Moolet Parameters

The parameters that relate to the Tempus Moolet are as follows:

run_on_startup : Determines whether Tempus runs when Moogsoft AIOps starts. If enabled, Tempus captures all alerts from the moment the system starts, without you having to configure or start it manually.

Type : Boolean
Default : false

persist_state : Enables Tempus to save its state for High Availability systems so if a failover occurs, the second moogfarmd can continue from the same point.

Type : Boolean
Default : false

metric_path_moolet : Determines whether Tempus is factored into the Event Processing metric for Self Monitoring or not.

Type : Boolean
Default : false

description : Describes the Situation produced by the Sigaliser.

Type : String
Default : A Tempus (a.k.a. Sigaliser V2) Situation

The default Tempus parameters are as follows:

    name              : "Tempus",
    classname         : "com.moogsoft.farmd.moolet.tempus.CTempus",
    run_on_startup    : false,
    persist_state     : false,
    metric_path_moolet   : true,
    #process_output_of : "AlertRulesEngine",
    process_output_of : "AlertBuilder"
    description       : "A Tempus (a.k.a. Sigaliser V2) Situation",

Note

name and classname are hardcoded and should not be changed.

Output Parameters

These parameters control the output processed by the Sigaliser:

process_output_of : Defines the source of the alerts that Tempus processes. By default, the Sigaliser connects directly to the Alert Builder and Alert Rules Engine is only being used if automations are desired prior to Situation resolution.

Type : List
One of: AlertBuilder, AlertRulesEngine, MaintenanceWindowManager, EmptyMoolet
Default : AlertBuilder

entropy_threshold : Sets the minimum entropy value for an alert to be clustered into a Situation. Tempus does not include any alerts with an entropy value below the threshold in Situations. Set to a value between 0.0 and 1.0. The default of 0.0 means all alerts are processed.

Type : Integer
Default : 0.0

The default output parameters are as follows:

    # process_output_of : "AlertRulesEngine",
    process_output_of : "AlertBuilder"
    description       : "A Tempus (a.k.a. Sigaliser V2) Situation",
 
    # Algorithm
    entropy_threshold : 0.0,

Trigger and Sigalising Window Parameters

The execution and duration of Tempus is controlled by the trigger, window and bucket parameters:

  • The sig_interval trigger determines when Tempus starts to run
  • The window is the total span of time in seconds in which Alerts will be analyzed each time Tempus runs
  • Time buckets are small five-second subdivisions of the window in which the Alerts are captured.

sig_interval : Executes the Tempus algorithm after a defined number of seconds. In the example above, the Sigaliser will run every 120 seconds (two minutes).

Type : Integer
Default : 120

window_size : Determines the length of time of the window in which Alerts are analysed and a Situation develops each time the Sigaliser is run. By default the Sigalising window is 1200 seconds (20 minutes).

Type : Integer
Default : 1200

bucket_size : Determines the time span of each bucket in which Alerts are captured in seconds. By default each bucket is five seconds long so there will be 240 buckets per window.

Type : Integer
Default : 5

Warning

Moogsoft do not recommend you change the bucket size. If you do want to change the bucket_size then change with caution because Tempus is designed to use small bucket sizes

arrival_spread : Sets the acceptable latency or arrival window for each Alert in seconds. This can be used to minimise or reduce the impact of multiple Alerts arriving over a small amount of time and landing in separate buckets.

Type : Integer
Default : 15

min_arrival_similarity : Determines how similar Alerts must be to be consider for clustering. This is useful way to determine what proportion of the events two Alerts need to share to have a similar pattern of arrival. By default this is 0.6667 which means Tempus will disregard any Alerts with less than two-thirds similarity.

Type : Integer
Default : 0.6667

The default trigger and sigalising window parameters are as follows:

       # Triggers
       sig_interval      : 120,    # seconds => sigalise every 2 minutes

       # Sigalising Window
       window_size       : 1200,   # seconds => 20 minutes
       bucket_size       : 5,      # seconds : Take Care if changing - Tempus is designed to use small bucket sizes
       arrival_spread    : 15,     # seconds : acceptbale latency/arrival window for each event

Partitioning

Partitioning is set to 'null' by default. There are two methods to partition data into Situations. The first is 'partition_by' which splits the clusters according to the parameters specified. The second is 'pre_partition', which splits the incoming event stream before clustering.

Note

Pre-partitioning is recommended as it does not interfere with the results of the clustering algorithms

partition_by : After clustering has taken place and before you enter merging and resolution, you can split clusters into sub-clusters based on a component of the events. For example, you can use the manager parameter to ensure the Situations only contain events from the same manager. In general, and by default, you should comment out the partition_by parameter.

Warning

Partitioning by components is not recommended

pre_partition : An alternative way of partitioning is to use pre_partition which allows you to specify a component field (from the list of specified components) around which the event stream will be partitioned before clustering occurs. The Alerts in the resulting Situations will each contain a single value for the component field chosen.

Tempus Example

Tempus appears in moog_farmd.conf as shown below:

{
	# Moolet
	name              	: "Tempus",
	classname         	: "com.moogsoft.farmd.moolet.tempus.CTempus",
	run_on_startup    	: false,
    persist_state     	: false,
    metric_path_moolet 	: true,
    #process_output_of 	: "AlertRulesEngine",
    process_output_of	: "AlertBuilder"
	description       	: "A Tempus (a.k.a. Sigaliser V2) Situation",

	# Algorithm
	entropy_threshold 	: 0.0,

	# Triggers
	sig_interval      	: 120,    # seconds => sigalise every 2 minutes

	# Sigalising Window
	window_size       	: 1200,   # seconds => 20 minutes
	bucket_size       	: 5,      # seconds : Take Care if changing - Tempus is designed to use small bucket sizes
	arrival_spread    	: 15,     # seconds : acceptbale latency/arrival window for each event

	# How similar must alerts be to be considered for clustering?
	#min_arrival_similarity : 0.6667,

	pre_partition     	: null,
	partition_by      	: null
}