The Data Config > Correlation Engine page in the UI lists the active definitions in your Moogsoft instance. Each definition specifies the following:

• The description to apply to incidents created by this correlation, based on alert data of interest.

• The definition scope — you can apply a correlation to all alerts, or specify an alert filter for matching alerts only.

• The alert fields and tags to consider for correlation, such as source or service.

Note

A definition can have multiple fields and tags. You can also specify a similarity threshold for each element to correlate values that are similar but not identical. See Fields and Tags to correlate below.

• The correlation time window — that is, how long an incident is a candidate for correlation using this definition.

You can also use the Correlations API to create, retrieve, update, and delete correlation definitions.

Before you begin

Do the following:

1. Set up your event ingestions.

2. Examine your alerts to determine if they include the data that you want to use for correlation. If they do not, set up your alert enrichments to add this data.

Correlation Settings

To define a correlation, go to Data Config > Correlation Engine and then click Add Correlation Definition. Each definition has the following settings:

• Correlation Name

• Incident description

The description to use for all incidents that get generated from this correlation definition. These descriptions appear in the Incidents table. You can use macros to generate incident descriptions dynamically based on the member alerts, as described below.

• Scope

You can specify an alert filter to limit the scope of alerts to consider for a specific correlation.

• Fields and Tags to correlate

The set of alert fields to consider for correlation, and the similarity required for a match between an alert and an incident.

• Correlation time period

The window for correlating alerts into the same incident.

Incident description

You can specify incident descriptions and fields dynamically, based on the alert data in each incident. For example, suppose you are defining a correlation based on the Service alert field. You can then specify a label string such as

Service Incident: cited(service) in classes unique(class,3) for cited(check,2) checks

Given this string, the resulting descriptions include the three most-cited services and the number of times each service is cited by a member alert:

Service Incident: ShoppingCart, Online Store in classes Storage, Compute, Network for Disk, CPU checks

Incident macros

You can use the following macros to generate incident descriptions:

• Count (alert-field) — Return the count of alert-field citations, including duplicates.

• Unique Count (alert-field) — Return the count of unique alert-field citations, excluding duplicates.

• To List (alert-field) — Return a comma-separated string of all elements in a list, including duplicates.

• Unique (alert-field, N) — Return a comma-separated string of N unique elements in a list, excluding duplicates.

• Top (alert-field) — Return the top-cited item.

• Cited (alert-field, N) — Return a list of the top-cited N items. If two or more items have the same number of citations, the items are sorted alphabetically.

Scope

If the correlation is relevant only to a subset of alerts, you can enter a filter string to consider only alerts of interest.

Fields and Tags to correlate

The set of alert fields and tags to consider for correlation, and the similarity required for a match between an alert and an incident. Two alerts are considered correlated if all the fields and tags in the definition meet the specified degree of similarity. For specific guidance, see Best practices for defining correlations.

The required degree of similarity between the same fields in a new vs. an open alert. Moogsoft uses the bag-of-words model and the shingling natural-language processing methods to calculate the text similarity between two fields.

The correlation engine calculates the similarity differently depending on the field type:

This section describes similarity for the source field and all custom tag fields. The following example illustrates how the correlation engine determines if two fields are similar.

1. A correlation definition specifies service as the one field to correlate, with a similarity threshold of 80%.

2. The correlation engine receives an alert with service = loginver012. An open alert has service = loginver011 .

3. To determine if the two fields are similar, the engine does the following:

1. Splits each string into a set of shingles based on the default shingle size, which is 3.

2. Compares each 3-character sequence in the new-alert field with the corresponding sequence in the open-alert field:

log ogi gin inv nve ver er0 r01 011

log ogi gin inv nve ver er0 r01 012

3. Calculates the similarity using the Sørensen–Dice coefficient:

(number_of_identical_shingles * 2) / total_number_of_shingles

In this case, 8 out of the 9 shingles are identical between the two fields. The total number of shingles in both fields is 18:

8 * 2 / 18 = 0.88...
4. Checks the similarity setting for this field. The similarity is 80%, so the two values are similar.

This section describes similarity for location and service fields.

The location field is a JSON object that can include component fields such as city and region. The service field is a JSON array that can include one or more strings. In both cases, the correlation engine creates strings from the objects and then applies Shingle similarity on the concatenated strings.

For example, suppose an alert arrives with "location": { "region" : "us-west-1", "building":"sfhqlc" } . The engine creates a string that includes all characters except spaces, which are removed:

{"region":"us-west-1","building":"sfhqlc"}

The engine converts arrays similarly. Suppose an alert has a list of services: "service": [ "retail", "support" ]. The engine creates the string:

["retail","support"]

This section describes similarity calculations for agent, class, description, and manager fields that contain multiple words.

agent, class, description, and manager are string fields that might consist of multiple words separated by spaces. Instead of splitting the string into shingles, the correlation engine splits each string into words with space characters as the delineators. Then it calculates the similarity between the two strings as described in Shingle similarity .

The following example illustrates how the correlation engine calculates similarity between two multi-word strings.

1. A correlation definition specifies class as the one field to correlate, with a similarity threshold of 80%.

2. Two open alerts have the following classes:

• Alert 1: "class" : "HTTP 5xx% mcrsrvc-login1"

• Alert 2: "class" : "HTTP 5xx% mcrsrvc-dbsrv3"

3. Alert 3 arrives with "class" : "HTTP 5xx% mcrsrvc-login2".

4. Comparing Alert 3 with Alert 1, the engine has 40 tokens total with 19 matches. (In this example, the underscores indicate spaces.)

HTT TTP TP_ P_5 _5x 5xx xx% x%_ %_m mar crs rsv srv vc- c-l -lo log ogi gin in1

HTT TTP TP_ P_5 _5x 5xx xx% x%_ %_m mar crs rsv srv vc- c-l -lo log ogi gin in2

The correlation engine feeds these numbers into the Sørensen–Dice coefficient:

(19 * 2) / 40 = 0.96

These classes meet the 80% similarity threshold. The engine clusters Alert 3 with Alert 1.

5. Comparing Alert 3 with Alert 2, the engine has 40 tokens total with 14 matches.

HTT TTP TP_ P_5 _5x 5xx xx% x%_ %_m mar crs rsv srv vc- c-l -lo log ogi gin in2

HTT TTP TP_ P_5 _5x 5xx xx% x%_ %_m mar crs rsv srv vc- c-d -db dbs bsr srv rv3

The correlation engine feeds these numbers into the Sørensen–Dice coefficient:

(14 * 2) / 40 = 0.7

These classes do not meet the 80% similarity threshold. The engine does not cluster Alert 3 with Alert 2.

The length of time for clustering similar alerts into the same incident, starting from the incident creation time. When the correlation period ends, Moogsoft correlates alerts into a new incident.

The correlation engine auto-extends an incident's correlation period if it adds alerts near the end of the specified period. Auto-extension works like this:

1. The correlation engine maintains an extension time that is 25% of the specified correlation period.

2. If a new matching alert is added to the incident in the last 25% of the specified correlation period, the new correlation period is the alert arrival time plus the extension time.

3. This process can extend the correlation period up to a maximum of 3 times the original correlation period.

Suppose you specify a correlation period of 15 minutes. In this case, the extension time is 3 minutes 45 seconds and the maximum possible correlation window is 45 minutes. The correlation period can extend as follows:

• The timer starts when the engine creates the incident.

• If the engine does not add an alert between 11 minutes 45 seconds and 15 minutes, the correlation period closes at 15 minutes.

• Suppose the engine adds an alert at 12 minutes. The correlation period extends to 12 minutes plus 3 minutes 45 seconds = 15 minutes 45 seconds.

• Suppose another alert gets added at 15 minutes. The correlation period extends to 15 minutes 45 seconds plus 3 minutes 45 seconds = 19 minutes 30 seconds.

This correlation period for this incident can auto-extend up to a maximum of  45 minutes.