Create a new correlation definition
Watch a use case walkthrough: Configure a correlation engine ►
APEX AIOps Incident Management includes a default correlation definition called Similar Sources which correlates similar alerts into incidents. You can see the incidents that form using Similar Sources in the Incidents view. If the incidents created meet your needs, then the Similar Sources correlation definition is sufficient. However, if you require incidents to form using different criteria (which is likely), you must create a custom correlation definition. You may also need to create a new correlation group, depending on your requirements.
Custom definitions can be used in combination with the default Similar Sources definition, or instead of the default. If you have multiple definitions, you can reorder them to change how incidents are created. For more information, see Correlation Engine overview.
NOTE: You can also use the Correlations API to create, retrieve, update, and delete correlation definitions.
Notes
Before you begin, make sure you have first configured event ingestion.
If you want to correlate your alerts using additional data not present in the default payloads, set up alert enrichment to add this data.
Create a correlation definition
To create a custom correlation definition, complete the following steps. Use the links in the steps to jump to more information about the different settings.
Navigate to Correlate & Automate > Correlation Engine.
Inside the correlation group where you want to add the definition, click Add Correlation Definition.
Enter a name (required) for the correlation definition in the Name field.
Under Alert Scope:
Define the alerts that are correlated using this definition. You can select:
Consider all alerts for this correlation
OR
Consider only alerts that match a filter
If you select this option, then click inside the Alert Scope box and compose your filter:
Use the guidance provided to compose a filter using suggested terms, operators and values. You can also type your preferred filter, or paste an API filter in the box.
Click Scope Preview and view the matching alerts to verify the filter reflects the correct scope for your correlation.
Edit the filter if necessary, clicking Refresh after making your changes to update the data in the preview list.
Click Apply to exit the preview window and apply the filter.
Under Alert Fields to Match:
Click Add Field and select one or more alert fields that the correlation engine will evaluate to create incidents.
If any of the fields selected under Alert Fields to Match are missing in alerts, or do not contain a value, those alerts are not considered for correlation.
In the Similarity Threshold space for each selected field, enter the percentage that field values must match to be considered part of the same incident.
If you include multiple fields, then the values for all the fields in the alerts must match as required by the Similarity Threshold.
Note
If the selected field is a list (such as
service
), then the similarity threshold defaults to 100 and is not editable. Similarity is not taken into account for lists. During correlation, if any of the list items match for two alerts, then the alerts are considered a match at 100% for that field.Under Incident Description Labeling:
Build the description which displays for incidents created using this correlation definition in this field using numbers, letters, macros, and substitution syntax.
Under Incident Creation Threshold:
Define the minimum number of similar alerts to receive before creating an incident. You can select:
Immediately create on first alert
Selecting this option means incidents can potentially contain only one alert.
OR
Wait for N alerts before creating the incident
Selecting this option means incidents created by this correlation definition can contain no fewer than this number of alerts.
Under Time Window:
Define the length of time an incident continues to add new alerts.
Click Save.
Correlation definition field details
Refer to the following sections for assistance with completing the correlation definition fields.
Correlation Name
The Correlation Name identifies the correlation definition which resulted in the creation of an incident, so it is a best practice to assign an easily identifiable and meaningful name.
The name displays in the Details tab of the Incidents view.
In the example shown in this section, the name of the correlation definition is "Database servers."
Alert Scope
The Scope allows you to define whether a correlation definition applies to all alerts or to specific matching alerts only.
Alert Fields to Match
The Fields to Correlate section specifies the fields and tags in alerts that the correlation engine compares to determine if the alerts are part of the same or different incidents.
A definition can have multiple fields and tags. The Similarity Threshold column specifies how similar the field and tag values must be for an alert to be a match for another alert, and can be configured independently for different fields. Alerts with a similarity the same as or above the threshold are included in the same incident. When the similarity of two alerts falls below the Similarity Threshold, those alerts are not correlated into the same incident.
You can add a "catch-all" that creates incidents from these uncorrelated singleton alerts if you also do not want to lose them. For more information, see Create incidents from uncorrelated alerts using correlation group settings.
For detailed information on the methodology the correlation engine uses to determine field similarity, see Understand alert similarity.
Incident Description Labeling
The description to use for all incidents that get generated from this correlation definition. These descriptions appear in the Incidents view.
You can enter plain text for the description (example: Disk issues in AWS Virginia), or a macro, or a combination of text and macros. Macros are dynamic, so the description will update with the information included in the incident.
For example, suppose you are defining a correlation based on the Service
alert field. You can then specify a label string such as:
Service Incident: cited(service) in classes unique(class,3) for cited(check,2) checks
Given this string, the resulting descriptions include the three most-cited services and the number of times each service is cited by a member alert:
Service Incident: ShoppingCart, Online Store in classes Storage, Compute, Network for Disk, CPU checks
Incident description macros
You can use the following macros to generate incident descriptions:
${count(alert-field)}
— Return the count of alert-field citations, including duplicates.${unique_count(alert-field)}
— Return the count of unique alert-field citations, excluding duplicates.${tolist(alert-field)}
— Return a comma-separated string of all elements in a list, including duplicates.${unique(alert-field, N)}
— Return a comma-separated string of N unique elements in a list, excluding duplicates.${top(alert-field)}
— Return the top-cited item.${cited(alert-field, N)}
— Return a list of the top-cited N items. If two or more items have the same number of citations, the items are sorted alphabetically.
Refer to Macro and substitution syntax reference for more.
Incident Creation Threshold
The minimum number of similar alerts required before creating an incident. This option is useful for reducing the total number of incidents by preventing the creation of incidents for one-off or intermittent alerts. The trade-off is that you might get "orphan" alerts that are not included in any incident. To find these alerts, go to the Alerts view, make sure that the Incidents column is included, and then sort on this column.
Time Window
The Correlation Time Window defines how long an incident is a candidate for correlation using this definition. For more information on the Correlation Time Window, see Correlation time window.
Fine-tune correlation definitions
After adding a new correlation definition, you can see if it is working as expected by viewing the resulting Incidents.
You can view the ID of the correlation definition responsible for creating any incident in the Correlation Definition column, in the incident summary information, or on the incident Details tab. Clicking the linked correlation definition name opens the correlation definition which caused alerts to cluster and form the selected incident.