Skip to main content

Best practices for defining similarity in correlation definitions

The key to defining good correlation definitions for your organization is to think about how you want to organize your alerts into incidents. Ask the following questions:

  • Which data fields do you want to use to correlate your alerts into incidents? For example, you might want to use fields such as:

    • source: Correlate alerts into incidents based on the nodes where the performance-impacting events occurred.

    • service: Correlate alerts into incidents based on the apps or services that generated the alerts.

    • location: Correlate alerts into incidents based on the physical locations where the performance-impacting events occurred.

  • How many alert fields do you want to consider for correlation?

    You can include multiple fields in a definition. If you include more fields, you get a larger number of more specific incidents. If you include fewer fields, you get a smaller number of more general incidents.

  • How similar do the corresponding values need to be for an alert and an incident to be correlated?

The following fields can be useful for correlating alerts into incidents:

  • class

  • service

  • location

  • manager — This field can be useful if you want to correlate based on an event generator such as "collector," "docker," or "new-relic." For example, suppose you have multiple EC2 instances running Docker. You want to correlate all Docker alerts for each EC2 into incidents. You can then define a correlation that includes the "source" field (the EC2) and the "manager" field (in this case, "docker").

  • source — This field works well in organizations where hostnames have formal naming conventions. Some organizations include tags in hostnames to indicate a host's function, team, or organization. Hostnames might include tags that indicate geographic locations, engineering teams, deployment statuses, and other attributes. In this case, you might want to correlate alerts based on hostname tags such as "*hq," "*nyc," "*dev," "*stage," or "*prod."

    This field does not work well in organizations where hostnames are ephemeral or auto-assigned, unless you want to correlate all alerts from the same source into one incident.

  • description — This field can be useful if your alert descriptions have a consistent format. Correlation considers alert descriptions only, not incident descriptions.

  • Custom tags — You can also correlate based on custom alert tags. Tags can provide a great deal of flexibility for correlation, because you can create any tags for any key-value data you want. The primary requirement is that the tag values themselves have a consistent convention and format to enable the correlation engine to compare two values directly.

The following fields are generally not useful for correlating alerts into incidents:

  • severity — This field is useful for filtering alerts to correlate, but not for correlating alerts into incidents.

  • status

  • assignee

  • ticket