Customize event deduplication
For many integrations, you can customize deduplication behavior in the integration settings. You can also explicitly define the deduplication_key
field in the raw events you send to APEX AIOps Incident Management. The following use cases provide some examples of when you might want to customize your deduplication:
You want to generate alerts for a service that runs in multiple containers. You’re primarily concerned about the overall performance of the service rather than specifics of the container in which the service runs. In this case, you can specify a deduplication key such as
my-service::user-response-time
.You want to deduplicate based on information that is not included in the default fields. For example, you might want to deduplicate all events that have a specific error code. In this case, you could define a deduplication key such as
my-source::my-service::my-check::err3059
.
Do’s and don’ts for custom deduplication
A deduplication key is made up of a subset of event properties. Different types of events require different keys. A perfect deduplication key contains just enough information to identify the context of an event.
In most cases, a combination of the following values tends to work well:
Source, such as hostname
Event type or class
Static unique IDs
Error codes
Impacted entities
Do not include fields that might change between events with the same context. For example:
Timestamp -- Every event has a different timestamp. If the deduplication key includes the timestamp, deduplication is effectively disabled.
State changes such as up or down
Event count
Variable unique IDs
Severity
Descriptions with changing content such as metrics
When you define a custom deduplication key, concatenate multiple fields with two colons "::" to prevent misleading results. For example, if you concatenate source "Node A" and unique ID "1234" as "NodeA1234" this could potentially also match Node A1 and unique ID 234.
Best practices
If two events involve different problems on the same source and service, they should belong to different alerts. For this reason, the check
field should be fairly specific. Suppose two events describe database errors on the same source: one is a replication error, and another is a query error. If check
= “database” for both events, they will get deduplicated into the same alert. The better choice is to use “database-replication” for the first check and “database-query” for the second.