Skip to main content

Glossary

APEX AIOps Incident Management glossary

Alert

A set of one or more unique events that all relate to a specific performance measure. APEX AIOps Incident Management deduplicates events into alerts. Then it correlates alerts into incidents.

For more information about alerts, see Events, alerts, and incidents.

Anomaly

The first observed data point after a time series metric switches from normal to anomalous performance. The ingestion engine treats each anomaly as a performance-impacting event.

Collector

An installable, Rust-based agent running on a server that does the following:

  • Observes time series metrics – either actively, at the source, or by ingesting a stream passively

  • Forwards the raw metrics to APEX AIOps Incident Management. The ingestion engine treats anomalies as performance-impacting events and aggregates them into alerts, which you can view in the Alerts page.

For more information about the APEX AIOps Incident Management Collector, see APEX AIOps Incident Management Collector.

Correlation

The process of finding correlations between alerts, based on similarities between data fields of interest, and clustering correlated alerts into actionable incidents.

For more information about correlation, see Correlate alerts into incidents.

Deduplication

A stage in the ingestion process where the ingestion engine eliminates any event that is identical to a previously-seen event. Deduplication eliminates noise and ensures that each ingested event is unique.

For more information about deduplication, see Deduplicate events to reduce noise.

Deduplication key

A auto-generated signature that APEX AIOps Incident Management generates for each new event and uses to determine if that event is a duplicate. By default the deduplication key is based on the source, service, and check fields.

For more information about deduplication, see Deduplicate events to reduce noise.

Detector

The algorithm that a Managed Object uses to detect anomalies in a metric. Every metric observed by a Managed Object has an associated detector.

For more information about detectors, see Anomaly detectors.

Enrichment

The process of adding or normalizing newly ingested events with information from your environment. Enrichment is useful when you want to customize how APEX AIOps Incident Management correlates alerts and clusters them into incidents. You might also want to enrich your alerts to make the resulting incidents more informative and readable.

For more information about enrichment, see Enrich events with additional data.

Event

A data object that describes an event of operational interest. An event might be based on an event notification from an external tool or a metric anomaly from a collector or Amazon CloudWatch. Events form the initial raw data for APEX AIOps Incident Management, which then deduplicates these events to form alerts.

For more information about events, see Events

Incident

A cluster of alerts that all relate to the same actionable incident. APEX AIOps Incident Management clusters alerts based on the similarity of their time stamps and data fields. The Settings > Correlation Engine page has a simple UI where you can define the correlation behavior that makes sense for your organization.

For more information about incidents, see Incidents, alerts, and metrics.

Managed object

A set of collector policies for observing metrics from a specific data source such as Linux OS, AWS, Docker, Logstash, etc. Each Managed Object defines the set of metrics to observe and the configuration settings for each metric.

Metric

A set of data points, each with its own timestamp, that measures a specific aspect of performance such as response time or utilization. Collectors can monitor performance on remote servers, detect performance anomalies locally at the source, and send anomalies and raw metrics directly to APEX AIOps Incident Management.

For more information, see Collector concepts.

Severity

Each anomaly, alert, and incident has an associated severity that indicates the degree of difference between the observed performance and normal performance. The severity generally indicates how urgently the performance issue requires corrective action. APEX AIOps Incident Management uses the following severity levels: Critical, Major, Minor, Warning, Unknown, and Clear.

For more information about severity, see Severity

Superseded

An incident that has been merged and replaced with another incident.

time series ()

An ordered sequence of data values recorded over a period of time. An example of a time series in APEX AIOps Incident Management is a metric.