Skip to main content

Concept explainer: What is APEX AIOps Incident Management? ►

This video explains some of the benefits and key concepts of APEX AIOps Incident Management, such as noise reduction, correlation, and intelligent metrics analysis.

*Please note Moogsoft is now part of Dell's IT Operations solution called APEX AIOps, and changed its name to APEX AIOps Incident Management. The UI in this video may differ slightly but the content covered is still relevant.

What Is APEX AIOps Incident Management?

APEX AIOps Incident Management is a solution that combines the best of observability and AIOps correlation, and does above and beyond the traditional monitoring tool.

Noise reduction and correlation functionality let you focus on the most important issues.

The level of insights you gain with APEX AIOps Incident Management is rich with context, so you get the whole picture of the behavior of your systems.

It is a single pane of glass to view the cross-source information, so you don’t have to jump around multiple interfaces to synthesize the information. With a tool like this, not only the traditional IT operations staff, but the SRE and DevOps engineers can benefit from its findings. Before diving into the product, let’s take a moment to visit these characteristics one by one to better understand them.

Noise reduction is a key functionality when you ingest a large amount of data from multiple sources. APEX AIOps Incident Management identifies the duplicate events, aggregates them into a single alert, then correlates them into Incidents. Having all relevant information in one place makes it easy to analyze and remediate.

Image1.png

Here’s one type of data flow.

The source monitoring system forwards the events it detected.

These source event data is then mapped to the Incident Management event data fields.

Then Incident Management uses some of the event fields as the deduplication keys, and if the values match, those events are bundled into an alert.

At this point, we can avoid different operators working on different occurrences of the same problem at the same time.

Image2.png

Then, Incident Management's correlation engine further groups those alerts by their related-ness into incidents. By algorithmically identifying the relation between alerts, Incident Management does the correlation work for you. So by the time you access the incidents, all related data about the issue is already identified and grouped together.

Image3.png

Here’s an alternate flow of data. Rather than integrating with your monitoring systems, you can install a collector to your data source, and have it perform the data collection.  The rest of the noise reduction process works much the same way,

Screen_Shot_2022-06-02_at_3_23_59_PM.png

Next, what do we mean by rich context?

APEX AIOps Incident Management can ingest both raw metrics and the alerts from your monitoring systems.

Then the data is correlated to provide a comprehensive global view of what’s happening in your infrastructure and applications. Let’s use an example to illustrate the benefit.

Let’s say you got Nagios watching the CPU utilization of a server, and it’s creeping up steadily since last week. Meanwhile, your APM system has detected that application A is getting slow.

Typically, when you troubleshoot a case like this you’d check multiple systems to gather diagnostic information. You see an application is slowing down in one dashboard. You may also learn from another source that the CPU usage is trending up. Then you may learn from yet another source about a particular process that got inefficient since the last release. After synthesizing the information you know why application A has gotten slow.

But wouldn’t it be nice if you don’t have to compare different types of data from multiple sources, and synthesize the information yourself? Wouldn’t it be nice to be able to see all relevant information in one place rather than jumping around multiple systems yourself? That’s what Incident Management does.

Image5.png

Intelligent metrics analysis is another strength of Incident Management.

How does the Incident Management know what to report?  It has the ability to learn the norm on its own, and responds when the metrics deviate from it.

Image8.png

Take advantage of the trial instances and see the power of APEX AIOps Incident Management now! 

Thank you for watching!