Skip to main content

Use case walkthrough: A tour of Incident Management for DevOps users ►

This video steps through an example of how a DevOps engineer might use Incident Management to resolve an incident.

*Please note Moogsoft is now part of Dell's IT Operations solution called APEX AIOps, and changed its name to APEX AIOps Incident Management. The UI in this video may differ slightly but the content covered is still relevant.

Let’s sample a day in the life of a DevOps engineer in Moogsoft. This incident just came in. We are going to assign it to us and investigate.


We’re in the Situation Room where our team can collaborate on this incident. And this is the timeline window. It logs how the incident unfolded.


Now, judging from the description, the issue seems to be with the message writing service.  Interesting. Let’s take a look at the alerts that rolled up into this incident.


This is the first alert. The description says we just deployed an updated message writer service.


Then minutes later, the process duration metric went out of bounds to the critical state. So there seems to be a connection here. Let’s see if this connection makes sense by looking at the metrics.


By default, the metrics screen is set to show the incident range, but we want to see it in the context of what’s normal.


So we are going to zoom out a bit… and set the time range to the last hour.


This is where the process duration metrics alert was triggered. Normally, it hardly takes 20 milliseconds to write anything to the database, and now it’s taking several seconds. That’s not good.


Now we have a choice- we could rollback the deployment and fix the issue in code that’s adding seconds to this process. Or, accept this as a new norm, and have Incident Management learn it.  For now, we are going to roll it back.

So we rolled back the service update. And now a few minutes later. Looks like the metrics are back to normal. We’ll examine the message writing service and deploy an updated version when ready.


So let’s go back to the incident view...and close it. There you go! Now you have seen a sample day in the life of a DevOps engineer in Incident Management. Thanks for watching!
