Moogsoft Docs

Heartbeat Monitor

You can configure the Alert Rules Engine in Moogsoft AIOps to detect missing heartbeat events from monitoring tools such as CA Spectrum and Microsoft SCOM. Both of these tools send regular heartbeats to indicate normal operation.

After you configure the Alert Rules Engine (ARE), AIOps creates an Situation when an event source does not send a heartbeat after a given time period. The ARE holds each heartbeat alert for a period of time, subsequent alerts from the same heartbeat source reset the timer. If the timer expires, a heartbeat has been missed and the alert is forwarded to a Sigaliser.

Before you Begin

Before you set up the heartbeat monitor in Alert Rules Engine, ensure you have met the following requirements :

  • You have an understanding of Alert Rules Engine, Action States and transitions. See Alert Rules Engine .
  • You can identify heartbeat alerts in the integration by description, class or another configurable field. These must be specific, regular events that arrive consistent intervals to indicate normal operation. If these are not available the Heartbeat Monitor will not work.
  • You have edited the alerts so they contain the same attribute (via the integration source or through enrichment). In the example below, ' class ' is set to 'heartbeat'.

Create a Heartbeat Monitor

To create a heartbeat monitor in Alert Rules Engine, follow these steps:

  1. Edit $MOOGSOST_HOME/bots/moobots/AlertRulesEngine.js.

  2. Add the heartBeatSeverity exit action to the AlertRulesEngine.js . This function changes the alert severity to critical and ensures alerts that are closed (see Status ID Reference ) are not forwarded to the Cookb ook :

    // Checks the state of the alert. If the alert's state is 9, the status ID for 'closed', it is not forwarded.
    function heartBeatSeverity(alert,associated) { 
      var currentAlert = moogdb.getAlert(alert.value("alert_id"));
      if ( currentAlert && currentAlert.value("state") !== 9 ) { 
    // Adds customInfo list displaying times of all missed heartbeats.
        var customInfo=currentAlert.getCustomInfo();
        if ( !customInfo ) { 
        if ( !customInfo.missedHeartbeats ) {
    // Changes severity of heartbeat alerts to 5, the severity level for 'critical', adds the description 'MISSED' and determines where the alerts are sent.
        var now = new Date();
        var alertDescr = currentAlert.value("description"); 
        alertDescr = alertDescr.replace(/(OK|SLOW)/,"MISSED");
  3. Navigate to Settings > Action States in the AIOps UI.

  4. Create a new Action State called "Heartbeat" as follows:

    Setting Name Input Value
    Name String Heartbeat
    Remember alerts for Integer (seconds) 30*
    Cascade on expiry Boolean True
    Exit Action String heartBeatSeverity


    The ' Remember alerts for ' setting is the timer. Set this to two or three times your heartbeart interval time.

  5. Go to Transitions and set up a transition to move your heartbeat alerts to the 'Heartbeat' State. Configure the settings as follows:

    Setting Name Value
    Name Heartbeat
    Priority 10
    Active True
    Trigger Filter (type = "heartbeat") AND ((((agent = "SPECTRUM") OR (manager = "SCOM")) OR (agent = "MONITOR1")) OR (agent = "MONITOR2"))
    Start State Ground
    End State Heartbeat

    E dit the 'Trigger Filter' to meet your requirements. In this example, the transition is triggered by an alerts with the type of 'heartbeat' and that come from either 'SPECTRUM' or 'SCOM' or 'MONITOR1' or 'MONITOR2':

  6. Run this command to open moog_farmd.conf in vi:

    vi /usr/share/moogsoft/config/moog_farmd.conf
  7. Ensure Alert Rules Engine is enabled. To do this set to run_on_startup to true.

  8. Add heartbeat Cookbook for heartbeat alerts. This only work with these alerts:

    name                : "HeartbeatCookBook", 
    classname           : "CCookbook", 
    run_on_startup      : true, 
    moobot              : "Cookbook.js", 
    process_output_of   : [], 
    # Algorithm 
    membership_limit  : 5, 
    scale_by_severity : false, 
    entropy_threshold : 0.0, 
    single_recipe_matching : false, 
    recipes :[ 
    # Any heartbeat class for the same agent.  
    chef                : "CValueRecipe", 
    name                : "ScomHeartbeatErrors", 
    description         : "SCOM Heartbeat: Missing heartbeat", 
    recipe_alert_threshold : 0, 
    exclusion           : "state = 9", 
    trigger             : "class = 'heartbeat' AND agent = 'SCOM'", rate                : 0,   # Given in events per minutee 
    min_sample_size     : 5, 
    max_sample_size     : 10, 
    matcher             :   { components: [ { name: "agent", similarity: 
    1.0 } ] } 
    chef                : "CValueRecipe", 
    name                : "ScomHeartbeatChange", 
    description         : "SCOM Heartbeat: Cluster host change", 
    recipe_alert_threshold : 0, 
    exclusion           : "state = 9", 
    trigger             : "class = 'heartbeatRoleChange' AND agent = 
    rate                : 0,   # Given in events per minutee 
    min_sample_size     : 5, 
    max_sample_size     : 10, 
    matcher             :   { components: [ { name: "agent", similarity: 
    1.0 } ] } 
    cook_for          : 20000 
  9. Save the changes and restart moogfarmd:

    service moogfarmd restart
After the heartbeat monitor configuration is complete, heartbeat alerts should start to arrive in AIOps.

Heartbeat Monitor Process

The process flow for a heartbeat alert is as follows:

  • Heartbeat alert arrives at the Alert Rules Engine.
  • The alert is transitioned from 'Ground' to 'Heartbeat' action state and starts the timer.
  • The alert sits in the 'Heartbeat' state waiting for the timer to expire.
  • Any subsequent heartbeat alert resets the timer.
  • If the timer expires. the exit action changes the alert severity to '5' (critical) and cascades it to 'Ground' state.
  • Any subsequent heartbeat updates the severity to '0' (clear) and restarts the timer.
  • You could also add an entry action to close any missed heartbeat situations the event is part of.

This example also updates the alerts with the times of the missing heartbeats for an easy audit trail.