Heartbeat Monitor
You can configure the Alert Rules Engine Moolet in Moogsoft Onprem to detect missing heartbeat events from monitoring tools such as CA Spectrum and Microsoft SCOM. Both of these tools send regular heartbeats to indicate normal operation.
After you configure the Alert Rules Engine, Moogsoft Onprem creates a Situation when an event source does not send a heartbeat after a given time period. The Alert Rules Engine holds each heartbeat alert for a period of time, subsequent alerts from the same heartbeat source reset the timer. If the timer expires, a heartbeat has been missed and the alert is forwarded to a Sigaliser (clustering algorithm).
Before You Begin
Before you set up the heartbeat monitor in Alert Rules Engine, ensure you have met the following requirements:
You have an understanding of Alert Rules Engine, Action States and transitions. See the Alert Rules Engine Moolet, Action States and Transitions for further details.
You can identify heartbeat alerts in the integration by description, class or another configurable field. These must be specific, regular events that arrive at consistent intervals to indicate normal operation. If these are not available, the Heartbeat Monitor will not work.
You have edited the alerts so they contain the same attribute, via the integration source or through enrichment. In the example below, '
type
' is 'heartbeat' in the Alert Rules Engine trigger filter and 'class
' is 'heartbeat' in the Cookbook Recipe trigger filter.
Create a Heartbeat Monitor
To create a heartbeat monitor in Alert Rules Engine, follow these steps:
Edit
$MOOGSOFT_HOME/bots/moobots/AlertRulesEngine.js
and add the heartBeatSeverity exit action. This function changes the alert severity to critical and ensures alerts that are closed are not forwarded to the Cookbook. See Status ID Reference for a list of status IDs.function heartBeatSeverity(alert,associated) { var currentAlert = moogdb.getAlert(alert.value("alert_id")); if ( currentAlert && currentAlert.value("state") !== 9 ) { alert.set("severity",5); var alertDescr = currentAlert.value("description"); // Update the description to "MISSED", a successful heartbeat will reset this. if ( !/^MISSED/i.test(alertDecr) ) { alert.set("description", "MISSED: " + alertDescr) } moogdb.updateAlert(alert); currentAlert.forward("HeartbeatCookBook"); } }
Navigate to Settings > Action States in the Moogsoft Onprem UI. Create a new Action State called "Heartbeat" as follows:
Setting Name
Input
Value
Name
String
Heartbeat
Remember alerts for
Integer (seconds)
30 *
Cascade on expiry
Boolean
True
Exit Action
String
heartBeatSeverity
Warning
* The
Remember alerts for
setting is the timer. Set this to two or three times your heartbeart interval time.Go to Settings > Transitions in the Moogsoft Onprem UI. Set up a transition to move your heartbeat alerts to the 'Heartbeat' State. Configure the settings as follows:
Setting Name
Value
Name
Heartbeat
Priority
10
Active
True
Trigger Filter
(type = "heartbeat") AND ((((agent = "SPECTRUM") OR (manager= "SCOM")) OR (agent = "MONITOR1")) OR (agent = "MONITOR2"))
Start State
Ground
End State
Heartbeat
Edit the 'Trigger Filter' to meet your requirements. In this example, the transition is triggered by alerts with the type of 'heartbeat' and that come from either 'SPECTRUM' or 'SCOM' or 'MONITOR1' or 'MONITOR2':
Ensure Alert Rules Engine is enabled. To do this, edit the
$MOOGSOFT_HOME/config/moolets/alert_rules_engine.conf
file and setrun_on_startup
to true.Create a
heartbeat.conf
configuration file in$MOOGSOFT_HOME/config/moolets
to add a Heartbeat Cookbook for heartbeat alerts. This only works with these alerts:# Moolet name:"HeartbeatCookBook", classname:"CCookbook", run_on_startup:true, metric_path_moolet : true, moobot:"Cookbook.js", process_output_of:"[]", # Algorithm membership_limit:5, scale_by_severity:false, entropy_threshold:0.0, single_recipe_matching:false, recipes:[ # Any heartbeat class for the same agent. { chef:"CValueRecipe", name:"ScomHeartbeatErrors", description:"SCOM Heartbeat: Missing heartbeat", recipe_alert_threshold:0, exclusion:"state = 9", trigger:"class = 'heartbeat' AND agent = 'SCOM'", rate:0, # Given in events per minute min_sample_size:5, max_sample_size:10, matcher:{ components:[ { name:"agent", similarity:1.0 } ] } }, { chef:"CValueRecipe", name:"ScomHeartbeatChange", description:"SCOM Heartbeat: Cluster host change", recipe_alert_threshold:0, exclusion:"state = 9", trigger:"class = 'heartbeatRoleChange' AND agent = 'SCOM'", rate:0, # Given in events per minute min_sample_size:5, max_sample_size:10, matcher:{ components:[ { name:"agent", similarity:1.0 } ] } } ], cook_for:20000 }
Save
heartbeat.conf
.Edit the Moogfarmd configuration file
$MOOGSOFT_HOME/config/moog_farmd.conf
to add a new merge group that references the HeartBeatCookbook Moolet. Configure this merge group to have analert_threshold
of 1 to allow a single alert to create a Situation (by default, a minimum of 2 alerts are required to create a Situation):merge_groups: [ { name: "Heartbeat", moolets: ["HeartbeatCookBook"], alert_threshold : 1, sig_similarity_limit : 1 } ],
Include the Moolet configuration by adding the following in
$MOOGSOFT_HOME/config/moog_farmd.conf
:{ include : "heartbeat.conf" },
Save the changes to
moog_farmd.conf
.|Restart Moogfarmd:
service moogfarmd restart
After the heartbeat monitor configuration is complete, heartbeat alerts should start to arrive in Moogsoft Onprem.
Heartbeat Monitor Process
The process flow for a heartbeat alert is as follows:
Heartbeat alert arrives at the Alert Rules Engine.
The alert is transitioned from 'Ground' to 'Heartbeat' action state and starts the timer.
The alert sits in the 'Heartbeat' state waiting for the timer to expire.
Any subsequent heartbeat alert resets the timer.
If the timer expires the exit action changes the alert severity to '5' (critical) and cascades it to 'Ground' state.
Any subsequent heartbeat updates the severity to '0' (clear) and restarts the timer.
You could also add an entry action to close any missed heartbeat situations the event is part of.
This example also updates the alerts with the times of the missing heartbeats for an easy audit trail.