Moogsoft Docs

Maintenance Manager Moolet

Introduction

The Maintenance Manager Moolet compares alerts sent from the AlertBuilder moolet against active Maintenance Windows (see Maintenance Schedule ) and then filters them.

If the alerts match an active Maintenance Schedule filter then they are not forwarded onto the next part of the chain, usually to a Sigaliser Moolet to be clustered into a Situation.

Note

Please note : Maintenance Schedule is used when you have scheduled outages and do not want new Situations to be created. The Maintenance Manager Moolet ensures alerts will not be passed along to Sigalisers and clustered into Situations during that time period.

Configuration

The functionality of the Maintenance Window Manager is controlled by a Moolet in moog_farmd which has the default configuration shown below:

{
   name                     : "MaintenanceWindowManager",
   classname                : "CMaintenance",
   run_on_startup           : true,
   persist_state            : false,
   metric_path_moolet       : true,
   process_output_of        : "AlertBuilder",
   maintenance_status_field : "maintenance_status",
   maintenance_status_label : "In maintenance",
   update_captured_alerts   : true
}

The following fields can be configured to change the behaviour of the Maintenance Manager Moolet:

Field Input Description
name String

The name of the Moolet. This should not be changed

Note

Please note : Only use the default 'MaintenanceWindowManager' as the name

classname String The class name 'CMaintenance' is hardcoded and cannot be changed
run_on_startup Boolean If enabled, the Moolet will run when moogfarmd is started
persist_state Boolean If enabled, persistence will be turned on (state will be persisted in a cluster)
metric_path_moolet Boolean If enabled, the Moolet will be included in the "event_processing_metric" calculation in Self Monitoring
process_output_of

AlertBuilder
AlertRulesEngine
Enricher

This tells the Moolet to process the output of the Alert Builder, Alert Rules Engine or Enricher
maintenance_status_field String The name of the custom_info field/key used to indicate maintenance status
maintenance_status_label String The value of the custom_info maintenance status field used to indicate that an alert is in maintenance
update_captured_alerts Boolean

If enabled, ensures the maintenance status of an alert is set to null once the Maintenance Window that captured it has expired

If disabled, the maintenance status field of a captured alert will remain as "In maintenance" (or whatever the maintenance_status_label text is set to) until that alert reoccurs at which point all custom_info maintenance fields will be set to null

It is possible to add a column in the alert view displaying the 'Maintenance Status' for each alert and the text visible in this column can be controlled by editing the MaintenanceWindowManager Moolet configuration within $MOOGSOFT_HOME/config/moog_farmd.conf (edit the maintenance_status_label).

For the feature to function, the MaintenanceWindowManager Moolet needs to be placed before a Sigalising Moolet in a forwarding chain (configured in $MOOGSOFT_HOME/config/moog_farmd.conf). It is also appropriate to locate it before the AlertRulesEngine in the processing chain. This is the clean install configuration.

Updating Catured Alerts

In addition to implementing the maintenance windows, the MaintenanceWindowManager Moolet also updates each alert (affected by a maintenance window) with several custom_info fields.

By default an alert that is "captured" by a maintenance window will have the following fields set:

Field Description
custom_info.maintenance_status Configurable text label - set to "In maintenance" by default
custom_info.maintenance_id The numerical id of the maintenance window that captured the alert
custom_info.maintenance_name The name of the maintenance window that captured the alert
custom_info.forward_Alerts Whether the alert is forward to Sigalisers or not - false by default


Out-of-the-Box Moolet Flow

The default, or out-of-the-box, flow for the Moolet looks something like this:

+------------+       +------------------------+       +----------------+       +----------+
|AlertBuilder| ----> |MaintenanceWindowManager| ----> |AlertRulesEngine| ----> |Sigalisers|
+------------+       +------------------------+       +----------------+       +----------+


To allow programmatic forwarding of alerts in the Moobots to different Sigalisers from an AlertBuilder (using alert.forward('SigaliserName'); ), you need more than one 'Maintenance Window Manager' Moolet. Here is an example of a selection of different approaches within a single moogfarmd instance:

  • AlertBuilder1 -> if (x) then alert.forward('MaintenanceWindowManager1'); -> MaintenanceWindowManager1 -> Sigaliser (process_output_of 'MaintenanceWindowManager1')
  • AlertBuilder1 -> if (y) then alert.forward('MaintenanceWindowManager2'); -> MaintenanceWindowManager2 -> Cookbook1 (process_output_of 'MaintenanceWindowManager2')
  • AlertBuilder2 -> alert.forward('MaintenanceWindowManager3'); -> MaintenanceWindowManager3 -> Cookbook2 (process_output_of 'MaintenanceWindowManager3')
  • AlertBuilder3 -> alert.forward(this); -> MaintenanceWindowManager4 -> Speedbird (process_output_of 'MaintenanceWindowManager4')

There are also endpoints available to manage this feature with the Graze API.

Create new maintenance window (HTTP POST) :

To create a one-time maintenance window (no recurrence), which is filtered on 'source equal to "hostWhichIsDown"' :

curl "https://<YOUR_HOSTNAME>:8080/graze/v1/createMaintenanceWindow" -H "Content-Type: application/json; charset=UTF-8" --insecure -X POST -v --data '{"auth_token": "<YOUR_GRAZE_AUTH_TOKEN>", "name": "my_window_1", "description": "This is my description", "filter": { "column": "source", "op": 0, "value": "hostWhichIsDown1", "type": "LEAF" }, "start_date_time": 1473849237, "duration": 55800, "forward_alerts": false}'

To create a maintenance window (same filter as above) that recurs once a month (from its start_date_time), add the recurring_period (only allowed value is 1) and recurring_period_units properties (allowed values are 2 (daily), 3 (weekly) and 4 (monthly)):

curl "https://<YOUR_HOSTNAME>:8080/graze/v1/createMaintenanceWindow" -H "Content-Type: application/json; charset=UTF-8" --insecure -X POST -v --data '{"auth_token": "<YOUR_GRAZE_AUTH_TOKEN>", "name": "my_window_1", "description": "This is my description", "filter": { "column": "source", "op": 0, "value": "hostWhichIsDown1", "type": "LEAF" }, "start_date_time" :1473849237, "duration": 55800, "forward_alerts":false, "recurring_period": 1, "recurring_period_units": 4}'

Delete maintenance window (HTTP POST) :

curl "https://<YOUR_HOSTNAME>:8080/graze/v1/deleteMaintenanceWindow" -H "Content-Type: application/json; charset=UTF-8" --insecure -X POST -v --data '{"auth_token" : "<YOUR_GRAZE_AUTH_TOKEN>", "id":123}'

Get current or future maintenance windows

The selection returned can be controlled using the start and limit parameters) (HTTP GET) :

curl "https://<YOUR_HOSTNAME>:8080/graze/v1/getMaintenanceWindows?auth_token=<YOUR_GRAZE_AUTH_TOKEN>&start=0&limit=20"
  • This will not return deleted Maintenance windows or Maintenance windows which have expired/set in the past

Important Note on Maintenance Windows via Graze

The following limitations are in effect:

  • The "filter" syntax must be in the correct Moog JSON format (same format as used in alert and Situation filters in the DB)
  • The "start_date_time" property must be in epoch time
  • For a recurring maintenance window, the recurring_period property must be 1 - no other value will be accepted
  • For a recurring maintenance window, the recurring_period_units property has allowed values of 2 (daily), 3 (weekly) or 4 (monthly) - no other values will be accepted

Note

This feature uses custom_info fields within the alerts. As a result, Moobots should not overwrite the following fields or completely empty the custom_info object within alerts:

  • custom_info.maintenance_status
  • custom_info.maintenance_id
  • custom_info.forward_alerts