Event and Alert Field Best Practice
This best practice is an attempt to offer consistency and reuse of configurations including the mapping from a source to an inbound event. The fields exposed in the alert/event are:
Field | Required | Data Type | Size | Description | Example | Comment | |
---|---|---|---|---|---|---|---|
1 | signature | Yes | VARBINARY(binary) | 767 | This is a special attribute used to determine when Moogsoft Onprem deduplicates events into Alerts. It can be any combination of one or more of the attributes listed below To be constructed as a subset of events from a source, also see existing guidance Constructed fields should be separated by “::” avoiding any possible issues with concatenation providing misleading results. e.g. NodeA event id 12 would concatenate as NodeA12, which would be the same as NodeA1 event 2. NodeA::12 and NodeA1::2 would therefore differentiate Signatures do not need to be human readable, so clarity isn’t a concern. If length is becoming an issue - remove whitespace or other extraneous characters (via a lambot) | host1::nagios::cpu | |
2 | alert_id | Yes | BIGINT(binary) | 20 | An auto-assigned incremental number. Internally generated DO NOT CHANGE | ||
3 | source_id | Yes | TEXT(utf8) | 65535 | Source and Source_ID refer to the generating source of the event, primarily focused on the host environment. The Source should be any unique human readable name (FQDN, Hostname, etc) and the source_id should be any identifier for the source machine generated ( IP, MAC, CI Number, etc.) If the event has no machine identification such as Application or other software generated events, then the Source should be some unique identifier of the instance (database name, cluster node, container name etc.). Again source_id should be any other unique identifier that is available (container UUID, cluster node UUID etc.) This attribute can be used for any additional identification attribute of the CI | 192.168.1.107 | |
4 | external_id | No | TEXT(utf8) | 65535 | Any unique identifier provided in the source event (event ID, Incident ID etc.) This is typically set to the CI's ID in the CMDB, or where the event is emitted from an underlying element management system, and may hold the unique source event identifier | 12345 | Returns Null if blank |
5 | manager | No | TEXT(utf8) | 65535 | A general identifier of the event generator or intermediary (NAGIOS, SCOM, etc.) In hub-and-spoke and/or relay architectures this typically is the name of the agent manager that pre-aggregates events prior to sending to Moogsoft Onprem. For example, there may be an BMC Patrol manager that manages all San Francisco data center alerts. This field is also sometimes used simply to track the name of the Moogsoft Onprem LAM that received the alerts in multi-LAM deployments | Nagios | Returns Null if blank |
6 | source | Yes | TEXT(utf8) | 65535 | Source and Source ID refer to the generating source of the event, primarily focused on the host environment. The source should be any unique human readable name (FQDN, Hostname, etc) and the source_id should be any identifier for the source machine generated ( IP, MAC, CI Number, etc.) If the event has no machine identification such as Application or other software generated events, then the Source should be some unique identifier of the instance (database name, cluster node, container name etc.). Again source_id should be any other unique identifier that is available (container UUID, cluster node UUID etc.) | host1 | |
7 | class | Yes | TEXT(utf8) | 65535 | Class and Type are generic classifications for the event in a hierarchy that allow you to maintain a simple event ontologies; class then type. (Disk space: free space, Memory: max used...total available, etc.) | cpu | |
8 | agent | Yes | TEXT(utf8) | 65535 | The specific agent that created the event, (SCOM REST, NAGIOS SOCKET, SNMP TRAP NATIVE, etc.). This is typically the name of the agent that facilitates the event from the CI e.g. "nagios-agent-london-7" A simple way to provide this is in the lam.conf by setting the agent:name and then mapping $LamInstanceName to agent, this is the default { name: "agent",rule: "$LamInstanceName" }, | Linux | |
9 | agent_location | Yes | TEXT(utf8) | 65535 | This is typically the geographic location of the agent and/or CI such as "London". Should be used consistently for all sources, either as the host machine that the agent is executed from (BEM Server 1, OEM Monitor cluster, etc.) OR the physical location that the agent is executing (NYC Data Centre, Stuttgart Main Station, (51.407139, -0.307321) etc.) | New York, NY | |
10 | agent_time | Yes | This is the timestamp in epoch seconds when the event occurred. This should be set across all event sources to provide a common time reference. Timezones should be nullified - all events should be presented in the same time context. If an event source does not provide a suitable time in the payload then use the ingestion time at the LAM. Note: polled event sources (rest_client_lam, SCOM, Netcool) may skew the event time in line with the poll cycle. If an event is being generated in a different timezone and is manipulated into the Moogsoft Onprem server time - add the origin time to the custom_info for the event. This can be operationally useful. e.g. custom_info.originalEventTime : agent_time should be in epoch seconds - convert as necessary. Miscalculated event times will cause unpredictable results across the system. Also see 4.1.2 Release note. [MOOG-2278] - Enhanced Alert Times If the agent_time is not defined, it should be set to the current epoch time using Javascript functions such as: Math.round(Date.now() / 1000); | ||||
11 | type | Yes | TEXT(utf8) | 65535 | Class and Type are generic classifications for the event in a hierarchy that allow you to maintain a simple event ontologies; class then type. (Disk space: free space, Memory: max used...total available, etc.) | DOWN | |
12 | severity | Yes | INT(binary) | 11 | Standard 0-5 but be mindful of the significance across all event sources if possible. A low value event source could produce critical events that in the wider context would be considered minor Use the Moogsoft Onprem LAM config file built in "sevMapper" to map your incoming severity values to a number between 0 and 5 : 0 = Clear 1 = Indeterminate 2 = Warning 3 = Minor 4 = Major 5 = Critical | 5 | 0 clear - 5 critical |
13 | significance | No | INT(binary) | 11 | This value is calculated by Moogsoft Onprem Events Analyser. Internally generated DO NOT CHANGE | ||
14 | count | No | INT(binary) | 11 | The reference count of deduplicated Events for each Alert. Internally generated DO NOT CHANGE | ||
15 | description | Yes | TEXT(utf8) | 65535 | The main text payload of the event. Add as much textual detail as possible. Remember a human operator will look at the detail and the entropy calculation works best with detailed narratives. | CPU Threshold exceeded: 99% | |
16 | first_event_time | No | BIGINT(binary) | 20 | If you set agent_time in the LAM/LAMbot to the actual epoch seconds timestamp of each event, Moogsoft Onprem will automatically keep track of the first and last occurrence of multiple instances of the same event. Note that setting agent_time will also set first_event_time and last_event_time. Internally generated DO NOT CHANGE | ||
17 | last_event_time | No | BIGINT(binary) | 20 | |||
18 | int_last_event_time | No | BIGINT(binary) | 20 | Internally generated DO NOT CHANGE | 1411134582 | From agent_time |
19 | last_state_change | No | BIGINT(binary) | 20 | Internally generated DO NOT CHANGE | ||
20 | state | No | INT(binary) | 11 | 1 | Opened 2 | Unassigned 3 | Assigned 4 | Acknowledged 5 | Unacknowledged 6 | Active 7 | Dormant 8 | Resolved 9 | Closed 10 | SLA Exceeded Internally generated DO NOT CHANGE | ||
21 | owner | No | INT(binary) | 11 | Set when an operator right-clicks on an alert in the Moogsoft Onprem UI and assigns ownership. Internally generated DO NOT CHANGE | ||
22 | entropy | No | DOUBLE(binary) | 22 | Internally generated DO NOT CHANGE | ||
23 | custom_info | No | TEXT(utf8) | 65535 | Custom_info is a special field that is the mechanism for extending the Moogsoft Onprem alert schema. This is a JSON encoded string that should contain key value pairs for each data element not supplied in the initial event or having been enriched via alert transformation. Be consistent with key names so they can be used in Sigalisers and filters. Consider using a LAMBot module that sets a base set of custom_info across all lams - this provides a single point of administration for the customer. Care should be taken when setting custom_info in a LAM to ensure that it does not overwrite downstream additions (e.g. enrichment via a moobot) when the Event is de-duplicated. You can store simple or arbitrarily complex hierarchical JSON attributes in this field. They are basically serialized for use in the standard JSON.parse/stringify manner and Moogsoft Onprem UI is written to display JSON hierarchies of any complexity in a tree-view format | Returns Null if blank |