Skip to main content

Normalize Data

For complex processing of event data you can use a LAMbot.

LAMbot processing

Some event fields may have embedded information that can be used later as enrichment and alert correlation. For example, a hostname may include regional information that can be used to cluster alerts based on physical location. As a best practice, parse incoming data fields which include this type of information. A number of utilities, for example, the Bot utility, are available to simplify data parsing in the LAMbot. 

Any event field can potentially be used in correlation. It is important to identify early on which fields will be used and assign them accordingly. As a rule, any fields identified for correlation should have reliable data. Avoid unreliable event fields.

Assume a simple incoming raw event of the following format:

{
    "ip_address":"10.42.63.74",
    "event_id":"e3562",
    "manager":"MAN",
    "host":"lon35sql04",
    "eventID":"e4268",
    "check": 
    {
        "name": "database", 
        "type": "availability"},
        "region":"EMEA",
        "datacenter":"London",
        "priority":"crit",
        "message":"Database is down",
        "app" : ["App A", "App B"]
    }
}

The following example shows a mapping to support the incoming event. Note the new severityConverter transformation. In the LAM configuration file:

constants:
{
    custom_severity:
    {
        "crit": 5,
        "ok": 0,
        "warn": 2,
        "minor": 3,
        "error": 4,
        moog_lookup_default: 1
    }
},
conversions:
{
    severityConverter:
    {
        lookup: "custom_severity",
        input:  "STRING",
        output: "INTEGER"
    }
},
mapping:
{
    catchAll: "overflow",
    rules:
    [
        { name: "signature", rule:      "$host::$check.name::$check.type" },
        { name: "source_id", rule:      "$ip_address" },
        { name: "external_id", rule:    "$external_id" },
        { name: "manager", rule:        "$manager" },
        { name: "source", rule:         "$host" },
        { name: "class", rule:          "$check.name" },
        { name: "agent", rule:          "$LamInstanceName" },
        { name: "agent_location", rule: "$region" },
        { name: "type", rule:           "$check.type" },
        { name: "severity", rule:       "$priority", conversion: "severityConverter" },
        { name: "description", rule:    "$message" },
        { name: "agent_time", rule:     "$moog_now" }
    ]
}
filter: 
{
    presend: "lambot.js"
}

Nested fields are supported. If the incoming field does not exist, the actual string "$incomingField" is used.

Any unmapped data from the raw event ends up in the catchAll field which by default is called "overflow". An overflow is a JSON object that maps the remaining event fields to indexes accessible using the standard Javascript object functions. If you need to access this data to perform further parsing you can do it as shown below in the LAMbot. Moogsoft advises using the Bot utility to achieve this.

Here is an excerpt from a LAMbot that further parses the event payload from above. Note the usage of Bot utility methods to build a custom info base and add fields to it.

Always use botUtil.setCustomInfo to set your custom_info in a LAMbot. The logic behind the method also nullifies the overflow before the event despatches to the Message Bus. This reduces the event size and therefore minimizes any size-related issues on the bus.

Also note botUtil.checkEvent.validateEvent method to validate the consistency of the event. In the LAMbot Javascript file:

function presend(event)
{
 
    // Create a base model for custom_info, consistent across all ingestions
    var custom_info = botUtil.createBaseCustomInfo();

    // Default value to be used in case the mapped field does not exist
    var default_value = "unknown";

    // Get the overflow object
    var overflow = botUtil.getOverflow(event);

    // Print the overflow object in the logs at the Info level.
    // Useful during initial setup.
    botUtil.printOverflow(overflow);

    // Check overflow for a list of expected mandatory attributes
    var attributeList = ["app", "datacenter"];
    var boolean_flag = botUtil.checkOverflow(overflow,attributeList);

    // If the mandatory fields are present that tag event as complete.
    // Otherwise set it to false so that we can filter on the list of alerts with incomplete set of data

    if (boolean_flag) 
    {
        custom_info.eventDetails.complete = true;
    } 
    else 
    {
        custom_info.eventDetails.complete = false;
    }
 
    // Retain app and region from overflow as these will be used during clustering.
    // Use a default value in case the field is missing.

    custom_info.eventDetails.app = overflow.app ? overflow.app : default_value;
    custom_info.eventDetails.region = overflow.region ? overflow.region : default_value;

    // Set custom info

    botUtil.setCustomInfo(event,custom_info);

    // It will check the event for any invalid fields and also print out in logs the entire payload

    botUtil.checkEvent.validateEvent(event,botUtil);
    botUtil.printCEvent(event);
 
    return true;
}

Post normalization

After the LAMbot completes data normalization it exits with one of the following options:

  • Return true: Event is sent to the Message Bus on the default events stream

  • Return { stream: "stream_name", passed: true }: Event is sent to the message bus on a separate stream to the main one. You will need to specifically configure an AlertBuilder to listen to the stream_name:

  • Return false: Event is dropped and not sent to the Message Bus (usually implemented when you want to blacklist certain events such as audit events from being processed by Moogsoft Onprem at all).

Sending data on a separate stream is useful if you need to set up separate functionality to the main AlertBuilder. If you choose to do so, remember that you need to explicitly configure an AlertBuilder to accept the data on the configured stream.

{
    name            : "AlertBuilder_Stream_",
    classname       : "CAlertBuilder",
    run_on_startup  : true,
    moobot          : "AlertBuilder._Stream_.js",
    process_output_of : "Event Workflows",
 
    # metric_path_moolet - a Moolet included in the
    # calculation of the time taken for events to complete
    # their path through the system from initial ingestion
    # through to complete processing.
    #
    metric_path_moolet : true,
 
    # Specify a list of streams to create alerts for. Reference the
    # streams set in the filter section of the LAM configuration.
    # Defaults to the generic event stream.
    # If you are using the Event Workflow Moolet, configure the
    # process_output_of property instead.
    event_streams : [ "stream_name" ],
 
    threads         : 4,
    events_analyser_config  : "events_analyser.conf",
    priming_stream_name         : null,
    priming_stream_from_topic   : false,
    moolet_queue_size_limit: 0
}