Configure Single Recipe Matching to Prioritize Recipes in a Cookbook

GIF
SingleRecipeMatching.png
SingleRecipeMatchingv2.gif

What Is Single Recipe Matching?

When you enable single Recipe matching in a Cookbook, the set of Recipes has a priority order. When an event enters a Cookbook, the Cookbook first evaluates it against the top Recipe in the list. If the Recipe is not able to produce a Situation, the Cookbook evaluates the event against the next Recipe in the list. Once the event satisfies the criteria of a Recipe and becomes part of a Situation, there is no subsequent evaluation.

If an event deduplicates into an existing alert which has already been used by a Recipe and is part of an existing Situation, the new event is only evaluated by higher priority Recipes, if any, than the Recipe that initially created the Situation, unless that alert is closed. Basically, this means the alert can only exist in one Situation or subsequently in a Situation generated by a Recipe of higher priority.

Here is an example:

The Network team wants to group alerts by the level of impact. We can set up multiple Recipes to handle this.

For the "by Server" Recipe, you can set a low alert threshold so that we catch every alert and add the details to the Situation description. We have additional recipes for "by Rack" and "by Floor". Any alert added to a "by Server" cluster, is also added to candidate clusters for the "by Rack" and "by Floor" Recipes. If within the cook for time, another alert arrives with matching rack values, then the Cookbook creates a Situation from the "by Rack" candidate cluster and merges it with the "by Server" Situation. Similarly, if three more alerts arrive with matching floor values, Moogsoft Enterprise promotes the candidate cluster for "by Floor" to a Situation that supersedes any of the "by Rack" or "by Server" Situations.

Events can be processed concurrently by one or more different instances of Cookbook depending on operational requirements. Within each Cookbook an inbound event passes to each Recipe in turn, in priority order, starting with the highest priority Recipe. A Recipe may return 0, 1, or more matching clusters, so-called “Candidate Clusters”. When single Recipe matching is enabled, the Cookbook transforms only the highest scoring candidate cluster into an actionable Situation. Otherwise, if single Recipe matching is disabled, all candidate clusters become Situations.

When To Use Single Recipe Matching

You should only use single Recipe matching if:

  • Recipes target very different types of alerts that do not need to be shared between separate Recipe contexts. In this case, Recipes should be unrelated. Otherwise, you might end up with a scenario when unrelated alerts merge into the same situation.

  • The contexts in individual Recipes are supersets of each other in the direction of merging, which is upwards. For example: Application→Site→Country or Server→Rack→Datacenter.

This is demonstrated in the following diagram:

singleRecipe.png

The animation below presents an example of clustering flow within a Cookbook with single Recipe matching enabled.

To understand this example, assume the following:

  • There are two recipes configured within the same Cookbook with the following priority order:

    1. Recipe A

    2. Recipe B

  • All of the arriving alerts are in the scope of both Recipes.

  • Recipe A has an alert threshold set to 4 while Recipe B has it set to 1.

  • A typical scenario that would fit this configuration is the Server→Datacenter context. In this example, the operators want to be notified about problems on individual servers however if there is an underlying wider impacting problem they would like to be alerted as well. So the Recipes could be set up as follows:

    1. Recipe A clusters alert based on the same datacenter. Given that this is wider impacting clustering and we expect more alerts if it is truly a datacenter issue - we need to set the alert threshold high enough.

    2. Recipe B clusters alerts based on the same server name and datacenter. Even a single alert might be an indication of a developing problem so we set the alert threshold to 1.

    3. Recipe B - with clustering by same server name and datacenter - has a more granular context; however, anything produced by this Recipe is a subset of the context in Recipe A - where clustering is done by the same datacenter.

single_recipe_match_flow_complex_finished.gif

The clustering flow is as follows:

  1. Alert_1 reaches the Cookbook which evaluates it against the top Recipe_A. Cookbook creates a candidate cluster containing Alert_1, but because it has not reached the alert threshold of 4 (AT: 4) the candidate cluster is not promoted to a Situation. However, the candidate cluster still remains in memory.

  2. Cookbook then evaluates Alert_1 against Recipe_B. Cookbook creates a candidate cluster which is promoted to Situation_1 because it has reached the alert threshold of 1 (AT: 1).

  3. Alert_2 enters the Cookbook and again Cookbook first evaluates it against Recipe_A. Alert_2 joins the existing candidate cluster according to the attribute similarity configured in Recipe_A. The candidate cluster only contains two alerts so it is still unable to produce a Situation.

  4. Cookbook then evaluates Alert_2 against Recipe_B. There is already a candidate cluster against which the alert is being evaluated. It matches whatever the attribute similarity is configured (note the same color) and is added to the candidate cluster and subsequently to the already existing Situation_1.

  5. Alert_3 enters the Cookbook and Cookbook evaluates it against Recipe_A. Alert_3 joins the existing candidate cluster according to the attribute similarity configured in Recipe_A. The candidate cluster only contains three alerts so it is still unable to produce a Situation.

  6. Alert_3 reaches Recipe_B and Cookbook evaluates it against the existing candidate cluster. The alert, however, is not similar enough to join the existing candidate cluster so it creates a separate one on its own. Because it reaches the alert threshold of 1, it creates a corresponding Situation_2.

  7. Alert_4 enters the Cookbook and is evaluated against Recipe_A. It joins the candidate cluster and creates Situation_3 as it reaches the alert threshold of 4. Because Alert_4 is now part of a Situation, Cookbook stops evaluating against any lower priority Recipes.

  8. Because Situation_1 and Situation_2 are complete subsets they become dormant and merge into Situation_3.

In the context of the real-life Server→Datacenter scenario, the initial contexts of Situation_1 and Situation_2 have morphed from "here are the alerts that indicate a problem with this particular server" into "here are the alerts that indicate a wider datacenter problem" in Situation_3.

See below the Cookbook configuration to match the example above.

In the Server > Datacenter Cookbook file-based example configuration:

{
    name                : "Datacenter",
    classname           : "CCookbook",
    run_on_startup      : true,
    metric_path_moolet  : true,
    moobot              : "Cookbook.js",
    process_output_of   : "MaintenanceWindowManager",
    membership_limit  : 1,
    scale_by_severity : false,
    entropy_threshold : 0.0,
    single_recipe_matching : true,
      recipes :[
       {
        chef                : "CValueRecipeV2",
        name                : "by Datacenter",
        description         : "Multiple issues impacted in $UNIQUE(custom_info.eventDetails.datacenter) datacenter. $CLASS(network_datacenter)",
        recipe_alert_threshold : 4,
        exclusion           : null,
        trigger             : null,
        seed_alert          : null,
        rate                : 0,   # Given in events per minute
        min_sample_size     : 5,
        max_sample_size     : 10,
        cluster_match_type   : "first_match",
        matcher : {
          components: [
            { name: "custom_info.eventDetails.datacenter",   similarity: 1.0}
          ]
        }
      },
      {
        chef                : "CValueRecipeV2",
        name                : "by Server",
        description         : "Issue impacting server $UNIQUE(source) housed in $UNIQUE(custom_info.eventDetails.datacenter) datacenter. $CLASS(network_source)",
        recipe_alert_threshold : 1,
        exclusion           : null,
        trigger             : null,
        seed_alert          : null,
        rate                : 0,   # Given in events per minute
        min_sample_size     : 5,
        max_sample_size     : 10,
                cluster_match_type   : "first_match",
        matcher : {
          components: [
            { name: "source",   similarity: 1.0 },
            { name: "custom_info.eventDetails.datacenter",   similarity: 1.0}
          ]
        }
      }
    ],
    cook_for          : 20000
}

How To Enable Single Recipe Matching

To enable the feature in a UI cookbook - tick the 'First Recipe Match Only' parameter in the Cookbook configuration tab.

firstRecipeMatchUI.png

If you set up your Cookbook via backend file-based configuration, set the single_recipe_matching parameter to true.

In cookbook.conf:

{
    # Moolet
    name                : " DatacenterCookbook",
 
    .......
 
    # Setting single_recipe_matching to true causes the
    # cookbook to treat the recipes as being in an order of
    # priority, based on the order of configuration in this
    # file, highest priority first.    #    # Individual alerts may only:
    #
    # a) appear in a single situation generated by a
    # particular recipe, and
    # b) subsequently may only appear in a situation
    # generated by a higher priority recipe.
    #
    # An alert is treated as being a new alert after it has
    # been closed (and reappeared) and is once again
    # available for inclusion in situations generated by any
    # recipe.
    single_recipe_matching : true,
 
    .......