Situation Analysis
Description
The Situation Analysis workflow configuration allows you to add tags to a Situation based on a set of quantitative analysis rules run against the alerts within the Situation. These rules can be simple; for example, “70% of the alerts are critical”, or more complex; such as, “add a tag if there are 2 different hostnames, at least 20 alerts and more than 10 disk space full alerts”.
Each analysis method consists of a set of criteria (a rule), with three possible types:
threshold: a count or percentage threshold:
alerts sharing the same attributes
alert count
distinct: a count of unique values of an alert field.
filter : include or exclude alerts from the analysis.
You can define multiple analysis methods with one as the default to use if the analyseSituation action does not define an explicit method. The action adds the relevant tag only if the Situation meets all criteria: if A and B and C then addTag. Because all criteria must be true, the order is unimportant. The action evaluates filter criteria before all others. This ensures that the thresholds are applied only to the fully filtered set of alerts.
In addition to setting and removing tags, the analysis can also add and remove Situation Flags. This allows external and internal processes to retrieve Situations with specific tags using the getSituationsWithFlag endpoint in the Graze API.
Configuration settings
An analysis method consists of the following configuration items:
Setting | Type | Required | Description |
---|---|---|---|
Tags field | string - global | yes | A JavaScript array that defines where the Situation tags will be set. The default is custom_info.tags. |
Method name | string | yes | A unique name for the analysis method. |
Tag | string | yes | The tag that will be added if all criteria are met. |
Set as Default | Boolean | no | Should this method be run if no other method was specified. All default analysis methods will be run. |
Remove existing tag | Boolean | no | If the criteria returns false, and the tag was previously set, then remove it. |
Add tag as Situation flag | Boolean | no | Sets or removes Situation flags to match any tags. |
Criteria | group | yes | The list of criteria within this analysis method. |
Criteria type | choice | yes | The criteria type. See the following sections: |
Distinct criteria settings
Setting | Type | Required | Description |
---|---|---|---|
Distinct operator | choice | yes | The calculation - i.e. are there more than X distinct values of Y
|
Number of distinct values | number | yes | The threshold of distinct values. |
Attribute name | text | yes | The alert field (such as source) to check. |
Filter criteria settings
Setting | Type | Choices | Required | Description |
---|---|---|---|---|
Exclude alert | Boolean | yes | Should this filter be included or excluded. If selected (checked), any alert matching the filter will not be evaluated against the distinct or threshold criteria. | |
Filter | text | yes | A valid SQL-like alert filter, e.g. description matches ‘jvm_stall’ |
Threshold criteria settings
Configuration Item | Type | Required | Description |
---|---|---|---|
Threshold | number | yes | The absolute or % value. (See percent and threshold type) |
Threshold type | choice | yes | The type of threshold: either the number of alerts (as a value or percentage) or the number / percentage of alerts sharing the same value for a group of attributes:
NoteIf a threshold type of “alert count” is used with a percentage, the percentage is calculated against the total number of alerts in the situation — not the filtered set. |
List of attributes | list | yes if “same value” was used | The list of attributes to check. For example, check if 70% of alerts shared the same source and type add both “source” and “type” as attributes. |
Percentage | Boolean | yes | Specifies whether the threshold is an absolute number or a percentage of all member alerts. |
Simple example
If a Situation has more than 50% of alerts with a severity of critical, add the tag “Urgent” to the tags.
Complex example
A tag of JVM_STALL should be added to the Situation if there are more than 5 critical jvm_stall alerts from 3 different hosts from the same application.