Moogsoft Docs

Configure the AWS CloudWatch LAM

CloudWatch is the monitoring tool for Amazon Web Services (AWS), its applications and other cloud resources. AWS CloudWatch is useful for tracking metrics, collecting log files, setting alarms, and reacting to changes in your AWS resources. It monitors resources including Amazon EC2 instances, Amazon DynamoDB tables, and Amazon RDS DB instances.

See AWS CloudWatch for UI configuration instructions.

The AWS integration fetches alarms and events from the AWS CloudWatch. The workflow of gathering alarms/events from AWS and publishing it to Moogsoft AIOps is as follows:

  1. AWS LAM reads the configuration from the aws_lam.conf file.

  2. AWS LAM reads credentials and region of AWS from the config file and requests Amazon Web Services for alarms/events.

  3. The AWS LAM parses the received alarms/events and converts it into a map and submits it to Event Factory.

  4. The events are parsed and converted into normalized Moogsoft AIOps events.

  5. The normalized events are then published to MooMS bus.

Configuration

The alarms/events received from AWS are processed according to the configuration in the awl_lam.conf file. The processed alarms are published to Moogsoft AIOps.

The configuration file contains a JSON object. At the first layer of the object, LAM has a parameter called config, and the object that follows config has all the necessary information to control the LAM.

Monitor

The AWS LAM takes alarm and event data from the AWS CloudWatch. To establish a connection with AWS, you can configure the parameters here:

General

Field

Type

Description

name and class

String

Reserved fields: do not change. Default values are AWS Monitor and CAwsMonitor.

access_key_id

String

Enter the Access Key ID received at the time of creating the AWS account.

encrypted_access_key_id

String

If the access key ID is encrypted, then enter the encrypted access key id in this field and comment out the access_key_id field. Either access_key_id or the encrypted_access_key_id field is used. If both fields are not commented, then only encrypted_access_key_id will be used.

Secret_access_key

String

Enter the Secret Access key received at the time of creating the AWS account.

encrypted_secret_access_key

String

If the secret access key ID is encrypted, then enter the encrypted password in this field and comment out the secret_access_key_id field. Either secret_access_key_id or the encrypted_secret_access_key_id field is used. If both fields are not commented then the field encrypted_secret_access_key_id is used.

enable_proxy

Boolean

Set it to true, if you want to use proxy for communication with AWS.

proxy_host

String

Enter the host name or the URL of the proxy. This field will only work if enable_proxy is set to true.

proxy_port

Integer

Enter the port of the proxy. This field will only work if enable_proxy is set to true.

proxy_userid

String

Enter the username of the user who has the rights to access the proxy. This field will be active only when the enable_proxy is set to true.

proxy_password

String

Enter the password of the user whose user name is given in the proxy_userid field. This field is active if enable_proxy is set to true.

encrypted_proxy_password

String

If the proxy password is encrypted, then enter the encrypted password in this field and comment out the proxy_password field. Either proxy_password or the encrypted_proxy_password field is used. If both fields are not commented, then only the field encrypted_proxy_password will be used.

polling_interval

Integer

The polling time interval, in seconds, between the requests after which the event data is fetched from the AWS.

Default = 60 seconds. If specified value is less than 1, the polling_interval will set to 60 seconds.

max_retries

Integer

The maximum number of retry attempts to reconnect with AWS server in case of a connection failure.

Default = -1, if no value is specified, then there will be infinite retry attempts.

If the specified value is greater than 0, then the LAM will try that many times to reconnect; in case of any other value less than 0, max retries will set to default.

retry_interval

Integer

The time interval between two successive retry attempts.

Default = 60 seconds, if specified value is less than 1,retry_interval will set to 60 seconds.

retry_recovery

Object

Specifies the behavior of the LAM when it re-establishes a connection after a failure.

- recovery_interval: Length of time to wait between recovery requests in seconds. Must be less than the request_interval set for each target. Defaults to 20.

- max_lookback: The period of time for which to recover missed events in seconds. Defaults to -1 (recover all events since the last successful poll).

timeout

Integer

This is the timeout value in seconds, which will be used to timeout a connection, socket and request. If no value is specified, then the time interval will set to to 120 seconds.

exclude_protected_regions

Boolean

When set to true, US Government and Chinese regions are excluded when "aws_all_regions" is used in either the alarms or events filter. By default, all regions are included.

Note

Below are the minimum access levels required for a user to retrieve data from the AWS:

  • AmazonEC2ReadOnlyAccess

  • CloudWatchLogsReadOnlyAccess

  • CloudWatchReadOnlyAccess

Secure Sockets Layer

Field

Type

Description

ssl

Boolean

Set to true, to enable SSL Communication:

  • ssl_keystore_file_path: Enter the path of the keystore file. This is the path where the generated keystore file is copied e.g. "/usr/local/aws_ssl/keystore.jks".

  • ssl_keystore_password: Enter the password of keystore. It is the same password that was entered when the keystore was generated.

Filter

Field

Sub Field

Type

Description

filter

alarms

Object

Alarms are fetched from the regions described in the alarms filter. See the example for more information.

You can filter the alarms from the regions added to the alarms field.

Each region has 2 parameters:

    • alarm_name_prefix: Enter the alarm name prefix.

    • alarms_to_monitor: Enter the name of the alarms.

The alarms filter is used to filter the alarms received from AWS CloudWatch per region basis. The alarm_name_prefix, filters the alarm based on the prefix in the alarm name. For example, if "test" is entered, then all the alarms having the text "test" in the starting of their names will be filtered and sent to Moogsoft AIOps.

In alarms_to_monitor, the alarm name is given, for example "alarm1". Only the alarms with the alarm name entered here will be sent to Moogsoft AIOps. You can also provide multiple alarm names separated by comma, for example "alarm1","alarm2".

Note

If none of the filter is provided, then all the alarms from the AWS account will be forwarded to Moogsoft AIOps. Only one filter will be used at a time, it can be either alarm_name_prefix or alarms_to_monitor.

If no configuration is present in the filter section, then LAM will not fetch alarms from any region.

If you want to fetch alarms from all regions, then leave the "aws_all_regions" block as uncommented. You may specify filter parameters in this block to apply filter(s) for all regions.

events

Object

Events are fetched from the regions described in the events filter. See the example for more information.

You can filter the events from the regions added to the events field.

The regions have 2 parameters:

filter_pattern: Enter the filter pattern.

log_group_to_monitor: Enter the log group to monitor.

Only the events which are logged in the log group given in the log_group_to_monitor field, and which have the same pattern as entered in the filter_pattern field will be forwarded to the Moogsoft AIOps GUI. For example, the log group /aws/lambda/SomethingHappened have events with a word "scheduled" in it , so to filter the events having the word "scheduled" in it, "scheduled" is entered in the filter_pattern field and /aws/lambda/SomethingHappened is entered in the log_group_to_monitor field.

Note

If none of the filter is provided, then all the events from the region where it is left blank will be sent to the LAM.

If no configuration is present in the filter section, then LAM will not fetch events from any region.

Note

If alarms or events are not to be filtered, comment out the complete filter section of the config file. If only alarms are to be filtered, then comment out the event's section or vice-versa.

Note

The LAM starts fetching the events from the current time. After that it saves the last poll time (in epoch format) in the state file.The state file is generated in the same folder where the config file is present e.g. $MOOGSOFT_HOME/config. The LAM generates the name of the state file as <proc_name>.state. Here the default proc_name (process name) is aws_lam, therefore, the state file name is aws_lam.state. proc_name is defined in the aws_lam.sh file located at $MOOGSOFT_HOME/bin.

It is recommended not to make any changes to the state file as this may lead to loss of events.

Note

The LAM can fetch alarms from multiple regions. In state file, there are 15 regions to fetch the alarms, and for logs there is one common timestamp which is used to fetch events from all the applicable regions. For example,

{"alarms":{"ap-south-1":1509610912603,"eu-west-3:1509610912603","eu-west-2":1509610912603,"eu-west-1":1509610912603,"ap-northeast-2":1509610912603,"ap-northeast-1":1509610912603,"ca-central-1":1509610912603,"sa-east-1":1509610912603,"ap-southeast-1":1509610912603,"ap-southeast-2":1509610912603,"eu-central-1":1509610912603,"us-east-1":1509610912603,"us-east-2":1509610912603,"us-west-1":1509610912603,"us-west-2":1509610912603},"logevent":1509610854792}

Example

monitor:
{
name                                    : "AWS Monitor",
class                                   : "CAwsMonitor",
role_arn                        : "",
role_session_validity           : 3600,
access_key_id                   : "",
#encrypted_access_key_id        : "",            
secret_access_key               : "",
#encrypted_secret_access_key    : "",
enable_proxy                    : false,
proxy_host                      : "localhost",
proxy_port                      : 8080,
proxy_userid                    : "userid",
proxy_password                  : "password",
#encrypted_proxy_password       : "",
exclude_protected_regions       : true,
filter:
        {
                alarms:
                {
                        "aws_all_regions":
                        {
                                #alarm_name_prefix              : "",
                                alarms_to_monitor       : []
                        }
                        "us-west-2":
                        {
                                #alarm_name_prefix      : "",
                                alarms_to_monitor       :[]
                        },
                        "ap-south-1":
                        {
                                #alarm_name_prefix      : "",
                                alarms_to_monitor       : []
                        }
                },
                events:
                {
                        "aws_all_regions":
                        {
                                #filter_pattern         : "",
                                log_group_to_monitor    : []
                        },
                        "us-west-2":
                        {
                                #filter_pattern         :"",
                                log_group_to_monitor    :[]
                        },
                        "ap-south-1":
                        {
                                #filter_pattern         :"",
                                log_group_to_monitor    :[]
                        }
                }
        },
polling_interval                                : 60,
max_retries                                     : -1,
retry_interval                                  : 60,
retry_recovery:
        {
                recovery_interval               : 20,
                max_lookback                    : -1
        },
timeout                                                 : 120
},
Agent and Process Log

Agent and Process Log allow you to define the following properties:

  • name: Identifies events the LAM sends to the Message Bus.

  • capture_log: Name and location of the LAM's capture log file.

  • configuration_file: Name and location of the LAM's process log configuration file.

Mapping

You can directly map the alarm/event fields of AWS with fields displayed in the Moogsoft AIOps. The mapping example is as follows:

mapping :
        {
            catchAll: "overflow",
            rules:
            [
                { name: "signature", rule:      "" },
                { name: "source_id", rule:      "" },
                { name: "external_id", rule:    "" },
                { name: "manager", rule:        "AWS Cloudwatch" },
                { name: "source", rule:         "" },
                { name: "class", rule:          "$class" },
                { name: "agent", rule:          "$LamInstanceName" },
                { name: "agent_location", rule: "" },
                { name: "type", rule:           "" },
                { name: "severity", rule:       "" },
                { name: "description", rule:    "" },
                { name: "agent_time", rule:     "" }
            ]
        },
        filter:
        {
            presend: "AwsLam.js"
        }

The above example specifies the mapping of the AWS alarm fields with the Moogsoft AIOps fields. Data not mapped to fields goes into "Custom Info".

Note

The signature field is used by the LAM to identify correlated alarms.

Constants and Conversions

Constants and Conversions allows you to convert formats of the received data.

Field

Description

Example

Severity and sevConverter

has a conversion defined as sevConverter in the Conversions section, this looks up the value of severity defined in the severity section of constants and returns back the mapped integer corresponding to the severity.

severity:
{
 "CLEAR"        : 0,
 "INDETRMINATE" : 1,
 "WARNING"      : 2,
 "MINOR"        : 3,
 "MAJOR"        : 4,
 "CRITICAL"     : 5
}, 
sevConverter:
{
    lookup : "severity",
    input  : "STRING",
    output : "INTEGER"
},

stringToInt

used in a conversion, which forces the system to turn a string token into an integer value.

stringToInt:
{
    input  : "STRING",
    output : "INTEGER"
},
Example

Constants and Conversions

 constants:
        {
            severity:
            {
                "CLEAR"             : 0,
                "INDETERMINATE"     : 1,
                "WARNING"           : 2,
                "MINOR"             : 3,
                "MAJOR"             : 4,
                "CRITICAL"          : 5,
            }
        },
        conversions:
        {
                        sevConverter:
                        {
                            lookup : "severity",
                    input  : "STRING",
                    output : "INTEGER"
                        },


                        stringToInt:
            {
                input  : "STRING",
                output : "INTEGER"
            }
         
        },
Severity Reference

Moogsoft AIOps Severity Levels

severity:
        {
            "CLEAR"           : 0,
            "INDETERMINATE" : 1,
            "WARNING"                 : 2,
            "MINOR"           : 3,
            "MAJOR"           : 4,
            "CRITICAL"                : 5,
            
        }

Level

Description

0

Clear

1

Indeterminate

2

Warning

3

Minor

4

Major

5

Critical

Service Operation Reference

Process Name

Service Name

aws_lam

awslamd

Start the LAM Service:

service awslamd start

Stop the LAM Service:

service awslamd stop

Check the LAM Service status:

service awslamd status