# Moogsoft Docs

### High Availability

High Availability (HA) deployments of Moogsoft AIOps comprise multiple instances of Moogsoft AIOps, Moogfarmd, and the associated LAMs to minimize downtime and data loss. Component redundancy protects against single points of failure. It also provides reliable mechanisms to enable failover from one component to another to avoid performance degradation and data loss.

See HA - Deployment Scenarios for deployment examples.

#### HA Components

Moogsoft AIOps is made up of the following set of processes and services components, which can be implemented in a distributed environment: HA in Moogsoft AIOps:

• Event ingestion (LAMs)

• Event processing (Moogfarmd)

• User interface (Nginx, servlets running in Apache Tomcat and Elasticsearch)

• RabbitMQ broker (Messaging system)

• Database (MySQL)

See HA - Setup for Dependencies for more information on how to set these up for distributed installations.

Introducing component redundancy (for example two identically configured event processing (Moogfarmd) components) makes HA system architecture possible. Implementing HA system architecture enables failover of Moogsoft AIOps components without loss of data or performance.

Failover of Moogsoft AIOps components is manually triggered using the ha_cntl command line utility which also allows the status of all components in the HA installation to be viewed.

###### HA Features at a Glance
• LAMs, Moogfarmd and Tomcat servlets can run in 'Active' mode (normal operation) or 'Passive' mode (not processing messages).

• Instance/Process Group/Cluster naming convention for LAMs, Moogfarmd and Tomcat servlets to build logical groupings for failover scenarios.

• ha_cntl utility to show running HA components and to trigger manual failovers (by setting Instances/Process Groups/Clusters to Active or Passive).

• 'Leader' capability to allow only defined Instances to become Active when their parent Cluster/Process Group becomes Active.

• Moolet state sharing ability (persistence) for Moogfarmd to facilitate data integrity during failover of Moogfarmd.

• MySQL 'failover' connection definition to allow Moogsoft AIOps components to failover to backup MySQL servers if the primary connection goes down.

• Handling for the UI to continue normal operation in the event of a UI failover.

• Self Monitoring pages in UI and moog_monitor command line utility show HA information.

• Product installation using split RPMs (by functional component) for easier distributed deployment.

###### Moogsoft AIOps HA architecture key concepts

Concept

Description

Component

An instance of a LAM, servlet or Moogfarmd. HA introduces redundancy on a component, Process Group or Cluster level.

Instance

A name for each Moogsoft AIOps component.

Process Group

A group of one or more of the same type of Moogsoft AIOps components (such as a group of load sharing socket LAMs). All Moogfarmd components in a Process Group must have identical configuration.

Cluster

One or more Process Groups. A Cluster must contain at least one Process Group.

Zone

Any number of Clusters, Process Groups and Instances can be defined within a single messaging 'Zone' (RabbitMQ broker vhost). Failover actions for those Clusters/Process Groups/Instances are limited to be within that Zone.

##### Architecture

Instances are individual components that run on a single machine. Process Groups and Clusters, however, can span multiple machines. Their configuration allows the flexibility to define architectural groupings for failover actions as long as they are within the same MooMS Zone:

Component

Description

Active / Passive mode

Instances are configured to operate in Active or Passive mode. Instances that are operational are set to Active mode. Instances that are backup (redundant components) operate in Passive mode

Defines which Instance in a Process Group becomes Active when the whole Process Group switches from Passive to Active state. Normally, only one Instance per Process Group should be defined as a Leader. Leader definition availability is as follows:

Component

Moogfarmd

Mandatory for a Process Group with more than one Moogfarmd

Socket LAM, Logfile LAM,TrapdLAM

Optional

REST LAM, UI servlets

Not applicable

Leadership status is a property of Process Groups. There are two states of group leadership status (as seen in the output of ha_cntl --view command. See below) as follows:

State

Description

This is the default setting for components where Leader definition is supported (in Moogfarmd, Socket LAM, Logfile LAM, Trapd LAM)

"no leader - all can be active"

This is the default setting for components where Leader definition is not supported (REST LAM and the UI servlets), OR if there is one or more Instances in the Process Group configured with "only_leader_active = false,". The behavior is dynamic, i.e. terminating such an Instance will change back the Process Group's status to "only leader should be active"

All of the above are defined when starting each component. If any of the values are not explicitly defined as parameters at startup, values are taken from the component configuration file (or if not defined there, values in system.conf are used).

$MOOGSOFT_HOME/bin/moog_farmd --cluster Surbiton --instance MASTER --leader yes --mode active The above creates an Instance of moog_farmd and defines it as a member of the Surbiton Cluster, with the Instance name MASTER. It also defines it as the Leader Instance in its Process Group and configures it to operate in Active mode. No Process Group (--group) is defined, so the default name (from the component configuration file) moog_farmd is used. #### HA Configuration The information in the table describes how to configure Moogsoft AIOps components for a HA architecture. Component File Section Example Description Default Cluster$MOOGSOFT_HOME/config/system.conf

ha section, cluster property

"ha":
{
"cluster": "NY"
}

The name of the Cluster. This supersedes anything set in system.conf (can also be overwritten by the command line)

LAMs

LAMs configuration file

ha section

ha:
{
cluster: "NY",
group: "socket_lam",
accept_conn_when_passive: true
}

Property

Description

cluster

The name of the Cluster. This supersedes anything set in system.conf (can also be overwritten by the command line)

group

The name of the Process Group. This defaults to the LAM process name if no value is specified (for example socket_lam)

A Boolean, indicating if the LAM is the Leader within its Process Group (see above). The default value is true if not specified

A Boolean that changes the type of Process Group from a Leader Only group to a Process Group where more than one process can be Active. The default is true, except for the REST LAM where it is not supported and it is always treated as false

accept_conn_when_passive

A Boolean instructing the LAM what to do in Passive mode. If true (or not set), the LAM accepts incoming connections but discards any events received. If false, the LAM does not accept incoming connections, and closes the socket from socket/Trapd LAMs. This is to prevent a load balancer from detecting them as unavailable and routing traffic elsewhere

moog_farmd

moog_farmd.conf

ha section

ha:
{
cluster: "NY",
group: "moog_farmd",
}

Property

Description

cluster

The name of the Cluster. This supersedes anything set in system.conf (can also be overwritten by the command line)

group

The name of the Process Group. This defaults to moog_farmd

A Boolean, indicating if this Moogfarmd is the Leader within its Process Group (see above). Defaults to true if no value is specified

Command line overwrites

Component

Description

Command line

Cluster

cluster SF to the command line for starting the component

cluster SF

Process Group

group cool_group to the command line for starting the component

group cool_group

Instance

instance instance_3 (for example) to the command line for starting this Instance of the component

instance instance_3

Passive Mode

mode passive to the command line for starting this Instance of the component

(where only_leader_active is set)

leader no to the command line for starting this Instance of the component. This will overwrite the default_leader in the configuration files

Example:

$MOOGSOFT_HOME/bin/moog_farmd --instance TEST_INSTANCE --group TEST_GROUP --cluster TEST_CLUSTER --mode passive$MOOGSOFT_HOME/bin/socket_lam --instance SOCK1 --group SOCKGROUP --cluster CLUSTER1 --leader no --mode passive

Servlets

$MOOGSOFT_HOME/config/servlets.conf ha section ha : { instance: "servlets", group: "UI", start_as_passive: false } • Note that all servlets defined in this file act as one HA "instance" - hence will all failover together • If cluster is not specified, the name of the Cluster is taken from the system.conf file • If group is not specified, the name defaults to "servlets" • If start_as_passive is not specified, then the servlet defaults to a setting of false for this property; hence, it is Active on startup ### Note • Servlets do not support any 'leader' settings • The apache-tomcat service must be restarted to apply configuration changes made to the servlets #### Active and Passive mode behavior In a High Availability deployment, Moogsoft AIOps components may operate in either Active or Passive mode. In Active mode, their behavior is unchanged from non-HA Moogsoft AIOps installations, carrying out data ingestion, processing, presentation, etc. In Passive mode, these activities do not occur - the component is effectively on standby, waiting for an instruction to start the processing activities defined by its component type and configuration setup. Failover is the process of converting one or more processes from Active to Passive mode while converting other processes from Passive to Active mode. The Active/Passive state of the HA components in a Cluster can be viewed in the Moogsoft AIOps UI using Self Monitoring or via the ha_cntl utility (see below). In the UI, Passive processes are indicated by the icon. Further details of how components behave in Passive mode and how, where relevant, the Passive mode may be identified from the command line are given below. ###### Servlets When the UI is in Passive mode, the moogsvr servlet will reject all requests with an HTTP status of 503 (server unavailable) and the moogpoller servlet will not accept incoming websocket upgrade requests. When switching from active to passive the moogsvr servlet will start rejecting requests and the moogpoller servlet will disconnect any existing websocket sessions. A Load balancer can therefore determine whether a UI is running in Active or Passive mode by sending a GET request to https://Li<server>:<port>/moogsvr/hastatus. A 204 response indicates that the UI is Active, a 503 response indicates Passive mode. The following example curl command can be sent from the command line to check servlet status: curl -k https://moogbox2/moogsvr/hastatus -v The output is < HTTP/1.1 204 No Content if the servlet is in Active mode, or < HTTP/1.1 503 Service Unavailable if the servlet is in Passive mode. ###### Moogfarmd A Moogfarmd process running in Passive mode will not process events or detect Situations. When it fails over to Active mode, it will be able to carry on using the state from the previously Active Instance if this has been persisted (see below). When the Moogfarmd state is being persisted, only one Moogfarmd process is allowed to run in Active mode at any given time within a single Moogfarmd Process Group. If more than one Moogfarmd process is started in Active mode, all but the first to become Active will be automatically converted to run in Passive mode within a few seconds. The same applies to new Moogfarmd processes started in Active mode when an Active Moogfarmd is already running. This prevents a condition known as 'split brain'; where two Active processes both believe that they are responsible for executing functionality. ### Note All Instances of Moogfarmd within the same Process Group must have identical configuration. ###### LAMs LAMs operating in Passive mode do not send Events to the MooMS bus. The REST LAM in Passive mode will reject POST requests with an HTTP status of 503 (server unavailable). Example curl command to check rest_lam status: curl -x POST http://moogbox2:9876 -v The output is < HTTP/1.1 503 Service Unavailable if the rest_lam is in Passive mode. If the rest_lam is in Active mode, then the response code is dependent on the format of data sent to it as per normal rest_lam behavior. Configuring persistence of state in moog_farmd The state of Moogfarmd can be persisted to ensure that context is not lost when failover occurs from one Instance of Moogfarmd to another. This means that information held in memory about the Situations created by the Sigalisers and the current state of the Sigalisers themselves will not be lost. The new Instance of Moogfarmd will continue to process events and detect the same Situations as would have been detected if there had been no failover. The state of the in-memory database (and the Constants module) will always be persisted if persistence is turned on. For each of the following Sigalisers: • Classic Sigaliser • Speedbird • Cookbook The persist_state configuration parameter in moog_farmd.conf must be set to true to ensure that the state for each Sigaliser is persisted. The state of the Alert Rules Engine Moolet can also be persisted using the persist_state configuration parameter. Similarly, setting "persist_state" for the AlertBuilder (or any other moolet) ensures that any tasks queued for that moolet - in this case Events that have not yet been processed - are persisted to Hazelcast while queueing and will be processed by another instance of farmd after failover. When failover occurs, events and other pieces of information may be queued in Moolets, waiting to be processed. To ensure that these tasks are processed in the newly Active Instance of Moogfarmd after failover, the persist_state flag is again used. This flag may be used for any Moolet that has a queue of tasks awaiting processing which, for all practical intents and purposes, is every Moolet other than the Scheduler. ### Note To take advantage of this feature and to ensure that the newly Active Moogfarmd Instance takes over from where the previous one left off, the message_persistence property in the MooMS section of the system.conf file must be set to true. ###### Choice of persistence mechanism and configuration Persistence may be carried out using a Hazelcast in-memory Cluster. The persistence mechanism is configured in system.conf in the persistence section, for example:  # Persistence configuration parameters. "persistence" : { # Set persist_state to true to turn persistence on. If set, state # will be persisted in a Hazelcast cluster. "persist_state" : true, # Configuration for the Hazelcast cluster. "hazelcast" : { # The port to connect to on each specified host. "network_port" : 5701, # If set to true Hazelcast will increment the port number to # an available one if the configured port is unavailable. "auto_increment" : true, # A list of hosts to allow to participate in the cluster. "hosts" : ["localhost"], # Additional config to allow cluster info to be viewed via # Hazelcast's Management Center UI, if running. "man_center" : { "enabled" : false, "host" : "localhost", "port" : 8091 } }, ... } and as previously mentioned, ensure that the message_persistence property in the MooMS section of the system.conf file is set to true: "mooms": { "zone": "MOOG", "brokers": [ { "host": "localhost", "port": 5672 } ], "username": "moogsoft", "password": "m00gs0ft", "message_persistence": true, "max_retries": 100, "retry_interval": 200, "cache_on_failure": false, "cache_ttl": 900 } ###### Clearing Persistence Data on Start-up If persistence is configured, once all Moogfarmd Instances have been stopped, the in-memory persistence data is lost. Moogfarmd also has a command line option --clear_state which, when specified at start-up, clears any current persistence data for the Process Group that the Moogfarmd Instance is a member of. This ensures a clean start for that particular Instance (i.e it would have no memory of previously created Situations) but also impacts any other running Moogfarmd Instances in that Process Group. ### Note This option does not remove Moogfarmd persistence data from other Process Groups [root@moogbox2 regression-tests]# moog_farmd --help \n-------- Copyright MoogSoft 2012-2015 --------\n\n Executing: moog_farmd\n\n------------ All Rights Reserved -------------\n usage: moog_farmd [ --config=<path to config file> ] [ --loglevel (INFO|WARN|ALL) ] [--clear_state] [ --instance <name> [ --cluster <name> --group <name> [ --mode <passive|active> ] [ --leader <yes|no> ] ] ] [ --version ] MoogSoft moog_farmd: Container for our herd of moolets --clear_state Clears any persisted state information associated with this process group on startup. --cluster <arg> Name of HA cluster (to overwrite the config file) --config Specify a full path to the configuration file of this farmd --group <arg> Name of HA group (to overwrite the config file) --instance <arg> Give this farmd herd a name for use with farmd control --leader <arg> Is this instance an HA leader within its group (yes, no) --loglevel <arg> Specify (INFO|WARN|ALL) to choose the amount of debug output - warning ALL is very verbose! --mode <arg> Start the process in passive or active mode (default will be active) --version Return current version of the Moog software #### Configuring Automatic Failover for Moogfarmd When configured in an active/passive HA configuration moog_farmd, has the capability for automatic failover. This allows a passive Moogfarmd to automatically take over processing from another (active) Moogfarmd in the same HA process group if the passive Moogfarmd detects that the active Moogfarmd has become inactive and is failing to report its status. This feature is controlled by three configuration properties: • automatic_failover • keepalive_interval • margin These properties are in the "failover" block in$MOOGSOFT_HOME/config/system.conf:

...
,
"failover" :
{
"persist_state" : false,
# Configuration for the Hazelcast cluster.
"hazelcast" :
{
...
},
# Failover configuration below currently applies only to moog_farmd.

# Interval (in seconds) at which processes report their
# active/passive status and check statuses of other processes.
"keepalive_interval" : 5,

# At next keepalive_interval, processes will allow <margin> seconds
# before treating active processes who have not reported their
"margin" : 3,

# Number of seconds to wait for previously active process to
# become passive during manual failover. After this time has
# expired the new instance will become active and force the
# process to become passive.
"failover_timeout" : 10,

# Allow a passive process to automatically become active if
# no other active processes are detected in the same process group
"automatic_failover" : false,

# Process will stop indicating that it is active if it fails
# to send <value> consecutive heartbeats.
"heartbeat_failover_after": 2
},
...

Property

Description

Other

automatic_failover

property enables or disables the feature

true|false

keepalive_interval

(seconds and defaults to 5) defines how often a Moogfarmd process reports its active/passive status to the database and checks the status of other reporting Moogfarmd processes

inactive

margin

(seconds and defaults to 3) defines how long after a passive Moogfarmd has detected that a formerly active Moogfarmd (in its same process group) is no longer reporting status and should therefore become active and takeover processing

inactive

failover_timeout

(seconds and defaults to 10)

active

heartbeat_failover_after

(number and defaults to 2) defines that the process will stop indicating that it is active if it fails to send <value> consecutive heartbeats

active

###### Example Automatic Failover Tuning

Assuming a highly simplified multi-host HA setup such as:

 +---------------------+ +---------------------+|server1| | |server2| |+-------+ | +-------+ || moog_farmd (active) | | moog_farmd (passive)|+----------+----------+ +-----------+---------+| || || || +---------------+ || |server3| | || +-------+ | |+---------+ DB +----------++---------------+

The moog_farmds on server1 and server2 are in the same process group but in different clusters. All other config is identical.

• With automatic_failover: false, set in system.conf, on both server1 and server2, then if the active Moogfarmd process on server1 is killed, becomes unresponsive, loses contact with the DB or drops off the network, then Moogfarmd on server2 will remain passive and not take over processing unless a manual failover is triggered using ha_cntl.

• With automatic_failover: true, set in system.conf, on both server1 and server2 and with default keepalive_interval and margin settings, then if the active Moogfarmd process on server1 is killed*, becomes unresponsive, loses contact with the DB or drops off the network, then Moogfarmd on server2 will automatically become active and take over processing between 3-8 seconds later (depending on when next keepalive_interval occurs). If the Moogfarmd on server1 is then restarted, resumes processing or rejoins the network, it will establish that there is already another active Moogfarmd running in its process group (i.e. the instance now active on server2) and it will become passive to prevent split-brain processing occurring

Thus, the keepalive_interval and margin properties can be used to tune the sensitivity of automatic failover. In the above example (and with default settings) automatic failover happens promptly. Users may wish to increase or decrease the interval at which Moogfarmd reports its status and also allow more time before a passive Moogfarmd tries to take over processing (possibly useful if the active Moogfarmd suffered a short interruption but has quickly resumed). Setting (for example) automatic_failover : true, keepalive_interval : 3 and margin : 10 would mean for the above system:

• The active Moogfarmd process on server1 is killed*, becomes unresponsive, loses contact with the DB or drops off the network, then Moogfarmd on server2 will automatically become active and take over processing between 10-13 seconds later (depending on when next keepalive_interval occurs). If the Moogfarmd on server1 resumes processing or rejoins the network within 10secs of the passive Moogfarmd on server2 detecting it as down, then it will continue as the active instance and the Moogfarmd on server2 will remain as passive and not take over. Conversely if the Moogfarmd on server1 had been restarted instead then it would not continue as the active process and the passive Moogfarmd on server2 would become active and take over

* See note below on failover behaviour when process is killed or shutdown "cleanly".

###### Important Notes and Limitations:
• For Moogfarmd only, automatic failover of LAMs or UI Servlets is not part of this implementation

• Identical configuration is needed on all servers running as part of the HA setup (as per other HA configuration)

• time sensitive: requires all servers to be time synchronised. A change to the system time on the DB server in a running HA setup could trigger automatic failover between Moogfarmd instances

• Requires communication with the DB. If the DB or DB server becomes unresponsive to all Moogfarmd instances then the feature will not work as expected

• If, in an automatic failover setup, an active moog_farmd instance is shutdown cleanly (i.e. using normal kill, service stop or ctrl-c) then a passive Moogfarmd will take over processing at its next keepalive_interval and will not wait the additional <margin> seconds

###### A Note on Process Startup

If automatic_failover is enabled and a Moogfarmd instance is started in passive mode and no other active Moogfarmd is running in its process group, it will switch to active. Users may wish to factor this in when starting up a system i.e. it is easiest to startup active Moogfarmd instances first.

###### A Note on Split-Brain Handling

HA implementation has built in handling to prevent split-brain processing occurring i.e. two active Moogfarmds (in the same process group) running at the same time and potentially leading to duplicate processing. At its simplest it prevents a second Moogfarmd being started in active mode if there is another active instance already running in the same process group (regardless of cluster or instance name). The second Moogfarmd will startup but will immediately switch to passive mode.

#### Controlling Moogsoft AIOpsHA (ha_cntl)

Moogsoft AIOps includes a High Availability Control utility to control the HA architecture.

Use the ha_cntl utility to:

• failover (change status of) Instances, Process Groups or Clusters

• view the current status of all Instances, Process Groups and Clusters

There is also help available for the ha_cntl utility.

### Note

The UI will not continue to function correctly after a failover if only one of the Tomcat servlets is failed over using activate or deactivate commands at a servlet Process Group level. Currently, the UI must be failed over at a Cluster level to ensure continued smooth operation

ha_cntl utility commands are as follows:

Command

Description

-a,--activate <arg>

Specify cluster[.group[.instance_name]] to activate all Process Groups within a Cluster, a specific Process Group within a Cluster or a single Instance

-d,--deactivate <arg>

Specify cluster[.group[.instance_name]] to deactivate all Process Groups within a Cluster, a specific Process Group within a Cluster or a single Instance

-h,--help

Print help text, that describes ha_cntl commands

-l,--loglevel <arg>

Specify (INFO|WARN|ALL) to choose the amount of debug output

-t,--time_out <arg>

Specify an amount of time (in seconds) to wait for the last answer. If not set, the default is 2 seconds

-v,--view

View the current status of all Instances, Process Groups and Clusters

-y,--assumeyes

Answer yes for all prompts. Useful for automation

###### Examples

Command line

Description

$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.socket_lam.SOCK1 This activates the socket_lam Instance SOCK1 within the Process Group socket_lam within the Cluster SURBITON. If the socket_lam Process Group is configured to be leader_only (see above) all other socket LAMs in the Process Group are deactivated $MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON

This activates all Process Groups in the KINGSTON Cluster and deactivates all other Clusters

$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.rest_lam -y This activates the rest_lam Process Group in the KINGSTON Cluster (and the -y means there is no 'are you sure?' prompt) and deactivates all other rest_lam Process Groups in all other Clusters $MOOGSOFT_HOME/bin/ha_cntl -d RICHMOND

This deactivates all Process Groups in the RICHMOND Cluster

$MOOGSOFT_HOME/bin/ha_cntl -d RICHMOND.trapd_lam This deactivates the trapd_lam Process Group in the RICHMOND Cluster $MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.UI -y

This activates the UI group in the KINGSTON Cluster and will deactivate the UI group in all other Clusters - triggering a UI failover (of all servlets) to the KINGSTON Cluster.

$MOOGSOFT_HOME/bin/ha_cntl -v The -v option prints detailed status about all Clusters, Groups and Instances that it can discover: [root@moogbox2 ~]# ha_cntl -v Getting system status Cluster: [KINGSTON] passive Process Group: [UI] Passive (no leader - all can be active) Instance: [servlets] Passive Component: moogpoller - not running Component: moogsvr - not running Component: toolrunner - not running Process Group: [moog_farmd] Passive (only leader should be active) Instance: FARM Passive Leader Moolet: AlertBuilder - not running (will run on activation) Moolet: AlertRulesEngine - not running (will run on activation) Moolet: Cookbook - not running (will run on activation) Moolet: Sigaliser - not running Moolet: Speedbird - not running (will run on activation) Moolet: TemplateMatcher - not running Process Group: [rest_lam] Passive (no leader - all can be active) Instance: REST2 Passive Process Group: [socket_lam] Passive (only leader should be active) Instance: SOCK2 Passive Leader Cluster: [SURBITON] active Process Group: [UI] Active (no leader - all can be active) Instance: [servlets] Active Component: moogpoller - running Component: moogsvr - running Component: toolrunner - running Process Group: [moog_farmd] Active (only leader should be active) Instance: FARM Active Leader Moolet: AlertBuilder - running Moolet: AlertRulesEngine - running Moolet: Cookbook - running Moolet: Default Cookbook - running Moolet: Sigaliser - not running Moolet: Speedbird - running Moolet: TemplateMatcher - not running Process Group: [rest_lam] Active (no leader - all can be active) Instance: REST1 Active Process Group: [socket_lam] Active (only leader should be active) Instance: SOCK1 Active Leader  #### farmd_cntl changes for HA The farmd_cntl utility has 2 changes for HA: To send farmd_cntl commands to a specific moog_farmd Instance within an HA environment, the <Cluster>.<Process Group>.<Instance> notation should be used for the --instance option. For example: farmd_cntl --instance SURBITON.moog_farmd.FARM --moolet AlertBuilder --start farmd_cntl now also gives more feedback on the results of the operation(s) requested: [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --all-moolets --stop Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed successfully. Report: Moolet AlertBuilder Stopped. Moolet Default Cookbook Stopped. Moolet Sigaliser Stopped. Moolet SituationMgr Stopped. Moolet Cookbook Stopped. [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --all-moolets --stop Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed with failures. Report: No Moolets to stop... [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --all-moolets --start Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed with failures. Report: Moolet AlertBuilder Started. Moolet Speedbird Started. Moolet Default Cookbook Started. Moolet TokenCounter could NOT be started. Moolet Sigaliser Started. Moolet TemplateMatcher Started. Moolet AlertRulesEngine Started. Moolet SituationMgr Started. Moolet Cookbook Started. Moolet Notifier Started. [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --moolet Sigaliser --restart Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed successfully. Report: Moolet Sigaliser Stopped. Moolet Sigaliser Started. [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --moolet Sigaliser --moolet Speedbird --restart --reconfig Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed successfully. Report: Moolet Sigaliser Stopped. Moolet Sigaliser Configuration Reloaded. Moolet Sigaliser Started. Moolet Speedbird Stopped. Moolet Speedbird Configuration Reloaded. Moolet Speedbird Started. #### MySQL failover for Moogsoft AIOps components Moogsoft AIOps allows the definition of a list of MySQL servers that, in the event that the primary connection (as defined in mysql.host) goes down, Moogsoft AIOps components that have a MySQL connection (Moogfarmd, tomcat, rest_lam) will automatically connect to the next available MySQL server in the failover_connections list. This is defined in the failover_connections section of the$MOOGSOFT_HOME/config/system.conf file, as follows:

"mysql" :
{
"host"            : "localhost",
"database"        : "moogdb",
"port"            : 3306
#
# New deadlock retry configuration - default values are as below if
# the config remains commented out.
#
# "maxRetries"      : 5,
# "retryWait"       : 10
#
# To use Multi-Host Connections for failover support use:
#
#  "failover_connections" :
#    [
#      {
#          "host"  : "193.221.20.24",
#          "port"  : 3306
#      },
#      {
#          "host"  : "143.47.254.88",
#          "port"  : 3306
#      },
#      {
#          "host"  : "234.118.117.132",
#          "port"  : 3306
#      }
#    ]
#
}, 

This is useful when the system is used with a replicated/clustered MySQL environment.

###### Example

For the following mysql section in system.conf:

 "mysql" :
{
"host"            : "moogbox1",
"database"        : "moogdb",
"port"            : 3306,
"failover_connections" :
[
{
"host"  : "moogbox2",
"port"  : 3306
},
{
"host"  : "moogbox3",
"port"  : 3306
}
]
},

On startup, the Moogsoft AIOps components that make a MySQL connection will connect to the MySQL server on moogbox1.

If the MySQL server on moogbox1 goes down then the Moogsoft AIOps components will automatically failover their MySQL connection to moogbox2 next. If that is not available or subsequently goes down then the connection will failover to moogbox3. Whilst the failover is occurring, some temporary MySQL connection errors or warnings may be seen in the Moogsoft AIOps components log output.

### Note

If the primary or another failover_connection higher up the list becomes available again, the connection will not automatically failback to that until the Moogsoft AIOps component is restarted or makes a new connection.