# Moogsoft Docs

## High Availability

High Availability (HA) deployments of Moogsoft AIOps comprise multiple instances of Moogsoft AIOps, Moogfarmd, and the associated LAMs to minimize downtime and data loss. Component redundancy protects against single points of failure. It also provides reliable mechanisms to enable failover from one component to another to avoid performance degradation and data loss.

See HA - Deployment Scenarios for deployment examples.

## HA Components

Moogsoft AIOps is made up of the following set of processes and services components, which can be implemented in a distributed environment: HA in Moogsoft AIOps:

• Event ingestion (LAMs)
• Event processing (moog_farmd)
• User interface (Nginx, servlets running in Tomcat and Elasticsearch)
• RabbitMQ broker (MooMS messaging system)
• Database (MySQL 5.7)

See HA - Setup for Dependencies for more information on how to set these up for distributed installations.

Introducing component redundancy (for example two identically configured event processing (moog_farmd) components) makes HA system architecture possible. Implementing HA system architecture enables failover of Moogsoft AIOps components without loss of data or performance.

Failover of Moogsoft AIOps components is manually triggered using the ha_cntl command line utility which also allows the status of all components in the HA installation to be viewed.

### Moogsoft AIOps HA Features at a Glance

• LAMs, moog_farmd and Tomcat servlets can run in 'Active' mode (normal operation) or 'Passive' mode (not processing messages)
• Instance/Process Group/Cluster naming convention for LAMs, moog_farmd and Tomcat servlets to build logical groupings for failover scenarios
• ha_cntl utility to show running HA components and to trigger manual failovers (by setting Instances/Process Groups/Clusters to Active or Passive)
• 'Leader' capability to allow only defined Instances to become Active when their parent Cluster/Process Group becomes Active
• Moolet state sharing ability (persistence) for moog_farmd to facilitate data integrity during failover of moog_farmd
• MySQL 'failover' connection definition to allow Moogsoft AIOps components to failover to backup MySQL servers if the primary connection goes down
• Handling for the UI to continue normal operation in the event of a UI failover
• Self Monitoring pages in UI and moog_monitor command line utility show HA information
• Product installation using split RPMs (by functional component) for easier distributed deployment

### Moogsoft AIOps HA architecture key concepts

Concept Description
Component An instance of a Moogsoft AIOps LAM, servlet or moog_farmd. HA introduces redundancy on a component, Process Group or Cluster level.
Instance A name for each Moogsoft AIOps component.
Process Group A group of one or more of the same type of Moogsoft AIOps components (such as a group of load sharing socket LAMs). All moog_farmd components in a Process Group must have identical configuration.
Cluster One or more Process Groups. A Cluster must contain at least one Process Group.
Zone Any number of Clusters, Process Groups and Instances can be defined within a single MooMS 'Zone' (RabbitMQ broker vhost). Failover actions for those Clusters/Process Groups/Instances are limited to be within that Zone.

#### Moogsoft AIOps Architecture

Instances are individual components that run on a single machine. Process Groups and Clusters, however, can span multiple machines. Their configuration allows the flexibility to define architectural groupings for failover actions as long as they are within the same MooMS Zone:

Component Description
Active / Passive mode Instances are configured to operate in Active or Passive mode. Instances that are operational are set to Active mode. Instances that are backup (redundant components) operate in Passive mode

Defines which Instance in a Process Group becomes Active when the whole Process Group switches from Passive to Active state. Normally, only one Instance per Process Group should be defined as a Leader. Leader definition availability is as follows:

moog_farmd Mandatory for a Process Group with more than one moog_farmd
Socket LAM, Logfile LAM,TrapdLAM Optional
REST LAM, UI servlets Not applicable

Leadership status is a property of Process Groups. There are two states of group leadership status (as seen in the output of  ha_cntl --view  command. See below ) as follows:

State Description
"only leader should be active" This is the default setting for components where Leader definition is supported (in moog_farmd, Socket LAM, Logfile LAM, Trapd LAM)
"no leader - all can be active" This is the default setting for components where Leader definition is not supported (REST LAM and the UI servlets), OR if there is one or more Instances in the Process Group configured with  "only_leader_active = false,"  . The behavior is dynamic, i.e. terminating such an Instance will change back the Process Group's status to  "only leader should be active" 

All of the above are defined when starting each component. If any of the values are not explicitly defined as parameters at startup, values are taken from the component configuration file (or if not defined there, values in  system.conf  are used).

$MOOGSOFT_HOME/bin/moog_farmd --cluster Surbiton --instance MASTER --leader yes --mode active The above creates an Instance of  moog_farmd  and defines it as a member of the  Surbiton  Cluster, with the Instance name  MASTER  . It also defines it as the Leader Instance in its Process Group and configures it to operate in Active mode. No Process Group (  --group  ) is defined, so the default name (from the component configuration file)  moog_farmd  is used. ## Moogsoft AIOps HA Configuration The information in the table describes how to configure Moogsoft AIOps components for a HA architecture. Component File Section Example Description Default Cluster$MOOGSOFT_HOME/config/system.conf  ha section, cluster  property
"ha":
{
"cluster": "NY"
}
The name of the Cluster. This supersedes anything set in  system.conf  (can also be overwritten by the command line)

LAMs

LAMs configuration file  ha  section
ha:
{
cluster: "NY",
group: "socket_lam",
accept_conn_when_passive: true
}
Property Description
cluster The name of the Cluster. This supersedes anything set in  system.conf  (can also be overwritten by the command line)
group The name of the Process Group. This defaults to the LAM process name if no value is specified (for example  socket_lam  )
default_leader A Boolean, indicating if the LAM is the Leader within its Process Group (see above ). The default value is  true  if not specified
only_leader_active A Boolean that changes the type of Process Group from a Leader Only group to a Process Group where more than one process can be Active. The default is  true  , except for the REST LAM where it is not supported and it is always treated as  false 
accept_conn_when_passive A Boolean instructing the LAM what to do in Passive mode. If  true  (or not set), the LAM accepts incoming connections but discards any events received. If  false  , the LAM does not accept incoming connections, and closes the socket from socket/Trapd LAMs. This is to prevent a load balancer from detecting them as unavailable and routing traffic elsewhere

moog_farmd

 moog_farmd.conf   ha  section
ha:
{
cluster: "NY",
group: "moog_farmd",
}
Property Description
cluster The name of the Cluster. This supersedes anything set in  system.conf  (can also be overwritten by the command line)
group The name of the Process Group. This defaults to  moog_farmd 
default_leader A Boolean, indicating if this moog_farmd is the Leader within its Process Group (see above ). Defaults to  true  if no value is specified

Command line overwrites

Component Description Command line
Cluster  cluster SF  to the command line for starting the component  cluster SF 
Process Group  group cool_group  to the command line for starting the component  group cool_group 
Instance  instance instance_3  (for example) to the command line for starting this Instance of the component instance instance_3

Passive Mode  mode passive  to the command line for starting this Instance of the component

(where  only_leader_active  is set)
 leader no  to the command line for starting this Instance of the component. This will overwrite the  default_leader  in the configuration files

Example:

$MOOGSOFT_HOME/bin/moog_farmd --instance TEST_INSTANCE --group TEST_GROUP --cluster TEST_CLUSTER --mode passive$MOOGSOFT_HOME/bin/socket_lam --instance SOCK1 --group SOCKGROUP --cluster CLUSTER1 --leader no --mode passive

Servlets

$MOOGSOFT_HOME/config/servlets.conf  ha  section ha :{ instance: "servlets", group: "UI", start_as_passive: false} • Note that all servlets defined in this file act as one HA "instance" - hence will all failover together • If cluster is not specified, the name of the Cluster is taken from the system.conf file • If group is not specified, the name defaults to "servlets" • If start_as_passive is not specified, then the servlet defaults to a setting of false for this property; hence, it is Active on startup ### Note • Servlets do not support any 'leader' settings • The apache-tomcat service must be restarted to apply configuration changes made to the servlets ## Active and Passive mode behavior In a High Availability deployment, Moogsoft AIOps components may operate in either Active or Passive mode. In Active mode, their behavior is unchanged from non-HA Moogsoft AIOps installations, carrying out data ingestion, processing, presentation, etc. In Passive mode, these activities do not occur - the component is effectively on standby, waiting for an instruction to start the processing activities defined by its component type and configuration setup. Failover is the process of converting one or more processes from Active to Passive mode while converting other processes from Passive to Active mode. The Active/Passive state of the HA components in a Cluster can be viewed in the Moogsoft AIOps UI using Self Monitoring or via the ha_cntl utility (see below ). In the UI, Passive processes are indicated by the icon. Further details of how components behave in Passive mode and how, where relevant, the Passive mode may be identified from the command line are given below. ### Servlets When the UI is in Passive mode, the moogsvr servlet will reject all requests with an HTTP status of  503  (server unavailable) and the moogpoller servlet will not accept incoming websocket upgrade requests. When switching from active to passive the moogsvr servlet will start rejecting requests and the moogpoller servlet will disconnect any existing websocket sessions. A Load balancer can therefore determine whether a UI is running in Active or Passive mode by sending a GET request to  https://Li<server>:<port>/moogsvr/hastatus  . A  204  response indicates that the UI is Active, a  503  response indicates Passive mode. The following example curl command can be sent from the command line to check servlet status: curl -k https://moogbox2/moogsvr/hastatus -v The output is  < HTTP/1.1 204 No Content  if the servlet is in A ctive mode, or  < HTTP/1.1 503 Service Unavailable  if the servlet is in P assive mode . ### moog_farmd A moog_farmd process running in Passive mode will not process events or detect Situations. When it fails over to Active mode, it will be able to carry on using the state from the previously Active Instance if this has been persisted (see below). When the moog_farmd state is being persisted, only one moog_farmd process is allowed to run in Active mode at any given time within a single moog_farmd Process Group. If more than one moog_farmd process is started in Active mode, all but the first to become Active will be automatically converted to run in Passive mode within a few seconds. The same applies to new moog_farmd processes started in Active mode when an Active moog_farmd is already running. This prevents a condition known as 'split brain'; where two Active processes both believe that they are responsible for executing functionality. ### Note All Instances of moog_farmd within the same Process Group must have identical configuration. ### LAMs LAMs operating in Passive mode do not send Events to the MooMS bus. The REST LAM in Passive mode will reject POST requests with an HTTP status of  503  (server unavailable). Example curl command to check rest_lam status: curl -x POST http://moogbox2:9876 -v The output is  < HTTP/1.1 503 Service Unavailable  if the rest_lam is in P assive mode . If the rest_lam is in A ctive mode, then the response code is dependent on the format of data sent to it as per normal rest_lam behavior. Configuring persistence of state in moog_farmd The state of moog_farmd can be persisted to ensure that context is not lost when failover occurs from one Instance of moog_farmd to another. This means that information held in memory about the Situations created by the Sigalisers and the current state of the Sigalisers themselves will not be lost. The new Instance of moog_farmd will continue to process events and detect the same Situations as would have been detected if there had been no failover. The state of the in-memory database (and the Constants module) will always be persisted if persistence is turned on. For each of the following Sigalisers: • Classic Sigaliser • Speedbird • Cookbook The  persist_state  configuration parameter in  moog_farmd.conf  must be set to  true  to ensure that the state for each Sigaliser is persisted. The state of the Alert Rules Engine Moolet can also be persisted using the  persist_state  configuration parameter. Similarly, s etting "persist_state" for the AlertBuilder (or any other moolet) ensures that any tasks queued for that moolet - in this case Events that have not yet been processed - are persisted to Hazelcast while queueing and will be processed by another instance of farmd after failover. When failover occurs, events and other pieces of information may be queued in Moolets, waiting to be processed. To ensure that these tasks are processed in the newly Active Instance of moog_farmd after failover, the  persist_state  flag is again used. This flag may be used for any Moolet that has a queue of tasks awaiting processing which, for all practical intents and purposes, is every Moolet other than the Scheduler. ### Note To take advantage of this feature and to ensure that the newly Active moog_farmd Instance takes over from where the previous one left off, the  message_persistence  property in the MooMS section of the  system.conf  file must be set to  true  ### Choice of persistence mechanism and configuration Persistence may be carried out using a Hazelcast in-memory Cluster. The persistence mechanism is configured in  system.conf  in the  persistence  section, for example:  # Persistence configuration parameters. "persistence" : { # Set persist_state to true to turn persistence on. If set, state # will be persisted in a Hazelcast cluster. "persist_state" : true, # Configuration for the Hazelcast cluster. "hazelcast" : { # The port to connect to on each specified host. "network_port" : 5701, # If set to true Hazelcast will increment the port number to # an available one if the configured port is unavailable. "auto_increment" : true, # A list of hosts to allow to participate in the cluster. "hosts" : ["localhost"], # Additional config to allow cluster info to be viewed via # Hazelcast's Management Center UI, if running. "man_center" : { "enabled" : false, "host" : "localhost", "port" : 8091 } }, ... } and as previously mentioned, ensure that the  message_persistence  property in the MooMS section of the  system.conf  file is set to  true  : "mooms": { "zone": "MOOG", "brokers": [ { "host": "localhost", "port": 5672 } ], "username": "moogsoft", "password": "m00gs0ft", "message_persistence": true, "max_retries": 100, "retry_interval": 200, "cache_on_failure": false, "cache_ttl": 900 } ### Clearing Persistence Data on Start-up If persistence is configured, once all moog_farmd Instances have been stopped, the in-memory persistence data is lost. moog_farmd also has a command line option  --clear_state  which, when specified at start-up, clears any current persistence data for the Process Group that the moog_farmd Instance is a member of. This ensures a clean start for that particular Instance (i.e it would have no memory of previously created Situations) but also impacts any other running moog_farmd Instances in that Process Group. ### Note This option does not remove moog_farmd persistence data from other Process Groups [root@moogbox2 regression-tests]# moog_farmd --help \n-------- Copyright MoogSoft 2012-2015 --------\n\n Executing: moog_farmd\n\n------------ All Rights Reserved -------------\n usage: moog_farmd [ --config=<path to config file> ] [ --loglevel (INFO|WARN|ALL) ] [--clear_state] [ --instance <name> [ --cluster <name> --group <name> [ --mode <passive|active> ] [ --leader <yes|no> ] ] ] [ --version ] MoogSoft moog_farmd: Container for our herd of moolets --clear_state Clears any persisted state information associated with this process group on startup. --cluster <arg> Name of HA cluster (to overwrite the config file) --config Specify a full path to the configuration file of this farmd --group <arg> Name of HA group (to overwrite the config file) --instance <arg> Give this farmd herd a name for use with farmd control --leader <arg> Is this instance an HA leader within its group (yes, no) --loglevel <arg> Specify (INFO|WARN|ALL) to choose the amount of debug output - warning ALL is very verbose! --mode <arg> Start the process in passive or active mode (default will be active) --version Return current version of the Moog software ## Configuring Automatic Failover for moog_farmd When configured in an active/passive HA configuration moog_farmd , has the capability for automatic failover . This allows a passive moog_farmd to automatically take over processing from another (active) moog_farmd in the same HA process group if the passive moog_farmd detects that the active moog_farmd has become inactive and is failing to report its status. This feature is controlled by three configuration properties: • automatic_failover • keepalive_interval • margin These properties are in the " failover " block in$MOOGSOFT_HOME/config/system.conf:

...
,
"failover" :
{
"persist_state" : false,
# Configuration for the Hazelcast cluster.
"hazelcast" :
{
...
},
# Failover configuration below currently applies only to moog_farmd.

# Interval (in seconds) at which processes report their
# active/passive status and check statuses of other processes.
"keepalive_interval" : 5,

# At next keepalive_interval, processes will allow <margin> seconds
# before treating active processes who have not reported their
"margin" : 3,

# Number of seconds to wait for previously active process to
# become passive during manual failover. After this time has
# expired the new instance will become active and force the
# process to become passive.
"failover_timeout" : 10,

# Allow a passive process to automatically become active if
# no other active processes are detected in the same process group
"automatic_failover" : false,

# Process will stop indicating that it is active if it fails
# to send <value> consecutive heartbeats.
"heartbeat_failover_after": 2
},
...
Property Description Other
automatic_failover property enables or disables the feature true|false
keepalive_interval (seconds and defaults to 5) defines how often a moog_farmd process reports its active/passive status to the database and checks the status of other reporting moog_farmd processes inactive
margin (seconds and defaults to 3) defines how long after a passive moog_farmd has detected that a formerly active moog_farmd (in its same process group) is no longer reporting status and should therefore become active and takeover processing inactive
failover_timeout (seconds and defaults to 10) active
heartbeat_failover_after (number and defaults to 2) defines that the p rocess will stop indicating that it is active if it fails to send <value> consecutive heartbeats active

#### Example Automatic Failover Tuning

Assuming a highly simplified multi-host HA setup such as:

  +---------------------+             +---------------------+   |server1|             |             |server2|             |   +-------+             |             +-------+             |   | moog_farmd (active) |             | moog_farmd (passive)|   +----------+----------+             +-----------+---------+     |                                    |     |                                    |     |                                    |     |         +---------------+          |     |         |server3|       |          |     |         +-------+       |          |     +---------+      DB       +----------+     +---------------+ 

The moog_farmds on server1 and server2 are in the same process group but in different clusters. All other config is identical.

• With automatic_failover : false, set in system.conf, on both server1 and server2, then if the active moog_farmd process on server1 is killed, becomes unresponsive, loses contact with the DB or drops off the network, then moog_farmd on server2 will remain passive and not take over processing unless a manual failover is triggered using ha_cntl
• With automatic_failover : true, set in system.conf, on both server1 and server2 and with default keepalive_interval and margin settings, then if the active moog_farmd process on server1 is killed*, becomes unresponsive, loses contact with the DB or drops off the network, then moog_farmd on server2 will automatically become active and take over processing between 3-8 seconds later (depending on when next keepalive_interval occurs). If the moog_farmd on server1 is then restarted, resumes processing or rejoins the network, it will establish that there is already another active moog_farmd running in its process group (i.e. the instance now active on server2) and it will become passive to prevent split-brain processing occurring

Thus, the keepalive_interval and margin properties can be used to tune the sensitivity of automatic failover. In the above example (and with default settings) automatic failover happens promptly. Users may wish to increase or decrease the interval at which moog_farmd reports its status and also allow more time before a passive moog_farmd tries to take over processing (possibly useful if the active moog_farmd suffered a short interruption but has quickly resumed). Setting (for example) automatic_failover : true, keepalive_interval : 3 and margin : 10 would mean for the above system:

• the active moog_farmd process on server1 is killed*, becomes unresponsive, loses contact with the DB or drops off the network, then moog_farmd on server2 will automatically become active and take over processing between 10-13 seconds later (depending on when next keepalive_interval occurs). If the moog_farmd on server1 resumes processing or rejoins the network within 10secs of the passive moog_farmd on server2 detecting it as down, then it will continue as the active instance and the moog_farmd on server2 will remain as passive and not take over. Conversely if the moog_farmd on server1 had been restarted instead then it would not continue as the active process and the passive moog_farmd on server2 would become active and take over

* see note below on failover behaviour when process is killed or shutdown "cleanly".

#### Important Notes and Limitations:

• For moog_farmd only , automatic failover of LAMs or UI Servlets is not part of this implementation
• Identical configuration is needed on all servers running as part of the HA setup (as per other HA configuration)
• time sensitive: requires all servers to be time synchronised. A change to the system time on the DB server in a running HA setup could trigger automatic failover between moog_farmd instances
• Requires communication with the DB. If the DB or DB server becomes unresponsive to all moog_farmd instances then the feature will not work as expected
• If, in an automatic failover setup, an active moog_farmd instance is shutdown cleanly (i.e. using normal kill, service stop or ctrl-c) then a passive moog_farmd will take over processing at its next keepalive_interval and will not wait the additional <margin> seconds

#### A Note on Process Startup

If automatic_failover is enabled and a moog_farmd instance is started in passive mode and no other active moog_farmd is running in its process group, it will switch to active. Users may wish to factor this in when starting up a system i.e. it is easiest to startup active moog_farmd instances first.

#### A Note on Split-Brain Handling

HA implementation has built in handling to prevent split-brain processing occurring i.e. two active moog_farmds (in the same process group) running at the same time and potentially leading to duplicate processing. At its simplest it prevents a second moog_farmd being started in active mode if there is another active instance already running in the same process group (regardless of cluster or instance name). The second moog_farmd will startup but will immediately switch to passive mode.

## Controlling Moogsoft AIOps HA (ha_cntl)

Moogsoft AIOps includes a High Availability Control utility to control the HA architecture.
Use the ha_cntl utility to:

• failover (change status of) Instances, Process Groups or Clusters
• view the current status of all Instances, Process Groups and Clusters

There is also help available for the ha_cntl utility.

### Note

The UI will not continue to function correctly after a failover if only one of the Tomcat servlets is failed over using activate or deactivate commands at a servlet Process Group level. Currently, the UI must be failed over at a Cluster level to ensure continued smooth operation

ha_cntl utility commands are as follows:

Command Description
 -a,--activate <arg>  Specify  cluster[.group[.instance_name]]  to activate all Process Groups within a Cluster, a specific Process Group within a Cluster or a single Instance
 -d,--deactivate <arg>  Specify  cluster[.group[.instance_name]]  to deactivate all Process Groups within a Cluster, a specific Process Group within a Cluster or a single Instance
 -h,--help  Print help text, that describes ha_cntl commands
 -l,--loglevel <arg>  Specify  (INFO|WARN|ALL)  to choose the amount of debug output
 -t,--time_out <arg>  Specify an amount of time (in seconds) to wait for the last answer. If not set, the default is 2 seconds
 -v,--view  View the current status of all Instances, Process Groups and Clusters
 -y,--assumeyes  Answer yes for all prompts. Useful for automation

### Examples

Command line Description
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.socket_lam.SOCK1 This activates the socket_lam Instance  SOCK1  within the Process Group  socket_lam  within the Cluster  SURBITON  . If the  socket_lam  Process Group is configured to be leader_only (see above ) all other socket LAMs in the Process Group are deactivated $MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON
This activates all Process Groups in the  KINGSTON  Cluster and deactivates all other Clusters
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.rest_lam -y This activates the  rest_lam  Process Group in the  KINGSTON  Cluster (and the  -y  means there is no 'are you sure?' prompt) and deactivates all other  rest_lam  Process Groups in all other Clusters $MOOGSOFT_HOME/bin/ha_cntl -d RICHMOND
This deactivates all Process Groups in the  RICHMOND  Cluster
$MOOGSOFT_HOME/bin/ha_cntl -d RICHMOND.trapd_lam This deactivates the  trapd_lam  Process Group in the  RICHMOND  Cluster $MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.UI -y
This activates the UI group in the KINGSTON Cluster and will deactivate the UI group in all other Clusters - triggering a UI failover (of all servlets) to the KINGSTON Cluster.
$MOOGSOFT_HOME/bin/ha_cntl -v The  -v  option prints detailed status about all Clusters, Groups and Instances that it can discover: [root@moogbox2 ~]# ha_cntl -v Getting system status Cluster: [KINGSTON] passive Process Group: [UI] Passive (no leader - all can be active) Instance: [servlets] Passive Component: moogpoller - not running Component: moogsvr - not running Component: toolrunner - not running Process Group: [moog_farmd] Passive (only leader should be active) Instance: FARM Passive Leader Moolet: AlertBuilder - not running (will run on activation) Moolet: AlertRulesEngine - not running (will run on activation) Moolet: Cookbook - not running (will run on activation) Moolet: Sigaliser - not running Moolet: Speedbird - not running (will run on activation) Moolet: TemplateMatcher - not running Process Group: [rest_lam] Passive (no leader - all can be active) Instance: REST2 Passive Process Group: [socket_lam] Passive (only leader should be active) Instance: SOCK2 Passive Leader Cluster: [SURBITON] active Process Group: [UI] Active (no leader - all can be active) Instance: [servlets] Active Component: moogpoller - running Component: moogsvr - running Component: toolrunner - running Process Group: [moog_farmd] Active (only leader should be active) Instance: FARM Active Leader Moolet: AlertBuilder - running Moolet: AlertRulesEngine - running Moolet: Cookbook - running Moolet: Default Cookbook - running Moolet: Sigaliser - not running Moolet: Speedbird - running Moolet: TemplateMatcher - not running Process Group: [rest_lam] Active (no leader - all can be active) Instance: REST1 Active Process Group: [socket_lam] Active (only leader should be active) Instance: SOCK1 Active Leader  ## farmd_cntl changes for HA The farmd_cntl utility has 2 changes for HA: To send farmd_cntl commands to a specific moog_farmd Instance within an HA environment, the  <Cluster>.<Process Group>.<Instance>  notation should be used for the  --instance  option. For example: farmd_cntl --instance SURBITON.moog_farmd.FARM --moolet AlertBuilder --start farmd_cntl now also gives more feedback on the results of the operation(s) requested: [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --all-moolets --stop Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed successfully. Report: Moolet AlertBuilder Stopped. Moolet Default Cookbook Stopped. Moolet Sigaliser Stopped. Moolet SituationMgr Stopped. Moolet Cookbook Stopped. [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --all-moolets --stop Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed with failures. Report: No Moolets to stop... [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --all-moolets --start Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed with failures. Report: Moolet AlertBuilder Started. Moolet Speedbird Started. Moolet Default Cookbook Started. Moolet TokenCounter could NOT be started. Moolet Sigaliser Started. Moolet TemplateMatcher Started. Moolet AlertRulesEngine Started. Moolet SituationMgr Started. Moolet Cookbook Started. Moolet Notifier Started. [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --moolet Sigaliser --restart Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed successfully. Report: Moolet Sigaliser Stopped. Moolet Sigaliser Started. [root@moogbox2 ~]# farmd_cntl --instance CLUSTER1.GROUP1.FARM1 --moolet Sigaliser --moolet Speedbird --restart --reconfig Response from: CLUSTER1.GROUP1.FARM1 Status: Action(s) completed successfully. Report: Moolet Sigaliser Stopped. Moolet Sigaliser Configuration Reloaded. Moolet Sigaliser Started. Moolet Speedbird Stopped. Moolet Speedbird Configuration Reloaded. Moolet Speedbird Started. ## MySQL failover for Moogsoft AIOps components Moogsoft AIOps allows the definition of a list of MySQL servers that, in the event that the primary connection (as defined in  mysql.host  ) goes down, Moogsoft AIOps components that have a MySQL connection (moog_farmd, tomcat, rest_lam) will automatically connect to the next available MySQL server in the  failover_connections  list. This is defined in the  failover_connections  section of the $MOOGSOFT_HOME/config/system.conf  file, as follows:

"mysql" :
{
"host"            : "localhost",
"database"        : "moogdb",
"port"            : 3306
#
# New deadlock retry configuration - default values are as below if
# the config remains commented out.
#
# "maxRetries"      : 5,
# "retryWait"       : 10
#
# To use Multi-Host Connections for failover support use:
#
#  "failover_connections" :
#    [
#      {
#          "host"  : "193.221.20.24",
#          "port"  : 3306
#      },
#      {
#          "host"  : "143.47.254.88",
#          "port"  : 3306
#      },
#      {
#          "host"  : "234.118.117.132",
#          "port"  : 3306
#      }
#    ]
#
}, 

This is useful when the system is used with a replicated/clustered MySQL environment.

### Example

For the following  mysql  section in  system.conf  :

 "mysql" :
{
"host"            : "moogbox1",
"database"        : "moogdb",
"port"            : 3306,
"failover_connections" :
[
{
"host"  : "moogbox2",
"port"  : 3306
},
{
"host"  : "moogbox3",
"port"  : 3306
}
]
},

On startup, the Moogsoft AIOps components that make a MySQL connection will connect to the MySQL server on moogbox1.

If the MySQL server on  moogbox1  goes down then the Moogsoft AIOps components will automatically failover their MySQL connection to  moogbox2  next. If that is not available or subsequently goes down then the connection will failover to  moogbox3  . Whilst the failover is occurring, some temporary MySQL connection errors or warnings may be seen in the Moogsoft AIOps components log output.

### Note

If the primary or another  failover_connection  higher up the list becomes available again, the connection will not automatically failback to that until the Moogsoft AIOps component is restarted or makes a new connection