Skip to end of metadata
Go to start of metadata

The architectures described here are for example only and the internal expertise within your organization must validate deployment architecture with reference to HA goals, organizational standards, and system configuration limitations before deployment. The following sections provide detailed descriptions of the following AIOps deployment scenarios using High Availability system architecture:

Considerations for Single and Multi-tiered Architecture


AdvantageDisadvantage
Single-tiermaintenancesingle point of failure

budget considerationsCPU resource

zero network latency between componentsmemory resource

simple upgradesecurity - single point of access
Multi-tierdb has dedicated machine - higher performancepossible network latency

greater performance - machine dependenttime consuming upgrade

upgrade a node at a timemaintenance

UI Clients performance increasedfarmd - no horizontal processing

LAMs performance increased

high availability

firewalls and security


Single server setup with load balancer for testing/proving purposes

Whilst HA is aimed at distributed setups it may be useful to perform an "all-on-one-box" install for testing/proving purposes. Instances/Process Groups/Clusters can still be configured within this single server and failovers triggered using the ha_cntl utility.

It is not possible to have multiple Instances of the Tomcat servlets on one machine so this system will be limited to the redundancy of the LAMs and moog_farmd

Architecture diagram

High-level description

  • Machine1 hosts:
    • Active Instances of the socket_lam, rest_lam, moog_farmd and Tomcat - all part of cluster 'SURBITON'
    • Passive Instances of the socket_lam, rest_lam, moog_farmd and Tomcat - all part of cluster 'KINGSTON'
    • MooMS broker (RabbitMQ)
    • MySQL server
    • ElasticSearch
    • nginx
  • Machine2 hosts:
    • LAM load balancer (e.g. HAProxy)

Purpose of system

This system provides all-in-one box redundancy for the REST LAM, the socket LAM and moog_farmd, so it can be used to test failover of those components. The UI is not part of that redundancy and is configured in its own separate Cluster to separate it logically from the other Clusters. 
 

The load balancer is configured to:

  • Route Events to the Active rest_lam, based on the listening port of the LAM being available and the hastatus endpoint not returning a 503 Service Unavailable
  • Route Events to the Active socket_lam, based on the listening port of the LAM being available and accepting connections. The LAM's configuration has accept_conn_when_passive set to false to ensure connection attempts are rejected when in Passive mode

Single-server full product installation

All components installed at the same location


  1. Run:

    yum groupinstall moogsoft

    or

    yum install moogsoft-db moogsoft-lams moogsoft-mooms moogsoft-search moogsoft-server moogsoft-ui moogsoft-utils
  2. Run:

    $MOOGSOFT_HOME/bin/utils/moog_init.sh -I <ZONE> -u root

    where <ZONE> is the name of the MooMS zone you want to create.
     

  3. Enter the password for MySQL. The moog_init script prompts for the MySQL root user password (blank by default) and then prompts for whether you wish to change the hostname used by the configurations (defaults to the machine hostname returned by the 'hostname' command). 

Configuration


ComponentDetails
General

In file $MOOGSOFT_HOME/config/system.conf set the following properties (changing them from their defaults):

  • mooms.message_persistence : true
  • failover.persist_state : true
  • failover.hazelcast.hosts : ["<Machine1>"]
rest_lam
  1. Create a copy of $MOOGSOFT_HOME/config/rest_lam.conf as $MOOGSOFT_HOME/config/rest_lam1.conf
    1. Edit the ha section and set the following properties:

      ha:
        {
          cluster: "SURBITON",
          group: "rest_lam",
          instance: "REST",
          start_as_passive: false,
          duplicate_source: false
        },
  2. Create another copy as $MOOGSOFT_HOME/config/rest_lam2.conf.
    1. Set a different "port" property to that of the first rest_lam e.g 8889
    2. Edit the ha section and set the following properties:

      ha:
        {
          cluster: "KINGSTON",
          group: "rest_lam",
          instance: "REST",
          start_as_passive: true,
          duplicate_source: false
        },
  3. Create a copy of /etc/init.d/restlamd as /etc/init.d/restlamd1. 

    1. Set the CONFIG_FILE property to point to rest_lam1.conf.

  4. Create a copy of /etc/init.d/restlamd as /etc/init.d/restlamd2.

    1. Set the CONFIG_FILE property to point to rest_lam2.conf.

  5. Start both services:

    service restlamd1 start
    service restlamd2 start
socket_lam
  1. Create a copy of $MOOGSOFT_HOME/config/socket_lam.conf as $MOOGSOFT_HOME/config/socket_lam1.conf.
    1. Edit the ha section and set the following properties:

      ha:
        {
          cluster: "SURBITON",
          group: "socket_lam",
          instance: "SOCK",
          only_leader_active: true,
          accept_conn_when_passive: false,
          start_as_passive: false,
          duplicate_source: false
        },
  2. Create another copy as $MOOGSOFT_HOME/config/socket_lam2.conf.
    1. Set a different "port" property to that of the first socket_lam e.g 8412
    2. Edit the ha section and set the following properties:

      ha:
        {
          cluster: "KINGSTON",
          group: "socket_lam",
          instance: "SOCK",
          only_leader_active: true,
          accept_conn_when_passive: false,
          start_as_passive: true,
          duplicate_source: false
        },
  3. Create a copy of /etc/init.d/socketlamd as /etc/init.d/socketlamd1.
    1. Set the CONFIG_FILE property to point to socket_lam1.conf.

  4. Create a copy of /etc/init.d/socketlamd as /etc/init.d/socketlamd2.

    1. Set the CONFIG_FILE property to point to socket_lam2.conf.

  5. Start both services:

    service socketlamd1 start
    service socketlamd2 start
moog_farmd
  1. Create a copy of $MOOGSOFT_HOME/config/moog_farmd.conf as $MOOGSOFT_HOME/config/moog_farmd1.conf.

  2. Edit $MOOGSOFT_HOME/config/moog_farmd1.conf and:

    1. Configure it with the required Moolets and persist_state settings for those moolets.

    2. Edit the ha section and set the following properties:

      ha:
        {
          cluster: "SURBITON",
          group: "moog_farmd",
          instance: "FARM",
          start_as_passive: false
        },
  3. Create a copy of $MOOGSOFT_HOME/config/moog_farmd1.conf as $MOOGSOFT_HOME/config/moog_farmd2.conf.

  4. Edit $MOOGSOFT_HOME/config/moog_farmd2.conf and:

    1. Configure it with the required Moolets and persist_state settings for those moolets.

    2. Edit the ha section and set the following properties:

    ha:
      {
        cluster: "KINGSTON",
        group: "moog_farmd",
        instance: "FARM",
        start_as_passive: true
      },
  5. Create a copy of /etc/init.d/moogfarmd as /etc/init.d/moogfarmd1
    1. Set the CONFIG_FILE property to point to moog_farmd1.conf

  6. Create a copy of /etc/init.d/moogfarmd as /etc/init.d/moogfarmd2

    1. Set the CONFIG_FILE property to point to moog_farmd2.conf

  7. Start both services:

    service moogfarmd1 start
    service moogfarmd2 start
UI
  1. Configure the /usr/share/moogsoft/config/servlets.conf file as follows:

    {
       loglevel: "WARN",
       webhost : "https://<Machine1>",
       moogsvr:
       {
            eula_per_user: false,
            cache_root: "/var/lib/moogsoft/moog-data",
            db_connections:	10,
            priority_db_connections: 25
        },
        moogpoller :
        {
        },
        toolrunner :
        {
            sshtimeout: 900000,
            toolrunnerhost: "<Machine1>",
            toolrunneruser: "<toolrunner username>",
            toolrunnerpassword: "<toolrunner password>"
        },
        graze  :
        {
        },
        events  :
        {
        },
        ha :
        {
            cluster: "RICHMOND",
            instance: "servlets",
            group: "UI",
            start_as_passive: false
        }
    }

    ...replacing <hostname>, <toolrunner host>, <toolrunner username> and <toolrunner password> with appropriate values.

  2. Restart the Apache-tomcat service with the following command:

    service apache-tomcat restart
Load Balancer

Example HAProxy configuration on Machine2 for rest_lam and socket_lam:

global
  log 127.0.0.1   local0
  log 127.0.0.1   local1 notice
  maxconn 4096
  chroot /var/lib/haproxy
  user haproxy
  group haproxy
  daemon
  #debug
  #quiet

defaults
  mode tcp
  maxconn 10000
  timeout connect 5s
  timeout client 100s
  timeout server 100s

listen stats :9090
  balance
  mode http
  stats enable
  stats auth admin:admin
 
frontend rest_lam_frontend
  bind Machine2:8888
  mode http
  default_backend rest_lam_backend

backend rest_lam_backend
  balance roundrobin
  mode http
  option httpchk POST
  http-check expect ! status 503
  server rest_lam_1 Machine1:8888 check
  server rest_lam_2 Machine1:8889 check

frontend socket_lam_frontend
  bind Machine2:8411
  mode tcp
  default_backend socket_lam_backend

backend socket_lam_backend
  balance roundrobin
  mode tcp
  server socket_lam1 Machine1:8411 check
  server socket_lam2 Machine1:8412 check
  • This config offers port 8888 for REST events and port 8411 for SOCKET events.
  • http mode is used for the rest_lam and tcp mode for the socket_lam
  • The httpchk option is used to get the Active/Passive status of the rest_lam and to treat a response of 503 (Passive) as if the LAM was down

Example failover commands




rest_lam failoverTo failover only the rest_lam Process Group from the SURBITON Cluster to the KINGSTON Cluster
(i.e. deactivating the SURBITON.rest_lam.REST Instance and activating the KINGSTON.rest_lam.REST Instance) run
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.rest_lam
socket_lam failoverTo failover only the socket_lam Process Group from the SURBITON Cluster to the KINGSTON Cluster
(i.e. deactivating the SURBITON.socket_lam.SOCK Instance and activating the KINGSTON.socket_lam.SOCK Instance) run
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.socket_lam
moog_farmd failoverTo failover only the moog_farmd Process Group from the SURBITON Cluster to the KINGSTON Cluster
(i.e. deactivating the SURBITON.moog_farmd.FARM Instance and activating the KINGSTON.moog_farmd.FARM Instance) run
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.moog_farmd
Any of the above can be failed back individually by reactivating the Process Group in the SURBITON Cluster, for example:
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.rest_lam
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.socket_lam
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.moog_farmd

It is possible to failover the LAMs and moog_farmd together from the SURBITON Cluster to the KINGSTON Cluster by activating the whole KINGSTON Cluster, for example:

$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON

however this would have the side-effect of also deactivating the RICHMOND Cluster i.e. the UI servlets would all become Passive and the UI could not be used until all three servlets were reactivated





Three server, two Cluster setup with load balancers

This setup offers redundancy of Event ingestion, Event processing and UI components across two servers; each designated to be a Cluster. The MooMS broker and MySQL server are installed on a dedicated third server. Separate load balancer servers are also configured to route Events to the LAMs and web traffic to the UI Instances.

Architecture Diagram

High-level description

  • Machine1 hosts Active Instances of the socket_lam, rest_lam, moog_farmd and UI - all part of cluster 'SURBITON'
  • Machine2 hosts Passive Instances of the socket_lam, rest_lam, moog_farmd and UI - all part of cluster 'KINGSTON'
  • Machine3 hosts the following:
    • MooMS broker (RabbitMQ)
    • MySQL server
    • ElasticSearch 
  • Machine4 hosts the LAM load balancer (e.g. HAProxy) which routes Events to only the Active Instances of both the rest_lam and socket_lam. In the event of a LAM failover, the load balancer switches the routing to the new Active Instances of those LAMs
  • Machine5 hosts the UI load balancer (e.g. HAProxy) which routes web traffic only to the nginx behind which is an active UI.

Purpose of system

This system provides redundancy for the REST LAM, the socket LAM, moog_farmd (using Hazelcast persistence) and the UI. The system makes use of a core server containing the MooMS broker, MySQL server and Elasticsearch server.
The LAM load balancer is configured to:

  • Route Events to the Active rest_lam, based on the listening port of the LAM being available and the hastatus endpoint not returning a 503 Service Unavailable
  • Route Events to the Active socket_lam, based on the listening port of the LAM being available and accepting connections. The LAM's configuration has accept_conn_when_passive set to false to ensure connection attempts are rejected when in Passive mode

The UI load balancer is configured to:

  • Route web traffic only to nginx behind which is an active UI. The decision for this is based on a check of the moogsvr hastatus endpoint in Tomcat.

Installation

Step

1

Install on Machine3

  1. Install the db, mooms, search and utils RPMs:

     yum install moogsoft-db moogsoft-mooms moogsoft-search moogsoft-utils
  2. Initialize the database:

    $MOOGSOFT_HOME/bin/utils/moog_init_db.sh -Iu root

    You are prompted for the password associated with the MySQL root user (blank by default).


  3. Connect to mysql as the root user and grant all access on the moogdb and moog_reference databases to the ermintrude user on both Machine1 and Machine2:

    GRANT ALL ON moogdb.* TO ermintrude@'Machine1' IDENTIFIED BY 'm00';
    GRANT ALL ON moog_reference.* TO ermintrude@'Machine1' IDENTIFIED BY 'm00';
    GRANT ALL ON moogdb.* TO ermintrude@'Machine2' IDENTIFIED BY 'm00';
    GRANT ALL ON moog_reference.* TO ermintrude@'Machine2' IDENTIFIED BY 'm00';
  4. Initialize the MooMS message broker (by creating a 'zone' and enabling the management plugins):

    $MOOGSOFT_HOME/bin/utils/moog_init_mooms.sh -pz <ZONE>

    where <ZONE> is the name of the MooMS zone you want to create.
     

  5. Configure search to connect to MySQL and set up the indexer cron job

    $MOOGSOFT_HOME/bin/utils/moog_init_search.sh -sd localhost:3306
  6. Configure the ElasticSearch process to listen on all interfaces (so can accept remote connections) by adding the following line to the /etc/elasticsearch/elasticsearch.yml file and then restarting the elasticsearch service:

    http.host: 0.0.0.0
  7. Configure the utils with connection details for MySQL and the MooMS broker:

    $MOOGSOFT_HOME/bin/utils/moog_init_utils.sh -z <ZONE> -d Machine3:3306 -m Machine3:5672

    Ensure that the <ZONE> used here is the same as used in step 3 above when initializing the MooMS message broker

 2Install on Machine1
  1. Install the lams, server, ui and utils RPMs:

    yum install moogsoft-lams moogsoft-server moogsoft-ui moogsoft-utils

    If you encounter dependency errors involving MySQL, you may have older versions of MySQL libraries on your server. These prevent installation of the AIOps components and you should therefore take care should not to simply remove these outright in case other dependent packages are impacted. Best practice is to create a script that is used by the yum shell command. 
    For example, if an existing mysql-libs prevented AIOps installation, then create a file /tmp/install with the following contents:

    remove mysql-libs
    install moogsoft-lams
    install moogsoft-server
    install moogsoft-ui
    install moogsoft-utils
    run

    and the script fed to yum shell using:

    cat /tmp/install |yum shell

    This displays what is going to be installed and what is going to be removed. If this looks correct, then re-run it with the -y flag to carry out the installation, as follows:

    cat /tmp/install |yum shell -y

    this progresses the installation, removing and replacing the erroneous package without causing dependency errors.

  2. Configure the LAMs to connect to the database and MooMS broker on Machine3:

    $MOOGSOFT_HOME/bin/utils/moog_init_lams.sh -bz <ZONE> -d Machine3:3306 -m Machine3:5672

    Ensure that the <ZONE> used here is the same as used in Part 1 step 3 above, when initializing the MooMS message broker

    The -b option backs up the original config and bot files.


  3. Configure the events analyzer cron job:

    $MOOGSOFT_HOME/bin/utils/moog_init_server.sh -e

    The moog_init_server.sh script can also configure the system.conf database and MooMS settings. However, as this has already been done in the previous step, there is no need to repeat it here

  4. Configure the UI:

    $MOOGSOFT_HOME/bin/utils/moog_init_ui.sh -otwfxz <ZONE> -c Machine3:15672 -s Machine3:9200

    Ensure that the <ZONE> used here is the same as used in Part 1 step 3 above, when initialising the MooMS message broker

This does the following:

    • Configures the UI with a connection to the Elasticsearch server running on Machine3: 9200
    • sets the MooMS console to Machine3:15672
    • sets the UI zone to <ZONE>

    • Generates ssl keys for Nginx

    • Rebuilds the webapps

    • Restarts the Tomcat service


/There is no need to run moog_init_utils.sh at this stage as everything it requires has been configured.

3

Install on Machine2

This is identical to that of Machine1 so repeat the steps in Part 2 above on Machine2.


Configuration

AIOps uses Hazelcast as a persistence mechanism.

ComponentDetails
General

On both Machine1 and Machine2 edit file $MOOGSOFT_HOME/config/system.conf and set the following properties (changing them from their defaults):

      • mooms.message_persistence : true
      • failover.persist_state : true
      • failover.hazelcast.hosts : ["<Machine1>","<Machine2>"]
rest_lam
  1. On Machine1 for the active rest_lam edit file $MOOGSOFT_HOME/config/rest_lam.conf and uncomment/edit the ha section as follows:

    ha:
      {
        cluster: "SURBITON",
        group: "rest_lam",
        instance: "REST",
        start_as_passive: false,
        duplicate_source: false
      },
  2. On Machine2 for the passive rest_lam edit file $MOOGSOFT_HOME/config/rest_lam.conf and uncomment/edit the ha section as follows:

    ha:
      {
        cluster: "KINGSTON",
        group: "rest_lam",
        instance: "REST",
        start_as_passive: true,
        duplicate_source: false
      },
  3. Start the restlamd services on both machines:

service restlamd start
socket_lam
  1. On Machine1 for the active socket_lam edit file $MOOGSOFT_HOME/config/socket_lam.conf and uncomment/edit the ha section as follows:

    ha:
      {
        cluster: "SURBITON",
        group: "socket_lam",
        instance: "SOCK",
        only_leader_active: true,
        accept_conn_when_passive: false,
        start_as_passive: false,
        duplicate_source: false
      },
  2. On Machine2 for the passive socket_lam edit file $MOOGSOFT_HOME/config/socket_lam.conf and uncomment/edit the ha section as follows:

    ha:
      {
        cluster: "KINGSTON",
        group: "socket_lam",
        instance: "SOCK",
        only_leader_active: true,
        accept_conn_when_passive: false,
        start_as_passive: true,
        duplicate_source: false
      },
  3. Start the socketlamd services on both machines:
moog_farmd
  1. Edit the $MOOGSOFT_HOME/config/moog_farmd.conf file on both Machine1 and Machine2 and configure it with the required Moolets and persist_state settings for those moolets.

    Other than the ha section, all settings in moog_farmd.conf must be identically configured on both Machine1 and Machine2

  2. On Machine1 edit the $MOOGSOFT_HOME/config/moog_farmd.conf file and set the ha section as follows:

    ha:
      {
        cluster: "SURBITON",
        group: "moog_farmd",
        instance: "FARM",
        start_as_passive: false
      },
  3. On Machine2 edit the $MOOGSOFT_HOME/config/moog_farmd.conf file and set the ha section as follows:

    ha:
      {
        cluster: "KINGSTON",
        group: "moog_farmd",
        instance: "FARM",
        start_as_passive: true
      },
  4. Start the services on both machines.

    service moogfarmd start

UI
  1. On Machine1 edit the $MOOGSOFT_HOME/config/servlets.conf file as follows:

    {
       loglevel: "WARN",
       webhost : "https://<Machine5>",
       moogsvr:
       {
            eula_per_user: false,
            cache_root: "/var/lib/moogsoft/moog-data",
            db_connections:	10,
            priority_db_connections: 25
        },
        moogpoller :
        {
        },
        toolrunner :
        {
            sshtimeout: 900000,
            toolrunnerhost: "<Machine1>",
            toolrunneruser: "<toolrunner username>",
            toolrunnerpassword: "<toolrunner password>"
        },
        graze  :
        {
        },
        events  :
        {
        },
        ha :
        {
            cluster: "SURBITON",
            instance: "servlets",
            group: "UI",
            start_as_passive: false
        }
    }
  2. On Machine2 edit the $MOOGSOFT_HOME/config/servlets.conf file as follows:

    {
       loglevel: "WARN",
       webhost : "https://<Machine5>",
       moogsvr:
       {
            eula_per_user: false,
            cache_root: "/var/lib/moogsoft/moog-data",
            db_connections:	10,
            priority_db_connections: 25
        },
        moogpoller :
        {
        },
        toolrunner :
        {
            sshtimeout: 900000,
            toolrunnerhost: "<Machine2>",
            toolrunneruser: "<toolrunner username>",
            toolrunnerpassword: "<toolrunner password>"
        },
        graze  :
        {
        },
        events  :
        {
        },
        ha :
        {
            cluster: "KINGSTON",
            instance: "servlets",
            group: "UI",
            start_as_passive: true
        }
    }
  3. Restart the Apache-tomcat service on both Machine1 and Machine2:

    service apache-tomcat restart 
LAM Load Balancer

Example HAProxy configuration on Machine4 for rest_lam and socket_lam:

global
  log 127.0.0.1   local0
  log 127.0.0.1   local1 notice
  maxconn 4096
  chroot /var/lib/haproxy
  user haproxy
  group haproxy
  daemon
  #debug
  #quiet

defaults
  mode tcp
  maxconn 10000
  timeout connect 5s
  timeout client 100s
  timeout server 100s

listen stats :9090
  balance
  mode http
  stats enable
  stats auth admin:admin
 
frontend rest_lam_frontend
  bind Machine4:8888
  mode http
  default_backend rest_lam_backend

backend rest_lam_backend
  balance roundrobin
  mode http
  option httpchk POST
  http-check expect ! status 503
  server rest_lam_1 Machine1:8888 check
  server rest_lam_2 Machine2:8888 check

frontend socket_lam_frontend
  bind Machine4:8411
  mode tcp
  default_backend socket_lam_backend

backend socket_lam_backend
  balance roundrobin
  mode tcp
  server socket_lam1 Machine1:8411 check
  server socket_lam2 Machine2:8411 check
  • This config offers port 8888 for REST events and port 8411 for SOCKET events.
  • http mode is used for the rest_lam and tcp mode for the socket_lam
  • The httpchk option is used to get the Active/Passive status of the rest_lam and to treat a response of 503 (Passive) as if the LAM was down
UI Load Balancer

Example HAProxy configuration on Machine5 for the UI:

global
  log 127.0.0.1   local0
  log 127.0.0.1   local1 notice
  maxconn 4096
  chroot /var/lib/haproxy
  user haproxy
  group haproxy
  daemon
  #debug
  #quiet

defaults
  mode tcp
  maxconn 10000
  timeout connect 5s
  timeout client 24d
  timeout server 24d

listen stats :9090
  balance
  mode http
  stats enable
  stats auth admin:admin
 
frontend ui_front 
  bind Machine5:443 
  option tcplog 
  mode tcp 
  default_backend ui_back 

backend ui_back 
  mode tcp
  balance source
  option httpchk GET /moogsvr/hastatus
  http-check expect status 204
  server ui_1 Machine1:443 check check-ssl verify none inter 100
  server ui_2 Machine2:443 check check-ssl verify none inter 100 
  • This config offers port 443 for UI traffic.
  • tcp mode is used throughout to handle the https traffic in ssl passthrough mode
  • Note the setting of "timeout client" and "timeout server" to the maximum allowable of 24d (days) - this ensures the browser-server websocket connection does not get disconnected.
  • The httpchk option is used against the /moogsvr/hastatus endpoints to get their Active/Passive status (a 204 response means active, a 503 response is passive)
  • Source balancing is used but as there is only one active UI server at a time the balancing mechanism is irrelevant.


Example failover commands

CommandDescriptionCommand line
rest_lam failoverTo failover only the rest_lam Process Group from the SURBITON Cluster to the KINGSTON Cluster
(i.e. deactivating the SURBITON.rest_lam.REST Instance and activating the KINGSTON.rest_lam.REST Instance)
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.rest_lam 
socket_lam failoverTo failover only the socket_lam Process Group from the SURBITON Cluster to the KINGSTON Cluster
(i.e. deactivating the SURBITON.rest_lam.SOCK Instance and activating the KINGSTON.rest_lam.SOCK Instance)
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.socket_lam
moog_farmd failoverTo failover only the moog_farmd Process Group from the SURBITON Cluster to the KINGSTON Cluster
(i.e. deactivating the SURBITON.moog_farmd.FARM Instance and activating the KINGSTON.moog_farmd.FARM Instance)
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.moog_farmd
Any of the above can be failed back individually by reactivating the Process Group in the SURBITON Cluster, for example:
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.rest_lam
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.socket_lam
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.moog_farmd
Cluster failoverTo failover the entire SURBITON Cluster to the KINGSTON Cluster
(i.e. deactivating everything in the SURBITON Cluster and activating everything in the KINGSTON Cluster)
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON

To fail back to the SURBITON Cluster again
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON 


Fully distributed multi-server setup

This setup offers redundancy of multiple components across multiple servers with two Clusters. The MooMS broker and MySQL server and Search (Elasticsearch) components are each installed on their own dedicated servers. Separate load balancer servers are also configured to route Events to the LAMs and web traffic to the UI Instances.

Architecture Diagram


For clarity, this diagram does not show all inter-component connection lines

High-level description

  • Machine1 hosts the MooMS broker
  • Machine2 hosts the MySQL database
  • Machine3 hosts ElasticSearch
  • Machine4 hosts the active REST1 and active SOCK1 instances of the REST and SOCKET lams respectively - all part of the SURBITON cluster.
  • Machine5 hosts the active REST2 and passive SOCK2 instances of the REST and SOCKET lams respectively - all part of the SURBITON cluster.
  • Machine6 hosts the active FARM_EP instance of moog_farmd, part of the SURBITON cluster.
  • Machine7 hosts the active FARM_SE instance of moog_farmd, part of the SURBITON cluster.
  • Machine8 hosts nginx and active UI group UI1, part of the SURBITON cluster.
  • Machine9 hosts nginx and active UI group UI2, part of the SURBITON cluster.
  • Machine10 hosts the passive REST1 and passive SOCK1 instances of the REST and SOCKET lams respectively - all part of the KINGSTON cluster.
  • Machine11 hosts the passive REST2 and passive SOCK2 instances of the REST and SOCKET lams respectively - all part of the KINGSTON cluster.
  • Machine12 hosts the passive FARM_EP instance of moog_farmd, part of the KINGSTON cluster.
  • Machine13 hosts the passive FARM_SE instance of moog_farmd, part of the KINGSTON cluster.
  • Machine14 hosts nginx and passive UI group UI1, part of the KINGSTON cluster.
  • Machine15 hosts nginx and passive UI group UI2, part of the KINGSTON cluster.
  • Machine16 hosts the LAM load balancer (e.g. HAProxy) which routes Events to only the Active Instances of both the rest_lam and socket_lam. In the event of a LAM failover, the load balancer switches the routing to the new Active Instances of those LAMs
  • Machine17 hosts the UI load balancer (e.g. HAProxy) which routes web traffic only to an nginx behind which is an active UI.

Purpose of system

This system provides:

  • Redundancy for the rest_lam, the socket LAM, moog_farmd and the UI
  • Load balancing capability for the rest_lam and UI within a single Cluster
  • The ability to failover socket_lam to another machine within a Cluster
  • Distributed moog_farmd processing


The system makes use of three core servers containing the MooMS broker, MySQL server and Elasticsearch. The MooMS broker and MySQL server here could be replaced with clustered versions of each.
The LAM load balancer is configured to:

  • Round-robin Events between the pair of Active rest_lams in the Cluster and in the event of a failover, route Events to the newly Active rest_lam(s) in the other Cluster
  • Route events to the Active socket_lam and in the event of a failover, route Events to the newly Active socket_lam(s)

The UI load balancer is configured to:

  • Round-robin https traffic (ideally based on sticky sessions or "source") between the pair of Active UI groups and in the event of a failover route traffic to the newly active UI groups in the other Cluster

Installation

Step

Machine

DescriptionCommand line

1. Install the MooMS component

1Install the mooms RPM
yum install moogsoft-mooms


Initialize the MooMS message broker (by creating a 'zone' and enabling the management plugins)
Where <ZONE> is the name of the MooMS zone you want to create
$MOOGSOFT_HOME/bin/utils/moog_init_mooms.sh -pz <ZONE>

2. Install and configure the database component

2Install the db RPM
yum install moogsoft-db


Initialize the database
You are prompted for the password associated with the MySQL root user (blank by default)
$MOOGSOFT_HOME/bin/utils/moog_init_db.sh -Iu root


Connect to mysql as the root user and grant all access on the moogdb and moog_reference databases to the ermintrude user from any machine.
GRANT ALL ON moogdb.* TO ermintrude@'%' IDENTIFIED BY 'm00';
GRANT ALL ON moog_reference.* TO ermintrude@'%' IDENTIFIED BY 'm00';

3. Install and configure the search component

3Install the search RPM
yum install moogsoft-search


Configure search to connect to MySQL and set up the indexer cron job

$MOOGSOFT_HOME/bin/utils/moog_init_search.sh -sd Machine2:3306


Configure the ElasticSearch process to listen on all interfaces (so can accept remote connections) by adding the following line to the /etc/elasticsearch/elasticsearch.yml file and then restart the elasticsearch service.
http.host: 0.0.0.0

4. Install the LAMs components

4
5
10
11 
Install the lams RPM on each machine
yum install moogsoft-lams


Configure the LAMs to connect to the database on Machine2 and the MooMS broker on Machine1



Ensure that the <ZONE> used here is the same as used in Part 1 step 2 above, when initializing the MooMS message broker

The -b option backs up the original config and bot files

$MOOGSOFT_HOME/bin/utils/moog_init_lams.sh -bz <ZONE> -d Machine2:3306 -m Machine1:5672

5. Install the server and utils components

6
7
12
13 
Install the server and utils RPMs on each machine
yum install moogsoft-server moogsoft-utils


Configure the server components to connect to the database on Machine2 and the MooMS broker on Machine1, configure the events analyzer cron job



Ensure that the <ZONE> used here is the same as used in Part 1 step 2 above, when initializing the MooMS message broker

 The -b option backs up the original config and bot files

There is no need to run moog_init_utils.sh at this stage as everything it needs has already been configured

$MOOGSOFT_HOME/bin/utils/moog_init_server.sh -bez <ZONE> -d Machine2:3306 -m Machine1:5672

6. Install the UI component

8
9
14
15 
Install the ui RPM  on each machine
yum install moogsoft-ui


Configure the UI component to connect to the database on Machine2 , the MooMS broker on Machine1, and Elasticsearch on Machine3 




Ensure that the <ZONE> used here is the same as used in Part 1 step 2 above, when initializing the MooMS message broker


 

$MOOGSOFT_HOME/bin/utils/moog_init_ui.sh -otwfxz <ZONE> -c Machine1:15672 -d Machine2:3306 -m Machine1:5672 -s Machine3:9200


Configuration


 Component Details
rest_lam

Using the same mechanisms as described for the rest_lam in the previous 2 system examples, carry out the following actions:

Machine4: Configure ha section in rest_lam.conf with instance: REST1, group: rest_lam, cluster: SURBITON and start_as_passive: false

Machine5: Configure ha section in rest_lam.conf with instance: REST2, group: rest_lam, cluster: SURBITON and start_as_passive: false

Machine10: Configure ha section in rest_lam.conf with instance: REST1, group: rest_lam, cluster: KINGSTON and start_as_passive: true

Machine11: Configure ha section in rest_lam.conf with instance: REST1, group: rest_lam, cluster: KINGSTON and start_as_passive: true

Start the restlamd service on all 4 machines.

socket_lam

Using the same mechanisms as described for the socket_lam in the previous 2 system examples, carry out the following actions:

Machine4: Configure ha section in socket_lam.conf with instance: SOCK1, group: socket_lam, cluster: SURBITON, start_as_passive: false and accept_conn_when_passive: false

Machine5: Configure ha section in socket_lam.conf with instance: SOCK2, group: socket_lam, cluster: SURBITON, start_as_passive: true, accept_conn_when_passive: false and default_leader: false

Machine10: Configure ha section in socket_lam.conf with instance: SOCK1, group: socket_lam, cluster: KINGSTON, start_as_passive: true and accept_conn_when_passive: false

Machine11: Configure ha section in socket_lam.conf with instance: SOCK2, group: socket_lam, cluster: KINGSTON, start_as_passive: true, accept_conn_when_passive: false and default_leader: false

Start the socketlamd service on all 4 machines.

moog_farmd

Using the same mechanisms as described for moog_farmd in the previous 2 system examples, carry out the following actions:

Machine6:

  • Configure system.conf and set mooms.message_persistence: true, failover.persist_state: true and failover.hazelcast.hosts: ["<Machine6>","<Machine12">]
  • Configure moog_farmd.conf with the required moolets (e.g. AlertBuilder, AlertRulesEngine, Sigaliser(s) & SituationMgr) and persist_state settings.
  • Configure ha section of moog_farmd with instance: FARM_EP, group: FARM_EP_GRP and cluster: SURBITON

Machine7:

  • Configure system.conf and set mooms.message_persistence: true, failover.persist_state: true and failover.hazelcast.hosts: ["<Machine7>","<Machine13">]
  • Configure moog_farmd.conf with the required moolets (e.g. EmptyMoolet, ServiceNowmoolet) and persist_state settings.
  • Configure ha section of moog_farmd with instance: FARM_SE, group: FARM_SE_GRP and cluster: SURBITON

Machine12:

  • Configure system.conf as for Machine6 above.
  • Configure moog_farmd.conf with the required moolets as for Machine6 above.
  • Configure ha section of moog_farmd with instance: FARM_EP, group: FARM_EP_GRP, cluster: KINGSTON and start_as_passive: true

Machine13: 

  • Configure system.conf as for Machine7 above.
  • Configure moog_farmd.conf with the required moolets as for Machine7 above.
  • Configure ha section of moog_farmd with instance: FARM_SE, group: FARM_SE_GRP, cluster: KINGSTON and start_as_passive: true

Finally, start the moogfarmd services on Machine6, Machine7, Machine12 and Machine13

UI servlets

On Machine8, Machine9, Machine14 and Machine15, configure system.conf to set mooms.message_persistence: true

Using the same mechanisms as described for UI servlets in the previous 2 system examples, carry out the following actions:

Machine8: Configure servlets.conf and set webhost : https://<Machine17> and in the ha section set instance: servlets, group: UI1, cluster: SURBITON and start_as_passive: false

Machine9: Configure servlets.conf and set webhost : https://<Machine17> and in the ha section set instance: servlets, group: UI2, cluster: SURBITON and start_as_passive: false

Machine14: Configure servlets.conf and set webhost : https://<Machine17> and in the ha section set instance: servlets, group: UI1, cluster: KINGSTON and start_as_passive: true

Machine15: Configure servlets.conf and set webhost : https://<Machine17> and in the ha section set instance: servlets, group: UI2, cluster: KINGSTON and start_as_passive: true

Finally, restart the apache-tomcat services on Machine8, Machine9, Machine14 and Machine15:

LAM Load Balancer

The following is an example HAProxy configuration on Machine16 for rest_lam and socket_lam. 

global
  log 127.0.0.1   local0
  log 127.0.0.1   local1 notice
  maxconn 4096
  chroot /var/lib/haproxy
  user haproxy
  group haproxy
  daemon
  #debug
  #quiet

defaults
  mode tcp
  maxconn 10000
  timeout connect 5s
  timeout client 100s
  timeout server 100s

listen stats :9090
  balance
  mode http
  stats enable
  stats auth admin:admin
 
frontend rest_lam_frontend
  bind Machine16:8888
  mode http
  default_backend rest_lam_backend

backend rest_lam_backend
  balance roundrobin
  mode http
  option httpchk POST
  http-check expect ! status 503
  server rest_lam_1 Machine4:8888 check
  server rest_lam_2 Machine5:8888 check
  server rest_lam_3 Machine10:8888 check
  server rest_lam_4 Machine11:8888 check

frontend socket_lam_frontend
  bind Machine16:8411
  mode tcp
  default_backend socket_lam_backend

backend socket_lam_backend
  balance roundrobin
  mode tcp
  server socket_lam1 Machine4:8411 check
  server socket_lam2 Machine5:8411 check
  server socket_lam3 Machine10:8411 check
  server socket_lam4 Machine11:8411 check
  • This config offers port 8888 for REST events and port 8411 for SOCKET events.
  • http mode is used for the rest_lam and tcp mode for the socket_lam
  • The httpchk option is used to get the Active/Passive status of the rest_lam and to treat a response of 503 (Passive) as if the LAM was down
UI Load Balancer

The following is an example HAProxy configuration on Machine17 for the UI:

global
  log 127.0.0.1   local0
  log 127.0.0.1   local1 notice
  maxconn 4096
  chroot /var/lib/haproxy
  user haproxy
  group haproxy
  daemon
  #debug
  #quiet

defaults
  mode tcp
  maxconn 10000
  timeout connect 5s
  timeout client 24d
  timeout server 24d

listen stats :9090
  balance
  mode http
  stats enable
  stats auth admin:admin
 
frontend ui_front 
  bind Machine17:443 
  option tcplog 
  mode tcp 
  default_backend ui_back 

 backend ui_back 
  mode tcp 
  balance source 
  option httpchk GET /moogsvr/hastatus 
  http-check expect status 204 
  server ui_1 Machine8:443 check check-ssl verify none inter 50 
  server ui_2 Machine9:443 check check-ssl verify none inter 50
  server ui_3 Machine14:443 check check-ssl verify none inter 50 
  server ui_4 Machine15:443 check check-ssl verify none inter 50
  • This config offers port 443 for UI traffic.
  • tcp mode is used throughout to handle the https traffic in ssl passthrough mode
  • Note the setting of "timeout client" and "timeout server" to the maximum allowable of 24d (days) - this ensures the browser-server websocket connection does not get disconnected.
  • The httpchk option is used against the /moogsvr/hastatus endpoints to get their Active/Passive status (a 204 response means active, a 503 response is passive)
  • Source balancing is used so that requests from different sources are always routed to the same backend server. If all browser clients are behind a NAT and present the same IP then this would not be a suitable method. Session stickiness is another configuration option - see this link: http://cbonte.github.io/haproxy-dconv/configuration-1.6.html#4-balance


Example failover commands

CommandDescriptionCommand line
rest_lam failoverTo failover only the rest_lam Process Group from the SURBITON Cluster to the KINGSTON Cluster (i.e. deactivating the SURBITON.rest_lam.REST1 and SURBITON.rest_lam.REST2 Instances and activating the KINGSTON.rest_lam.REST1 and KINGSTON.rest_lam.REST2 Instances) run:
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.rest_lam

To failover only the SURBITON.rest_lam.REST1 rest_lam Instance to the KINGSTON Cluster, this is done manually in two steps:
$MOOGSOFT_HOME/bin/ha_cntl -d SURBITON.rest_lam.REST1
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.rest_lam.REST1
socket_lam failoverTo failover only the socket_lam Process Group from the SURBITON Cluster to the KINGSTON Cluster (i.e. deactivating the SURBITON.socket_lam.SOCK1 Instance and activating only the KINGSTON.socket_lam.SOCK1 Instance - hence the "leader" setting) run:
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.socket_lam
moog_farmd failoverTo failover only the FARM_EP_GRP moog_farmd Process Group from the SURBITON Cluster to the KINGSTON Cluster (i.e. deactivating the SURBITON.moog_farmd.FARM_EP Instance and activating the KINGSTON.moog_farmd.FARM_EP Instance) run:
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.FARM_EP

To failover only the FARM_SE_GRP moog_farmd Process Group from the SURBITON Cluster to the KINGSTON Cluster (i.e. deactivating the SURBITON.moog_farmd.FARM_SE Instance and activating the KINGSTON.moog_farmd.FARM_SE Instance) run:
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.FARM_SE
UI servlet failoverTo failover only the UI1 Process Group from the SURBITON Cluster to the KINGSTON Cluster (i.e. deactivating the SURBITON.UI1.servlets instance and activating the KINGSTON.UI1.servlets instance) run:
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.UI1

To failover only the UI2 Process Group from the SURBITON Cluster to the KINGSTON Cluster (i.e. deactivating the SURBITON.UI2.servlets instance and activating the KINGSTON.UI2.servlets instance) run:
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON.UI2

Any of the above can be failed back individually by reactivating the Process Group in the SURBITON Cluster, for example:
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.rest_lam
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.socket_lam
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.moog_farmd_EP
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.moog_farmd_SE
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.UI1
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON.UI2
Cluster failoverTo failover the entire SURBITON Cluster to the KINGSTON Cluster (i.e. deactivating everything in the SURBITON Cluster and activating everything in the KINGSTON Cluster except the SOCK2 Instance as per the leader setting) run:
$MOOGSOFT_HOME/bin/ha_cntl -a KINGSTON

To fail back to the SURBITON Cluster again, run:
$MOOGSOFT_HOME/bin/ha_cntl -a SURBITON
  • No labels