Set Up the Redundancy Server Role

In Moogsoft AIOps HA architecture, both RabbitMQ and ElasticSearch run as three-node clusters. The three-node clusters prevent issues with ambiguous data state, such as a "split-brain".

RabbitMQ is the Message Bus used by Moogsoft AIOps. Elasticsearch delivers the search functionality.

The three nodes are distributed across the two Core roles and the redundancy server.

HA architecture

In our distributed HA installation, the RabbitMQ and Elasticsearch components are installed on the Core 1, Core 2 and Redundancy servers.

Core 1: RabbitMQ Node 1, Elasticsearch Node 1
Core 2: RabbitMQ Node 2, Elasticsearch Node 2
Redundancy server: RabbitMQ Node 3, Elasticsearch Node 3

Refer to the Distributed HA system Firewall for more information on connectivity within a fully distributed HA architecture.

Install Redundancy server

Install the Moogsoft AIOps components on the Redundancy server.

On the Redundancy server install the following Moogsoft AIOps components:

VERSION=7.3.1.1; yum -y install moogsoft-common-${VERSION} \
    moogsoft-mooms-${VERSION} \
    moogsoft-search-${VERSION} \
    moogsoft-utils-${VERSION}

Edit the ~/.bashrc file to contain the following lines:

export MOOGSOFT_HOME=/usr/share/moogsoft
export APPSERVER_HOME=/usr/share/apache-tomcat
export JAVA_HOME=/usr/java/latest
export PATH=$PATH:$MOOGSOFT_HOME/bin:$MOOGSOFT_HOME/bin/utils

Source the .bashrc file:

source ~/.bashrc

Initialize RabbitMQ cluster node 3 on the Redundancy server and join the cluster.
1. On the Redundancy server initialise RabbitMQ. Use the same zone name as Core 1 and Core 2:
```
moog_init_mooms.sh -pz <zone>
```
2. The erlang cookies must be the same for all RabbitMQ nodes. Replace the erlang cookie on the Redundancy server with the Core 1 erlang cookie located at /var/lib/rabbitmq/.erlang.cookie. Make the Redundancy server cookie read-only:
```
chmod 400 /var/lib/rabbitmq/.erlang.cookie
```
  You may need to change the file permissions on the Redundancy server erlang cookie first to allow this file to be overwritten. For example:
```
chmod 406 /var/lib/rabbitmq/.erlang.cookie
```
3. Restart the rabbitmq-server service and join the cluster. Substitute the Core 1 server short hostname:
```
systemctl restart rabbitmq-server
rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@<Core 1 server short hostname>
rabbitmqctl start_app
```
  The short hostname is the full hostname excluding the DNS domain name. For example, if the hostname is ip-172-31-82-78.ec2.internal, the short hostname is ip-172-31-82-78. To find out the short hostname, run rabbitmqctl cluster_status on Core 1.
4. Apply the HA mirrored queues policy. Use the same zone name as Core 1:
```
rabbitmqctl set_policy -p <zone> ha-all ".+\.HA" '{"ha-mode":"all"}'
```
5. Run rabbitmqctl cluster_status to verify the cluster status and queue policy. Example output is as follows
```
Cluster status of node rabbit@ldev02 ...
[{nodes,[{disc,[rabbit@ldev01,rabbit@ldev02]}]},
 {running_nodes,[rabbit@ldev01,rabbit@ldev02]},
 {cluster_name,<<"rabbit@ldev02">>},
 {partitions,[]},
 {alarms,[{rabbit@ldev01,[]},{rabbit@ldev02,[]}]}]
[root@ldev02 rabbitmq]# rabbitmqctl -p MOOG list_policies
Listing policies for vhost "MOOG" ...
MOOG    ha-all  .+\.HA  all {"ha-mode":"all"}   0
```
Initialise, configure and start Elasticsearch cluster node 3 on the Redundancy server.
1. Initialize Elasticsearch on the Redundancy server:
```
moog_init_search.sh
```
2. Uncomment and edit the properties of the /etc/elasticsearch/elasticsearch.yml file on the Redundancy server as follows:
```
cluster.name: aiops
node.name: <Redundancy server hostname>
...
network.host: 0.0.0.0
http.port: 9200
discovery.zen.ping.unicast.hosts: [ "<Core 1 hostname>","<Core 2 hostname>","<Redundancy server hostname>"]
discovery.zen.minimum_master_nodes: 1
gateway.recover_after_nodes: 1
node.master: true
```
3. Restart Elasticsearch on the Core 1, Core 2 and Redundancy servers:
```
systemctl restart elasticsearch
```

Verify that the Elasticsearch nodes are working correctly:

curl -X GET "localhost:9200/_cat/health?v&pretty"

Example cluster health status:

epoch timestamp cluster status 
node.total node.data shards pri relo init 
unassign pending_tasks 
max_task_wait_time 
active_shards_percent
1580490422 17:07:02 aiops green 3 3 0 0 0 0 0 0 - 100.0%

In this section:

Set Up the Redundancy Server Role

HA architecture

Install Redundancy server

Search results