Initializing the Core Moogsoft Services

The next step in the HA installation is to set up Core services on each of your three HA nodes. The primary and secondary nodes will have all the core services, including MooMs (RabbitMQ), Search, and Moogfarmd, while the redundancy node will only have MooMs (RabbitMQ) and Search. You will cluster the RabbitMQ instances and Elasticsearch instances together (three-node cluster) for each to avoid the split-brain problem, and you will set up Moogfarmd as an active/passive HA pair on your primary and secondary nodes.

Overview

For this exercise, note that the Moogsoft RPM packages for all services have already been installed on your nodes, and you will only need to initialize them for this exercise. In a "real world" scenario, you'd first need to install each RPM package before initializing it.

You will fulfill the following requirements:

  • Initialize the MooMs (RabbitMQ) service on all three nodes:

    1. Initialize the MooMs (RabbitMQ) service on all three nodes using the standard Moogsoft Enterprise init script for MooMs with the command:

      moog_init_mooms.sh -pz <ZONE> 

      where <ZONE> is your RabbitMQ zone, which is arbitrary, but must match on all three clustered nodes. For this exercise, you will use HA_TRAIN as your zone.

    2. Make the Erlang cookie on your primary node readable by others. This will allow you to copy it into your secondary and redundancy nodes, as all nodes in a RabbitMQ cluster must share the same Erlang cookie.

    3. Overwrite the Erlang cookie on your secondary and redundancy nodes with the one from your primary node. You can use the scp (secure copy) command to accomplish this.

    4. Once the cookie is copied, remove the read permissions for others from the Erlang cookie on your primary node. Final permissions for all .erlang.cookie files should be 400, and can be verified by listing the relevant directory with the command: ls -la.

    5. On your secondary and redundancy nodes, configure the RabbitMQ instances to join in a cluster with your primary node by running the following commands. The <short-hostname-for-primary-node> is the "short" hostname of your node (before the first dot), which should be in the form ip-X-Y-Z-W where X.Y.Z.W is the AWS private IP address for the node. Note that the short hostname is part of the standard command prompt on ec2 instances, so you can see it on the command line interface without having to run a command, and <ZONE> is the RabbitMQ zone as above:

      systemctl restart rabbitmq-server
      
      rabbitmqctl stop_app
      
      rabbitmqctl join_cluster _
      
      rabbit@<short-hostname-for-primary-node>
      
      rabbitmqctl start_app
      
      rabbitmqctl set_policy -p <ZONE> ha-all ".+\.HA" '{"ha-mode":"all"}'
  • Test that the RabbitMQ cluster is correct by running the command:

    rabbitmqctl cluster_status
  • Initialize the Elasticsearch service on all three nodes, and configure the three instances as a three-node cluster.

    1. Initialize the Moogsoft search service on all three nodes using the standard Moogsoft Enterprise init script for Search with the command:

      moog_init_search.sh
    2. Edit the Elasticsearch configuration file at /etc/elasticsearch/elasticsearch.yml and uncomment/edit the following properties:

      cluster.name: moogsoftEnterprise
      node.name: <full hostname of server, for example 
      ip-172-31-82-211.ec2.internal>
      network.host: 0.0.0.0
      http.port: 9200
      discovery.zen.ping.unicast.hosts: ["<PRIMARY_IP>", 
      "<SECONDARY_IP>", "<REDUNDANCY_IP"]
      discovery.zen.minimum_master_nodes: 1
      gateway.recover_after_nodes: 1

      Note

      The double quotes are required around the IP addresses and there should be no hard line breaks within the property values.

    3. Once your Elasticsearch configuration is correct, restart the elasticsearch service and test your cluster by running this command on your primary node:

      curl -X GET "localhost:9200/_cat/health?v&pretty"
  • Configure system.conf and moog_farmd.conf for HA, and then start the moogfarmd service.

    1. On your primary and secondary nodes, edit the mooms section of the $MOOGSOFT_HOME/config/system.conf file to contain the correct zone (HA_TRAIN) and to include your primary and secondary nodes in the brokers list.

    2. On your primary and secondary nodes, edit the search section of system.conf to include your primary and secondary nodes in the nodes list.

    3. On your primary and secondary nodes:

      1. Edit the failover section of system.conf to set persist_state to true and enable hazelcast for the primary and secondary nodes with cluster_per_group set to true, and set automatic_failover to true

      2. Edit the ha section of system.conf to set the correct cluster name: PRIMARY for your primary node and SECONDARY for your secondary node

    4. On your primary and secondary nodes, edit the $MOOGSOFT_HOME/config/moog_farmd.conf file to set up the HA pair for the moogfarmd service. The ha section of those files needs to look like the one below—note that you may need to add a comma before the ha section to preserve correct JSON syntax when you uncomment the section:

      ,
      ha:
      {
           group: "moog_farmd",
           instance: "<instance_name, eg primary or secondary>",
           default_leader: true (on primary) or false (on secondary),
           start_as_passive: false
      }
    5. On your primary and secondary nodes, start the Moogfarmd service.

    6. Test your work by looking at the Moogfarmd service log

Note

If you know how to set up and test the core services (MooMs/RabbitMQ, Elasticsearch, and Moogfamd) on your three nodes, you are encouraged to try to implement the exercise using the solution design above and your knowledge based on our discussion in class. If you are unsure of how to proceed, or if you get stuck, you can read on for the full step-by-step solution.

Step-By-Step Solution

  1. Connect to your primary node via SSH and become root.

  2. Change to the $MOOGSOFT_HOME/bin/utils directory and run the command:

    moog_init_mooms.sh -pz <ZONE> --accept-eula

    where <ZONE> is the RabbitMQ zone for your cluster. We will use HA_TRAIN as the zone, so your command should be:

    moog_init_mooms.sh -pz HA_TRAIN --accept-eula
  3. Once initialization is complete, change to the /var/lib/rabbitmq directory and add read permissions for others to the .erlang.cookie file with the command:

    chmod o+r /var/lib/rabbitmq/.erlang.cookie
  4. Perform step 2 on your secondary and redundancy nodes

  5. When initialization is complete, go to your secondary node and change to the /var/lib/rabbitmq directory. Copy the .erlang.cookie file from your primary node by running the command (all on one line):

    scp <user>@<primary-dns-name>:/var/lib/rabbitmq/.erlang.cookie ./.erlang.cookie

    where <user> is the username you use to log into the instances via ssh and <primary-dns-name> is the URL of the primary instance (example: nico-primary.datalab.moogsoft.com).

  6. Perform step 5 on your redundancy node.

  7. On your primary node, remove the read permission for others that you just added to the .erlang.cookie file (since it is now copied to your secondary and redundancy nodes) with the command:

    chmod o-r /var/lib/rabbitmq/.erlang.cookie
  8. Check that your Erlang cookie is the same on all three instances by using the command:

    cat .erlang.cookie

    Final permissions for all .erlang.cookie files should be 400. Verify this by listing the relevant directory with the command: ls -la.

  9. On your secondary node, restart the rabbitmq-server service by running the command:

    systemctl restart rabbitmq-server
  10. Use the rabbitmqctl utility to stop the RabbitMQ application by running the command:

    rabbitmqctl stop_app
  11. Tell RabbitMQ to join in a cluster with your primary node by running the command:

    rabbitmqctl join_cluster rabbit@<short-hostname-for-primary-node>

    where <short-hostname-for-primary-node> is the part of your primary node's AWS private DNS name before the first dot. The short name is part of the standard command prompt on ec2 instances, so you can see it on the command line interface of the primary node. It should be of the form ip-X-Y-Z-W where X.Y.Z.W is the private IP address of the primary node.

    A sample command, using the primary DNS name ip-172-31-13-94.ec2.internal would be:

    rabbitmqctl join_cluster rabbit@ip-172-31-13-94
  12. Restart the RabbitMQ application by running the command:

    rabbitmqctl start_app
  13. Set the correct HA policy for RabbitMQ by running the command (all on one line):

    rabbitmqctl set_policy -p <ZONE> ha-all ".+\.HA" '{"ha-mode":"all"}'

    where <ZONE> is the same RabbitMQ zone you used when initializing MooMs (HA_TRAIN).

  14. Perform steps 9 through 13 above on your redundancy node, making sure to use the same hostname (the one for your primary node) on the join_cluster command.

  15. At this point you should have a correctly configured three-node RabbitMQ cluster. Test your work by running the command:

    rabbitmqctl cluster_status

    from any of your nodes. The output should reflect the fact that all three nodes are now clustered.

  16. Now on to the Elasticsearch service. On your primary node, change to the $MOOGSOFT_HOME/bin/utils directory and run the command:

    moog_init_search.sh

    This will Initialize the Elasticsearch service.

  17. Change to the /etc/elasticsearch directory and edit the elasticsearch.yml file to uncomment and edit the following properties:

    1. cluster.name: moogsoftEnterprise

    2. node.name: <primary-private-DNS-name> This is the short hostname followed by .ec2.internal (example: ip-172-31-13-94.ec2.internal)

    3. network.host: 0.0.0.0

    4. http.port: 9200

    5. discovery.zen.ping.unicast.hosts: ["<primary_IP>", "<secondary_IP>", "<redundancy_IP>"] The <..._IP> sections are the private IP addresses for all three of your nodes. Note that the order matters, and that you must include the quotes around the IP addresses.

    6. discovery.zen.minimum_master_nodes: 1

    7. gateway.recover_after_nodes: 1

    Once you are done with the edits, save the file.

  18. Perform steps 16 and 17 above on your secondary and redundancy nodes, making sure to use the respective DNS name in the node.name property for each node.

  19. On your primary node, restart the Elasticsearch service by running the command:

    systemctl restart elasticsearch
  20. Perform step 19 above on your secondary and redundancy nodes.

  21. You should now have a correctly configured three-node Elasticsearch cluster. Test your work by running the following command from your primary node:

    curl -X GET "localhost:9200/_cat/health?v&pretty"

    The output should reflect the fact that the moogsoftEnterprise cluster contains three nodes and has a status of green with 100% of the nodes active.

  22. Finally, on to the Moogfarmd service. This service is only required on the primary and secondary nodes. On your primary node, change to the $MOOGSOFT_HOME/config directory and edit the system.conf file to uncomment and/or edit the following properties:

    1. In the mooms section, ensure the zone property is set to "HA_TRAIN."

    2. In the mooms section, set the brokers list to the private IP addresses for all three of your nodes in the following order:

      {"host": "<primary_IP>", "port": 5672},
      {"host": "<secondary_IP>", "port": 5672},
      {"host": "<redundancy_IP>", "port": 5672}
    3. In the mooms section, set the value of the cache_on_failure property to true.

      In the search section, set the nodes list to the private IP addresses for all three of your nodes in the following order:

      {"host": "<primary_IP>", "port": 9200},
      {"host": "<secondary_IP>", "port": 9200},
      {"host": "<redundancy_IP>", "port": 9200}
    4. In the failover section, set:

      1. The persist_state property to true

      2. In the hazelcast property set:

        1. "hosts": ["<primary_IP>","<secondary_IP>"],

        2. "cluster_per_group": true

      3. The automatic_failover property to true

    5. In the ha section, set the cluster property to "PRIMARY".

    When you are done editing the properties, save the file.

  23. Perform step 22 above on your secondary node, making sure to set the cluster property in the ha section to "SECONDARY".

  24. On the primary node, edit the $MOOGSOFT_HOME/config/moog_farmd.conf file to uncomment the ha section and set the properties as follows. Note that we are removing the cluster property, as it will be obtained from system.conf:

    1. group: "moogfarmd"

    2. instance: "moogfarmd_primary"

    3. default_leader: true

    4. start_as_passive: false

    Note

    Add a comma before the ha section to preserve proper JSON syntax when you uncomment the section. When you are done editing the properties, save the file.

  25. Start the Moogfarmd service by running the command:

    systemctl start moogfarmd
  26. Perform steps 24 and 25 above on your secondary node, making sure to:

    1. Set "moogfarmd_secondary" for the value of the instance property in the ha section.

    2. Set the default_leader property to false in the ha section.

    3. Save the moog_farmd.conf file before starting the moogfarmd service.

  27. You should now have a correctly configured HA pair for the Moogfarmd service, and you can test your work by looking at the Moogfarmd logs. Your primary node should have normal log output, and your secondary node should have detected that the primary node is running and set itself to passive.

This concludes the lab section.