Moogsoft Docs

HA - Summary for Moogsoft AIOps

  1. Moogsoft AIOps architecture: Nginx replaces httpd and Elasticsearch replaces Sphinx.
  2. UI servlets all now act as one HA "instance". This is described in ticket MOOG-3825 v6 Release Notes .
  3. The moogsvr hastatus endpoint still works as previously but the moogpoller hastatus endpoint has been removed (due to websockets changes). As the UI servlets all act as one instance now, only the moogsvr hastatus needs to be used to determine the UI's active/passive status. Additionally, a passive moogpoller servlet will not accept incoming websocket connections and going passive (from active) will disconnect existing websocket connections.
  4. The switch to websockets means that care must be taken with any load balancers fronting an active/passive UI pair to ensure that the websocket connection does not get periodically reset. If this happens this will cause the UI to refresh by itself. An example config for haproxy is below that I will be using in the HA docs for a load balancer on qatest1 that is fronting an HA pair of UI's on qatest2 and qatest3. It is the "timeout client" and "timeout server" settings here (at their maximum allowed values of 24days) that prevent the websocket being disconnected:

        log   local2
        log   local1 notice
        maxconn 4096
        chroot /var/lib/haproxy
        user haproxy
        group haproxy

        log     global
        mode    http
        option  httplog
        option  dontlognull
        option redispatch
        retries 3
        maxconn 2000
timeout connect 5000
   timeout client 24d
timeout server 24d

listen haproxy-monitoring *:9090
  mode    http
  stats   enable
  stats   show-legends
  stats   refresh           5s
  stats   uri               /
  stats   realm             Haproxy\ Statistics
  stats   auth              admin:admin
  stats   admin             if TRUE

frontend ui_front
  bind qatest1:443
  option tcplog
  mode tcp
  default_backend ui_back
backend ui_back
  mode tcp
  balance source
  option httpchk GET /moogsvr/hastatus
  http-check expect status 204
  server ui_1 qatest2:443 check check-ssl verify none inter 100
  server ui_2 qatest3:443 check check-ssl verify none inter 100

5) The "failover.margin" property in system.conf has been increased from 3 to 10 seconds. This only comes into play if moog_farmd auto-failover is being used and will mean that the passive moog_farmd will allow a little more time before attempting to take over processing. This therefore makes auto-failover a little less "twitchy". Auto-failover (for moog_farmd) was introduced in 5.2.0 and explained there: