Moogsoft Docs

Troubleshoot Slow UI

If the system is showing signs of slow UI performance, such as long login times, spinning summary counters, or other, then the problem is likely with Tomcat and/or the database. The following diagnostic steps will help you track down the cause:

Step

Description

Possible Cause and Resolution

1

Check catalina.out for any obvious errors or warning.

Cause may be evident from any warnings or errors.

2

Check browser console or any errors or timing out requests.

Possibly a bug or more likely that the query to the database associated with the request is taking longer that 30secs (the default browser timeout). Root cause of this should be investigated.

3

Check network latency between browser client machine and server using ping.

Latency of =>100ms can make login noticeably slower.

4

Check the CPU/memory usage of the server itself.

If the server, as a whole, is running close to CPU or memory limit and no other issues can be found (e.g. rogue processes or memory leaks in the Moogsoft AIOps components) then consider adding more resource to the server or distributing the Moogsoft AIOps components.

5

Check MoogSvr/Moogpoller/Graze counter logging in catalina.out

Tomcat may be processing a high number of requests or bus updates.

If Moogpoller count is zero then something may be wrong with Tomcat > RabbitMQ connection. Check RabbitMQ admin UI for signs of message queue build-up.

6

Check whether Tomcat java process is showing constant high CPU/memory usage.

Tomcat may be processing the updates from an event or situation storm. Backlog should clear assuming storm subsides.

7

Has the memory of the Tomcat java processed reached a plateau?

Tomcat may have reached its java heap limit. Check the -Xmx setting in /etc/init.d/apache-tomcat.

Increase the -Xmx settings as appropriate and restart the apache-tomcat service.

8

Is the database tuned?

Check the innodb-buffer-pool-size and innodb_buffer_pool_instances settings in /etc/my.cnf as per Tuning section above. Ensure they are set appropriately and restart mysql if changes are made.

9

Check the server for any other high CPU or memory processes or that which might be impacting the database.

Something may be hogging CPU/memory on the server and starving Tomcat of resources.

The Events Analyser utility may be running or a sudden burst of Moogfarmd or Graze activity may be putting pressure on the database and affecting the UI.

10

Run DBPool Diagnostics (see previous section) several times to assess current state of Tomcat > Database connections.

Tomcat database connections may be maxed out with long running connections - this may indicate a processing deadlock - perform a kill -3 <pid> on the Tomcat java process to generate a thread dump (in catalina.out) and send it to Moogsoft AIOps Support.

Alternatively Tomcat may be very busy with lots of short but frequent connections to the database. A Graze request bombardment is another possibility (Graze does not currently have a separate DB Pool). Consider increasing the number DBPool connections for Tomcat by increasing the related properties in servlets.conf and restarting the apache-tomcat service.

11

Turn on MySQL slow query logging (see earlier section on how to do this)

Slow queries from nasty filters in the UI may be causing problems and they should be reviewed for efficiency.

Alternatively slow queries from other parts of the system may be causing problems (e.g. inefficient Moobot code).

Slow queries may also be down to the sheer amount of data in the system. Consider enabling Database Split to move old data and/or using the Archiver to remove old data.

12

Is Tomcat memory constantly growing over time and a memory leak is suspected?

Note that Tomcat memory does typically increase for periods of time then is trimmed back via java garbage collection.

Take periodic heap dumps from the Tomcat java process and send them to Moogsoft support so they can analyse the growth. Use the following commands:

DUMPFILE=/tmp/tomcat-heapdump-$(date +%s).bin
sudo -u tomcat jmap -dump:format=b,file=$DUMPFILE $(ps -ef|grep java|grep tomcat|awk '{print $2}')
bzip2 $DUMPFILE

Notes:

  • jmap needs Java JDK to be installed. "yum install jdk" should suffice to install this.

  • generating a heap dump is likely to make the target process very busy for a period of time and also triggers a garbage collection so the memory usage of the process may well reduce.

  • heapdump files may be very large

User Interface (UI) issues
Unavailable UI Login Page
  • Check that port 443 is not being blocked by the firewall on the server.

  • Check that the Nginx service is running with command:

    service nginx status
  • Check that Nginx is listening on port 443. Example expected output:

    netstat -anp|grep 443
    tcp        0      0 0.0.0.0:443                 0.0.0.0:*                   LISTEN      42356/nginx         
    tcp        0      0 :::443                      :::*                        LISTEN      42356/nginx 
Login fails with "You could not be logged in. Please try again."
Apache-tomcat service not running
  • Check the apache-tomcat service is running:

    service apache-tomcat status
Communication problem between the UI and MySQL database
  • Check the MySQL service is running:

    service mysqld status
  • If MySQL is running on a different server, check that it is accessible from the Moogsoft AIOps web server and the required permissions have been applied.

Authentication problem between the UI and MySQL database
  • Check that the user exists in the MySQL moogdb.users table.

  • Check that the username and password used for authentication are correct.

Search/Elasticsearch

See Configure Search and Indexing for more information.

ElasticSearch not running or generating errors (such as MySQL connection problems)
  • Check that the Elasticsearch service is running:

    service elasticsearch status
  • Any errors are written to /var/log/elasticsearch/elasticsearch.log

Tomcat cannot connect to Elasticsearch
  • Check /usr/share/apache-tomcat/logs/catalina.out for any errors when attempting a search from the UI.

Cron job errors
  • Check that cron job that runs the moog_indexer (created by the moog_init_search.sh script to re-index against the Moogsoft AIOps database on a once-a-minute basis) exists and is not generating any warnings or errors.

  • List the configured cron jobs:

    crontab -l
  • Errors are written to /var/log/cron

  • Depending on the intervals at which Elasticsearch re-indexes against the Moogsoft AIOps database, it is possible that new alerts, Situations, threads or comments have not yet been indexed, and so will not be searchable.

  • To change the interval manually:

    crontab -ed
Elasticsearch fails to start with /tmp directory permission problems

Elasticsearch fails to start with "java.lang. UnsatisfiedLinkError: /tmp/jna--<blah>" error. For example:

[2017-08-07T14:14:31,173][WARN ][o.e.b.Natives] unable to load JNA native support library, native methods will be disabled.
java.lang.UnsatisfiedLinkError: /tmp/jna--1985354563/jna3872404023206022895.tmp: /tmp/jna--1985354563/jna3872404023206022895.tmp: failed to map segment from shared object: Operation not permitted
   at java.lang.ClassLoader$NativeLibrary.load(Native Method) ~[?:1.8.0_171]
   at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941) ~[?:1.8.0_171]
   at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824) ~[?:1.8.0_171]
   at java.lang.Runtime.load0(Runtime.java:809) ~[?:1.8.0_171]
   at java.lang.System.load(System.java:1086) ~[?:1.8.0_171]
   at com.sun.jna.Native.loadNativeDispatchLibraryFromClasspath(Native.java:851) ~[jna-4.2.2.jar:4.2.2 (b0)]
   at com.sun.jna.Native.loadNativeDispatchLibrary(Native.java:826) ~[jna-4.2.2.jar:4.2.2 (b0)]
   at com.sun.jna.Native.<clinit>(Native.java:140) ~[jna-4.2.2.jar:4.2.2 (b0)]
   at java.lang.Class.forName0(Native Method) ~[?:1.8.0_171]
   at java.lang.Class.forName(Class.java:264) ~[?:1.8.0_171]
   at org.elasticsearch.bootstrap.Natives.<clinit>(Natives.java:45) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:104) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:203) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.cli.Command.main(Command.java:88) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:89) [elasticsearch-5.6.9.jar:5.6.9]
   at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:82) [elasticsearch-5.6.9.jar:5.6.9]

This is most likely due to the noexec directive in the /tmp mount. The solution is to remove the noexec directive, if it is practical to do so:

sudo mount /tmp -o remount,exec

Or set the following in /etc/sysconfig/elasticsearch:

ES_JAVA_OPTS="-Djna.tmpdir=/var/lib/elasticsearch/tmp"

Restart the Elasticsearch service after either of the above changes.