Troubleshoot the UI
The following sections outline potential problems and solutions related to the Moogsoft Onprem user interface.
Slow UI
If the system is showing signs of slow UI performance, such as long login times, spinning summary counters, or other, then the problem is likely with Apache Tomcat and/or the database. The following diagnostic steps will help you track down the cause:
Step | Description | Possible cause and resolution |
---|---|---|
1 | Check catalina.out for any obvious errors or warning. | Cause may be evident from any warnings or errors. |
2 | Check browser console or any errors or timing out requests. | Possibly a bug or more likely that the query to the database associated with the request is taking longer that 30 seconds (the default browser timeout). Investigate the root cause. |
3 | Check network latency between browser client machine and server using ping. | Latency of =>100ms can make login noticeably slower. |
4 | Check the CPU/memory usage of the server itself. | If the server, as a whole, is running close to CPU or memory limit and no other issues can be found (e.g. rogue processes or memory leaks in the Moogsoft Onprem components) then consider adding more resource to the server or distributing the Moogsoft Onprem components. |
5 | Check MoogSvr/Moogpoller/Graze counter logging in catalina.out | Tomcat may be processing a high number of requests or Message Bus updates. If Moogpoller count is zero then something may be wrong with Tomcat > RabbitMQ connection. Check RabbitMQ admin UI for signs of message queue build-up. |
6 | Check whether Tomcat java process is showing constant high CPU/memory usage. | Tomcat may be processing the updates from an event or situation storm. Backlog should clear assuming storm subsides. |
7 | Has the memory of the Tomcat java processed reached a plateau? | Tomcat may have reached its java heap limit. Check the -Xmx setting in Increase the -Xmx settings as appropriate and restart the apache-tomcat service. |
8 | Is the database tuned? | Check the innodb-buffer-pool-size and innodb_buffer_pool_instances settings in |
9 | Check the server for any other high CPU or memory processes or that which might be impacting the database. | Something may be hogging CPU/memory on the server and starving Tomcat of resources. The Events Analyser utility may be running or a sudden burst of Moogfarmd or Graze activity may be putting pressure on the database and affecting the UI. |
10 | Run DBPool Diagnostics (see previous section) several times to assess current state of Tomcat > Database connections. | Tomcat database connections may be maxed out with long running connections - this may indicate a processing deadlock - perform a Alternatively Tomcat may be very busy with lots of short but frequent connections to the database. A Graze request bombardment is another possibility (Graze does not currently have a separate DB Pool). Consider increasing the number DBPool connections for Tomcat by increasing the related properties in servlets.conf and restarting the apache-tomcat service. |
11 | Turn on MySQL slow query logging (see earlier section on how to do this) | Slow queries from nasty filters in the UI may be causing problems, review them for efficiency. Alternatively slow queries from other parts of the system may be causing problems (e.g. inefficient Moobot code). Slow queries may also be down to the sheer amount of data in the system. Consider enabling Database Split to move old data and/or using the Archiver to remove old data. |
12 | Is Tomcat memory constantly growing over time and a memory leak is suspected? Note that Tomcat memory does typically increase for periods of time then is trimmed back via java garbage collection. | Take periodic heap dumps from the Tomcat java process and send them to Moogsoft support so they can analyse the growth. Use the following commands: DUMPFILE=/tmp/tomcat-heapdump-$(date +%s).bin sudo -u tomcat jmap -dump:format=b,file=$DUMPFILE $(ps -ef|grep java|grep tomcat|awk '{print $2}') bzip2 $DUMPFILE Notes:
|
Search / Opensearch problems
See Configure Search and Indexing for more information.
Check that the Opensearch service is running with one of the following commands:
service opensearch status
Check any errors. They are written to /var/log/opensearch/<clustername>.log
process_cntl
Check that the Opensearch heap size in /etc/opensearch/jvm.options is large enough. See Configure Opensearch heap size in RPM Installation for more information on setting the JVM heap sizes.
Check
/usr/share/apache-tomcat/logs/catalina.out
for any errors when attempting a search from the UI.
Check that cron job that runs the moog_indexer (created by the moog_init_search.sh script to re-index against the Moogsoft Onprem database on a once-a-minute basis) exists and is not generating any warnings or errors.
List the configured cron jobs:
crontab -l
Check errors, they are written to
/var/log/cron
.To change the interval manually:
crontab -ed
Opensearch fails to start with "java.lang. UnsatisfiedLinkError: /tmp/jna--<text>" error. For example:
[2017-08-07T14:14:31,173][WARN ][o.o.b.Natives ] [master1] unable to load JNA native support library, native methods will be disabled. java.lang.UnsatisfiedLinkError: /tmp/opensearch-1073614746573548365/jna9916496385695771284.tmp: /tmp/opensearch-1073614746573548365/jna9916496385695771284.tmp: failed to map segment from shared object at jdk.internal.loader.NativeLibraries.load(Native Method) ~[?:?] at jdk.internal.loader.NativeLibraries$NativeLibraryImpl.open(NativeLibraries.java:383) ~[?:?] at jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:227) ~[?:?] at jdk.internal.loader.NativeLibraries.loadLibrary(NativeLibraries.java:169) ~[?:?] at java.lang.ClassLoader.loadLibrary(ClassLoader.java:2407) ~[?:?] at java.lang.Runtime.load0(Runtime.java:747) ~[?:?] at java.lang.System.load(System.java:1857) ~[?:?] at com.sun.jna.Native.loadNativeDispatchLibraryFromClasspath(Native.java:1018) ~[jna-5.5.0.jar:5.5.0 (b0)] at com.sun.jna.Native.loadNativeDispatchLibrary(Native.java:988) ~[jna-5.5.0.jar:5.5.0 (b0)] at com.sun.jna.Native.<clinit>(Native.java:195) ~[jna-5.5.0.jar:5.5.0 (b0)] at java.lang.Class.forName0(Native Method) ~[?:?] at java.lang.Class.forName(Class.java:377) ~[?:?] at org.opensearch.bootstrap.Natives.<clinit>(Natives.java:58) [opensearch-1.3.6.jar:1.3.6] at org.opensearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:123) [opensearch-1.3.6.jar:1.3.6] at org.opensearch.bootstrap.Bootstrap.setup(Bootstrap.java:191) [opensearch-1.3.6.jar:1.3.6] at org.opensearch.bootstrap.Bootstrap.init(Bootstrap.java:412) [opensearch-1.3.6.jar:1.3.6] at org.opensearch.bootstrap.OpenSearch.init(OpenSearch.java:178) [opensearch-1.3.6.jar:1.3.6] at org.opensearch.bootstrap.OpenSearch.execute(OpenSearch.java:169) [opensearch-1.3.6.jar:1.3.6] at org.opensearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:100) [opensearch-1.3.6.jar:1.3.6] at org.opensearch.cli.Command.mainWithoutErrorHandling(Command.java:138) [opensearch-cli-1.3.6.jar:1.3.6] at org.opensearch.cli.Command.main(Command.java:101) [opensearch-cli-1.3.6.jar:1.3.6] at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:135) [opensearch-1.3.6.jar:1.3.6] at org.opensearch.bootstrap.OpenSearch.main(OpenSearch.java:101) [opensearch-1.3.6.jar:1.3.6]
This is most likely due to the noexec directive in the /tmp
mount. The solution is to remove the noexec directive, if it is practical to do so:
sudo mount /tmp -o remount,exec
Or set the following in /etc/sysconfig/opensearch as appropriate
ES_JAVA_OPTS="-Djna.tmpdir=/var/lib/opensearch/tmp"
Restart the Opensearch service after either of the above changes.
Other UI issues
Check that port 443 is not being blocked by the firewall on the server.
Check that the Nginx service is running with command:
service nginx status
Check that Nginx is listening on port 443. Example expected output:
netstat -anp|grep 443 tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 42356/nginx tcp 0 0 :::443 :::* LISTEN 42356/nginx
A possible error is "You could not be logged in. Please try again". First check /usr/share/apache-tomcat/logs/catalina.out
to understand the error better. Possible causes are as follows:
Apache Tomcat is not running. Check its status:
service apache-tomcat status
There is a communication problem between the UI and the MySQL database. Check the MySQL service is running:
service mysqld status
If MySQL is running on a different server, check that it is accessible from the Moogsoft Onprem web server and the required permissions have been applied.
There is an authentication problem between the UI and the MySQL database.
Check that the user exists in the MySQL moogdb.users table.
Check that the username and password used for authentication are correct.
If you're using a load balancer, the hostname in the URL you're using to access the UI does not match the webhost in the servlet configuration files.
Set the "webhost" in
$MOOGSOFT_HOME/config/servlets.conf
on each UI server to the hostname of the load balancer.
A message appears in your browser "your connection is not private" and you are unable to proceed to the UI.
In macOS Catalina or later, the Moogsoft Onprem UI is inaccessible in Chrome, Safari and Edge browsers because self-signed certificates are no longer trusted. For workaround instructions see Catalina Browser Certificate Workaround.
If you are getting an empty column in your alert views for alerts from a particular event source, verify the following:
The events are being processed by the LAM.
Moogfarmd is running.
After that check the LAM configuration file for configuration and mapping issues.