Skip to main content

Percona Cluster 8.0 Tarball Minor Version Upgrade

Use the following instructions to upgrade a tarball Percona XtraDB Cluster to v8.0.35 to avoid Moogsoft Onprem-impacting issues that have been identified in lower versions.

System Setup

This example uses the following sample server configuration for Moogsoft Onprem and the Percona Cluster:

  • Server1 - Moogsoft Onprem 9.x.x.x

  • Server2 - Percona Cluster Node1

  • Server3 - Percona Cluster Node2

  • Server4 - Percona Cluster Node3

Assumptions

This procedure guides a user through a tarball-based installation. The same user that performed the Percona installation originally should perform the upgrade.

Definitions

  • IST - Incremental State Transfer. Functionality which instead of whole state snapshot can catch up with the group by receiving the missing writesets, but only if the writeset is still in the donor’s writeset cache.

  • SST - State Snapshot Transfer is the full copy of data from one node to another. Used when a new node joins the cluster; it has to transfer data from an existing node.

Upgrade Steps

Step 1 - Stop all Moog services that connect to the database

$MOOGSOFT_HOME/bin/utils/process_cntl rest_lam stop
$MOOGSOFT_HOME/bin/utils/process_cntl socket_lam stop
$MOOGSOFT_HOME/bin/utils/process_cntl moog_farmd stop
$MOOGSOFT_HOME/bin/utils/process_cntl apache-tomcat stop

Note

While it is technically possible to perform the cluster upgrade when the system is running and processing events, doing so greatly increases the chances of issues and/or SST occurrence later in the upgrade.

Step 2 - Make sure the three database nodes are synced

Perform the following check from any location:

[moogsoft@server1]# curl http://server2:9198
Percona XtraDB Cluster Node is synced.
[moogsoft@server1]# curl http://server3:9198
Percona XtraDB Cluster Node is synced.
[moogsoft@server1]# curl http://server4:9198
Percona XtraDB Cluster Node is synced.

Step 3 - Perform the actions on Server2

  1. Stop the MySQL node:

    • If this is a database-only node (with no Moogsoft Onprem components installed), OR the $MOOGSOFT_HOME/bin/utils/process_cntl mysql status command does not report ‘running’, run the following commands:

      MYSQL_PID=$(ps -ef | grep mysql | egrep -v 'grep|mysqld_safe' | awk '{print $2}')
      kill -s TERM $MYSQL_PID
    • If this database is on a node with Moogsoft Onprem installed AND the $MOOGSOFT_HOME/bin/utils/process_cntl mysql status command does report ‘running’, run the following commands:

      $MOOGSOFT_HOME/bin/utils/process_cntl mysql stop
  2. Download the Percona files and copy them into the same folder where the existing Percona installation is located (default is ~/install):

    cd ~/install;


    
    curl -L -O https://downloads.percona.com/downloads/Percona-XtraDB-Cluster-80/Percona-XtraDB-Cluster-8.0.35/binary/tarball/Percona-XtraDB-Cluster_8.0.35-27.1_Linux.x86_64.glibc2.17.tar.gz;
    curl -L -O https://downloads.percona.com/downloads/Percona-XtraBackup-8.0/Percona-XtraBackup-8.0.35-30/binary/tarball/percona-xtrabackup-8.0.35-30-Linux-x86_64.glibc2.17.tar.gz;
    
  3. Extract the files into the same folder where the existing Percona installation is located (default is ~/install):

    tar -xf Percona-XtraDB-Cluster_*.tar.gz;
    tar -xf percona-xtrabackup-*.tar.gz;
  4. Update the PATH variable in the bashrc file to reference the newer Percona packages::

    sed -i 's;Percona-XtraDB-Cluster-[^/]\+;Percona-XtraDB-Cluster_8.0.35-27.1_Linux.x86_64.glibc2.17;' ~/.bashrc;
    sed -i 's/percona-xtrabackup-[^/]\+/percona-xtrabackup-8.0.35-30-Linux-x86_64.glibc2.17/g' ~/.bashrc
    source ~/.bashrc;
  5. Back-up and update the .my.cnf file:

    cp ~/.my.cnf ~/.my.cnf$(date +%s);
    MYSQL_HOME=$(dirname $(dirname $(which mysqld)))
    sed -i "s;basedir.*;basedir = ${MYSQL_HOME};" ~/.my.cnf
    sed -i "s;wsrep_provider\s*=.*;wsrep_provider = ${MYSQL_HOME}/lib/libgalera_smm.so;" ~/.my.cnf
  6. Backup the grastate.dat file as a safety measure.

    LINE_NUM=$(cat -un ~/.my.cnf | grep '\[mysqld\]' | awk '{ print $1 }')
    DATADIR=$(awk "NR>${LINE_NUM}" ~/.my.cnf | egrep '^\s*datadir' | head -1 | awk -F'=' '{ print $2 }' | tr -d '\t ')
    cp ${DATADIR}/grastate.dat ${DATADIR}/grastate.dat.bak
  7. Add the following property to the [mysqld_safe] section of the Percona Node's configuration file ~/.my.cnf to have a generous limit on SST duration:

    service-startup-timeout = 19520

    e.g.:

    [mysqld_safe]
    log-error = <INSTALL_DIR>/var/log/mysqld.log
    pid-file = <INSTALL_DIR>/var/run/mysqld/mysqld.pid
    socket = <INSTALL_DIR>/var/run/mysqld/mysqld.sock
    nice = 0
    service-startup-timeout = 19520
  8. Finally, start the node normally:

    • If this is a database-only node (no Moogsoft Onprem components installed), run the following command:

      (mysqld_safe > /dev/null 2>/dev/null)&
    • If this is a node with Moogsoft Onprem components installed, run the following command:

      $MOOGSOFT_HOME/bin/utils/process_cntl mysql start

    It will either sync immediately or spend a short amount of time performing an IST from another node. See the example below from mysqld.log:

    2020-08-20T11:12:45.559572-00:00 0 [Note] WSREP: Receiving 
    IST... 13.0% ( 96/739 events) complete.
    2020-08-20T11:12:56.030910-00:00 0 [Note] WSREP: Receiving 
    IST... 23.8% (176/739 events) complete.
    2020-08-20T11:13:06.922647-00:00 0 [Note] WSREP: Receiving 
    IST... 34.6% (256/739 events) complete.
    2020-08-20T11:13:18.481076-00:00 0 [Note] WSREP: Receiving 
    IST... 45.5% (336/739 events) complete.
    2020-08-20T11:13:30.433314-00:00 0 [Note] WSREP: Receiving 
    IST... 56.3% (416/739 events) complete.
    2020-08-20T11:13:42.463826-00:00 0 [Note] WSREP: Receiving 
    IST... 67.1% (496/739 events) complete.
    2020-08-20T11:13:52.673644-00:00 0 [Note] WSREP: Receiving 
    IST... 75.8% (560/739 events) complete.
    2020-08-20T11:14:02.812737-00:00 0 [Note] WSREP: Receiving 
    IST... 84.4% (624/739 events) complete.
    2020-08-20T11:14:06.210387-00:00 0 [Note] WSREP: Receiving 
    IST...100.0% (739/739 events) complete.
    2020-08-20T11:14:06.211793-00:00 2 [Note] WSREP: IST 
    received: beebd113-e080-11ea-8803-6f888df3319a:21784800
    2020-08-20T11:14:06.212523-00:00 0 [Note] WSREP: 
    0.0 (pxc-node-ldev03): State transfer from 2.0 
    (pxc-node-ldev04) complete.
    2020-08-20T11:14:06.212564-00:00 0 [Note] WSREP: SST 
    leaving flow control
    2020-08-20T11:14:06.212574-00:00 0 [Note] WSREP: Shifting 
    JOINER -> JOINED (TO: 21784800)
    2020-08-20T11:14:06.213148-00:00 0 [Note] WSREP: Member 
    0.0 (pxc-node-ldev03) synced with group.
    2020-08-20T11:14:06.213174-00:00 0 [Note] WSREP: Shifting 
    JOINED -> SYNCED (TO: 21784800)
    2020-08-20T11:14:06.213211-00:00 2 [Note] WSREP: Synchronized 
    with group, ready for connections
    2020-08-20T11:14:06.213234-00:00 2 [Note] WSREP: Setting 
    wsrep_ready to true
    2020-08-20T11:14:06.213254-00:00 2 [Note] WSREP: 
    wsrep_notify_cmd is not defined, skipping notification.
    2020-08-20T11:14:44.973932-00:00 0 [Note] InnoDB: 
    Buffer pool(s) load completed at 200820 11:14:44
  9. Verify this node has synced before proceeding to the next node:

    [root@server1]#  curl http://server2:9198
    Percona XtraDB Cluster Node is synced.

Step 4 - Perform the actions on Server3

Repeat the same actions on Server3 as you performed on Server2 in Step 3.

Step 5 - Server4 actions:

Repeat the same actions on Server4 as you performed on Server2 in Step 3.

Step 6 - Final Cluster Check

Check that all three database nodes are synced:

[root@server1]# curl http://server2:9198
Percona XtraDB Cluster Node is synced.
[root@server1]# curl http://server3:9198
Percona XtraDB Cluster Node is synced.
[root@server1]# curl http://server4:9198
Percona XtraDB Cluster Node is synced.

Step 7 - Restart moog services

$MOOGSOFT_HOME/bin/utils/process_cntl rest_lam start
$MOOGSOFT_HOME/bin/utils/process_cntl socket_lam start

$MOOGSOFT_HOME/bin/utils/process_cntl moog_farmd start

$MOOGSOFT_HOME/bin/utils/process_cntl apache-tomcat start

Fixing the upgraded node if an SST occurs

When the upgraded node rejoins the cluster, it is important that it synchronizes with the cluster using IST. If an SST occurs, you may need to upgrade the data directory structure using mysql_upgrade again to make sure it is compatible with the newer version of the binaries.

Note

Check the error log for statements like the following to determine if an SST occurred:

“Check if state gap can be serviced using IST ... State gap can’t be serviced using IST. Switching to SST” instead of “Receiving IST: ...” lines appropriate to IST synchronization.

Perform the following additional steps to upgrade the data directory structure after SST completes:

  1. Shut down the node that rejoined the cluster using SST:

    • If this is a database-only node (no Moogsoft Onprem components installed), OR the $MOOGSOFT_HOME/bin/utils/process_cntl mysql status command doesn’t report ‘running’, run the following commands:

      MYSQL_PID=$(ps -ef | grep mysql | egrep -v 'grep|mysqld_safe' | awk '{print $2}')
      kill -s TERM $MYSQL_PID
    • If this database is on a node with Moogsoft Onprem installed AND the $MOOGSOFT_HOME/bin/utils/process_cntl mysql status command reports ‘running’, run the following commands:

      $MOOGSOFT_HOME/bin/utils/process_cntl mysql stop
  2. Restart the node normally and ensure the node rejoins the cluster using IST:

    • If this is a database-only node (no Moogsoft Onprem components installed), run the following command:

      (mysqld_safe > /dev/null 2>/dev/null)&
    • If this is a node with Moogsoft Onprem components, run the following command:

      $MOOGSOFT_HOME/bin/utils/process_cntl mysql start