Setting Up IBM Data Server Manager as a Highly Available Service

View online

More options

Rating

Abstract

This IBM® Redbooks® web doc describes how to build, configure, and deploy IBM Data Server Manager as a high availability (HA) service on a Linux system. Data Server Manager runs on an HA cluster that includes one master node and one backup node. If Data Server Manager stops running on one of the nodes, the other instance of Data Server Manager starts to ensure that the Data Server Manager service is always available. Critical configuration and user data for Data Server Manager are continuously synchronized between the two Data Server Manager nodes.

This web doc guides you through building a Data Server Manager HA solution based on IBM Tivoli® System Administration for Multiplatforms (Tivoli SA MP). Because it needs to operate on system files, the Data Server Manager HA system installation and configuration requires Linux root authority in most of the following steps. After Data Server Manager HA is installed and configured, a user can view HA cluster information and manage Data Server Manager without root authority.

If you are using the Data Server Manager historical repository, you should also consider making the IBM Db2® database that stores the historical repository highly available. This action helps to ensure uninterrupted access to critical historical monitoring data. IBM DB2® High Availability Disaster Recovery (HADR) or IBM DB2 pureScale® are good options to consider. This document applies to IBM Data Server Manager Version 2.1.2 and later.

With IBM Data Server Manager, you can monitor, analyze, tune, and administer Db2 databases. This document describes how to build the Data Server Manager high availability (HA) environment, based on IBM Tivoli® System Administration for Multiplatforms (Tivoli SA MP). Two servers with the Linux operating system are needed. The operating system used in this document is Red Hat Enterprise Linux Server Release 6.7.

Generally, the Tivoli SA MP installation package is bound together with the Db2 installation package. You need to manually install Tivoli SA MP on both the master and backup nodes. The Tivoli SA MP installation package can be obtained from the Db2 installation package directory server/db2/linuxamd64/tsamp. Perform the following steps to install Tivoli SA MP:

Install the prerequisite components by running the following commands (you must have root authority).

Data Server Manager user data in the ibm-datasrvrmgr/Config folder is sensitive. The Data Server Manager HA cluster keeps this data safe by synchronizing backups across nodes. With the installation of rsync and inotify tools, any changes in this folder are synchronized from one node to the other node. If Data Server Manager is failing on a node, the other node starts with the same configuration data.

The rsync and inotify tools need to be installed on both master and backup nodes. Run the following commands on both nodes. These tools monitor and synchronizeibm-datasrvrmgr/Config file changes that are on the two nodes (root authority is needed):

Two nodes act as a Tivoli SA MP cluster. For the remainder of this document, the host names for the two nodes are dsm-master and dsm-backup. You can use different host names, but be sure to replace the host names in subsequent steps where the names are used. Complete these steps.

On both master and backup nodes, update the /etc/hosts file with the fully qualified domain name (root authority needed). Figure 5 shows the file contents.

Figure 5. The contents of /etc/hosts

(Optional) Modify the host name value on both nodes by editing the /etc/sysconfig/network system file. Set HOSTNAME to dsm-master on the master node (Figure 6) and dsm-backup on the backup node (Figure 7); root authority is needed:

[root@dsm-master ~]# cat /etc/sysconfig/network

Figure 6. Set hostname for dsm-master

Figure 7. Set hostname for dsm-backup

On the master node, run the following command:

[root@dsm-master ~]# hostname dsm-master

On the backup node, run the following command:

[root@dsm-master ~]# hostname dsm-backup

Reboot the master and backup nodes after these configuration changes are complete.

Configuration on both nodes

Configure Secure Shell (SSH) on both nodes so that file changes can be synchronized between these two nodes:

Run the ssh-keygen command on both the master and backup nodes:

[root@dsm-master .ssh]# ssh-keygen -t rsa

Copy the contents of the id_rsa.pub file (Figure 8) on the master node and append it to the /root/.ssh/authorized_keys file on the backup node. Also copy the contents of id_rsa.pub on the backup node and append it to the /root/.ssh/authorized_keys file on the master node.

Figure 8. Content of the id_rsa.pub file

Run the Tivoli SA MP preprpnode command on both nodes before creating a domain (root authority needed):

[root@dsm-master .ssh]# preprpnode dsm-master dsm-backup

Creating a domain for two nodes

Run the Tivoli SA MP commands to create a domain for Data Server Manager HA cluster only on the master node. To start the domain and check its status, make dsm a resource on the master node (root authority needed) by using the following commands:

To create a resource, you need a script that includes start, stop, and status commands. You also need a definition file. Create a resource for Data Server Manager as follows:

On both the dsm-master and dsm-backup nodes, create a script with the name dsm. The script includes the Tivoli SA MP definition of start, stop, and status, and aligns with Data Server Manager start, stop, and status scripts. Save the script in a directory, such as /etc/init.d/, illustrated in this example. The directory highlighted in bold is the directory where Data Server Manager is installed (root authority might be needed).

On the dsm-master node, create a definition file named dsm.def. The directory highlighted in bold is the directory where the dsm script is created. Save the file in the same directory as the dsm script, which is /etc/init.d/.

Choose an available IP address as the virtual IP for the Data Server Manager service. Using this virtual IP makes Data Server Manager switching between the master and backup nodes transparent to users. Regardless of the Data Server Manager service active on either node, users see only the virtual IP. Complete these steps:

Choose an available IP (9.111.97.120 is used in this example) as the virtual IP. Next create the virtual IP resource on the master node (root authority needed):

To synchronize data between the master and backup nodes, notification resources can be added on both nodes as follows:

Create a script named syncup.sh and save it on both the dsm-master and the dsm-backup nodes. Save syncup.sh in a directory of your choosing, but be sure to refer to the proper location in subsequent steps. In these examples, the script is saved in/root/syncup. Replace the value that is assigned to inotifyDirwith the directory where the inotifywait command resides on the master and backup nodes. The following command can be used to locate the directory:

[root@dsm-master bin]# locate inotifywait

Figure 18 shows the output of the locate inotifywait command.

Figure 18. Output of the locate inotifywait command

a....Replace the value that is assigned to destIP with the IP address of the pair server. On the master node, destIP is the IP address of backup node; and on the backup node, destIP is the IP address of master node.

b....Replace the value that is assigned to srcDir with the directory where Data Server Manager is installed on the master node. Replace the value that is assigned to destDir with the directory where Data Server Manager is installed on the backup node.

The Data Server Manager folder Config/ is defined to synchronize the master and backup nodes with the specified policy in this script:

On both the dsm-master and dsm-backup nodes, create a script with the name inotify. The script contains start, stop, and status information. Save the script in a directory of your choosing. In this example, the script is saved in the /etc/init.d directory. The /root/syncup directory is where syncup.sh was created in the previous step.

Create a definition file with name inotify.def on the dsm-master node. The directory highlighted in bold in the following example is the directory where inotify is created. Save the file in the same directory where the inotify script was saved in a previous step. In this example, that directory is /etc/init.d.

Create a definition file with name inotify2.def on the dsm-backup node. The directory in bold is where inotify is created. Save the file in the same directory where the inotify script was saved in a previous step. In this example, that directory is /etc/init.d.

Use the lssam command to display the status of the cluster; this command shows which node is online:

[root@dsm-master init.]# lssam -V

Figure 19 shows the output of lssam -V command .

Figure 19. Output of the lssam -V command to display status of the cluster

How HA works while Data Server Manager is failing

While the Data Server Manager is failing, HA works in the following ways:

If Data Server Manager is stopped on the active node, Tivoli SA MP will try to restart it.

If Data Server Manager is restarted successfully, it will continue to run on the same node.

When Tivoli SA MP fails to restart Data Server Manager after several attempts (the number of attempts can be configured), Data Server Manager HA will fail over to another node. For example, Tivoli SA MP will switch from the master node to the backup node or from the backup node to the master node. To observe this behavior, you can temporarily move (or rename) some critical Data Server Manager installation files. With critical files missing, Tivoli SA MP cannot restart Data Server Manager on the master node. Tivoli SA MPs will then switch to the backup node and start Data Server Manager there. You can run the lssam -V command, which indicates that the backup node is online and the master node is offline.

In the following example, ibm-datasrvrmgr/bin is renamed to ibm-datasrvrmgr/bin_bak. With this change, Tivoli SA MP cannot start Data Server Manager.

[root@dsm-backup ibm-datasrvrmgr]# mv bin bin_bak

Figure 20 shows the output of these file commands.

Figure 20. Output of the mv bin bin_bak command

Run the lssam -V command. Note that the dsm-master node is now in a "Pending online" state. Tivoli SA MP is trying to restart Data Server Manager on the master node.

After several failed attempts to start Data Server Manager on the master node, Tivoli SA MP starts Data Server Manager on the backup node. Run the command lssam -V to see that the dsm-backup node is online now.

Figure 22 shows the output of command lssam -V at this point.

Figure 22. Output of the lssam -V command to see that the dsm-backup node is online

Fix the issue on the master node by renaming ibm-datasrvrmgr/bin_bak back to ibm-datasrvrmgr/bin.
Reset the dsm-master node resource by running the following command:

Figure 23. Output of lssam -V command to see the dsm-master is offline

Users do not need to do anything manually during failover. They still visit the same website regardless of which node is running Data Server Manager. This transparency is the result of configuring virtual IP resources and binding them with Data Server Manager resources.

Figure 24 shows that the Data Server Manager website with a virtual IP is always accessible during failover, except the intervals that Tivoli SA MP attempts to restart Data Server Manager on the master node.

Figure 24. Access Data Server Manager via virtual IP

Related information

For more information, see the following topics in IBM Knowledge Center:

Special Notices

This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment.