Configuring KDump in an IBM Smart Analytics System 5600 or a Balanced Warehouse D5100 environment

Technote (FAQ)

Question

How do you configure KDump to capture operating system dump files in an IBM Smart Analytics System 5600 environment or a Balanced Warehouse D5100 environment?

Answer

KDump is a Novell Linux tool that you can use to collect an operating system dump file with additional diagnostic information about a system crash or hang in cases where standard Linux log files do not provide sufficient information about the root cause of a system crash. KDump can be configured in your IBM Smart Analytics System 5600 environment or your Balanced Warehouse D5100 environment to collect an operating system dump file in the event of a system crash.

A. Verify that the correct RPM packages are installed on your system

1. Log in as the root user on each administration node, data node, and standby node, and verify that the correct Novell RPM packages are installed on your system:

rpm -qa | grep <package_name> where
<package_name>represents the name of the Novell package.

The following RPM packages need to be installed before you can configure and use KDump in your environment:

kernel-kdump

kdump

kexec-tools

Note: The version of the
kernel-kdump package has to match the version of the Linux kernel you are running in your system.

Example: For example, if you are running
kernel-smp-2.6.16.60-0.85.1 as your kernel version, then you have to install the
kernel-kdump-2.6.16.60-0.85.1 package,

2. If you do not have the correct Novell RPM packages installed, you will need to download the correct packages from the Novell site.

Note: Access to the packages on the Novell site is restricted and requires a valid Novell license and ID.

B. Configure the kernel parameters

1. Log in as root on each administration node, data node, and standby node.

2. Edit the
/etc/sysctl.conf file and add the following parameters:

kernel.suid_dumpable = 1kernel.sysrq = 1kernel.panic_on_oops = 1

3. Issue the following command to reload the kernel configuration from the
/etc/sysctl.conf file:

sysctl -p

4. Optionally, you can run the following commands to verify that the kernel parameters have been changed.

a.
sysctl -A | grep dumpable

The command should return the following output:
kernel.suid_dumpable = 1

b.
sysctl -A |grep sysrq

The command should return the following output:
kernel.sysrq = 1

c.
sysctl -A |grep panic_on_oops

The command should return the following output:
kernel.panic_on_oops = 1

C. Configure the GRUB boot string

You need to add a kernel parameter to the
/boot/grub/menu.lst file to reserve a specific amount of memory for KDump in the event of a system crash.

1. Determine the amount of memory that needs to be reserved for KDump.

a. Determine the amount of physical memory on each administration node, data node, and standby node.

b. Identify the appropriate value for the
crashkernel parameter from the table below.

Memory on the node

Value of the crashkernel parameter

13 - 48 GB

128M@16M

49 - 128 GB

256M@16M

129 - 256 GB

512M@16M

Example: For a node with 64 GB of physical memory, select the value in the second row
256M@16M.

2. Log in as the root user on each administration node, data node, and standby node, and add the following string to the node boot string in the
/boot/grub/menu.lst file:

crashkernel=<value_from_table> where
<value_from_table> represents the string in the table you identified in step 1b.

Example: Using the previous example of a node with 64 GB of physical memory, you would add the value
crashkernel=256@16M value to the node boot string in the
/boot/grub/menu.lst file.

3. If you are configuring only the GRUB boot string, reboot each node now to allow the memory change to take effect.
If you are configuring KDump for the first time, you can skip this step and continue to section D because you will reboot the system after completing that task.

D. Configure the KDump parameters

Configure KDump by editing the KDump configuration file:
/etc/sysconfig/kdump. The KDump configuration settings should be set to the same values on all administration nodes, data nodes, and standby nodes in your system.

1. Configure the
KDUMP_RUNLEVEL parameter. This parameter sets the runlevel to boot the KDump kernel.

If the vmcore file is written to a file system that is mounted in multi-user mode or to a file system that is mounted over the network (for example, an NFS-mounted directory), set the value to runlevel 3. Add the following string to the KDump configuration file to set the parameter to runlevel 3:

KDUMP_RUNLEVEL="3"

If the vmcore file is written to any other type of file system, use the default value of runlevel 1. Add the following string to the KDump configuration file:

KDUMP_RUNLEVEL="1"

2. Configure the
KDUMP_SAVEDIR parameter. This parameter determines the directory where the vmcore dump file will be copied. You can set the value any directory you prefer. The following examples show how to set the value to a local directory on the node and how to set the value to an NFS-mounted directory.

NOTE: Do
not use any of the preconfigured file systems such as
/db2home.
/db2path or file systems under the
/db2fs mount point as the KDump directory.

Example 1: Copy the dump file to a local directory

To copy the dump file to the
/work_files/DUMPDIR/data01 local directory on the node, add the following string to the KDump configuration file:

The vmcore dump file is copied to the directory
/work_files/DUMPDIR/data01/<timestamp>.

Example 2: Copy the dump file to an NFS-mounted directory

To copy the dump file to an NFS-mounted directory, add the following string to the KDump configuration file:

KDUMP_SAVEDIR="nfs://10.11.12.13://DUMPDIR/data02"

If the node boots to the KDump kernel, the vmcore dump file is copied to the
/DUMPDIR/data02/<timestamp> directory on the node with the IP address
10.11.12.13. The
/DUMPDIR/data02/ directory is mounted to the
/mnt directory on the local node.

Note: The example assumes that the
/DUMPDIR/data02 directory is exported on the NFS server (10.11.12.13) and that it can be NFS-mounted. You can use the following command to verify that the file system can be mounted:

mount -t nfs 10.11.12.13://DUMPDIR/data01 /mnt

The NFS directory does
not need to be mounted to the
/mnt mount point for KDump to work.

3. Configure the
KDUMP_DUMPLEVELparameter. This parameter determines the dump level, or what is to be dumped and what is to be stripped. Specify a value in the range 0 - 31. If you specify a value of zero, nothing is stripped from the vmcore file and it will be the largest possible file size. If you specify a value of 31, the maximum amount is stripped from the vmcore file and it will be the smallest possible file size.

To set the dumplevel to the value 31, add the following string to the KDump configuration file:

KDUMP_DUMPLEVEL=31

4. Configure the
KDUMP_DUMPFORMAT parameter. This parameter determines the format of the dump file and can be used to reduce the size of the vmcore dump file. To specify a smaller compressed dump file, add the following string to the KDump configuration file:

KDUMP_DUMPFORMAT="compressed"

5. On each administration node, data node, and standby node, set KDump to start each time the node reboots:

chkconfig kdump on

6. Start KDump by issuing the following command on each administration node, data node, and standby node:

rckdump restart

Note: Run the
rckdump restart command each time you change the KDump configuration file. You do not need to reboot the node to activate the changes.

7. If you are configuring KDump for the first time, reboot each administration node, data node, and standby node to activate the change you made to the GRUB boot string.

E. Verify that KDump is configured correctly

Optionally, you can verify that KDump is configured correctly by simulating a system crash and generating a vmcore file.

Note: The simulation will crash the system, cause it to reboot and generate the vmcore.

1. Log in as the root user on each administration node, data node, and standby node, and issue the following command: