Availability Engineering

Debugging the Data service Resource Type Implementation

Computer programs often need to be examined to determine the cause of apparent errors or to gain a better understanding of their source code structure and control flow. This examination is called debugging,since its usual objective is the location and removal of program errors (bugs). This need arises during both the development phase in the software life cycle and during subsequent software maintenance.

In this blog we are interested in the activity of the examining the program to understand the phenomenon involved. This would require the examination of the program source and its input and output behavior. It may also require examination of the internal state of the program at interesting points of execution. The purpose of this activity is to assist in the formulation of hypotheses regarding the reason for the perceived aberrant behavior. There are various ways of approaching to this, Single stepping, inserting break points to suspend and examine the program, invoke printable statements (like printf), use UNIX provided syslog interfaces etc... The error messages and warnings differ from debug statements and are events from the program and is used to notify the user of an abnormal behavior. Debug statements on the other hand contain function names, variable names etc.. Hence it is more understood by the developers than the administrators.

The below diagram illustrates the debug model using syslog interface on UNIX.

With this brief introduction and moving forward, the Sun Cluster HA agent developers are offered with a set of DSDL syslog API's to assist in debugging of Data services.

To start with download the Open HA Cluster agent source, build tools, and related binaries from http://opensolaris.org/os/community/ha-clusters/ohac/downloads/. This will be for the understanding of the source code and to use the DSDL provided API's.

Now lets move on to use the DSDL built-in features for syslog.

DSDL provides scds_syslog_debug() utility for adding debugging statements to the resource type implementation. The debugging level (a number between 1-9) can be dynamically set for each resource type implementation on each cluster node. A file named /var/cluster/rgm/rt/<rtname>/loglevel has to be created and contains only an integer between 1 and 9. This is read by all the resource type callback methods. The DSDL function scds_initialize() reads this file and sets the debug level to the specified level. The scds_syslog_debug() function uses the facility that is returned by the scha_cluster_getlogfacility() function at a priority of LOG_DEBUG. You can configure these debug messages in the /etc/syslog.conf file.

Create a loglevel file under /var/cluster/rgm/rt/<rtname>/. Edit the loglevel file and add "9" as the debug level.

Restart the syslog daemon.#pkill -9 syslogd

Now you would see the debug messages directed to /var/adm/ds.out

Perform failovers and switchovers (using cluster admin commands) to see how the Agent is behaving. Seeing these debug messages and correlating them with the code, gives you a much better understanding of the control flow of these Agents.