Improve DBA Operation Productivity – Part Two

2012/07/31

As a DBA, we have to take various actions based on huge amount of information received. In one perspective, our productivity is limited by our capability bandwidth to process the information received. However, if this information is frequently “polluted or even poisoned”, our productivity is surely compromised.

I define “polluted information” as the info that can distract our attention (even for 1 second) and we do not care about, such as job succeeded / completed , backup successful info or login successful info in the sql server log(of course, unless you have very special reason that you need to know such info). In expanded case, even when the info is necessary in normal sense, such as tlog backup job failure notification, it can be considered as “somewhat polluted”. My argument is in a complex environment, when a tlog job fails in the first time (or even the 2nd time), I usually do not care about because network glitch can happen anytime, however, if the job failures continues more than twice, I will be interested in finding out what’s going wrong here.

Although we may set up various filters to avoid these unwanted info, we still lose here for the unnecessary overhead work. What if we can totally avoid these “polluted” information in the first place?

I define “poisoned information” as the false information that we are not able to tell at the time. I still remember one time as a DBA on duty, I was waken up at around 2:30am by my cell phone because I received a critical error message, one production server had its tempdb log drive full. After I got up and logged into the server, and found it was a false alarm, then I had to send a confirmation email to all stakeholders, after that (about 15 min later), I could not sleep at all, thus followed by a sleepy and non-productive day with black eyes.

So to improve productivity, it is actually pretty straight-forward, set up a policy that prohibits all information that does not need immediate attention, and have a facility to automatically double / triple validate critical information before it is released.

In a broader sense, avoiding “polluted or poisoned” information is a good practice for “green database administration” as these unnecessary info is generated with a cost (some I/O and CPU) in its lifecycle.