22 June, 2007

datacenter confidential #2

I have become used to a certain degree of derision from Steve, the datacenter facilities manager. Here is a man in his mid-50s who has probably seen and done anything and everything I am likely to accomplish in my career 10 times over.

But there was something different in his eyes when both he and I realized that my predecessor had foolishly plugged in a 15-amp remote power controller (RPC) into a vertical power strip plugged into a 20-amp power circuit.

"That thing could catch fire," Steve told me, in the shrillest of tones.

I looked back at him confused and a little hurt. "What amperage do we require you to use on a 20-amp circuit?" continued Steve.

"15 amps, 80%" I repeated, from rote.

"What's 80% of 15 amps?"

"12 amps?" I replied using quick math.

The circuit that 15-amp RPC was plugged into was drawing nearly 11 amps. It dawned on me that if I plugged anything else into that RPC there was a good chance that, at best, I would blow the breaker inside the RPC and, at worst, it could catch fire.

This RPC, which I had been haphazardly plugging things into for over a year and a half, was some sort of half-assed jury rig -- a legacy from my predecessor, the same genius who slung all the ethernet cables hither and thither, put the internal DNS on the corporate web-server (a Win2k box), and basically committed every offense conceivable by someone with my background (software raid? are you kidding me?).

Steve made it clear to me that I had to fix it, and now. It was 10am, and the install was shot -- only later did I realize it was shot in many different ways. I spent the morning snaking cables, tracing power plugs, and moving things around while Greg from the vendor dutifully prepared my two new SAN trays which I had no hope of plugging in.

"The fibre cables are too short," Greg told me at some point. Sure enough, the cables would not reach the head unit several dozen "U"s (units) down. "I'll call Amanda (our faithful implementation manager at the vendor) and order longer cables -- 2 meters?"

"Yeah, that sounds right," I replied absentmindedly while tracing power chords through two racks. The Fucker, my predecessor, had done it again, and stolen power from another rack to provide power to an adjacent rack -- another no-no.

I took a break from tracing power to address a bitchy server with a couple of degraded RAIDs. The salient details are these: Greg, looking at the error messages from the server shook his head several times and said, sympathetically, "no bueno." In order to replace the suspect EIDE cables leading from the source drives to the RAID controllers I had to forcibly remove a row of fans, and in the process warped the fan bracket. When I eventually replaced the cables, the fan bracket was so bent up it would not possibly go back into the chassis, so the fans would have to sit on top of the EIDE cables bracket-less.

"Chris, check this out.." said Greg, from the vendor.

I looked up as he pointed at the second shelf of drives we had just installed.

"What?"

"We're going to have to send this back."

I looked closer. Greg was pointing to the right corner of the back of the drive shelf, which was clearly bent as if it was dropped. I quickly reviewed in my head receiving the equipment and putting it on a forklift jack -- nope, didn't drop anything.

Let's review:

(1) the power was fucked up, and needed to be fixed. (2) the fibre cables were too short(3) one of the trays was damaged

This is truly what it means to be in the weeds, but, it can quickly get worse, and it did just over a week later.