A CTO Must Never Do This…

A couple years back I was contacted to look at a very strange problem.

The firm ran flash sales. An email goes out at noon, the website traffic explodes for a couple of hours, then settles back down to a trickle.

Of course you might imagine where this is going. During that peak, the MySQL database was brought to its knees. I was asked to do analysis during this peak load, and identify and fix problems. Make it go faster, please!

First day on the job I’m working with a team of outsourced DBAs. I was also working with a sort of swat team chatting on SKYPE, while monitoring the systems closely.

Then up popped one comment from a gentlemen I hadn’t worked with. He insisted there was contention for a little known MySQL resource called the AUTO_INC lock. Since I wanted to know more, I asked who the guy was and to my surprise he turned out to be the CTO.

[quote]The CTO was tuning and troubleshooting the database![/quote]

Wow, that’s a first. I thought I’d seen it all. A CTO is normally overseeing technology & the team rather than crawling around in the trenches on the front line.

This all raised some important points

1. The app was having major growing pains
2. Current architecture was not scaling
3. Amazon elasticity was not helping at the database layer
4. People & process were also failing, hence the CTOs hands on approach

It was shocking to see a problem deteriorate to this point, but when you consider the context its understandable. A company like this is struggling with hypergrowth to such a degree, that each day seems like a hurricane storm. With emergency meetings, followed by hardware & application emergencies, trouble seems constant. It can be very difficult to step back and see the larger picture.

The takeaway from this experience…

o Amazon EC2 can’t do it all – consider physical servers for disk intensive apps
o MySQL still has some real scalability limitations
o use technology for its intended purpose – MySQL isn’t great for queueing
o A CTO tuning the database means problems have deteriorated too far

Post navigation

Reminds me when a recruiter once contacted me about a job and attempted to woo me with the phrase: “Even our CTO contributes to the code!”. That might have been understandable if the company had 10 employees (at that point a CTO title is somewhat of a formality), but this one had a team of nearly 100 developers. The recruiter couldn’t understand why I suddenly lost any desire to work for them.

hullsean

LOL. Recruiters… we all have a love/hate relationship with them I guess. I still talk to some regularly who don’t know the difference between a developer and an operations person. I could never understand how someone could have a job day in and day out to place people in positions, for which they hadn’t made the tiniest effort to understand.

Rob Smith

When the shit is hitting the fan, why ignore any technical resources. Title or no title, the guy knows something about MySQL and wants to contribute. It’s very different then being the DBA day to day.

hullsean

Yep, for sure Rob.

MitchN

I’m a CTO, and also wear the DBA hat when the situation calls for it. Mostly it’s because:
A – I am not allowed by the business to employ a DBA, so when shit hits the fan someone has to fix it. The house-full sign is up, so we have to use the resources we have and if I’m one of the few that can tinker with MySQL, rebuild masters/slaves in a complex multi-master environment, then I’ll do that.
B – Whilst being a CTO, I do also enjoy keeping my technological skills up to date. I do limit the input to being predominantly R&D and DBA tasks, but I don’t sit in an ivory tower and rule from above.

Perhaps it would be better not to have to do this, but at the moment it works for us and it works for me!

hullsean

All good points Mitch. I’ve never been a CTO myself. In some cases I’d find it refreshing to see a CTO get his hands dirty, for sure. In this case it was definitely a symptom of a larger set of problems, some architectural, and some wrong tool for the job.

http://twitter.com/alexgorbachev Alex Gorbachev

I don’t think CTO’s hands-on activities themselves should be the universal indicator of something wrong. A significant part of CTO responsibilities is providing leadership to his organization. There are multiple scenarios when hands on collaboration with a group or a person who is several levels below a leader can we leveraged effectively. Leadership at scale doesn’t have to be top-down. Yes, top-down approach (you lead your direct reports and help them lead theirs) is the universal model of scalability but identifying groups/individuals/projects/problems that are exceptional (exceptionally talented/critical/risky/demanding/…) and gravitate more focus there is appropriate. Of course, everything requires moderation and that’s very difficult in an whack-a-mole style environment.

hullsean

@twitter-65913:disqus Thx for your very reasoned & seasoned advice. I hear you about balance between top-down vs hands on and obviously you’re the CTO not me! In this war story it was a standing problem that hadn’t been resolved. Although we were bumping into real MySQL locking problems, the larger problem was twofold. Using MySQL for queuing (the famous ORDER BY RAND() was causing a lot of problems) was causing major headaches, and developers were resistant to rearchitecting this. Secondly the application was write intensive during the flash sales, and we all know Amazon EC2 has serious disk I/O bottlenecks. As always though, your mileage may vary. Thx again for comments Alex.

As always thx for the feedback Gautam. Always excited to see folks
reading, and comments very much appreciated!

Certainly hear what you’re saying, and as a sort of caveat, I’m of
course not a CTO myself.

What I was hinting at was delegation of power. For example, the
military where there are Lieutenants, Captains, Majors, Colonels, and at
the top Generals, you have a hierarchy of command. My point was that
the General is more tactical and strategic, and when situations find him
on the front line & in the trenches, something has broken down,
either with the chain of command or execution and so forth.

http://twitter.com/gautamg Gautam Guliani

For small to medium shops (with tech departments of less than 100 people, let’s say) and a highly technical product, power of a CTO (or any other senior technologist) should derive from her “distance from code”. In other words, CTO and other senior technologists should be as hands on as possible. Otherwise the hierarchy ends up being too much of an overhead.

As an analogy, think of a submarine (or the starship Enterprise in Star Trek). There is a clear command hierarchy and delegation of power but every one up the chain of command must know how to operate critical ship functions in case of a crisis. Otherwise they end up dead in the water (or space).

hullsean

We don’t want to be dead in the water, definitely not!

http://www.themusingsofthebigredcar.com/ JLM

.

As a former professional soldier I must disagree with you.

The tactical commander, of any rank, is where the point of the spear is touching the enemy because that is where success is being created.

There is a huge difference between strategy, tactics, objectives and operations. Most folks, outside the military, have no clue on this subject.

As is said: “The first casualty of contact with the enemy is the plan.”

A well trained military unit will know what to do even when the plan goes to shit. Most other armies go to defense, circle the wagons and hunker down.

The American army improvises, overcomes and drives on — they stay on the offensive.

Here is great appreciation of just that fact from the perspective of a French infantryman in A’stan.