Twitter outage: Not a hack but a cascading bug, says microblogging site

When a website or online service goes down, many people ordinarily turn to Twitter to vent their frustration and anger. Trouble is, the service that went down Thursday happened to be Twitter itself, leaving many users wondering where to turn.

Intermittent outages of the site occurred throughout the day, affecting some 140 million users worldwide. The first sign of trouble came in the morning when the client stopped working for all web users. Meanwhile, mobile clients were failing to show new tweets.

The microblogging giant posted an explanation later in the day, putting the problems down to a ‘cascading bug’ and not, as Twitter’s VP of engineering Mazen Rawashdeh said in the post, “due to a hack or our new office or Euro 2012 or GIF avatars, as some have speculated today.”

Rawashdeh summed up how the Twitter team felt about the crash, opening his post with the line, “Not how we wanted today to go.”

Explaining to users exactly what happened, Rawashdeh wrote, “A ‘cascading bug’ is a bug with an effect that isn’t confined to a particular software element, but rather its effect ‘cascades’ into other elements as well.”

He continued, “One of the characteristics of such a bug is that it can have a significant impact on all users, worldwide, which was the case today. As soon as we discovered it, we took corrective actions, which included rolling back to a previous stable version of Twitter.”

It took the company several hours to get the site up and running again, with Rawashdeh adding that his team are now in the process of carrying out a comprehensive review to make sure it doesn’t happen again.

However, he was keen to point out that for the vast majority of the time, Twitter runs without a hitch. “For the past six months, we’ve enjoyed our highest marks for site reliability and stability ever: at least 99.96% and often 99.99%,” he wrote. “In simpler terms, this means that in an average 24-hour period, twitter.com has been stable and available to everyone for roughly 23 hours, 59 minutes and 40-ish seconds.”