Trading time for quality to improve Exchange 2013 updates

No

Tony Redmond

Thu, 2013-08-15 06:00

Microsoft's "mea maxima culpa" posted to the EHLO blog following yesterday's withdrawal of the MS13-061 security patch must have been embarrassing for a proud engineering group. But the really interesting information is buried in the Q&A section at the end of the post, including an indication that Exchange 2013's quarterly update cadence might just be threatened.

This wasn’t the first problem with an update for Exchange. It probably won’t be the last either. What’s upsetting many customers is that no improvement seems to have occurred over the last 18 months as we have had a succession of updates and patches issued and withdrawn. Microsoft promised that the new servicing model would be much better because “the same code is already deployed in the Exchange Online service and has been validated against millions of mailboxes.” In other words, Office 365 would have discovered any lingering bugs that lurked inside an update before customers even had a chance to install the code.

Given MS13-061, you would be forgiven in doubting this statement. However, it still holds true because MS13-061 is not a cumulative update. It’s a patch and, as Microsoft explains, “Exchange Online does not deploy .msp patches into the environment; instead, Exchange Online deploys new full builds of the product (cumulative updates, if you will) on a regular release cadence.” What this means is that the servers running Exchange Online are regularly taken offline, reduced to bare metal, and completely reinstalled with the latest and greatest software, so there’s no necessity to apply security patches. Reimaging servers makes a lot of sense when you have to manage tens of thousands of servers. There’s no way that letting Windows Update do its stuff would work inside such a massive automated datacenter environment.

So Microsoft couldn’t detect the problem within Exchange Online. But they run some on-premises Exchange too, don’t they? The answer is that they do in the famous “dogfood” environment that is specifically designed to allow Microsoft engineers enjoy the fruit of their labors by using their own code in production. Microsoft regularly updates dogfood with new builds of Exchange. Alas, dogfood didn’t come to the rescue here either because Microsoft did not deploy MS13-061 into the dogfood environment. To be fair, Microsoft admits this oversight and says “Unfortunately, this security update did not get deployed into our dogfood environment prior to release.” Given the previous history of problematic updates such an omission is curious, to say the least.

At the end of the post, Microsoft poses a question that many customers would have asked:

You have told us time and time again that you were going to improve your testing procedures, and yet each time you have to tell us that you missed something. When will it end?

It’s a horrible situation for an engineering organization to have to answer a question like this because of the implicit admission that problems exist in their testing procedures “time and time again”.

That being said, Microsoft did as well as they could in answering the question. The most interesting comment being “we have recently made the decision to delay the release of Exchange 2013 RTM CU3 by several weeks to ensure that we have enough run time testing within our dogfood environment.”

Microsoft admits that additional testing means that the quarterly update cadence to which they aspired for Exchange 2013 might have to change. I think everyone who uses Exchange 2013 will breathe a deep sigh of relief at the prospect of higher quality updates. I would certainly trade time for quality any day. Achieving quality in releases and updates is just about the only way the Exchange development team can now rebuild its reputation.