Blog Post

Microsoft Azure afflicted again

Microsoft Azure is having troubles again, the company acknowledged Tuesday night. According to a status update, Azure’s Traffic Manager was experiencing “a multi-region partial performance degradation.”

Starting at 18 Nov, 2014 16:34 UTC a subset of customers using Traffic Manager may experience latency while attempting to create or modify their Traffic Manager profiles. Customers may see 20 to 30 minutes for their profiles to become effective. We have identified a potential root cause, and are working to mitigate the issue. The next update will be provided in 60 minutes.

Traffic Manager applies policy to Domain Name Service (DNS) queries to route traffic most efficiently to the appropriate endpoints, according to Microsoft.

As of Wednesday morning at 8:30 a.m. EST, the status page reported that all “core servcies” were running normally but also mentioned “issues” impacting VM performance for some customers in Europe and an “extended recovery” in process for VS Studio Online customers who use Application Insights to monitor their applications. Per a linked blog post:

Application Insights Services were impacted by Azure Storage Services. At this moment issue has been resolved by our partner Azure Team but customer will see a data gap during impacted window starting from 11/19/2014 01:00 UTC.

We continue to process the data and update this post once fully recovered.

We apologize for the inconvenience this may have caused

Application Insights Service Delivery Team

Details about the problem or its causes and how widespread it is were sketchy. And users took to social networks to complain that it took the status page a long time to reflect the problems they were seeing. There was also grumbling about a broader lack of transparency not only in this case, which to be fair is brand new, but also about past snafus. And that is more concerning.

Lydia Leong, Gartner’s cloud analyst, for example, pointed out that as far as she knows, Microsoft has yet to come out with a clear post mortem for Azure problems that cropped up in August. In February 2013, Azure was laid low by an expired security certificate, but at least that cause was made public.
A lack of clarity — or even a perception of that lack — about underlying issues, is certainly not good for a company trying to woo enterprise accounts and business applications to its cloud and catch up with public cloud leader [company]Amazon[/company] Web Services.

Be hard to ever trust that status page again… huge worldwide outage and the status page tells us that everything is great :) #Azure