Between 19:29 and 22:35 UTC on 02 May 2019, customers may have experienced connectivity issues with Microsoft cloud services including Azure, Microsoft 365, Dynamics 365 and Azure DevOps. Most services were recovered by 21:40 UTC with the remaining recovered by 22:35 UTC.

Azure status page writer

One of the questions that came up in the aftermath was what happens to the automated and recurrent flows that have either triggers or actions impacted by an outage? (thank you, Jerry, for asking).

I can’t think of a better person to answer this than Stephen Siciliano, a Principal PM Director for Microsoft Flow.

For flows we can divide the impact to triggers and actions. For triggers:

For automated flows that poll these flows would have failed to start new runs during the interruption. However, the way that polling triggers work is they check for new data every-so-often, this means they automatically “heal” when the system is healthy — they would simply process all of the events in the window since they last successfully ran, albeit significantly delayed. For webhook triggered automated flows, those events would have to be resent.

For instant flows triggers (e.g. flows manually started by users on-demand) they would have immediately received an error upon trying to trigger the flow. Each user will need to retry running the flow. Since the triggering itself failed, there is no way for an admin to ‘resubmit’ this failure.

For scheduled flow triggers, there may have been intervals that were skipped. These flows automatically resume upon the system healing.

For actions it would be possible for a flow to have failed in mid-execution if it had previously been triggered but the actions begin failing. Flow makers may want to Resubmit failed flow runs. A maker can see the failures across their flows by going to the Alerts icon at the top of the Flow portal and selecting the flows runs which failed (they don’t need to inspect each flow individually).

And for all nay-sayers out there I can’t think of a better way to express my attitude towards what happened in Azure:

When #Office365 goes down, millions of system admins the world over scream "see, our infrastructure is more reliable!".

You know what I get to do now? NOTHING IT WILL BE FIXED.
You know what I had to do when on-premises infrastructure failed? STAY UP ALL NIGHT FIXING IT. pic.twitter.com/koaHRSRJ0R