There are quite a lot of differences between having an on-premise data center and using the cloud. One of these differences is the (guaranteed) uptime and the latency between the different servers. When creating your local on-premise datacenter you will have a pretty stable network connection between the different servers and it’s probably really fast. The cloud can be pretty fast also, especially when you are located in the same datacenter/container. However, you don’t have any real guarantees on where your stuff gets deployed, let alone the latency or connection between the servers.

This might lead to some small problems when you have to hit a different server, like when accessing files from storage or querying a database. This is why the Microsoft Patterns & Practices team has published the Transient Fault Handling Application Block. This application block states there are several transient exceptions which can occur when you try to access a service. These exceptions are known to sometimes be automatically be resolved over a small period of time, therefore retrying within a small period of time might fix the problem.

Basically, all this application block does is helping you add a retry-mechanism so you don’t have to worry much about these transient errors yourself. There are a couple of retrying strategies available out of the box when using the Entity Framework. The SQL team has also provided a new retrying mechanism to be used with the Elastic Scale libraries.

When you have downloaded the Elastic Scale sample applications you’ll have access to the SqlDatabaseUtils in which you will find the retry policy defined for the sample application.

If an exception occurs within the scope of this policy, it’ll check if the exception is transient. If so, the command is executed again in 5 seconds. To determine if a SqlException is transient the policy checks if the error number matches one of a specified list. Below is an excerpt of the decompiled piece code.

When executing this piece of code and you are faced with a transient SqlException, the request is executed again to the same connection/context until the constraints of the policy is met.

The code runs a bit different in my project, since I’m not creating the database connection and the database context in the same place. The reason for this is because sometimes I have to do a query/command on multiple shards and the so I had to resort to a small work around to get the proper connections and database contexts.

One of my connection managers now contains the following piece of code

You might notice a rather strange piece of code over here in the catch block and the ContinueWith method. This has a reason though. When I ran my tests on this piece of code it became clear to me the exceptions thrown within the retry policy scope were swallowed, even when they weren’t transient exceptions. As you can imagine, this isn’t a desirable scenario so I had to think of a work around for this. Now, when an exception occurs, I’m able to store it locally and throw it later on, out of the scope of the retry policy.

Using the above examples it’s rather doable to create a stable connection to the database. Hope it helps!