Delayed Durability – An Introduction

When first learning about the A.C.I.D.(Atomicity, Consistency, Isolation, Durability) principles of modern relational databases it can be tempting to think of them as a set of immutable laws, as unbending and inviolable as the laws of physics.

Once we look a bit deeper however we begin to see that all is not as it seems. Some of these properties are implemented along a spectrum of strictness, sometimes even allowing the developer to configure exactly how A.C.I.D. compliant a given system should behave in certain regards.

The reason for this configurability is relatively straightforward – strict A.C.I.D. compliance is computationally expensive. What’s more, under certain scenarios at least, strict compliance is also unnecessary.

One of the most interesting changes coming in SQL Server 2014 is that for the first time we will be able to control the database engine’s approach to transaction Durability using a new feature called Delayed Durability. This is similar, though not quite as configurable as the Isolation Level modifiers we can use to control how strictly SQL Server enforces transaction Isolation.

As we shall see in Part 2, taking advantage of the Delayed Durability option can lead to a substantial improvement to write throughput. Sticking with this post for now though – I want to provide some background material on the concept of delayed durability.

Why is Durability Important Anyway?

In ACID parlance, Durability is the principal that once COMMITTED, a transaction becomes permanent.

This seems simple enough, but lets take a look at a concrete example of why it would be very bad if this were not true in most systems.

An insurance brokerage system places a transaction to take out a £25 million insurance contract on an oil tanker. The database informs the brokerage application that the transaction has been committed successfully. The application then automatically issues paperwork advising the client that they are insured and initiates various other workflows relating to billing.

The importance of transaction permanence or durability here should be quite clear. Based on what the database has told us – we have advised the client that they are insured and have potentially set in motion a number of related actions in secondary systems. If we later found out that the transaction was not really committed – we would run in to a lot of problems very quickly – not least of all because our transaction has caused a cascading series of events and possible legal obligations that may well be impossible to unwind.

Write Ahead Logging

SQL Server achieves transaction durability by ensuring the client application is only notified of success once the transaction has made it to the transaction log on disk. This concept is known as write-ahead logging.The transaction will also be committed in memory and will eventually be flushed out to the data file but it is this idea of writing to the log file on disk *before* notifying the client that guarantees durability.

Although durability is an essential property of many systems it is unfortunately a very expensive safety mechanism. Unlike read operations that are free to make extensive use of lightning fast RAM, all write ahead logging operations must contend with the latencies involved in writing to a permanent storage system – often a spinning disk.

Even taking a fairly benign scenario – where we have a high-end storage array fronted by some sort of cache – our writes to permanent storage are going to be some orders of magnitude slower than a write to RAM. In systems that require extreme levels of write throughput it is not an exaggeration to say that IO on the transaction log is likely to be the bottleneck.

This situation is made worse in scenarios where we have large numbers of very small, discrete transactions that each need to be acknowledged in serial as they are written to disk.

Delayed Durability – What Happens When I Switch it On?

When Delayed Durability is enabled – either at the database or transaction level – SQL Server simply doesn’t wait for Windows to advise it on whether a write was successful. Writes will eventually be flushed out to the transaction log as a batch but the transaction will have long since completed as far as SQL Server is concerned.

This has a number of advantageous effects on write throughput (at the expense of durability):

1. The CPU can continue processing without needing to engage in monstrously long waits for the disk to commit. Less waiting means more useful work on the processor.

2. Because our writes are now bundled up into batches the IO operations themselves are considerably more efficient when compared to writing out each individual write transaction as it arrives.

With this in mind then – and assuming we are willing to accept the risks, it would be reasonable to assume that we should get a pretty sizeable performance improvement by switching Delayed Durability on.

Take a look at Part 2 to see just how big a difference and Part 3 for some guidance on whether Delayed Durability is really what you need in your life.