Patterns of Service-oriented Architecture: Database Transactions

Dave Copeland

November 22, 2017
- Washington, DC

I’ve been lax in updating the “Patterns of Service-oriented Architecture” series, so here’s a new one on a time-honored and critical technique: Database Transactions. This is a powerful feature of most SQL databases that allows you to apply a series of changes to the database in an “all or nothing” type of approach.

Intent

Ensure all data written to a database is written, or none of it is.

Motivation

Many simple operations involve writing data to more than one table in the same database. If a failure should occur in between those writes, it could leave the database in a half-updated state that is potentially invalid and likely difficult to recover from.

For example, suppose a customer changes their shipping address. Suppose further that we store the shipping address used for each order, and when the customer updates their shipping address, we want to update the shipping address of any unshipped orders to use the new address.

If we were to experience some sort of failure while updating an order’s shipping address, we will have successfully updated the customer’s shipping address, but one or more orders will ship to the old address.

If we’re using a remotely modern relational database, this can be solved by wrapping the operation in a database transaction, assuming all database operations go against the same database.

The database ensures that all writes happen or none of them do. If nothing goes wrong, the transaction is said to be committed and the data changes take effect, becoming visible to everyone else using the
database. If something does go wrong, it’s rolled back and it’s as if the changes never took place. No one will have seen them, and they will never be made. In this case we will have experienced an error, but at least
it’s one with predictable behavior. If the error is due to something transient, and we are running code like this in a Background Job, that job can safely retry, because the transaction
ensures the operation is idempotent.

You may also want to roll back explicitly as a means of undoing a series of complex database insertions that might not apply once
things get to a certain state. For example, you might have logic that requires that certain orders not be shipped to certain
addresses. In the logic above, you might only discover that while iterating through the orders and after several updates to the
database have happened. You could initiate a rollback inside the loop to undo all that and exit the routine.

Applicability

Operations that are entirely based on writing data to the same data store should be wrapped in a database transaction.
This can be a critical piece in implementing an asynchronous transaction, or in relying on idempotency keys.

Structure

How this works depends on your language. In our example above, Ruby on Rails provides a block syntax that means “run all this code inside a transaction”. Other languages that lack this convenience require a
bit of care to make sure you explicitly commit the transaction. Here’s a similar example in Java:

Consult the documentation of your database access library to know how to properly use transactions. Also note that most application frameworks provide a more expedient way to do this (in the case of Java, most Java frameworks provide something higher-level than what we just saw, such as running all controller actions in a transaction).

Anti-Patterns and Gotchas

Transactions have a very specific applicability. Any irreversible action taken inside a transaction won’t be rolled back if the transaction is rolled back. For example, making an API call inside a transaction won’t be undone during a rollback.

Performing long operations inside a transaction will have a deleterious effect on the database. In order to make transactions work the way they do, the database will consume extra resources while the transaction is open (meaning that it has not been committed nor rolled-back). This can create contention inside your database, usually by locking certain tables or rows. Other connections operating on those tables or rows may have to wait for the transaction to complete, and possibly time out as a result. Transactions also consume CPU and memory. The longer a transaction is open, the worse it is for your database and its other users.

This means that any operation like interacting with other data stores, queuing background jobs, or sending messages should be avoided while inside a transaction. This affects the design of your solutions, so you want to
think this through ahead of time when you are working out how to implement a feature and not just after the fact.