Database Transactions

I am often asked when it makes sense to use Commit with your database code. The answer is always “it depends”. To go into more detail, let’s talk a bit about transactions.

A transaction is a collection of changes to the database. It can be a single change, such as an INSERT statement, or it can be many changes such as a combination of INSERT and UPDATE statements.
When changes are made within a transaction, they are generally not permanent (and not visible to other connections looking at the database) until you commit them*. In order to use transactions effectively, you have to understand how they work with the database engine you are using. Some databases start a transaction for you automatically. And some will also automatically commit for you after each change so you don’t have to actually commit manually.

Note: Some databases let you use a “read uncommitted” mode where data that has not been comitted can be viewed by other connections. This is usually a special case use for specific performance reasons.

Generally, you do not want to commit after every database change as seen by the following Banking example.

Banking Example

You want to transfer $10 from a savings account to a checking account.

Here are the starting amounts:

Savings: $100

Checking: $50

If you remove $10 from savings and commit, you have a database that looks like this:

Savings: $90
Checking: $50

There is now $10 in limbo. The next step is to add the $10 to checking and commit resulting in this:

Savings: $90
Checking: $60

Having that $10 in limbo is a big gamble. If your app or database crashes before it can do the last step, you end up with the $10 being lost and your database integrity compromised.

Transactions prevent this problem. Look at the example again.

Here are the starting amounts:

Savings: $100
Checking: $50

You remove $10 from savings and you add $10 to checking and then you commit just once after both steps are done. The database now looks like this:

Savings: $90
Checking: $60

Because you changed the values in a single transaction, the data was never permanently in the database in its intermediate form. If something bad happened before you were able to commit then the database would have remained in its original state with $100 in savings and $50 in checking.

Transactions give you database integrity. But there is another good reason to use them: performance.

Performance

If you commit after every database change, you force your database to do a lot of work behind the scenes to make the data permanently available. This is not a big problem when dealing with small amounts of data, but it can really add up when dealing with lots of data.

For example, if you are importing thousands of rows of data into a specific table, committing after each row could cause your import to take several minutes because of all the overhead. Switching to a transaction that only commits at the end (or even every 1000 rows) could result in a tremendous improvement. I’ve seen times drop from several minutes down to several seconds when using a transaction in this manner.

Transactions give you better performance.

Using Transactions

So now you know that transactions are often the way to go, how do you use them? Unfortunately, that varies depending on the database engine.

With SQLite (SQLiteDatabase), if you do not explicitly start a transaction, then your command is put in an implicit transaction which essentially means that each database change is made permanent for you automatically. As I’ve shown you above, this may not be what you want.

With SQLite, you use the “BEGIN TRANSACTION” SQL command to start a transaction. Other database engines have similar commands. When you are finished with the transaction and want to make the changes permanent, you send the “COMMIT” command. Or if you need to cancel the transaction, you send the “ROLLBACK” command.