2 Replies - 918 Views - Last Post: 12 June 2013 - 07:08 PMRate Topic:

What has to be considered in order to keep a database consistent?

Posted 05 June 2013 - 02:15 PM

Hello,

I have developed a Winforms application which handles database requests, with the database lying on a local network server. Therefor I used LINQ to SQL.

Now I read about how to use LINQ's support for optimistic concurrency here. So I simply have to catch the according ChangeConflictExceptionand handle it.
I am actually not familiar with databases so I was wondering.

Is concurrency really completely solved this way.

What other scenarios should I consider in order to keep my consistent?

Replies To: What has to be considered in order to keep a database consistent?

Re: What has to be considered in order to keep a database consistent?

When referring to consistency in the terms of ACID properties; consistency is that the database should always be in a valid state after every single write (INSERT or UPDATE).

So the consistency of your database is defined by the requirements and rules you establish for it, so foreign keys constraints, proper data types and probably most importantly transactions (and their isolation levels). LInQ handles a lot of transactional work for you, but I think you should research into transactions and isolation levels.

A quick summary though: A transaction is used when you need an "all or nothing" approach to 2+ writes. So if you need to write to one table, and then update another, after the first write, the database is in an invalid state, until the update to the second table is done. If the write succeeds and the update fails, the database is in an invalid state (or more relevantly inconsistent state). Wrapping these two commands in a transaction will effectively isolate the commands; meaning they have only happened in the scope of the current execution. Committing the transaction will make it visible to all, and rolling the transaction back will erase all trace of it happening.

This may seem a rather trivial example, but if the update was actually an update to 1000000 rows, there'd be some time between the database being in consistent states.

Re: What has to be considered in order to keep a database consistent?

Posted 12 June 2013 - 07:08 PM

An often-cited real-world example of the use of transactions is double-entry bookkeeping. You always have to debit one account and credit another. (Well, at least one of each, to be accurate.) Suppose you're moving some money from savings to checking. You have to first debit your savings account and then credit your checking. If for some reason you manage to do the debit successfully, and then the attempt to credit the checking fails, you have to put the money back in the savings account. So a transaction is a way of requiring that two or more activities either all happen or do not happen at all.

A transaction has a begin phase (locks the necessary tables and/or records), a commit phase (posts changes to the locked tables and/or records), and an optional rollback phase (undoes all changes) if something goes wrong with the commit phase.

As Andy mentions, if you are updating a large number of rows, the database will not be in a consistent state until you get them all done. A transaction is a way of making sure that consistency is maintained, by undoing all of the updates if any one of them fails. (This would be a bad idea: look up "granularity" to see why.)

This all-or-nothing requirement is the first of the ACID properties: Atomicity. Atomic is a Greek word that means can't be cut--atoms are called atoms because when they came up with the word they couldn't be split. The other properties are Consistency, which Andy has already explained, Isolation, which means that one transaction can't be dependent on another one (each transaction has to maintain Consistency independently of any other transaction), and Durability, which says that the transaction has to persist. Typically, the Durability property is achieved by requiring the transaction to be posted to non-volatile media, i. e. "saved".

Now, concurrency is something different: it has to do with how to handle multiple users making changes to the same data, or "conflicts". "Optimistic concurrency" simply means that you will expect that there won't be a conflict, and will handle it if it happens. "Pessimistic concurrency" means that you expect that there will be one if you allow it, and so will not allow it. The mechanism for optimistic concurrency is typically throwing an exception if a confict occurs, and the mechanism for pessimistic concurrency is locking the stuff you are going to use to prevent anyone else from using it.

There are reasons for both. Locking stuff slows things down, resolving conflicts also slows things down. If you expect a lot of conflicts (in other words, you are pessimistic about the likelihood that conflicts will not occur), then it will probably require less overhead to lock stuff than it will to handle conflicts; if you do not (you are optimistic that they will not occur), the reverse will probably be true.