Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

This is a humble question asked in the spirit of increasing my knowledge; kindly be gentle in your response.

As a long-time application developer, I know at some level what a transaction is (I use them all the time). Leaving aside transaction isolation levels for the moment, at a high level a transaction allows a block of work to be completed entirely or not at all, and allows for a certain amount of isolation from other database-modifying activity.

I also know what (in various databases) a lock is, or at least how one behaves (if I lock a table in some way explicitly, then no other process or thread can update anything about that table).

What I am most distinctly not clear about is: in various databases, when I explicitly lock a row or a table, am I employing the exact same constructs that are used by the database's transaction facilities under the covers to make the transaction work properly?

That is, it occurs to me that in order for a transaction to be atomic and isolated, it must be doing some locking. Is this transaction-initiated, tranasction-hidden locking the same sort of locking that various databases let me access through constructs such as SELECT FOR UPDATE or explicit LOCK commands? Or are these two concepts completely different?

Again, I apologize for the naïveté of this question; I am happy to be pointed to more foundational sources.

5 Answers
5

when I explicitly lock a row or a table, am I employing the exact same constructs that are used by the database's transaction facilities under the covers to make the transaction work properly?

Yes. If that would not be true, then your own 'locking' would only be scoped to other similar 'locking' and not interact with the engine own locking. So you would lock a row in a table so that it cannot be locked by another application in the same manner, but your lock would be ignored by the engine itself. These semantics are seldom desired. Most of the time an application locking a row means 'lock it against any means of access/modify'. Side note that locking mechanisms that are strictly application specific do exists, because they are useful. For instance SQL Server has application locks.

it occurs to me that in order for a transaction to be atomic and isolated, it must be doing some locking.

Locking is one means to achieve this. The major alternative is versioning. Nowadays most databases support both (which also means that if you 'lock' a row in the app but another transaction uses versioning to read the row, it will read it because your locking does not block versioned reads).

You are sort of circling around a concept known in the database implementation world as 'two phase locking protocol'. the linked Wikipedia article is a good starter. If you want to read more detailed explanation about this topic I recommend head to the library and ask for a loan on Transaction Processing: Concepts and Techniques. Pretty much every database out there is, at its core, an implementation of that book.

Aha! Now we're talking. Indeed, lurking at the back of my mind was MVCC. Thank you for the well-articulated answer, the great references and for taking the time to really dig into my question.
–
Laird NelsonJan 23 '14 at 17:46

In very simple terms, a transaction is a sequence of work that has to be performed as a single logical unit in its entirety and must maintain ACID properties.

Any RDBMS must provide "Locking facilities" that can be used to complete the transaction in its entirety, there by preserving the transaction isolation and its durability. This ensures the physical integrity of the database.

Most importantly, by default - transactions are managed at the connection level. So when a transaction is started on a connection, all the T-SQL statements (S/I/U/D) executed on that connection are a part of the transaction until the transaction ends. (MARS is handled differently)

Now back to your questions :

when I explicitly lock a row or a table, am I employing the exact same constructs that are used by the database's transaction facilities under the covers to make the transaction work properly?

Yes. This means that you have to be careful in determining the sequence of data that will be modified and which will leave the database to a consistent state. In other words, your DML operation should leave the database in a consistent state which confines to your organization's business rules. Still, the RDBMS (here SQL Server) can enforce the physical integrity of the transaction.

From BOL: Locking and row versioning prevent users from reading uncommitted data and prevent multiple users from attempting to change the same data at the same time. Without locking or row versioning, queries executed against that data could produce unexpected results by returning data that has not yet been committed in the database.

Is this transaction-initiated, transaction-hidden locking the same sort of locking that various databases let me access through constructs such as SELECT FOR UPDATE or explicit LOCK commands?

Everything in sql server is contained in a transaction. Its when you access your data, the RDBMS has to take locks depending on the isolation level and the Operations you are performing on your data. Check this answer for more details.

I'd say transactions are part of database "interface" in a sense is that you as a developer decide when to begin, end, what to do within the scope of transactions, etc. Locks, as I see them, belong to implementation details and used for access synchronization to different of objects. In most cases the engine itself decides what and for how long should be locked . There are many system level locks that cannot be directly manipulated (for instance, the engine may lock certain memory areas ). Even when it comes to DML locks, many of them happen behind the scene (for example, to ensure referential integrity Oracle and, as far as I remember, SQLServer may put a lock on a corresponding row in master table if new record is inserted into details table) as a result of DML statements issued within transaction.

When it comes to transactions you can expect more or less consistent behaviour from any RDMS that claims to comply SQL and support transactions, but when it comes to locks, almost every vendor uses different strategy and terminology. The common part across all RMDS as far as I can tell is that concurrency between transactions is defined by isolation level whereas concurrency between locks is controlled by lock types (shared, exclusive, etc).

To summarize, locks are low level mechanism to control consistency of objects and concurrency . Locks can be issued during SQL statements execution . Depends on transaction isolation level implementation, the engine may put different types of locks on affected objects (rows, group of rows, indexes, etc). There is a limited number of command available to manually issue locks (SELECT FOR UPDATE, LOCK). DML locks can be escalated (depends on RDMS, for example, in SQLServer row->page->partition->table ). Locks also can be issued by the database engine during connection initiation, backups, restore, procedure/trigger/function/etc recompilation, startup, shutdowns, etc.

I'm not sure if that answers your question, but I hope it makes sense.

Thank you for your comment. You're definitely the closest so far. I'm still trying to see if transactions are always implemented in terms of the locks that are used by, say, explicit LOCK or SELECT FOR UPDATE statements, or via some other mechanism.
–
Laird NelsonJan 23 '14 at 16:28

As far as I know, BEGIN TRANSACTION itself issues no locks. Locks will appear after DMLs within transaction.
–
a1ex07Jan 23 '14 at 16:31

Clarification - I meant BEGIN TRANSACTION itself doesn't create DML locks; it should in fact issue some internal locks because it has to allocate resources, add an entry to system table[s] (if any) that holds active transactions, etc.
–
a1ex07Jan 23 '14 at 16:38

I will use SQL Server jargon, but the concepts should be the same for other vendors:

Every command you execute is executed inside a transaction.
That transaction can be opened explicitly with BEGIN TRAN, or implicitly, by the database engine.
The reason an implicit transaction is opened is that the engine still needs to maintain ACID compliance and the ability for a rollback.

When you do a SELECT FOR UPDATE, it just means that while the transaction is in place, it will hold a certain lock.

Thank you for your comment. That much I know. But my question is still: when that transaction is opened, is its isolation accomplished by holding locks of its own? If so, are those locks the same kinds of locks I can explicitly acquire? Or does the transaction achieve isolation through other means?
–
Laird NelsonJan 23 '14 at 16:02

1

Yes, this is the same mechanism. Isolation is achieved using locks in both modes, the same locks you can explicitly acquire. The difference is that if you don't explicitly open a transaction, the locks will be released when the command is finished, whereas in an explicit transaction the locks are held until you commit (not 100% accurate because of isolation levels, but that's the general idea).
–
Matan YungmanJan 23 '14 at 17:13

Thank you for your comment. The reason I'm asking my question is that I read somewhere that some databases use MVCC as the means to achieve ACID transactions, which would seem to me to be a lock-free way of doing it. In such cases, then, I am not clear on when I would ever want to issue a lock explicitly. But that's probably a separate question. :-)
–
Laird NelsonJan 23 '14 at 17:43

Lock are necessary and they make the database. This prevents data from being corrupted or invalidated when multiple users try to read while others write to the database.
Transactional isolation is usually implemented by locking whatever is accessed in a transaction.
Bad design applications make a big use of the database lock concept :) !! So to avoid lock concentrate on your FK and data layout.

It's all about ACID: - read this and it will clear your mind!
ACID is a set of properties that you would like to apply when modifying a database.

**Atomicity

Consistency

Isolation

Durability**

A transaction is a set of related changes which is used to achieve some of the ACID properties. Transactions are tools to achieve the ACID properties.

Atomicity means that you can guarantee that all of a transaction happens, or none of it does; you can do complex operations as one single unit, all or nothing, and a crash, power failure, error, or anything else won't allow you to be in a state in which only some of the related changes have happened.

Consistency means that you guarantee that your data will be consistent; none of the constraints you have on related data will ever be violated.

Isolation means that one transaction cannot read data from another transaction that is not yet completed. If two transactions are executing concurrently, each one will see the world as if they were executing sequentially, and if one needs to read data that is written by another, it will have to wait until the other is finished.

Durability means that once a transaction is complete, it is guaranteed that all of the changes have been recorded to a durable medium (such as a hard disk), and the fact that the transaction has been completed is likewise recorded.

So, transactions are a mechanism for guaranteeing these properties; they are a way of grouping related actions together such that as a whole, a group of operations can be atomic, produce consistent results, be isolated from other operations, and be durably recorded.

Thanks for your comment. I'm at least passably aware of the properties of ACID. What I'm still not clear on is: do transactions implement ACID by using the same kinds of locks that I can use directly via explicit LOCK statements, or do they do it using some other mechanism?
–
Laird NelsonJan 23 '14 at 16:25

Databases offer a number of transaction isolation levels, which control the degree of locking that occurs when selecting data.Serializable,Repeatable reads,Read committed,Read uncommitted.
–
Up_OneJan 23 '14 at 16:34