Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

if anyone could please help me understand why the following transactions deadlock?
I am providing the 2 transactions and the indexes for each table:

Transaction 1:

SELECT blockId, blockData, syncedToGeneration, itemId
FROM blocks
WHERE indexId=@indexId
and itemId IN (SELECT itemId
FROM KeywordReferences
WHERE indexId=@indexId
AND keywordRootId IN (360,4498,359,1229))

Transaction 2:

UPDATE blocks
SET blockData=@blockData,
syncedToGeneration=@syncedToGeneration
WHERE indexId=@indexId
AND blockId=@blockId

(please notice that the 'IN' section in the first transaction is much longer and contains about 30 values, which i truncated for sake of readability)

The blocks table has the following indexes:
- indexId->blockId (Clustered)
- indexId->itemId

That's how SQL Server works in the default installation works if I'm not mistaken. Readers block writer and writers block readers. You can probably fix that by using the NOLOCK hint in the SELECT statement (or switch to snapshot isolation to make it behave like most other DBMS)
–
a_horse_with_no_nameApr 22 '12 at 8:29

I have actually implemented snapshot isolation to work around this, but am still very much keen to understand exactly why this DEADLOCKS. as reader and writers lock eachother, I still fail to understand why the queries deadlock.
–
Amit Ben ShaharApr 22 '12 at 9:24

Does seem like it could be avoided by the reading transaction releasing its PAGE: 5:1:91669 lock before requesting the PAGE: 5:1:91671 lock. Wonder why it doesn't as is at read committed isolation level. What do both execution plans look like? Have you disabled row locks on the index or just using defaults?
–
Martin SmithApr 22 '12 at 9:52

I am using defaults, what would be the impact of disabling row locks? (performance/stability)
–
Amit Ben ShaharApr 22 '12 at 12:37

1

@AmitBenShahar - No. It was just a request for more information, not a recommendation. You can use profiler to capture the execution plans or retrieve them from the DMVs. See answer here
–
Martin SmithApr 22 '12 at 15:46

3 Answers
3

This looks like a typical case of table scan (possible missing indexes, but could be more). The SELECT has chosen a page lock granularity, indicative of a large scan (on blocks table). Also note how all the locks are on the same resource rowset, the clustered index, another indication that the SELECT does not use a selective secondary index to locate the rows. You seem to had hoped that the nonclustered index on (indexId, itemId), which you describe as existing, would had beed chosen and perhaps, had it be chosen, the query would not deadlock. As things stand it looks like is ignored most likely because of selectivity (it hits the tipping point). Is difficult to advise for a better index w/o knowing your workload and data model. At the very least I can say that the requirement to project blockData, syncedToGeneration in the SELECT list makes the index non-covering, so perhaps adding them as included columns would be a first step. But w/o knowing the size of these columns (blockData seems like the name of a large one...) is, again, difficult to predict the outcome.

To exemplify why the scan is the culprit, consider any two rows updated by the UPDATE statement. The UPDATE will update them in the order (indexId, blockId) but the SELECT scan will see them in the order (indexId, itemId). Unless there is a functional dependency between itemId and blockId that guarantees that the two indexe keys return all the rows int he same order, there will always be some pair of rows for which the order is reversed between the clustered index and the non-clustered index. The gist of it is that because the SELECT does a scan, it is guaranteed to visit every row. Therefore any pair of rows updated by the UPDATE will be visited by the scan. There are four possibilities:

both rows UPDATEd are behind (in the clustered index key order) the current position of the scan (the current row being scanned). This is safe, the UPDATE blocks and waits for the scan.

both rows UPDATEd are ahead of the position of the scan. Also safe, the scan blocks and waits for the update.

the first row UPDATEd is behind the current position of the scan, but the second is ahead. Safe, the update blocks.

the first row UPDATEd is ahead of the current position of the scan, but the second is behind. This is a guaranteed deadlock. Note that the further appart the rows are in the clustered index order, the higher the probability of one being ahead and one being behind any scan position.

This situation will always arrise with multi-row updates in the presence of scans. The solution is usually to replace the scan with a more selective operation, which has the added benefit of improving performance. If the SELECT and the UPDATE only visit the rows they really desire to process then a deadlock can occur only if two operations occur concurrently on the same logical data (ie. the same item is being read and updated at the same time) but most times on OLTP systems this is prevented by the business workflow. It can still happen, but with a much reduced frequency. As a side note, even when operating on disjoint logical items it is still a game of probabilities due to hash collisions.

One easy way out is, indeed, deploying snapshot isolation. But the underlying problem (the scan) will be only masked, not alleviated. Other problems arising from the scan (poor performance, high latency and slow response time, buffer pool pollution etc) are stil going to be present. You should fix the scan problem and deploy snapshot isolation.

P.S. note that the fact that the scan uses PAGE granularity locks has no relevance. Had the scan be using ROW granularity the deadlock would simply occur on rows instead of pages. But the fact that PAGE granularity is chose is significant as a telltale for a scan.

According to his comment, he is already using snapshot isolation.
–
a_horse_with_no_nameApr 23 '12 at 6:57

+@a_horse_with_no_name I have applied snapshot isolation and it did solve the immediate issue, but i am very much interested in removing any scans - in the presented SELECT there are 71 itemIDs, probably returning somewhere are 150 rows, and the table has about 150,000 rows, so as much as I don't fully understand the tipping-point mechanics I'm not sure that is the problem - is there a deterministic way to see if there is a scan and why?
–
Amit Ben ShaharApr 23 '12 at 7:44

I think that the deadlock is due to the engine acquiring row locks on each of those itemid values, one at at time. Each query is working on an unsorted list of itemid values and acquires those locks in a different order, allowing them to get intertwined and, eventually, each query winds up with a lock the other query already has and has a lock the other query wants. The other thing is that SQL might be trying to promote the locks to a table lock. SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED on the connection that SELECTS might help, or you might get rid of the deadlock by dropping the locking granularity to page or even table.

This makes sense, as the row locking order may be different in transaction1 (order of the IN clause) for locking order in the 'where' clause in transaction 2.
–
Amit Ben ShaharApr 22 '12 at 12:42

But why is the SELECT blocked at all? Isn't that the whole purpose of snapshot isolation that readers are no longer blocked by writers?
–
a_horse_with_no_nameApr 22 '12 at 17:23

@a_horse_with_no_name yes, I have only now introduced snapshot isolation that resolved the issue but am still looking to understand the problem more intimately
–
Amit Ben ShaharApr 23 '12 at 7:46

@AmitBenShahar: ah. I thought you get the blocking even though you have snapshot isolation.
–
a_horse_with_no_nameApr 23 '12 at 8:02

@AmitBenShahar: In SQL Server, under the default conditions, writers block readers. With snapshot isolation, as with heavy use of NOLOCK hinting or changing the TRANSACTION ISOLATION level (which I prefer), writers do not block readers.
–
darin straitApr 23 '12 at 11:19