Even with all the data and index pages already in memory, the update took over 16 seconds to update just 1,000 rows, performing over 52,000 lob logical reads (nearly 16,000 of those using read-ahead).

Why on earth is SQL Server reading LOB data for a statement that only updates a single integer column?

The Execution Plan

The execution plan for test 3 is a bit more complex than before:

The bottom level of the plan is exactly the same as we saw with the non-unique index test. The top level has heaps of new stuff, which I will address shortly.

Now, you might be expecting to find that the Clustered Index Scan is reading the lob_data column (for some reason). After all, we need to explain where all the LOB logical reads are coming from.

Yet when we look at the properties of the Clustered Index Scan, we see exactly the same as before:

SQL Server is still only reading the pk and some_value columns . So what is doing all those reported LOB reads?

Updates that Sneakily Read Data

We have to look as the Clustered Index Update operator before we see the LOB column in the output list:

[Expr1020] is a bit flag added by an earlier Compute Scalar. It is set true if the some_value column has not been changed (part of the non-updating updates optimization I mentioned earlier).

The Clustered Index Update operator adds two new columns, the lob_data column, and some_value_OLD.

The some_value_OLD column, as the name suggests, is the pre-update value of the some_value column.

At this point, the clustered index has been updated with the new value, but we haven’t touched the nonclustered index yet.

An interesting observation here is that the Clustered Index Update operator can read a column into the data flow as part of its operation.

SQL Server could have read the LOB data as part of the initial Clustered Index Scan, but that would mean carrying the LOB data through all the operations that occur prior to the Clustered Index Update.

The server knows it will have to go back to the clustered index row to update it, so it delays reading the LOB data until then. Sneaky!

Why the LOB Data Is Needed

This is all very interesting (I hope), but why is SQL Server reading the LOB data? For that matter, why does it need to pass the pre-update value of the some_value column out of the Clustered Index Update?

The answer relates to the top row of the execution plan for test 3, reproduced here for ease of reference:

This is a wide (per-index) update plan. SQL Server used a narrow (per-row) update plan in test 2, where the Clustered Index Update took care of maintaining the nonclustered index as well. More about this difference shortly.

Split, Sort, Collapse

The Split/Sort/Collapse combination is an optimization, which aims to make per-index update plans more efficient. It does this by:

Breaking each update into a delete/insert pair

Reordering the operations

Removing any redundant operations

Applying only the net changes to the nonclustered index.

Imagine we had a unique index which currently holds three rows with the values 1, 2, and 3. If we run a query that adds 1 to each row value, we would end up with values 2, 3, and 4.

The net effect of all the changes is the same as if we simply deleted the value 1, and added a new value 4.

By applying net changes, SQL Server can also avoid false intermediate unique-key violations. If we tried to immediately update the value 1 to a 2, it would conflict with the existing value 2 (which would soon be updated to 3 of course) and the query would fail.

You might argue that SQL Server could avoid the uniqueness violation by starting with the highest value (3) and working down. That’s fine, but it’s not possible to generalize this logic to work with every possible update query, and all possible row processing orders.

Wide updates

SQL Server has to use a wide update plan if it sees any risk of false uniqueness violations. It’s worth noting that the logic SQL Server uses to detect whether these violations are possible has definite limits. As a result, you will often receive a wide update plan, even when you can see that no violations are possible.

Another benefit of this optimization is that it includes a sort on the index key as part of its work. Processing the index changes in index key order promotes sequential I/O against the nonclustered index.

Introduced inserts

A side-effect of all the above is that the net changes might include one or more inserts.

Now, in order to insert a new row in the index, SQL Server needs all the columns — the key column and the included LOB column. This is the reason SQL Server reads the LOB data as part of the Clustered Index Update — it needs it for an insert.

In addition, the some_value_OLD column is required by the Split operator (it turns updates into delete/insert pairs). In order to generate the correct index key delete operation as part of its work, it needs the old key value.

The irony is that in this case the Split/Sort/Collapse optimization is anything but an optimization. Reading all that LOB data is extremely expensive, so it is a bit sad that the current version of SQL Server has no way to avoid it.

Finally, for completeness, I should mention that the Filter operator is there to filter out the non-updating updates.

Beating the Set-Based Update with a Cursor

One situation where SQL Server can see that false unique-key violations aren’t possible is where it can guarantee that only one row is being updated.

Armed with this knowledge, we can write a cursor (or WHILE-loop equivalent) that updates one row at a time, and so avoids reading the LOB data:

Notice how the LOB data is read twice; once at the Clustered Index Scan, and again from the work table in tempdb used by the Eager Table Spool.

If you try the same test with a non-unique clustered index (rather than a primary key), you’ll get a much more efficient plan that just passes the clustering key (including uniqueifier) around (no LOB data or other non-key columns):

A unique nonclustered index (on a heap) works well too:

Both updates complete in a few tens of milliseconds, with no LOB reads, and just a few thousand logical reads. In fact the heap is rather more efficient.

There are lots more fun combinations to try that I don’t have space for here.

Final Thoughts

The behaviour shown in this post is not limited to LOB data by any means. If the conditions are met, any unique index that has included columns can produce similar behaviour — something to bear in mind when adding large INCLUDE columns to achieve covering queries, perhaps.