Featured Database Articles

Complex Updates Using the Case Statement

Introduction

One of the keys to database performance if keeping your
transactions as short as possible. In this article we will look at a couple of
tricks using the CASE statement to perform multiple updates on a table in a
single operation.

Multiple updates to a single column

This example uses the pubs database to adjust book prices for
a sale by different amounts according to different criteria. In the example I am
going to knock 25% off all business books from any publisher, and 10% off any
non-business books from a particular publisher. You might be tempted to wrap two
separate update statements into one transaction like this:

begin tran
update titles set ...
update titles set ...
commit tran

The down side of this technique is that it will read through
the table twice, once for each update. If we code our update like the example
below, then the table will only need to be read once. For large tables, this can
save us a lot of disk IO, especially if the query requires a table scan over a
long table

Note that there is a definite "top-down" priority
involved in the CASE statement. For business books from publisher 0736 the
"business" discount will apply because this is the first condition in
the list to be fulfilled. However, we will not give a further 10% publisher
discount, even though the criteria for the second "when" clause is
satisfied, because the CASE statement only evaluates criteria until it finds the
first one that fits.

Multi-column updates

We can use the CASE statement to update multiple columns in a
table, even using separate update criteria for each column. This example updates
the publishers table to set the state column to "--" for non-USA
companies, and changes the city for one particular publisher, all in one table
read operation.

update publishers
set
state = case
when country <> "USA"
then "--"
else state
end,
city = case
when pub_id = "9999"
then "LYON"
else city
end
where country <> "USA" OR
pub_id = "9999"

The same format will work for updates across three or more
rows with different update criteria.

You may come across fewer opportunities to use this second
technique efficiently. This query will almost invariably result in a table scan
because we are selecting on multiple columns that are unlikely to all be in a
covering index. If each column is updated only a small number of times, and is
indexed, it may still be more efficient to do separate updates.

A good place to use this technique might be in cleaning up
multiple columns in a long interface file from another system.

Because we are using two separate case statements, one for
each test criteria/update, each case statement will be evaluated for every row,
and updates applied where required. Therefore if more than one column in the row
requires an update, they will all be updated.

Two things are particularly important to remember in this
example:

The else [column] clause
is required for each case statement used, otherwise you will end up nulling-out
data you do not want to.

The where clause at the
end must be used to restrict the update to rows that require at least one
column updating, otherwise every column in the table will be updated,
increasing both execution time and pressure on the transaction log.