Locking

Subversion's copy-modify-merge version control model lives
and dies on its data merging algorithms—specifically on
how well those algorithms perform when trying to resolve
conflicts caused by multiple users modifying the same file
concurrently. Subversion itself provides only one such
algorithm: a three-way differencing algorithm that is smart
enough to handle data at a granularity of a single line of text.
Subversion also allows you to supplement its content merge
processing with external differencing utilities (as described in
the section called “External diff3”), some
of which may do an even better job, perhaps providing
granularity of a word or a single character of text. But common
among those algorithms is that they generally work only on text
files. The landscape starts to look pretty grim when you start
talking about content merges of nontextual file formats. And
when you can't find a tool that can handle that type of merging,
you begin to run into problems with the copy-modify-merge
model.

Let's look at a real-life example of where this model runs
aground. Harry and Sally are both graphic designers working on
the same project, a bit of marketing collateral for an
automobile mechanic. Central to the design of a particular
poster is an image of a car in need of some bodywork, stored in
a file using the PNG image format. The poster's layout is
almost finished, and both Harry and Sally are pleased with the
particular photo they chose for their damaged car—a baby
blue 1967 Ford Mustang with an unfortunate bit of crumpling on
the left front fender.

Now, as is common in graphic design work, there's a change
in plans, which causes the car's color to be a concern. So Sally
updates her working copy to HEAD, fires up
her photo-editing software, and sets about tweaking the image so
that the car is now cherry red. Meanwhile, Harry, feeling
particularly inspired that day, decides that the image would
have greater impact if the car also appears to have suffered
greater impact. He, too, updates to HEAD,
and then draws some cracks on the vehicle's windshield. He
manages to finish his work before Sally finishes hers, and after
admiring the fruits of his undeniable talent, he commits the
modified image. Shortly thereafter, Sally is finished with the
car's new finish and tries to commit her changes. But, as
expected, Subversion fails the commit, informing Sally that
her version of the image is now out of date.

Here's where the difficulty sets in. If Harry and Sally
were making changes to a text file, Sally would simply update
her working copy, receiving Harry's changes in the process. In
the worst possible case, they would have modified the same
region of the file, and Sally would have to work out by hand the
proper resolution to the conflict. But these aren't text
files—they are binary images. And while it's a simple
matter to describe what one would expect the results of this
content merge to be, there is precious little chance that any
software exists that is smart enough to examine the common
baseline image that each of these graphic artists worked
against, the changes that Harry made, and the changes that Sally
made, and then spit out an image of a busted-up red Mustang with
a cracked windshield!

Of course, things would have gone more smoothly if Harry and
Sally had serialized their modifications to the image—if, say,
Harry had waited to draw his windshield cracks on Sally's
now-red car, or if Sally had tweaked the color of a car whose
windshield was already cracked. As is discussed in the section called “The Copy-Modify-Merge Solution”, most of these
types of problems go away entirely where perfect communication
between Harry and Sally exists.
[14]
But as one's version control system is, in fact, one form of
communication, it follows that having that software facilitate
the serialization of nonparallelizable editing efforts is no
bad thing. This is where Subversion's implementation of the
lock-modify-unlock model steps into the spotlight. This is
where we talk about Subversion's locking
feature, which is similar to the “reserved
checkouts” mechanisms of other version control
systems.

Subversion's locking feature exists ultimately to minimize
wasted time and effort. By allowing a user to programmatically
claim the exclusive right to change a file in the repository,
that user can be reasonably confident that any energy he invests
on unmergeable changes won't be wasted—his commit of those
changes will succeed. Also, because Subversion communicates to
other users that serialization is in effect for a particular
versioned object, those users can reasonably expect that the
object is about to be changed by someone else. They, too, can
then avoid wasting their time and energy on unmergeable changes
that won't be committable due to eventual
out-of-dateness.

When referring to Subversion's locking feature, one is
actually talking about a fairly diverse collection of behaviors,
which include the ability to lock a versioned file
[15]
(claiming the exclusive right to modify the file), to unlock
that file (yielding that exclusive right to modify), to see
reports about which files are locked and by whom, to annotate
files for which locking before editing is strongly advised, and
so on. In this section, we'll cover all of these facets of the
larger locking feature.

The Three Meanings of “Lock”

In this section, and almost everywhere in this book, the
words “lock” and “locking” describe
a mechanism for mutual exclusion between users to avoid
clashing commits. Unfortunately, there are two other sorts
of “lock” with which Subversion, and therefore
this book, sometimes needs to be concerned.

The second is working copy locks,
used internally by Subversion to prevent clashes between
multiple Subversion clients operating on the same working
copy. This is the sort of lock indicated by an
L in the third column of
svn status output, and removed by the
svn cleanup command, as described in the section called “Sometimes You Just Need to Clean Up”.

Third, there are database locks,
used internally by the Berkeley DB backend to prevent clashes
between multiple programs trying to access the database. This
is the sort of lock whose unwanted persistence after an error
can cause a repository to be “wedged,” as
described in the section called “Berkeley DB Recovery”.

You can generally forget about these other kinds of locks
until something goes wrong that requires you to care about
them. In this book, “lock” means the first sort
unless the contrary is either clear from context or explicitly
stated.

Creating Locks

In the Subversion repository, a
lock is a piece of metadata that
grants exclusive access to one user to change a file. This
user is said to be the lock owner.
Each lock also has a unique identifier, typically a long
string of characters, known as the lock
token. The repository manages locks, ultimately
handling their creation, enforcement, and removal. If any
commit transaction attempts to modify or delete a locked file
(or delete one of the parent directories of the file), the
repository will demand two pieces of information—that
the client performing the commit be authenticated as the lock
owner, and that the lock token has been provided as part of
the commit process as a form of proof that the client knows which
lock it is using.

To demonstrate lock creation, let's refer back to our
example of multiple graphic designers working on the same
binary image files. Harry has decided to change a JPEG image.
To prevent other people from committing changes to the file
while he is modifying it (as well as alerting them that he is
about to change it), he locks the file in the repository using
the svn lock command.

The preceding example demonstrates a number of new things.
First, notice that Harry passed the
--message (-m) option to
svn lock. Similar to svn
commit, the svn lock command can
take comments—via either --message
(-m) or --file
(-F)—to describe the reason for locking the
file. Unlike svn commit, however,
svn lock will not demand a message by
launching your preferred text editor. Lock comments are
optional, but still recommended to aid communication.

Second, the lock attempt succeeded. This means that the
file wasn't already locked, and that Harry had the latest
version of the file. If Harry's working copy of the file had
been out of date, the repository would have rejected the
request, forcing Harry to svn update and
reattempt the locking command. The locking command would also
have failed if the file had already been locked by someone
else.

As you can see, the svn lock command
prints confirmation of the successful lock. At this point,
the fact that the file is locked becomes apparent in the
output of the svn status and svn
info reporting subcommands.

The fact that the svn info command,
which does not contact the repository when run against working
copy paths, can display the lock token reveals an important
piece of information about those tokens: they are cached in
the working copy. The presence of the lock token is critical.
It gives the working copy authorization to make use of the
lock later on. Also, the svn status
command shows a K next to the file (short
for locKed), indicating that the lock token is present.

Regarding Lock Tokens

A lock token isn't an authentication token, so much as
an authorization token. The token
isn't a protected secret. In fact, a lock's unique token is
discoverable by anyone who runs svn info
URL. A lock token is special only when it lives
inside a working copy. It's proof that the lock was created
in that particular working copy, and not somewhere else by
some other client. Merely authenticating as the lock owner
isn't enough to prevent accidents.

For example, suppose you lock a file using a computer at
your office, but leave work for the day before you finish
your changes to that file. It should not be possible to
accidentally commit changes to that same file from your home
computer later that evening simply because you've
authenticated as the lock's owner. In other words, the lock
token prevents one piece of Subversion-related software from
undermining the work of another. (In our example, if you
really need to change the file from an alternative working
copy, you would need to break the lock and relock the
file.)

Now that Harry has locked banana.jpg,
Sally is unable to change or delete that file:

But Harry, after touching up the banana's shade of yellow,
is able to commit his changes to the file. That's because he
authenticates as the lock owner and also because his working
copy holds the correct lock token:

Notice that after the commit is finished, svn
status shows that the lock token is no longer
present in the working copy. This is the standard behavior of
svn commit—it searches the working
copy (or list of targets, if you provide such a list) for
local modifications and sends all the lock tokens it
encounters during this walk to the server as part of the
commit transaction. After the commit completes successfully,
all of the repository locks that were mentioned are
released—even on files that weren't
committed. This is meant to discourage users from
being sloppy about locking or from holding locks for too long.
If Harry haphazardly locks 30 files in a directory named
images because he's unsure of which files
he needs to change, yet changes only four of those files, when he
runs svn commit images, the process will
still release all 30 locks.

This behavior of automatically releasing locks can be
overridden with the --no-unlock option to
svn commit. This is best used for those
times when you want to commit changes, but still plan to make
more changes and thus need to retain existing locks. You can
also make this your default behavior by setting the
no-unlock runtime configuration option (see
the section called “Runtime Configuration Area”).

Of course, locking a file doesn't oblige one to commit a
change to it. The lock can be released at any time with a
simple svn unlock command:

$ svn unlock banana.c
'banana.c' unlocked.

Discovering Locks

When a commit fails due to someone else's locks, it's
fairly easy to learn about them. The easiest way is to run
svn status -u:

In this example, Sally can see not only that her copy of
foo.h is out of date, but also that one of the
two modified files she plans to commit is locked in the
repository. The O symbol stands for
“Other,” meaning that a lock exists on the file
and was created by somebody else. If she were to attempt a
commit, the lock on raisin.jpg would
prevent it. Sally is left wondering who made the lock, when,
and why. Once again, svn info has the
answers:

Just as you can use svn info to examine
objects in the working copy, you can also use it to examine
objects in the repository. If the main argument to
svn info is a working copy path, then all
of the working copy's cached information is displayed; any
mention of a lock means that the working copy is holding a
lock token (if a file is locked by another user or in another
working copy, svn info on a working copy
path will show no lock information at all). If the main
argument to svn info is a URL, the
information reflects the latest version of an object in the
repository, and any mention of a lock describes the current
lock on the object.

So in this particular example, Sally can see that Harry
locked the file on February 16 to “make a quick
tweak.” It being June, she suspects that he probably
forgot all about the lock. She might phone Harry to complain
and ask him to release the lock. If he's unavailable, she
might try to forcibly break the lock herself or ask an
administrator to do so.

Breaking and Stealing Locks

A repository lock isn't sacred—in Subversion's
default configuration state, locks can be released not only by
the person who created them, but by anyone. When somebody
other than the original lock creator destroys a lock, we refer
to this as breaking the lock.

From the administrator's chair, it's simple to break
locks. The svnlook
and svnadmin programs have the ability to
display and remove locks directly from the repository. (For
more information about these tools, see
the section called “An Administrator's Toolkit”.)

Now, Sally's initial attempt to unlock failed because she
ran svn unlock directly on her working copy
of the file, and no lock token was present. To remove the
lock directly from the repository, she needs to pass a URL
to svn unlock. Her first attempt to unlock
the URL fails, because she can't authenticate as the lock
owner (nor does she have the lock token). But when she
passes --force, the authentication and
authorization requirements are ignored, and the remote lock is
broken.

Simply breaking a lock may not be enough. In
the running example, Sally may not only want to break Harry's
long-forgotten lock, but relock the file for her own use.
She can accomplish this by using svn unlock
with --force and then svn lock
back-to-back, but there's a small chance that somebody else
might lock the file between the two commands. The simpler thing
to do is to steal the lock, which involves
breaking and relocking the file all in one atomic step. To
do this, Sally passes the --force option
to svn lock:

In any case, whether the lock is broken or stolen, Harry
may be in for a surprise. Harry's working copy still contains
the original lock token, but that lock no longer exists. The
lock token is said to be defunct. The
lock represented by the lock token has either been broken (no
longer in the repository) or stolen (replaced with a
different lock). Either way, Harry can see this by asking
svn status to contact the
repository:

If the repository lock was broken, then svn
status --show-updates (-u)
displays a B (Broken) symbol next to the
file. If a new lock exists in place of the old one, then a
T (sTolen) symbol is shown. Finally,
svn update notices any defunct lock tokens
and removes them from the working copy.

Locking Policies

Different systems have different notions of how strict a
lock should be. Some folks argue that locks must be
strictly enforced at all costs, releasable only by the
original creator or administrator. They argue that if
anyone can break a lock, chaos runs rampant and the
whole point of locking is defeated. The other side argues
that locks are first and foremost a communication tool. If
users are constantly breaking each other's locks, it
represents a cultural failure within the team and the
problem falls outside the scope of software enforcement.

Subversion defaults to the “softer”
approach, but still allows administrators to create stricter
enforcement policies through the use of hook scripts. In
particular, the pre-lock and
pre-unlock hooks allow administrators
to decide when lock creation and lock releases are allowed
to happen. Depending on whether a lock already exists,
these two hooks can decide whether to allow a certain user
to break or steal a lock. The
post-lock and
post-unlock hooks are also available,
and can be used to send email after locking actions. To
learn more about repository hooks, see the section called “Implementing Repository Hooks”.

Lock Communication

We've seen how svn lock
and svn unlock can be used to create,
release, break, and steal locks. This satisfies the goal of
serializing commit access to a file. But what about the
larger problem of preventing wasted time?

For example, suppose Harry locks an image file and then
begins editing it. Meanwhile, miles away, Sally wants to do
the same thing. She doesn't think to run svn status
-u, so she has no idea that Harry has
already locked the file. She spends hours editing the file,
and when she tries to commit her change, she discovers that
either the file is locked or that she's out of date.
Regardless, her changes aren't mergeable with Harry's. One of
these two people has to throw away his or her work, and a lot of
time has been wasted.

Subversion's solution to this problem is to provide a
mechanism to remind users that a file ought to be locked
before the editing begins. The mechanism
is a special property: svn:needs-lock. If
that property is attached to a file (regardless of its value,
which is irrelevant), Subversion will try to use
filesystem-level permissions to make the file read-only—unless,
of course, the user has explicitly locked the file.
When a lock token is present (as a result of using
svn lock), the file becomes read/write.
When the lock is released, the file becomes read-only
again.

The theory, then, is that if the image file has this
property attached, Sally would immediately notice
something is strange when she opens the file for editing:
many applications alert users immediately when a read-only
file is opened for editing, and nearly all would
prevent her from saving changes to the file. This
reminds her to lock the file before editing, whereby she
discovers the preexisting lock:

Users and administrators alike are encouraged to attach
the svn:needs-lock property to any file
that cannot be contextually merged. This is the primary
technique for encouraging good locking habits and preventing
wasted effort.

Note that this property is a communication tool that
works independently from the locking system. In other words,
any file can be locked, whether or not this property is
present. And conversely, the presence of this property
doesn't make the repository require a lock when
committing.

Unfortunately, the system isn't flawless. It's possible
that even when a file has the property, the read-only reminder
won't always work. Sometimes applications misbehave and
“hijack” the read-only file, silently allowing
users to edit and save the file anyway. There's not much that
Subversion can do in this situation—at the end of the
day, there's simply no substitution for good interpersonal
communication.
[16]

[14] Communication wouldn't have been such bad medicine for
Harry and Sally's Hollywood namesakes, either, for that
matter.

You are reading Version Control with Subversion (for Subversion 1.6), by Ben Collins-Sussman, Brian W. Fitzpatrick, and C. Michael Pilato.
This work is licensed under the Creative Commons Attribution License v2.0.
To submit comments, corrections, or other contributions to the text, please visit http://www.svnbook.com/.