Multi-threaded assignment surprises

By Steve Ball & John Miller Crawford

09/19/2000

A volatile brew is formed by mixing assignment (topic of our previous column—"Assignment
surprises," Vol. 3, No. 7) and threads. Perils and surprises lurk within the most innocent-looking
statement. This time, we expose those perils and surprises and point out where you need to proceed
with due caution if you are to ensure the effective use of locked objects.

Maxim 2: Never Assign To A Locked Object
We'll examine these pitfalls and their unhappy consequences within the setting of an aircraft ground
service system. In this system, "checking out" an Airplane object
and performing a large number of checks and modifications corresponds to the real-world allocation
of an actual airplane to one of a number of service crews for refueling, repair, and refurbishment.
For safety reasons, only one service crew is permitted to work on the airplane at the same time.
Because of that limitation, the methods for each service operation acquire exclusive access to
the Airplane object for the duration of the task, locking the object
with the synchronized statement. This forces all the other crews to have to wait until the airplane
has been released for its next service operation.¹

Let's say that the mechanical service crew is the first to get to work today and is currently checking
the airplane's airworthiness. Meanwhile the hospitality crew is waiting for access to the airplane so
that they can give it a cleaning and load the next flight's meals.

Because the mechanicalService() method started first, the
hospitalityService() method is blocked and will not resume until
the mechanicalService() method releases its lock on the
Airplane object when it completes.
The advantage of this resource contention scheme is its simplicity. (Its disadvantage is that it
does not permit the service crews to work in parallel on the same airplane). Because the
Airplane class has so many methods that would require synchronization,
the designer of the class chose not to add any synchronization at all within the class. This requires
its users to provide these checks themselves at a higher level, as we've done in
the ParkingBay class.
This is a perfectly reasonable compromise to make. On the one hand, placing multi-threaded checks
in a class may complicate its implementation enormously. On the other hand, controlling access to
objects of the class at a higher level places the responsibility for ensuring serialized access
onto the users of the class and also reduces the amount of possible sharing of the class because
it increases the granularity of the locking scope.
However, despite the simplicity of this scheme, it demonstrates the trickiness of working with multiple
threads, as the peril we spoke of is lurking within the mechanicalService()
method. The problem arises when the maintenance crew determines that the airplane is not airworthy and
decides to replace it with a new plane that the airline keeps for just this sort of contingency:

airplane = new Airplane();

This new instance is not locked-that's okay though, because only the mechanical crew is privy to this
exchange of airplanes (the other crews will go about their tasks unaware that the planes have been switched).
But what happens when the blocked hospitalityService() method proceeds,
having acquired exclusive access to the Airplane object? Figure 1 shows
the sequence of events.
The T0 point on the timeline represents the state of the parking bay just after the airplane has
been parked there and before any service crews have started. The instance variable airplane refers to
the parked plane (Instance 1).
At T1, the mechanicalService() method is invoked and it acquires
an exclusive lock on the Airplane instance using the synchronized statement.
The hospitality crew arrives later at point T2 but discovers that the mechanical crew is not yet
finished. The hospitalityService() method attempts to lock the
Airplane instance and blocks.
At point T3, the mechanical crew concludes that the airplane is not fit to fly and decides to
swap in a replacement. A new Airplane instance (Instance 2) is assigned
to the airplane instance variable. The refueling of the new plane is completed and so at point T4,
by exiting the synchronized block, the mechanicalService() method releases
its lock on the first Airplane instance, the one that was previously
referred to by airplane. Now remember, at the point that the mechanicalService()
method released its lock the airplane instance variable had already been modified to refer to Instance 2
so the airplane switch should have been transparent to the other crews.
However, the hospitalityService() method has been waiting since the
airworthiness checks started to get a lock on the original Airplane
instance. At point T5 (which occurs very shortly after the release of the lock at point T4)
the hospitalityService() method acquires access to the original
Airplane instance on which it synchronized at point T2.
Unfortunately, the airplane instance variable now refers to a different Airplane
object.
The resulting situation at point T6 is obviously not a desirable one. The
Airplane object now referenced by the airplane instance variable
(and the one on which the hospitality crew will be working) has not been not locked by the
hospitalityService() method, and the one on which that method does hold
an exclusive lock is now being melted down in a local foundry.²
Things get worse. Suppose a further ParkingBay method,
performTestFlight(), has been invoked in order to check the new airplane's
ability to get off the ground after having been mothballed for so long. This method will also attempt to
get exclusive access to the airplane by locking the instance now referred to by the
airplane instance variable. This will immediately succeed because no thread
currently has that instance locked. One hopes that the pilot would think to disconnect the vacuum cleaner
power leads trailing out the back door before taxiing off down the runway!
We can trace the origin of this misfortune back to a failure to realize the implications of Java's
distinction between object instances and object references. The lock held by a thread is a lock on the
instance, not the reference. On the other hand, assignment acts on the reference, not the instance,
as we emphasized in our last column ("Assignment surprises," Vol. 3, No. 7). Applied to a locked object in
a threaded environment, assignment switches the reference to another instance, while any currently blocked
lock requests will eventually be granted on the original instance; hence, our maxim warning you
against assigning to a locked object.
There is a solution that will allow a method to gain a lock on the current instance associated with an
object reference even when other threads may perform assignment on it. Applied to the
hospitalityService() method, it looks like this:

We loop around locking the instances referred to by airplane until the locked instance matches the current
value of the reference. If assignments to airplane always occur while the
instance it refers to is locked, then this method will operate on the latest instance referred to by airplane
and that instance will have been locked by the method.
A better solution would be one that avoids needing to make assignments to locked objects in the first place.
Accepting that the assignment performs a vital action that cannot be avoided, to solve the problem we will
have to eliminate locking on the same object, which brings us to the next maxim.

Maxim 3: Lock With One Class And Assign
With Another
Despite its problems, the simplicity of the original scheme has some appeal, as we still want to avoid placing
the locking code into the Airplane class. The solution entails taking a lock
not just on the right object but on the right object of the right class.
Let's review what went wrong with the airplane servicing. The hospitality crew requested a lock on a
particular airplane scheduled for service. By the time that airplane had become available (and the lock
granted), it had already been consigned to the scrap heap. The hospitality crew unwittingly carried out
their duties on an airplane different from the one they'd locked. The designers of this system could have
prevented this mishap if they'd arranged for the lock to be obtained on the parking bay instead of the
airplane. This is safer because, even though airplanes may be swapped into and out of service unpredictably,
the number of parking bays an airport has is generally fixed. Presumably, the
Airport class is defined something like this:

The problems of assigning to locked objects cannot arise when the this
reference is locked or synchronized methods are invoked because assignment
to the this reference is not permitted.
The mechanical crew, having carried out their tasks in drastic fashion by substituting the airplane in
the parking bay, in due course would have released their lock on that bay. This would have allowed the
hospitality crew to enter and lock the parking bay (and the replacement airplane now in it) and obstruct
the actions of the test flight crew, who would then be compelled to wait their turn.
This solution works by ensuring that the layer of abstraction that is subjected to synchronization
constraints is a different one than the layer at which new instances are assigned to existing references.
We achieved this by moving the locking up to a higher level of abstraction—from
Airplane class to ParkingBay class.

Maxim 4: Encapsulate Your Locking Mechanisms In Separate Classes
An alternative application of Maxim 3 could separate the locking from the assignment by creating
a peer class for the Airplane class that would handle the synchronization
issues for it.
To do this we'll need the assistance of a separate locking class.
Listing 1 defines a Mutex (mutual
exclusion) class that can be used for this purpose.
Using separate locking classes introduces its own problems—most notably that stand-alone locking
objects are not automatically unlocked—but is generally a good idea in any non-trivial application.
It also provides the opportunity we've been waiting for (no pun intended) to signal that the airplane is
not yet ready for the next crew and would allow them to perform some other task other than merely wait
for the airplane to become available.

Writing thread-safe code requires special vigilance, especially when contention is over objects that may be
assigned new instances. An awareness of the difference between object instances and object references is,
as always, key. When you face that type of resource contention look for other related objects that can be
locked instead so that you will never need to assign to a locked object.

1. Requiring the service crews wait idly for the airplane to become available (in our code, to block in the synchronized statement) is far from ideal. A more sophisticated scheme would allow a service crew to inquire as to the airplane's availability and move on to a different airplane if appropriate.

2. if the hospitalityService() method were to invoke the
wait() or notify() methods on
the airplane instance variable these methods would throw an
IllegalMonitorStateException exception even though the methods
would be invoked on the same reference as the one used with the synchronized statement that
contains them!