Seems like a lot of work acquiring and releasing the spinlock just to decrement the
refcount. Well, remember the 'lock' prefix instruction we used to make the test_and_set
thing? We can be a little fancier about it and make a 'dec_and_testz' routine:

int locked_dec_and_testz (int *refcount); // decrement *refcount, return whether or not it went zero
.globl locked_dec_and_testz
locked_dec_and_testz:
xorl %eax,%eax # clear all of %eax
movl 4(%esp),%edx # get pointer to refcount
lock # keep other cpu's out of next instruction
decl (%edx) # decrement refcount, keeping other cpu's out of refcount
setz %al # set %eax=1 if result is zero, set %eax=0 otherwise
ret # return %eax as the result

Now this has a little gotcha. 'refcount' must now be thought of as being a variable
not protected by a spinlock, but by 'lock instruction prefix' access only! So any
modifications to refcount must be done with the lock prefix. So we can no longer:

But we still come out ahead, because there is no possibility of doing any spinning in
any of these routines. (The actual spinning is done at the CPU bus level, which is
very very fast).

Atomic arithmetic

Now we can generally apply increment/decrement to any single arithmetic operation on
a single variable. It is possible to write routines to do add, subtract, or, and, xor,
etc, and have the routine return the previous value.

For example, this routine or's in 'new_bits' into *value, and returns the
previous contents of value:

Now you notice this does have a little 'spin' in it, it has a loop back. But the
loop is so short, that the chances of conflicting with other modifications of the
*value is very slim, so it is highly unlikely that it will ever loop.