Thursday, February 28, 2008

Transactional Integrity Problem

An astute reader pointed out that there is a transactional integrity
problem with the HAppS application built over the last 4 posts. The
function checkAndAdd in Finished
HAppS Application contains a call to "query $ IsUser" as well as a
call to "update $ AddUser". This violates that ACID guarantee that
was desired from the checkAndAdd function. If two people
simultaneously try to create the same username, it's possible that
both of them could get past the "query" and "if exists" statements
before either of the "update AddUser" statements are executed. In
this case, both of the AddUser updates would succeed and both users
would think their account was created. But if they had the same
username, then first one would be overwritten by the second one. The
second user wouldn't notice a problem, but the first user would not be
able to log in to the newly created account because his password would
probably be different from the password chosen by the second user.
This wouldn't be the end of the world, but it would certainly create
frustration for the first user.

The problem exists because HAppS gives us transactional guarantees at
the query and update level only. I just didn't think about it when I
originally wrote the code. I could just give the fix, but I'll
outline incorrect attempts I made before I got to the fix. Hopefully
it will be more beneficial to see some wrong solutions and how they
got corrected. Since I'm still trying to overcome tendencies learned
from years of imperative programming, maybe this will be useful to
other people in the same position. First we need to modify the
addUser function to check for the existence of the user first. What
we want is something like the following:

addUser name u = do
exists

Haskell's if statement requires an else clause. The else clause must
be the same type as modUsers, so that means it has to be "else return
()". Then I found out that Haskell's "unless" function does exactly
the same thing. So you can replace the whole if statement with
"unless exists $ modUsers $ M.insert name u".

For those who are still trying to understand monads, I should point
out that it won't work to avoid exists binding with "if not (isUser
name)". The problem here is that isUser is of type "m Bool", and the
not function needs a Bool. The bind operator is the mechanism
responsible for allowing us to effectively pull a result out of the
monad to be passed to another function. Information never actually
comes out of the monad though, because the function has to return a
monad-encapsulated value.

The reason for the change is that isUser is a MonadReader action and
modUsers is a MonadState action. The new type signature is saying
that code using this function must be an instance of both MonadReader
and MonadState. Fortunately, the update method has both of these, so
we can get away with using the two different monads.

There's still one problem with this definition of addUser. There is
no way for the caller to find out whether a new user was created or
not. The simplest way to communicate this information is to just
return exists at the end of the computation. So our finished addUser
function in Session.hs looks like this:

addUser name u = do
exists

And our checkAndAdd function becomes:

checkAndAdd user pass = do
existed

Now we have a single transaction and arguably cleaner code. What
have we learned from this? Your update and query functions need to be
carefully designed to provide a framework with the appropriate
transactional guarantees needed in your system.