03 May, 2011

For the past few months we've had this bug (PROD only, and unreproducible in any of our other environments) where every now and then we could not get data from our DB. The DB was up and running just fine, but the SqlClient on the web server just could not read the data (we'd get 2 errors: one about trying to read data where a reader was already open, and another about not being able to cast the data from the reader to a certain type). In case you're interested, the exact errors were:

System.InvalidCasException: Specified cast is no valid. At System.Data.SqlClient.SqlBuffer.get_Int32()

Invalid attempt to call Read when reader is closed. Somewhere in System.Data.SqlClient as well.

For a while we just lived with the error; it didn't happen very often and it only affected a few customers. Then the problem got worse and so we immediately blamed it on some environment change... we recycled the app pool on a consistent basis while trying to figure out what the problem was.

Finally, when the problem got bad enough, I did a little bit of research and concluded that we were seeing the problem because we were not disposing our LINQ contexts correctly. After trying to prove that the problem was in someone else's code (and failing), I quickly concluded that it must be Unity that was not disposing the contexts.

You see, I had assumed that Unity would new up an instance of our db context on every call to resolve, but it turns out that by default the lifetime of the dependencies Unity resolves is the same as the lifetime of the IoC container. In our case, the IoC container stayed alive throughout the entire app (because we needed to resolve stuff in our MVC controllers) and so the longer the app went without a restart, and the more requests it handled, the more we saw the error.

Fortunately, because we were using Unity to resolve all of our dependencies, and because the Unity code was all in one place and well isolated from the rest of the app, it was really easy to fix. All I had to do was write a per HTTP request lifetime manager and plug it into our container. I'll be posting the code for that container soon... but now I gotta go to bed. Looks like we'll have quite a few bugs to fight tomorrow. :)