ADO.NET Connection Pooling Explained

Because the .NET managed providers manage the connection pool for us, using
shared database connections is as easy as a summertime splash in the kiddie
pool. But if those connections unexpectedly become invalid, you could find yourself
floundering in the deep end. Any number of things can cause connections in the
pool to become invalid; maybe the database administrator killed the wrong session.
Perhaps the database server was rebooted. Or it could be that the firewall between
the web server and the database server closed what it thought were idle connections.
Whatever the reason, your application won't know if a pooled connection is invalid
until it's too late and an exception has been thrown. Microsoft documents this
behavior in an
MSDN white paper on connection pooling:

"If a connection exists to a server that has disappeared, it is possible
for this connection to be drawn from the pool even if the connection pooler
has not detected the severed connection and marked it as invalid. When this
occurs, an exception is generated."

Most of us have already experienced this painful problem firsthand (and if
you haven't, it's only a matter of time before you do). We don't yet know whether
enhancements to the System.Data.SqlClient namespace in the upcoming
Whidbey release will address it. It's possible that Microsoft will introduce
some sort of failover solution behind the scenes in the SqlConnection
class that will first try to use other connections in the pool before throwing
an exception. However, even if they do address the problem, you don't want to
rely on a vendor-specific solution. After all, your data
access component might be using the SqlConnection class today,
but next year it might use OracleConnection or some other managed
provider's IDBConnection implementation. Therefore, your best bet
is to handle the problem in your own data access component.

One possible solution is to ping the server with something like "select date=getdate()"
(T-SQL) or "select
sysdate from dual" (PL/SQL) before each
real query. The idea behind this is to receive a date to verify that the pooled connection is
valid and will work correctly for the real query. I've seen this "pessimistic
ping" solution put into production, but it's a bad idea for two reasons. First,
it's an unnecessary and inefficient round trip to the server. Each query ends up being two round
trips, which could impact the performance of the application. Second, in a
busy connection pool, there's no guarantee you'll get the same connection on each
query. The ping query's connection might be unavailable when the real query is
executed, and who knows if the real query's connection is valid or invalid; you're right back to where you started. In this article, I recommend an optimistic
solution to the problem of invalid connections in the pool. But first, to
reproduce the problem, let's examine the behavior of the connection pool with a
simple test harness.

PoolTest

Create a console application called PoolTest, reference the System.Data and
System.Data.SqlClient namespaces, and rename the default Class1.cs to
PoolTest.cs. Then overwrite the default class with this one:

Obviously, you'll want to replace the uid and pwd
connection-string parameters with credentials local to your database instance.
Run the application and notice that the current date writes out to the console.
That's not very interesting. But what is interesting is what happened behind
the scenes. When cn.Open() was called, the managed provider instantiated
an internal class called SqlConnectionPoolManager and invoked its
GetPooledConnection method, passing into it the connection string.
The pool manager examined all current pools to see if there was one that used
a connection string that exactly matched the one it was given. In our case, there
were none. Since there wasn't one, it constructed a new ConnectionPool
object passing in the connection string as a unique identifier for that pool.
It then seeded the ConnectionPool object with three connections.
Why three? Because that's how many we defined in the Min Pool Size
parameter. Finally, when ExecuteReader was called, the SqlConnection
instance used one of these connections to execute the query and fetch the current
system date from SQL Server.

As long as we don't change the signature of our connection string, every time
we press the Enter key, the managed provider will reuse one of the
existing connections in the pool. This boosts performance greatly, because the
managed provider doesn't have to go through the expensive process of instantiating
a new Connection from scratch every time there is a database query.

We can see the database processes that the managed provider created when it
seeded the connection pool. Before shutting down the console application, go
into Query Analyzer and run the following query:

The connections to these three processes are maintained by the managed provider
and match the Min Pool Size value that we set in our connection
string. Now let's simulate a network problem in which the connections in the
pool become invalid without our managed provider's knowledge. To do that, we'll
manually kill all three processes while the test harness is running. In Query
Analyzer, use the T-SQL kill command and, showing no mercy, kill
all three of them:

kill 52;
kill 53;
kill 54;

If you're using Oracle on the back end, then run the following
query instead: SELECT SID, SERIAL#, STATUS FROM v$session WHERE OSUSER =
'ASPNET'. To kill a session, run ALTER SYSTEM KILL SESSION 'x,y'; (where x = SID and y = SERIAL#).

Press the Enter key on the test harness again, and an ugly
SqlException is thrown. Our SqlConnection instance didn't know its
underlying connection had been severed when Open() was called, and it was only after the
ExecuteReader method tried to query the database that the problem
was discovered and the error thrown. Obviously, it
would be desirable for us to handle the exception gracefully rather than being
unprepared for it and letting it
awkwardly bubble up to the caller.

Handling Invalid Connections

So how should we handle the exception? We could trap it at the user interface level and ask the user to try the action again.
Even better, we could handle the exception in such a way that the user is never
made aware of the problem. This pseudocode describes behavior that would be very
desirable:

Now if you run the test harness again and kill the three processes in the pool,
you'll discover that the application recovers quite nicely by trying again until
it finds a good connection or the pool manager creates and adds another valid one. It's
important to close both the data reader and connection as soon as they are no
longer needed. If you don't and they go out of scope, the application will leak a connection. I wrapped
them in a finally block to make sure that even if an exception is thrown they
will get closed properly. In fact, if an exception is thrown then you must
explicitly close the connection to mark it as invalid in the pool. Only if a
connection is marked as invalid will the pool manager remove it from the pool on
its next scan.

A Simple Data Access Component

Console applications are one thing, but what about implementing this solution
in a real-world, data access component? In this section, I'll suggest one
way of doing just that by creating a simple data access component (DAC) that
builds on what we've learned so far about connection pool behavior. To begin,
add a new class to the PoolTest project called SqlHelper.cs
and make sure it references the System.Data and
System.Data.SqlClient namespaces. Then add these two private variables and
a static constructor to the class:

Notice that in the catch block a recursive call is made. If the connection
was severed, then closing it will mark it as invalid in the pool. That way, the
recursive call will get a different connection on the next try. (If you're using
Oracle, this will not work. See the "gotcha" below.) Now, with the SqlHelper class built, we can modify the test harness to call the
ExecuteReader method. Replace the old while loop with this one:

Now run the test harness and pull back a few dates from the database. This
time, if you
kill all three sessions while it is running, the ExecuteReader method
will recover and try again. In this way, the DAC can successfully handle the exception
and get a valid connection from the pool.
Only if there is a persistent problem of some sort, and the number of tries reaches MAX_TRIES, is an error
allowed to bubble up to the caller.

Oracle 9i Gotcha

The details of ODP.NET,
Oracle's implementation of the .NET managed provider, are hidden from us, so
I can't explain why a certain bug happens. But in particular circumstances,
such as when a session is killed (raising OracleException ORA-00028)
or the connection is otherwise severed, the OracleConnection class
cannot reconnect to the database. The bottom line is that if you port the simple
DAC above over to use against an Oracle 9i database instance, you'll be disappointed.
Instead of recovering from the exception as expected, you'll next get OracleException
ORA-01012 ("Not logged on") on each recursive call to ExecuteReader
until MAX_TRIES is finally reached, and the error bubbles up to
the caller.

I do have a workaround for Oracle that I've tested thoroughly. My design
involves a straightforward
Observer Pattern in which
an observer class is notified if the subject class encounters an OracleException,
and then re-seeds the connection pool for the subject before the recursive call
is made. If you use Oracle on the
back end, feel free to contact me at still_james@hotmail.com and I'll email my implementation to you.

A Final Thought

In the interest of making the test harness simple, I used the integer constant
MAX_TRIES to prevent an infinite regress. But in a real-world application,
we have to assume that all connections up to the Max Pool Size parameter
value might be invalid. (If this parameter is not explicitly set in the connection
string, then for SqlConnection the default is 100.) A better solution than
hard-coding a constant is to use the Max Pool Size parameter instead of
an arbitrary MAX_TRIES constant. My final thought, expressed in code (of
course), is a handy method that will return the value of any connection string parameter
passed into it. I'll leave it to you to improve the DAC with it: