Description

I have found an incomplete-initialization bug the WSGIHandler when running under Apache mpm_worker with mod_wsgi. Code review shows this is a potential problem with all of the handlers.

On the first request the WSGIHandler.__call__(...), we set up the middleware. There is a lock wrapping the load of the middleware, but the logic still allows incomplete initialization on other threads. Below is the block of code in question from django/core/handlers/wsgi.py (line 226):

if self._request_middleware is None:
self.initLock.acquire()
# Check that middleware is still uninitialised.
if self._request_middleware is None:
self.load_middleware()
self.initLock.release()

Example:

Initial start or restart of the Apache instance

Thread T1 - a request comes in. self._request_middleware is None

Thread T1 - acquires the self.initLock, calls self.load_middleware()

Thread T2 - a request comes in (T1 is part of the way through self.load_middleware())

Thread T2 - self._request_middleware is not None, but it's not completely loaded either, continues with the request

Thread T1 - completes self.load_middleware(), release the lock

The attached patch changes BaseHandler.load_middleware() to do an atomic set of all four middleware lists. self._request_middleware is set last to signal the completion of middleware loading.

shai, tuple assignment is not atomic, so the suggestion does not help.

The patch looks OK, although, strictly speaking, temporary list is only required for _request_middleware as only that is checked in client code (i.e. only that serves as initialization guard, not the other lists).

It would make things a bit less opaque if a _middleware_not_loaded boolean property was added to BaseHandler (that would do the very same return self._request_middleware is None) and used throughout client code. It may make sense to avoid this for optimization though, method call would add quite a few bytecode instructions.

mrts, I will take your word on tuple assignment as I can't find any reference in a quick search.

However, your threading analysis is wrong: In your line 5, T2 doesn't wait for the initLock. This is because without the (original) patch, self._request_middleware is not None (django/core/handlers/wsgi.py line 226).

Ok I think we've been arguing past each other, the issue isn't anything that occurs in that block of code, the issue is that once that block of code begins exceuting for real in the first thread if a second thread comes in it won't block on that, while it should.

However, your threading analysis is wrong: In your line 5, T2 doesn't wait for the initLock. This is because without the (original) patch, self._request_middleware is not None (django/core/handlers/wsgi.py line 226).