Description

Under some circumstances (I'll get to guesses about this in the end), the db connection isn't closed at the end of the request, and resurfaces later on to service another request. Thus, one request writes to the database, and then the next request tries to look for that data and fails, because it uses an outdated connection that hasn't yet been notified of the db change (per ​pep-249, different connections aren't required to see the same data.) The workaround solution for this is to add a django.db.connection.close() before that query, which brings to mind bug #13533. Similarly, This bug manifests on mysql innodb; but I haven't checked other databases, so this might just be coincidence. It is also possible that the non-closing connection bug exists for all back-ends, but that auto-commit isn't enough to prevent the damage only on mysql innodb.

The situation under which I've seen these connections retained is when sending an HttpResponse with a generator for content. It might be that when sending such content, the following lines (django.db:44-47)

which are supposed to make sure the connection closes, don't work, because the signal isn't raised. It is, however, still a riddle how the old connection gets attached to a new request. Bug #19117 might also be related to this.

What does this mean? When using StreamingHttpResponse the request_finished signal is sent before the response is actually generated, meaning database connections are already closed and a new query inside the iterator would cause a new connection to open.

Do we have a problem with transactions here? The transaction used for the request could be closed before the request is actually generated. If this is the case, then documenting this with "if using transaction middleware the transaction used for the request will be closed before the request is generated" seems like the easy way out. Another option is to add hooks for request finished with success/exception, but this is out of scope for 1.5.

Yes, the transaction middleware will also commit the transaction before the response is generated. Same goes for decorators like commit_on_success. But independent of that, the whole connection is closed before the response is generated.

Ok, so I attached an initial patch. Anssi provided some valuable feedback: He suggested that we only do the .close dance if we are actually working with a StreamingResponse to keep the patch backwards-compatible (although we don't know if it is backwards-incompatible yet). I somewhat agree with that, but on the other hand I'd like to handle both cases in the same way.

Next question is how to handle streaming responses in the test client. Currently they are not consumed at all and Anssi suggested to consume them always (probably add a flag later on to prevent consuming them, but that's probably not needed for 1.5).

Last thing: I am still thinking how to test that since the TestClient uses it's own WSGI handler which does stuff a bit different (it actually disconnects the close_connections receiver before it fires request_finished ;)).

Any thoughts from someone with more knowledge of the WSGI protocol than me would be appreciated :)

Related: ​http://bugs.python.org/issue16220 -- We either have to raise the required python versions or fix it in our wsgiref subclass (probably the later since Python didn't fix it in 2.6, unless it wasn't broken which I doubt).

Do we have a problem with transactions here? The transaction used for the request could be closed before the request is actually generated.

This is less likely to bite people in real life. There are obvious uses cases for reading from the database while generating a streaming response (eg. huge CSV exports); I don't see writing to the database as being as common.

Related: ​http://bugs.python.org/issue16220 -- We either have to raise the required python versions or fix it in our wsgiref subclass (probably the later since Python didn't fix it in 2.6, unless it wasn't broken which I doubt).

This fix will be available in Python 2.7.4, which isn't released yet.

If we want to fix this by triggering request_finished in the WSGI iterable's close() method we must ensure it's reliably closed.

This isn't testable within Django's test framework because the test client explicitly disables closing the database connection at the end of each request (since #8138).

I've used Florian's test project to validate manually that the pull request fixes the issue.

Per PEP 3333, "the close() method requirement is to support resource release by the application". Currently, Django uses the request_finished signal to close the connections to the databases and caches. That's exactly the purpose of close(), and therefore I think it's a good idea to send the signal from there.

If we supported only the iteration approach, then current frameworks
that assume the availability of "push" suffer. But, if we only support
pushing via write(), then server performance suffers for transmission
of e.g. large files (if a worker thread can't begin work on a new
request until all of the output has been sent). Thus, this compromise
allows an application framework to support both approaches, as
appropriate, but with only a little more burden to the server
implementor than a push-only approach would require.

To me this suggest it is possible a different thread will finish the request and/or there is another request processed at the same time by the same thread. This would mean that we could be closing connections for different threads or different requests. I hope I am 100% wrong here.

If the above is the case, this is going to get a lot more complex. The connection handling is bound to threads, not requests. What we could do is detach the connection from the thread at the old request_finished point, and then when the request is actually finished (the new request_finished point) we could close the connection.