Ian Bicking recently promoted the idea of a WSGI reference library, to possibly include the following components (among others):

Sessions middleware

Logging middleware/library (I assume he meant request logging)

Error reporting middleware/library

Test frameworks

A file application (handling If-Modified-Since, etc)

A proxy application

Libraries for parsing query strings and all that.

Authentication.

URL parsers.

And maybe a few of the more boring servers, like the CGI server, which will otherwise be homeless (or widely repeated).

Not being the most careful reader in the world, I was thrown by the phrase, "...collaborating on a ... library of WSGI middleware"; I read the list as if he meant each piece would be a middleware component! Of course he did not intend that. Many of the items in the list are WSGI applications, which sit at the end of the software stack.

Some of the items in the list are, in fact, paraware; that is, they parallel the main application. Traditional programming libraries/toolkits are a common example of paraware. They provide functionality by supplying input and output hooks, which are supplied and consumed by the main application:

mylib.set_value(3)
mylib.munge(obj)
result = mylib.get('f')

Middleware, on the other hand, handles/munges a content stream, and sits between at least two other components in a software stack. Middleware is a nasty thing in many environments, because each middleware component must manage I/O of all shared objects, in two directions (both its caller and the next component in the stack). In Python, however (and specifically WSGI), the shared objects are all on the same heap, and can all be passed by reference.

I see problems with writing most of these components as middleware. WSGI has a shot at being ubiquitous because it enforces a set of interfaces and a data model; this same enforcement, however, can also be a liability, since WSGI is not yet ubiquitous. As a developer of a web framework, I have a dilemma: I need to provide the same functionality whether my users use WSGI or not. This means I need to write such components as libraries (so they can be used as paraware) and then wrap them with WSGI boilerplate (so they can be used as middleware). This leads to serious code smell. WSGI's callback structure is complicated enough without me introducing library-code wrappers. Perhaps what we need are generic pieces of WSGI middleware which you can init with a callback from your library code. Hmmm.

Potential components from Cation

I've been meaning for a while now to investigate breaking my Cation app framework down into a set of libraries (instead of the monolithic framework it is today). You can see from the dearth of recent checkins that I haven't done any of it yet. Many of those could be added to a WSGI library (some are already on Ian's list). Here are the ones I'd be most interested in writing:

Top-level error trapping, logging, and pretty printing

I'd like to do this myself because Cation keeps a list of application developers (usernames), and shows full tracebacks in the browser to developers. Ordinary users get a "pretty" error message, and the full traceback goes into the log only. I'm pretty sure a standard library version wouldn't do that. Integrating the usernames into the error handling logic leads me to want to provide this as paraware, since middleware components are usually not expected to interoperate.

This isn't WSGI-specific, and shouldn't be a candidate for WSGI. But it's something I'd like to rewrite in more of a library style, instead of a framework.

Centrally registered and managed requests

For example, this would assist a WSGI application in fulfilling a request to shut down--each active web request (thread) could be sent a shutdown message and kill itself gracefully from outside the application itself.

Data type coercion (both inbound and outbound), including encoding

Since HTML form values are always received as strings, a standard (but overridable) way to convert them into Python values would be helpful. In the other direction, values need to be coerced to strings, put in the encoding of the server (or of the page), and often quoted safely. Again, this would probably need enough customizability that it would be a poor candidate for middleware, but a good candidate for a set of library calls.

Profiling

Classic middleware, meeting a need orthogonal to the actual content delivery, and not needing customization or context.

HTTP uploads.

Something that might on occasion need to be specialized, but ultimately a commodity for 90+% of cases. The standard implementation would be nothing more than a pretty interface over simple (but secure) file management.

That's enough for the next year or so Pity I have so many other projects to work on simultaneously.