05/30/06

Ryan and I took a spur-of-the-moment backpacking trip to the Sespe Wilderness this weekend. His notes and photos are here.

All in all, I felt it was a so-so trip. It was all rushed, and only three days long, so I didn't quite get into the 2-person backpacking vibe. When you hike with only one other person, you find yourself with a lot more solo time than with 3 people or more. I also was hoping to go all the way to the Sespe hot springs on the first day, but blisters prevented that. We settled for the Willette Hot Springs, which took a couple of hours to find and was pretty disappointing. I think I've been spoiled by Big Sur. We also had some technical problems (always, always buy real Nalgene!) that didn't help the overall enjoyment of the trip.

Despite all that, it was worth it. The Sespe river in May is moderately strong, and our trail crossed it at least 6 times. So if you ever go in May, hike in river-running shoes, so you don't have to stop and change shoes every hour.

05/09/06

Looks like CherryPy 3 will be significantly faster than CP 2.2. Here are some quick benchmark (Apache ab) stats from my little Win2k laptop. The first three are from the same test (1000 requests, 14 byte response body, 10 server threads), for 10 to 50 client threads:

These two are from a different test (1000 requests, 50 client threads, 10 server threads), for response sizes of 10 bytes, 100, 1K, 10K, 100K, and 100M:

I believe the improvement comes from three areas. First, the lowercase_api flag and checks are no longer needed. Second, filters are no longer called just to see if they're turned on. Third, all of the configs and special attributes are now looked up once, inline with the page handler (i.e, controller method) lookup.

04/23/06

I committed the first round of changes for CherryPy version 3 on Friday. It's nowhere near complete, but it hopefully can give hints about the future.

Before I dive into the meat, you should know I moved some things around:

_cphttptools is now called _cprequest

There's a new 'tools.py' module (see below).

All of the code in the /filters folder still exists, but it's all been moved into the /lib folder. The filters folder has been removed.

You can now call functions and instantiate objects in config files. For example: now = cherrypy.lib.httptools.HTTPDate()

Dispatchers

In CherryPy 2.2, you're able to replace the page-handler-dispatch mechanism by using a custom Request class; that is, you would subclass _cphttptools.Request and override the main or mapPathToObject methods. That can be tedious, since you can't specify the Request class on a per-request basis; the Request object has already been formed by the time the URL has been parsed.

In CP 3, there's a new _cprequest.dispatch function, and each Request object calls it. If you don't like the way CP looks up page handlers by default, you can declare your own dispatcher in the config:

dispatcher = my.custom.dispatcher.function

or

dispatcher = my.custom.DispatchClass(blah)

The only requirement is that the right-hand-side be a callable: it takes a "path" argument and should return a page handler (a callable). The default Dispatcher also sets request.virtual_path, so unless you're also setting request.execute_main to False you should probably do the same.

Filtering is now Hooking

I had a good long look at filters in CP 2.2. Despite their name, they don't really "filter" anything; nothing "passes through them". Some of them modify cherrypy.request attributes, but just as many of them don't. They're not implemented as filters; instead, they're "hooks".

A "hook" usually means a place where callbacks are called, and CherryPy filters have always been called from a pre-determined set of hooks (e.g. before_request_body). So I went ahead and changed the terminology throughout the codebase.

But there's a much bigger change than just the name. People have been pining for CP to release its grip over both filter declaration (which filters are available) and filter invocation (CP 2 calls all filter methods whether enabled or not). These issues have largely been solved in the current trunk by moving control out of the global cherrypy.filters module and into each Request object. Every Request object now possesses a "hooks" attribute, a _cprequest.HookMap object. The HookMap class has the following attributes and methods:

callbacks: a dict of the form {hookpoint: [callback, ...]}. The "hookpoint" is one of our old filter method names, like "before_finalize".

failsafe: a list of hooknames that should run all their callbacks, even if some of those callbacks raise exceptions.

attach(point, callback, conf=None): allows you to attach a callback to be invoked by this request. Any code can do this, and can do it on the fly! See the new caching module for an example; if the request is served from cache before_main, then the logic which would cache the page handler output is never attached, and therefore never invoked.

run(point): runs all registered callbacks for the given hook point.

populate_from_config(): this is called automatically by the Request object, and searches for Tools which it can call to setup hooks. What's a Tool? Read on...

Tools

CherryPy has always included a number of extensions and libraries which help you design web applications more quickly. In addition, many people have designed their own extensions to CP, some as custom filters, some as decorators, some as base classes to be subclassed, some as WSGI middleware, custom Request objects, on*Start methods, etc., etc., etc.

I'd like to call all of these extensions "features" for the rest of this post. A "feature" in this sense is any function(s) or module which could be implemented in a variety of ways. If the feature should apply site-wide, you probably want to run it like a CP2-style filter, and perhaps declare its scope in the config dict/file. But if it only applies to a page handler or two, you might think a decorator would be more attractive syntax. Sometimes, you want to invoke the feature from inside the page handler, after you've inspected a certain header, or after a lookup has failed.

However, it often happens that implementing your feature to be used in one of these ways harms its use in another: if you make a lovely decorator out of your feature, chances are that you cannot just "plug it in" as a before_main handler and expect it to work. This was a big problem for CP-2; a lot of logic could be useful elsewhere, but wasn't available because it was "locked away" inside a filter or some other construct.

A Tool is my new term for "feature adapter". If you can write your feature as a normal Python function, with normal Python arguments instead of config.get calls, chances are it can be wrapped in a Tool in a single line of code:

Your function can be called from the tools namespace: tools.cool_stuff(*args, **kwargs).

Your function can be used as a decorator via @tools.cool_stuff.wrap(*args, **kwargs). Any arguments passed to wrap() get passed to your function whenever it is called.

Your function can be used as a hook and managed in config. Remember the populate function (above)? It scans through the current config, finds any items that start with "tools.", and checks to see that "tools.cool_stuff.on" is True. If it is, it takes all other "tools.cool_stuff.*" config entries and passes them as named arguments to your cool_stuff function, at the hook point you requested.

That is the "simple case", and there is sufficient room for very complex additions to that (grep for the setup method). If your feature needs to replace the page handler, for example (as caching, static, and xmlrpc do), there's a tools.MainTool class; when used as a decorator or a hook, it automatically skips the page handler for you if your function returns True (meaning "I've handled this request, thanks").

I plan to explore other Tool improvements in the near future:

Argument inspection is high on the list, so that decorators, etc get the same argspec as your original function. You might also be able to import tools and let your IDE auto-complete your config entries, which in my mind would cut down on reaching for manuals quite a bit. It would have to be optional, because IIRC Jython doesn't have an "inspect" module.

Other hook points are possible. Investigate using hooks in a more generic fashion.

Using a tool as a decorator effectively means that it is not overridable in config. This "feature lock" is something I've wanted for quite a while, but there may need to be some means of allowing config to override such features or their arguments. For example, a developer may want to insist that a "staticfilter" be in place, but not particularly care about the OS path to its resources.

There are other issues that need to be addressed in CherryPy 3, of course (separating the CP server and the HTTP server springs instantly to mind). But these changes should give a us a good basis for consolidation of a lot of code, and the freedom to use all our beautiful library logic in whatever way is most appropriate to each application and installation. I look forward to all your ideas and improvements.

04/05/06

...By my understanding of Single Table Inheritance, the flight leg, ground leg, and scheduled meeting legs data would all be mashed up in one table. If I were designing tables in an RDBMS, I would never design that way – and I’m no genius Relational designer.

I guess, after thinking about it, I would write the three legs as separate tables/objects and write one legs action to combine them (in Rails)?

That's how I'd do it in Dejavu (three different tables). But Dejavu allows you to recall objects from those three tables either individually or collectively, without having to write your own "combine action". If you recall the subclass, you get just that subclass. If you recall the superclass, you get objects from all subclasses together in the same result set (you also get objects from the superclass, although quite often it's abstract and there aren't any).

I've never worked with Rails' inheritance, but I have been horrified to see mashup tables in plenty of databases. You know the ones: three columns common to all records, 28 columns that only apply to 50% of the rows, and 34 additional columns that only apply to the other 50%. Pick larger column-counts and smaller percentages if you're into mental masochism.

04/03/06

You may have run into this problem if you've ever tried to use "Require group" and "SSPIOmitDomain" simultaneously with mod_auth_sspi. I encountered it while trying to set up a new Trac site, to which I only wanted to allow access by staff members.

The problem seems to be that, if you set "SSPIOmitDomain On", then no SSPI call is made to check credentials. This could be because the domain is "omitted" before the authentication is done (I cared on Friday night, but I don't now). Regardless, the group requirement seems to fall through the sspi handler, at which point Apache complains that no group file could be found.

Anyway, rather than patch mod_auth_sspi, there's an easy workaround: use a PythonFixupHandler to strip the domain, instead of using SSPIOmitDomain.

Hope that helps someone. At the least, it should save you the trouble of porting mod_auth_sspi to a PythonAuthenHandler, as I tried to do first. Since there's limited access to the Apache connection objects in mod_python (e.g., you can't define connection cleanup functions), that's a whole 'nother barrel of fun...

03/23/06

When writers, editors, and publishers manage to be interesting, informative, or entertaining -- in short, useful -- we attract readers. If we are consistently useful, a relationship bond may form. And if we are clever, we will figure out how the tangible expression of that bond -- the RSS subscription -- can mediate exchanges of money for value. But language determines thought, and our language of sausage and traffic prevents us from focusing on what we actually do, why it matters, and how to reinvent ourselves in a networked world.

No, no, no, no, NO. If you are consistently useful, I will pay you for being useful, each time you are useful. If you continue to be clever and expect a relationship, where I pay you whether you are useful or not, you are clevering yourself out of the networked world.

I do not have a relationship with any sausage vendor, nor do I want one. The only reason I may have one is because they are geographically nearby, and the cost of switching to another is therefore high. In a "networked world", no site is farther away than any other, and switching costs are zero. If you charge for your relationship, sooner or later you will be beaten out on the Net by someone who does not.

The "language that determines thought" here is not "traffic", it is "relationship". Stop using that word.

03/19/06

I spent a few hours of my weekend working on getting a Read-Eval-Print Loop (sometimes called an "interactive interpreter") in a web browser. It was surprisingly easy to do so using Python's builtin code module and CherryPy. You can get it here: http://projects.amor.org/misc/wiki/HTTPREPL If anyone wants to contribute adapters for other web frameworks, I'd be happy to include them.

Anyway, now that you can build your application completely on the fly, we're one step closer to Smalltalk-style web nirvana. Maybe I should include a textarea option for larger chunks of code? Maybe an option to save the command history with the prompts stripped out? Hm...

03/07/06

I've often thought that sucking less every year is how humble programmers improve. You should be unhappy with code you wrote a year ago. If you aren't, that means either A) you haven't learned anything in a year, your code can't be improved, or C) you never revisit old code. All of these are the kiss of death for software developers.

They are also, therefore, the kiss of death for open-source software projects.

And yet, somehow, both new and experienced developers look at new and even gently-used open-source software efforts and expect them to be perfect. Not even, "this works as advertised" perfect, but "this is exactly how I would have written it" perfect. Never mind that, if they had written it, it would suck in a year, too.

This has been bothering me for a long time, and boils down to this maxim: "Everything's hard until you've done it once." Computation is hard. Search is hard. Concurrency is hard. Interfaces are hard. It's easy for smart people to believe that what they know today has always been known by everyone, and to conclude that, if you don't know it yet, you are a total dunce. Don't fall into that trap, people.

One of the things I like best about the Python community in particular is a healthy dose of humility. Let's keep it going.