Thursday, January 06, 2005

Leaky Abstractions and UI Toolkits

Late last night, I was digging around in the bowels of ATL for some work-related research and got distracted into doing some wxPython digging, after which I got distracted into doing some Wax digging. (Long night).

I like the clean and idiomatic Wax design. But when you commit to Wax, you're committing to a lot. For example, on Windows, your dependency chain looks like this:

YourApp -> Wax -> wxPython -> wxWidgets -> Win32

Each of those arrows is an abstraction (or for those who like big words, a "paradigm boundary transition"). And each abstraction leaks. This is true for all abstractions, because an abstraction is just a bridge between two different sets of assumptions.

One good example of abstraction leaks occurs when you're a procedural process like Win32 window creation in an object framework (I'll use C++ for this example). To create a window from C, you first create and register a window class structure, which includes a pointer to a window procedure (a callback function that handles all messages for windows of that class). Then you call CreateWindowEx(), specifying your window class. This creates the window and returns a unique identifier for the window (called a window handle, or HWND). Simple, right?

Object-oriented folks would want to create a C++ class for each "window class" and one instance of that C++ class for each window. So you implement the window procedure as a static class method, because Windows expects callbacks to have C linkage. Your window procedure needs to dispatch messages to the appropriate object, so you need a map of HWNDs to instances, which you populate in your object's constructor with the HWND returned by CreateWindowEx.

But there's a subtle gotcha: when CreateWindowEx creates the window it immediately sends several messages to the window procedure and processes the results before returning. When your static window procedure receives these messages, CreateWindowEx hasn't returned yet, so your object hasn't updated the mapping table yet, which means your window procedure doesn't know which object should handle the message!

Object toolkits solve this with different, but equally egregious hacks. MFC and wxWidgets abuse a little-known but documented Windows feature called a CBT hook to get a notification at the moment a window is created (before messages are processed). ATL is more evil--it injects a hand-written assembly language thunk into the beginning of your window procedure to replace the HWND parameter with the address of the C++ object (of course, this means they have to hand-write assembly code for each CPU they support, but as far as Microsoft cares, "portability is for canoes").

The point is that abstractions have to do some interesting gymnastics to jump to the next paradigm. Gymnastics means code and data, and code and data mean additional performance cost. A Wax application has four levels of abstraction about the "native" environment. That's why (on my machine) a Wax version of "Hello World" has a memory footprint of many megabytes and takes about five seconds to start, while a bare-metal Win32 version written in C eats less than 50KB and starts in under a second.

wxWidgets was written over a decade ago as a cross-platform toolkit for C++ programmers, so it implements what C++ programmers in the 1990s needed (like a string class, cross-platform sockets, and the Windows CBT hook hack). wxPython is a Python binding for wxWidgets, which means it can rely on some well-used and well-tested code, but it brings along parts of wxWidgets that Python programmers don't need. And Wax is a more idiomatic API on top of wxPython, but it, in turn, has to bring along parts of wxPython that it doesn't need.

I'm not knocking any of the libraries (or their authors). Each decision to adapt the previous library was a good one, in its own context. But it adds up to a tall library stack with a big footprint.

On the other hand, if someone got the itch to take something like the Wax API, and implement it more directly (say, with ctypes interfacing to the native toolkit)... that would be cool.