Monday, May 29, 2006

Redesigned Cocoa binding reduces image size by 2Mb

The old Cocoa binding generated one Factor word for each Objective C method. With only a dozen or so classes imported, this added 2 megabytes to the image size. The new binding instead stores a global mapping of selector names to return/argument types, and only compiles distinct words for each possible return/arg type combination. This reduces the number of stub words from several thousand to 22!

The new syntax is a bit more verbose, however it no longer requires importing a vocabulary for each class you intend to use. Instead, you use this syntax:

NSObject -> alloc -> init

This is in fact equivalent to the following, since -> is a parsing word:

NSOject "alloc" send "init" send

Calling methods in a superclass is done in a similar way:

SUPER-> dealloc"dealloc" send-super

The compiler transforms the calls to send; in fact, the lookup of the selector is done at compile time, and the compiler turns the above snippet into something like this:

The selector object caches the Objective C selector (a sort of internalized string, analogous to a Lisp symbol). The gensym is the cached message sender which only depends on the arglist. Here is a typical definition:

"id" f "objc_msgSend" { "id" "SEL" } alien-invoke

So the new binding style uses less space in the image, and is no slower since in the end the same code is generated, except with less redundancy and duplication. There is one disadvantage, though: if two imported methods have the same selector but different argument or return types, they will clash and you will not be able to send one of the two methods using send or send-super. There are various ways around this; either make method dispatch slower and look up arguments at runtime (this also makes it harder, but not impossible, to make it compile) or provide an alternative form of send, perhaps send*, which takes as class name and can be used in the case of clashes.