a detailed Naiad description

(from the Spoon release notes)

Spoon is a project to make Smalltalk systems more understandable, by removing unnecessary stuff, reorganizing what remains, and making it easier to share and deploy. If you’re interested in teaching, modules, bootstrapping, minimalism, remote messaging, proxies, metaprogramming, streams, sockets, or namespaces, you may be interested in this.

Motivation

Smalltalk began as a part of a vision to provide extraordinary computing power to creative individuals. Although computer networks grew alongside it, Smalltalk was released into the world while its support for collaboration was still rudimentary. The system uses the concept of a virtual machine, including a model of the machine’s memory, composed entirely of objects. Besides providing a very portable execution model, this scheme gives the system a powerful continuity: everything happens as the result of sending messages to objects. When it comes to sharing objects with other people, however, this continuity is violated.

Traditionally, to share code with another Smalltalk programmer, one writes source code to a file, and shares the file. One is no longer sending messages to objects, and the behavior being shared is reduced from live objects to a textual form which requires compilation to reanimate. When it is in that form, one cannot ask it to do things, as with objects. Worse, source code is ambiguous; the result of compilation depends on the environment in which it takes place. It’s easy for the target compilation environment to be different from the original one in very subtle ways. Even referring to a class by name is risky; there is no way to be sure that the class with that name in the target compilation environment (if there even is one) is equivalent to the one in the original.

With the means of code-sharing being relatively unexpressive and inconvenient, Smalltalk object memories develop mostly in isolation, and quickly diverge from one another. And within each memory, support for describing the component behavior of different subsystems is weak. The result, after over thirty years of evolution, is that it is difficult even to describe what behavior one wants to transfer, before subjecting it to the inaccuracies of the transfer itself. Object memories have grown into large knots which are hard to untangle manually. Smalltalk’s designers pursued a system that one person could understand and maintain. These problems hinder that pursuit, and make working in teams much harder than it could be. They make the experience of teaching and learning the system awkward, and this is one reason why the Smalltalk community is small.

Approach

Smalltalk programmers who persevere in spite of these problems live with them for quite a while. They often don’t even notice them, both because the system in general is so much more powerful and fun than those they have used before, and because the peculiarities of the development process are longstanding traditions. When it becomes too painful to ignore, many people contemplate something better, but end up using workarounds that seem to make a reasonable trade between improvement and compatibility. There always seem to be more important things to work on. I think the workarounds created so far have made team development tolerable (for example, Monticello and SqueakSource), but they don’t make the system much easier to understand.

For me, the point at which the pain became intolerable came while I was teaching a Squeak course at Stanford University. There’s nothing like explaining the system to a room full of inquisitive college students to make its shortcomings clear. You see the system with new eyes, and a fresh sensibility. In order to explain one concept, I had to explain several interdependent pieces of folklore. I wanted to show how object-oriented programming enables powerful divisions of responsibility, but it was tricky to do this with Smalltalk as my exemplar.

I decided it was time for the extraneous stuff to go. I wanted to know what is truly essential for a functioning Smalltalk system, so I could show it to others. Once I had identified that, I wanted a way to compose larger systems while avoiding the knots of the past.

Nuke It From Space; It’s The Only Way To Be Sure

I started by creating a proxy system and a remote system browser that used it, so that I could remove things from afar. As anyone who has tried to refactor the user interface knows, it’s tricky to do that while using it. By changing the system remotely, I could use another system’s user interface instead. I was able to remove large swathes of the system with impunity, which was fun.

I got pretty far with this before I realized there’s a better way to discard unused objects: throw away all the methods that haven’t been run recently. I modified the virtual machine to set a mark bit in each method it runs, and wrote a version of the garbage collector which treats unmarked methods as garbage. I could clear the marks on all the methods in the system, then run the system for a while (ideally, run unit tests). When I ran the new collector, all the unmarked methods got discarded.

More importantly, all the literals of those methods got discarded, if there was no other path from them to some marked method. By running this from a remote browser while the target system ran headlessly, I was able to discard the entire graphics system, including Morphic, in one stroke. This led to a series of small object memories, the smallest being 1,337 bytes long (it adds 3 and 4 and then exits).

Visualization

I realized I now had memories with fewer bytes than there were pixels on my screen. I wanted to see what they looked like. I extended Squeak’s virtual machine simulator to write the object memory as a picture, one pixel per object byte, with the color of each pixel determined by the class of the corresponding object. By writing such a picture every few instructions, I made animations that showed contexts being created, and objects allocated and reclaimed. I made an interactive browser that could magnify a region of one picture, and print detailed information about the object corresponding to a pixel chosen with the mouse.

At the same time I was ripping things out, I was using remote messaging to put things in. I designed a way to transfer compiled methods from one system to another directly. I call this imprinting. I extended my remote browser to compile source code in the user-interface memory, and transfer the results directly to the remote memory. With this direct transfer, the compilation environment of the target memory matters much less, because no compilation is necessary in the target memory. If you can transfer a method’s literals and instructions correctly, then source code becomes an optional piece of documentation for the benefit of humans. In Spoon, literal markers describe method literals. They are transferred instead of the literals themselves, and they can recreate those literals when their methods are installed.

Transferring some method literals, like strings, is trivial. Others pose a challenge. In particular, how should one transfer a class reference? If a class object in one system and the corresponding class object in another system are truly equivalent in how they define the state of their instances, then we can refer to them using a single unique identifier. Like source code generally, class names become an optional piece of documentation. Compiled methods, and the virtual machines that run them, don’t care what the names of the classes are. It is only important that the instances the methods manipulate are of the expected storage format. This led to the namesake concept of Spoon’s module system.

NAIAD: Name and Identity Are Distinct

When you compile a method from source code that refers to a class, that method object has a reference to a unique live class object in the system. By using a unique identifier to refer to that class, rather than a textual name, we can preserve the identity of that class object across multiple object memories. By transferring methods directly, without resorting to recompilation, we can preserve the behavior of the instances of that class across memories too.

Naiad, Spoon’s module system, provides a framework for structured imprinting. With it, we can not only transfer methods, but also refer to particular versions of execution environment components, including methods, modules (groups of methods), classes, authors, comments, tags, and checkpoints. We can create collections of these references, called edit histories, that can answer important questions about the execution environment over time.

Identifiers and Editions

The basis for these references is the identifier, or ID. Each class in the system (both meta and non-meta) is given a universally unique identifer, or UUID, also known as its base ID. Add to that the UUID of the author who defined the class, and a version, and you have a class ID. Add to that the selector, version, and author UUID of a method of that class, and you have a method ID.

With these IDs, we can record the act of installing the corresponding components, as editions. An edition is a description of a component, sufficient to create an identical instance of that component in another memory. For example, a method edition records the header, literal markers, initial and ending program counter values, instructions, and class edition of a method, and a class edition records the format, superclass ID, and method editions of a class.

A class in Spoon still has a name, but it is part of the state of that class, along with its base ID. Each class is responsible for its own name, and there is no longer a need for a central registry of names. In Spoon there is no system dictionary. The compiler asks a class directly for its name binding. The system roots, including class Object, are known to the compiler. One could argue that, apart from the pseudovariables (true, false, nil, and thisContext), there are no global variables in Spoon.

Edit Histories and Modules

The place where we record editions is an edit history. Here editions are associated with the IDs of their components. We can then look up the edition for any component, given its ID. For example, we can ask a class for its base ID, and ask the edit history for the active class edition for that base ID. Editions have references to the previous and next editions timewise for their component, so we can make more complex queries such as “What are classes in the system that have ever had a method version written by this author?”

Modules are collections of method IDs. A module transfers the methods it describes to a remote memory by creating a copy of itself in the remote system, and guiding that system through a synchronization of the two memories. By “synchronization”, I mean that the original module is smart about not installing components which are already present in the remote memory. Many of the questions that an original module will ask its remote copy are ultimately answered by the edit history of the module copy’s system.

The edit history also augments the traditional Smalltalk source files (“changes” and “sources”). For a given development system, it occupies its own object memory, called a history memory, maintaining a remote messaging connection to the memory it describes, called the subject memory. It’s feasible to give the edit history its own memory because memories are now small by default.

Note that in cases where live system synchronization is not possible due to connectivity constraints, one can still send an entire history memory asynchronously to the requestor and let the synchronization happen locally. Similarly, the history memory that a developer uses can be located on any connected machine. For example, one could use a subject memory on an iPhone that has its history memory in the cloud.

Module Discovery

Finally, Naiad provides a means of finding module objects that one would like to install. It uses Google to index web pages that describe all available modules. Each module page includes a link to a local Spoon web server that, when followed, instructs the local Spoon system about how to install the module.

There are at least two significant things missing from the current release. The first is a reorganization of the virtual machine, which is itself another sprawling and confusing knot (although the least so of the Smalltalk VMs I’ve encountered). The second is support for secure messaging (unless one uses an end-to-end scheme based on SSH tunnels or a VPN). The remote messaging system here is a minimal bootstrapping measure. Clearly one can imagine using something more sophisticated, like TeaTime. Again, though, I think a major simplification would be in order.

I want to extend Spoon beyond Squeak to other Smalltalk implementations. That’s my project for Camp Smalltalk at various conferences. I’m available to apply this work in paid and academic contexts. I’m currently a consultant based in Amsterdam, eager to work with clients worldwide. I would also like to pursue a PhD degree based on this work. I would love to hear from you.

Thanks

Thanks for reading this far, or just being interested enough to see what’s at the end. :) If you have any questions, please feel free to ask! I am usually on the Spoon IRC channel between 9am and 9pm GMT.

[…] Naiad is Caffeine‘s live module system. The goal is to support live versioning of classes and methods as they are edited, from connected teams of developers using Smalltalk or JavaScript IDEs from web browsers and native apps. Naiad keeps each developer informed of events meaningful to their teams and work. It’s comparable to a mashup of GitHub and Slack, and will interoperate with them as well. […]