Imagine a world in which we couldn't choose what our programming
languages could talk to; where programmers had to rely on language
vendors for access to libraries, operating system calls, or
devices.
Fortunately, those of us using open source languages generally have
an alternative.
Today's most successful languages are all capable of
being extended more or less easily to integrate with new systems
and devices.
This ease depends on a number of factors, but generally,
dynamic languages (like Smalltalk, Ruby, or Perl) can be harder to
integrate with external libraries than C, C++, or assembly
language.
Most of them provide memory management that is different than the
memory management (if any) of external libraries written in C.
Further, since the execution model is probably different, there is
usually some glue code required between the language and the
extension library.

I recently decided that I was going to make Squeak Smalltalk work
with an open source package that is available as a static
link library.
It took me a while to learn how to do this correctly, so I thought
I'd share what I learned with you.

Squeak (http://www.squeak.org) is an open-source Smalltalk
language development system that comes with a powerful development
environment, graphics frameworks, and a number of other tools.

You can write code to do most of the things you need to do in
Squeak directly in the Smalltalk language.
But it's also possible to extend Squeak using code
that's written in C or another language.

These extensions are called plugins, and contain
primitives, which are named subroutines that can be called
directly from Squeak code.

This article will explain how to make your own plugin in Squeak,
and will take you through the construction of an example plugin.
There is an appendix (#appendix) at the end for use as a quick
language reference.

I assume that the reader has some familiarity with both Smalltalk
and C syntax, and at least some familiarity with non-blocking file
I/O and the select() runtime library function.

Why bother writing primitives?

Why would you choose to write a primitive rather than writing a
method in Smalltalk?

One reason is that the primitive will probably run faster than
Smalltalk.
Since Squeak uses a byte code interpreter, individual
instructions run more slowly than native code produced by a good
compiler would.
For some applications -- realtime streaming video or audio,
compression, crypto algorithms, and JPEG decoding, for
instance -- this gain in speed can make the difference between
an application being usable or not.

Another reason to write a primitive is to use the services of an
pre-existing library.
This could be anything
from native OS services (like sockets, asynchronous file support, or
serial port usage),
to extension libraries like zlib (compression) or
pcre (regular expressions),
to interfaces with other programs (like OLE, Applescript, or X11
servers).

Primitives also let you deal with callbacks from external
sources -- somewhat.
Unfortunately, the Squeak interpreter doesn't let you call
Smalltalk code from external code.
Because of this, the usual idiom is to receive the callback in a
routine written in another language, and signal a Squeak
Semaphore to let a Squeak Process continue running to handle the
condition.

The other important justification for Squeak primitives is to make sure
that Squeak doesn't block while waiting for I/O.
The problem is that Squeak runs in a single OS process, and has
its own multi-tasker internally.
If one Squeak Process blocks at an OS level, no other Squeak
Process can run.
To make non-blocking I/O possible, most ports of Squeak have a
provision for checking I/O events (files or sockets that have
become readable or writable, or sockets that have exceptions) and
calling back to user code in a plugin.
This code then sets a Semaphore as described above in the
discussion of callbacks.
Using this scheme, a Squeak Process that needs I/O service can
start the request and block on a Semaphore until the transfer is
complete, letting other Processes run.

For some examples of existing primitives, you can look at the
classes in the class category VMConstruction-Plugins.
Good examples include:

the DSAPlugin, which is an example of a plugin created for the
purpose of speed. It calls no external libraries.

the Mpeg3Plugin, which is an example of a plugin that accesses an
external library.

the AsynchFilePlugin, which is an example of calling OS services
and using asynchronous notification via semaphore signaling.

About the Spread plugin

I can best demonstrate how to write a plugin by showing you a
concrete example: my Spread plugin.
This is a plugin that I made to interface with an external library,
in this case the Spread library libsp.
Let me introduce you to Spread and take you through the process of
writing this plugin.

Spread (http://www.spread.org) is a group communications
system that allows messaging to groups across the
network.
I want to add Spread capability to Squeak so that I can
experiment with various broadcast, collaboration, and
distributed object schemes.
A Spread system consists of one or more Spread daemons
that receive requests from Spread clients and pass messages
between themselves and between Spread clients.
A Spread client can be in as many groups as it wants,
and it can send messages to as many groups as it wants (even
ones that it doesn't belong to).

After looking at the Spread API documentation and source code,
I saw two choices for connecting Squeak to Spread.

One was to duplicate all the client logic in Smalltalk, down
to sending packets over the network.
I could read the C or Java implementations and duplicate them
in Smalltalk.
This has the advantage of not requiring a compiled plugin, but
has the serious disadvantage of being a lot of work.

The other choice that I saw was to hook Squeak up to the
Spread client library libsp, which is written in
C and is available as a static linker library.
Although this choice requires compilation of a plugin, it has
the advantage of being able to track new versions of Spread
easily by a simple re-compile.
But most important for me, it looks like much less work,
so this is the strategy I chose.

libsp is written in reasonably portable C and is
compilable on all the popular desktop platforms that support the
standard Sockets API.
This means that my plugin potentially can be used on most of the
computers that run Squeak.

The Spread API itself is quite simple.
At its core, it consists of the following functions:

SP_connect()

Connect an application to a Spread daemon.

SP_disconnect()

Disconnect an application from a Spread daemon.

SP_join()

Add a client to a (possibly newly created) group.

SP_leave()

Remove a client from a (possibly nonexistent) group.

SP_multicast() (and its variants)

Send a message to all members of one or more groups.
The message will be marked as having originated from the
sending client.
There are six different levels of service that specify
different guarantees on message reception and ordering.

SP_receive()

Receive the next message on a given connection.
Messages are either regular messages that are explictly sent
by a client, or are membership notification messages of
various kinds that are sent upon changes to group membership.
SP_receive() will block until a message is
received.

SP_poll()

Return the number of bytes waiting on a particular connection
without blocking.
I could use this by itself rather than using a
Semaphore, but I don't want the additional overhead of having
to poll periodically.

Getting Squeak to use these functions seems straightforward
enough, except for SP_receive().
I don't want to call a blocking function, because if I do,
none of the other Squeak Processes will have a chance to run
until the function completes.
So I'm going to have to avoid calling SP_receive() until I know
that it won't block.
This requires knowing that there are bytes to be received on
the socket that is being used by the Spread client
connection.
Luckily, one of the return values from
SP_connect() is actually a socket file
descriptor (though this isn't documented).

Using the socket file descriptor to test for readiness to
receive requires Squeak to call the runtime library select()
periodically to test whether the file is ready.
This support has already been built into Squeak for the use of
Squeak's native sockets and asynchronous file support, so I need
to hook into it.
Unfortunately, there isn't yet a standard API for this
select() polling, so there will have to be a
platform-specific portion of my Spread plugin.

To make the SpreadPlugin easy to port, I should write it so that
the Smalltalk part doesn't have to change for different
platforms.
So it looks like I'll end up with these files:

SpreadPlugin.c

Platform-independent code.

SpreadPlugin.h

Declarations of types and external
functions needed by the SpreadPlugin.c code.

sp{Platform}Spread.c

Platform-specific code needed
for hooking up the functions in SpreadPlugin.c to
the Spread API and the operating system specific Squeak polling
mechanism.

I'm writing this first for Linux, so my platform-specific file
will be called spUnixSpread.c .

Anatomy of the plugin

Now that I've figured out a broad strategy, what do I need to get
this plugin to work?
The required pieces are independent of the way I choose to write
the plugin code itself.
From the top down, I will have:

Squeak client code that uses the plugin.

A Squeak class that supplies the interface for the client code.

Methods in the interface class that call functions within the
plugin.

Functions within the plugin that are designed to be called by
Squeak.
These are the primitives.

Other functions required internally by the plugin.

The Spread API itself.

It probably makes sense to define these from the top down, so that
the client interface is as clear and Squeak-friendly as possible.
So let's go over each of these pieces in order from the top down.

At the highest level, my requirements for using this plugin from
Squeak seem pretty simple:

The users of the plugin shouldn't have to know that they're
using a plugin.

I want it to be possible for one or more Processes to be sending
messages while another Process is waiting for messages.

Ideally, the different kinds of Spread messages (regular and
membership) should be represented by instances of different
classes, so I can use polymorphism to dispatch the messages.

There's also some unknowns and code that I don't want to write
right now:

I'm not too sure what kind of message polymorphism I need yet,
so I'm going to defer that decision until I've had some
experience using the plugin.
In other words, I'll start with a single SpreadMessage class
that looks very much like the messages I receive from
SP_receive().

I'm not sure how I'm going to handle the probable situation
where different Processes want to receive messages that were
sent to different groups.
I'll assume for now that I'm going to build a dispatcher on top
of whatever plugin interface I come up with.

The next level below the client code is the interface class.
Since all of the operations in the Spread API either require or
return mailboxes (which identify individual connections,
and are represented by socket file descriptors), it makes sense
to have the interface class represent a connection.
I'll call it a SpreadConnection.
It will have to present the appropriate API for client code, of
course, but it will also have to hold whatever data I need to
represent the state of the connection itself for the use of the
plugin code.

This state data includes at least the file descriptor
returned from SP_connect() and the Semaphore
used to block a single Squeak Process while waiting for a
receive.
It might also be nice to maintain whatever data pertaining to
the connection that SP_connect() returns for the
sake of client code, though I might not actually need it.
So I'll add the private group name that is returned by
SP_connect().
Maybe later I'll also save the name and/or port of the Spread
daemon for error reporting, but not now.

At first, I'll make the interface of the
SpreadConnection class mirror the Spread API.
This will make debugging easier, but may not be appropriate
for final use.
As I discover more about the needs of my programs that use
this plugin, I can add to or change the interface.

All the Spread API calls return a numeric error code of some
sort; some of the calls also pass back a byte count in the
error code.
I'm going to return the same error code from my Smalltalk
API for the time being, because it makes testing easier.

Connect to the Spread daemon named by daemonName, with a
(daemon-)unique privateName (if nil, one is assigned).
wantsGroupMembershipMessages indicates whether or not group
membership messages will be sent to me. Answer the Spread
error code.

disconnect

Disconnect from Spread.
Wake up all Processes that were waiting for data to appear on
this connection. Answer the Spread error code.

Send the message mess, with user message type
messType (16 bits),
and Spread service type serviceType, to the groups whose names are listed
in groups.
Answer the Spread error code.

poll

Answer the number of bytes waiting to be read, without
blocking.

receive

Block the current Process if necessary until data is ready,
then answer a single message.

After thinking about it a while, I realize that these
connections will have to be able to withstand an image save
and startup.
However, if part of their state is a file descriptor, that
descriptor will certainly not be valid when the image comes
back up.
I could either disconnect all the connections on a shutdown
using a class shutDown method, or I could just
keep track of whether they're valid somehow.
I think I'll do both, because I also need to be able to close
out connections when they get garbage collected, if someone
forgets to disconnect them.
And I need to tell if a connection that has been closed is
safe to use.
So I'm going to add a way to query validity from Smalltalk
(I get to figure out how to do this later):

isValid

Answer whether I am a valid connection.
If I am not, all other Spread API calls will cause an Error.

Now that I've mapped out the top SpreadConnection
layer, I have to actually call from Squeak to the primitives in
the plugin.
I'll call these interface methods here.
Squeak has a special syntax for these calls.
They look like this:

The first line of these calls looks like the first line of
a normal Smalltalk method, with the name of the method and
arguments, if any.
This is followed by the special syntax
<primitive: 'primitiveIsValid' module:'SpreadPlugin'>
which calls a named primitive (in this case,
primitiveIsValid())
in a named plugin (here SpreadPlugin).

After the primitive call is Smalltalk code. This code is
only run if the primitive call fails for some reason.
Reasons for a primitive failing include not having the proper
plugin, not being able to load it because of library
dependencies or sending the wrong kind of parameters.
Since the SpreadPlugin requires the use of the
primitives and can't be effectively replaced by Smalltalk
code, all of the plugin's primitive calls, except for
primIsValid: and primConnect:...,
which will raise an exception if they fail.

Another thing that the interface code can do easily is to
translate and prepare argument data for the primitive, and to
modify the output data from the primitive.
I've found that it's often easier to do this kind of translation in
Smalltalk than down in the primitive code.
For instance, C often wants NUL-terminated strings.
But Squeak's strings have a count and no NUL.
My first draft of a couple of these interface methods passed the primitive
the string and its count, so that I could avoid counting the string
in the primitive.
Some translation that did survive my optimization is the packing
and unpacking of group names.
SP_join() has a list of groups as an input, and
SP_receive()
returns a list of groups.
The C interface to the Spread plugin expects these names to be in
fixed-size arrays, 32 bytes per each group name.
However, it's much more natural for Squeak to pass around
collections of group names.
Rather than requiring a specific data type to be passed (say,
requiring an Array of Strings),
I allocate and stuff a ByteArray with the
characters going in to the SP_join() call,
and I allocate a buffer of a nominal size to
return the groups from the SP_receive() call.
Luckily, Spread will return an error code telling me if my
buffers are too short, and will also indicate how long they have to
be.
So my Smalltalk code allocates nominally sized buffers, calls the
primitive, and then reallocates buffers and calls the primitive
again until the buffers are big enough.

Now I'm ready to specify how the interface methods look.
These look very similar to the higher level interface; the
differences are mostly because some of these API calls have
multiple return values that have to be returned in Smalltalk
objects.
The interface section of the SpreadConnection
class looks like this:

Here the collection of groups passed in to the higher level
routine has been copied to an array of fixed-length character
arrays, as expected by the C interface to
SP_multicast().

primPoll

primReceive:messagedropRecv:drop

In this primitive, a message object gets all the return
values from SP_receive() except for the error
code.

The linkage between Squeak and the named primitives in the
plugin is managed by Squeak, which will load the external
library (or hook up the internal library) when needed and
arrange for the method calls to call the primitives.

Primitives don't take arguments or return values like, say, C
functions do;
rather, they get inputs from and leave their output on
the Squeak object stack.
Further, the contents of the Squeak stack aren't C objects,
so translation is usually needed before your C code can do
anything with the arguments or receiver of the message.
You also have to translate the C return value back into a
Smalltalk object.

I could write my primitives directly in C, but decided
instead to have the C generated by a compiler for a language
called Slang.
This compiler comes standard with Squeak.
Its syntax is that of Smalltalk, but it generates plain old
(non-object) C code.
So method calls become regular function calls, plugin instance
variables become plugin globals, and other Smalltalk
expressions become C expressions.

All of the Squeak plugins that I know about have been compiled
from C.
And most (but not all) of this C has been generated from Slang code in
the image.
There's nothing magical about either C or Slang;
since all that is needed is to have functions in a library that
have C calling convention, I could write my primitives in
(non-object) C++, or assembly language, or Delphi, or any other
language that was compatible.

If I were writing in C, I'd declare the primitives as
taking no arguments and returning an (unused) int or
void.
However, I'm using Slang to generate my C, so the Slang compiler will
automatically generate code to deal with the Squeak stack.

Because I'm using Slang, the primitives are declared just
like the interface methods.
However, their exported names are the names given in the
interface methods (i.e. primitiveReceive rather
than primitiveReceive:dropRecv:).
They declare these exported names in the Slang code.

There are also three methods that will be called in
plugins that define them; their names are fixed by the runtime system.
Since I need both startup and shutdown processing, I define
all of them:

initialiseModule

This routine is optional; if a plugin
exports this, it will be called immediately after loading the
plugin.
In my code, I use it to initialize a global data structure.

shutdownModule

This routine is optional; if a plugin
exports this, it will be called at shutdown time.
It will also be called if you manually unload your module,
like this:

Smalltalk unloadModule: 'SpreadPlugin'.

Manually unloading modules can be handy during module
development, as it lets you re-compile an external plugin and
test it without leaving Squeak.

setInterpreter()

This routine is required, and will be
generated automatically by Slang if you use it.
If you're writing your primitives directly in C, you have to
provide this function.
It takes a single int argument, which is
actually a pointer to the interpreterProxy.
Plugin code saves this in a file-scoped global called
interpreterProxy.

Then there's also an internal routine that my plugin code
will use to convert a SpreadConnection Smalltalk object into
a C struct SpreadConnection:

mboxPointerFrom:

Inside a primitive

Now that I've mapped out the general structure of my plugin, let me
talk about the primitives themselves.

The primitives have access to whatever module-wide variables
they declare (which appear as plugin instance variables using
Slang), as well as a global pointer to something called the
interpreterProxy.
This is a C structure that has pointers to many of the
Interpreter's methods, for use by the plugin code.
The pointer is set up by a call to a function within the plugin
called setInterpreter() that is called before any
primitives.
Within a primitive, the "instance variable"
interpreterProxy refers to this C structure.
It is through the interpreterProxy that all stack
access, memory allocation and most object conversion is done.

Since I have to call platform-specific code (in this case, for
the asynchronous notification from select(), the
place to do it is not in the primitive method itself, but in a
separate function called by the primitive.
Remember, I have separated all the platform-specific bits into a separate
file called sqUnixSpread.c.

So what does the primitive see when it's run?
When a primitive is called, the receiver (in my case a
SpreadConnection object) and any arguments to
the method are pushed on the stack in left-to-right
order.
So slot 0 (the top of stack) is the last argument in a call
with N parameters,
N-1 is the first argument,
and N is the receiver.

The stack holds 32-bit numbers which are called
Oops.
There are two kinds of Oops: pointers to objects, and
SmallIntegers.
Since the SmallInteger objects are common,
small, and immutable, not using a pointer saves lots of
space and time.
To tell the difference between a SmallInteger
and a pointer, Squeak uses the low bit of an Oop to mark
it as a SmallInteger.
This leaves SmallIntegers with a 31-bit
range.

Inside the primitives, I convert the arguments into primitive (C)
types, do whatever other preparation is required (like
allocating temporary buffers) and then call the Spread API
functions.
I then convert the return values from these functions (which are numeric error
codes) into Squeak SmallInteger objects, drop all
the arguments and receiver from the stack, and push the
converted error code.
If there is an error at the primitive level that keeps me from
even calling the Spread function, then the primitive leaves the
stack intact (for the Smalltalk cleanup code), and signals its failure
by calling interpreterProxy->failed() or
interpreterProxy->success(aBoolean).

Ordinarily, since Squeak is single-threaded from an OS point of
view, the rest of Squeak doesn't run while a primitive is
being executed.
Because of this, I'm free to use pointers to Squeak
ByteArrays, Strings,
SmallIntegers, etc. in my primitive.

However, because Squeak moves objects around during garbage
collection, I can't hang on to a pointer to a Squeak object
after the primitive ends.
So buffers and other structures that have to be accessed
asynchronously by C code (in my case, the socket file descriptor
and semaphore index) must be allocated from C code.
In my primitiveConnect call, I only need to hang on
to the file descriptor, which is an integer.
When I pass this to the asynchronous notification code, that
code copies it into memory it allocated.

I could, if I needed to, also call back to the
interpreterProxy to allocate Smalltalk objects.
This is one alternative for making variant return values from a
primitive.
However, when Squeak is asked to allocate an object, it may decide
to do a garbage collection.
So any pointers to Squeak objects will not necessarily be
valid after allocating a Squeak object.

To deal with the moving objects problem, you can either copy
data into locally allocated memory, or you can take advantage
of a technique for protecting objects from the garbage
collector.

The interpreter supports a separate stack of objects that
will be protected from the garbage collector.
Within a primitive, you can push Squeak Oops onto this stack,
and be guaranteed that those objects won't be moved by the GC
until after they're popped from the stack.
So if you're doing allocation of Squeak objects and require
access to the internals of other Squeak objects later, you
should use this technique.
This is how it would be written in Slang (the C version
merely has a different syntax):

How to make a plugin

Using Slang to write your primitives

Putting it all together: all the code for one primitive

Let's look at all the code that I need to connect a Squeak method
to a primitive.
For this example, I chose
SpreadConnection>>connectTo:privateName:wantsGroupMembershipMessages:,
which is one of the more complicated routines in this plugin.

At the highest level, I have the Smalltalk method
SpreadConnection>>connectTo:daemonNameprivateName:privateNameOrNilwantGroupMembershipMessages:wantsGroupMembershipMessages.

It's responsible for:

initializing instance variables.

In this case, I have to initialize:

wantsGroupMessages

privateName

semaphore

this gets a new Semaphore that is then registered with the
system (via my class register method).

mbox

this is initialized to a ByteArray of 12 bytes.
mbox is what is used to communicate the
connection state (file descriptor, semaphore index, session
ID) to the C code.

Registering the semaphore.

There is a system method called
Smalltalk>>registerExternalObject: that takes a
Semaphore and saves it in a special table of
registered semaphores.
It returns a small positive integer that is then used by
lower-level code to signal the Semaphore.

Translating data types as needed for the next layer.

Here I translate privateName into a
ByteArray, and
wantsGroupMembershipMessages into a
SmallInteger.

At the next level down, I have the interface method that calls the
primitive.
There's nothing too interesting here, except that if the primitive
fails I return an error code rather than raising an exception.
Since this is the first Spread primitive call that will be made,
this allows testing for the presence of the Spread plugin without
throwing exceptions.

Going down to the next level, I have the primitive itself (in the
SpreadPlugin class).
This is written in Slang.
Since I'm using the TestInterpreterPlugin, I can
declare the types of the method arguments using the
primitive:parameters: method.
This way I don't have to remember what slot numbers everything is
in on the stack.
The return value of this method is the receiver Oop (called
connection here), which I pass to the (inline) routine
mboxPointerFrom: to get the pointer to the struct
SpreadConnection that the C code will need.

Then I get the lengths of the two strings, and re-check the
interpreterProxy success flag once more just in case, then call the
C code function sqSpreadConnect().
Since this function returns a C int, I have to convert
it into a SmallInteger Oop.

Note the trick here, which I learned from Juan Vuletich's JPEG
plugin code, of using cCode:inSmalltalk: to fool the
compiler into thinking I've used the temporary variables.
Without this, the compiler will complain up to twice for each
variable when I accept a method change.
When I compile it, though, the statement generates no C code at
all (actually it generates a bare semicolon, but I can live
with that).

Here's the C result of the Slang-to-C translation, which is what
I'd have to write in C by hand if I wasn't using Slang.
This was translated by by VMMaker into the file
src/plugins/SpreadPlugin/SpreadPlugin.c.
Note that if the interpreterProxy fails, the stack is left as it
was; it's only on successful exit that the receiver and parameters
are popped and the return value is pushed on the stack.

The generated SpreadPlugin.c code includes the cross-platform header file
platforms/Cross/plugins/SpreadPlugin/SpreadPlugin.h.
I wrote this by hand along with my sqUnixSpread.c interface code.
In this header, my C connect routine is declared as:

At the lowest level is the platform-specific function
sqSpreadConnect(), which lives in the
platforms/unix/plugins/SpreadPlugin/sqUnixSpread.c file.
This is where the SP_connect() routine is actually called.

Note that I'm making copies of the passed in strings
daemonName and
privateName.
I'm doing this because I have to provide a NUL-terminated string,
and the incoming Smalltalk strings aren't large enough for me to
insert a NUL.
These get freed right after the SP_connect() call.

If the connect call succeeds, I fill in the fields of
mbox with the
file descriptor, session ID, and semaphore index, then register the
file descriptor with the async IO routines using
aioEnable() and
aioHandle().
I also keep track of which file descriptors this plugin is using so
that at plugin shutdown time I can tell the aio layer
not to watch the file descriptors any more.

This is a somewhat older document that covers numbered primitives.
Some of its discussion (getting to things on the stack, for
instance) has some parts that are still relevant, though the
TestInterpreterPlugin improves the situation a good deal.
(http://www.create.ucsb.edu/squeak/DIYSqPrims.html)