Saturday, 07 Aug 2010

Go Puzzler

Russ Cox explained a while ago how interface types work in the original
implementation of the Go language. But reading about an implementation when your
grasp of the language is shaky is sometimes more confusing than helpful.

For example, I happened to see the beginning of a dispute on the mailing list
[edit: which turned out to be just a misunderstanding] about
whether interface values are copied by value or by reference. I had forgotten
and couldn't find an answer in the Go spec, so I wrote a test program. See
if you can guess what this program does without running it:

The key to understanding the output is knowing that whenever you call a method in Go,
the method sees a copy of its receiver. This is consistent with other languages
when the receiver is either a pointer or an immutable value, but it's a surprise
when the receiver is a struct. [1]

Here's where knowing a little about the implementation might help you guess wrong.
When the value being wrapped by an interface won't fit in a machine word, the
"gc" implementation uses a two word header with the second word pointing to
some extra data. This allows all interface values to fit in the same size header,
and makes them cheaper to pass around since the extra data doesn't need to be copied
most of the time.

However, as far as I can tell, that's just an implementation detail. The Go designers
have cleverly arranged so that there's no way to modify the extra data of an
interface value without copying it first. So the presence of a pointer in the
implementation doesn't necessarily mean copy-by-reference happens in the language.

Going back to my opening question, when you implement an interface, you can choose
whether it acts like a value type or like a reference type. If you choose a data structure
with no pointers, it will behave like an immutable value. If you use
pointers then you'll get copy-by-reference semantics for the targets of the
pointers. But there's no way to make a struct-like object that can be both
copied by value (via assignment) and modified in place by calling methods on it.
Even if you implement your objects with a struct, they will behave like an
immutable object when you call methods.

So if you have a value of type interface{}, it could behave like a pointer
or like a large chunk of immutable data. At a high level, it acts like void * in
C or like an Object reference in Java, but the details are different, because you
also get immutable objects for free.

[Edit: clarified second-to-last paragraph.]

[1] The rationale for this puzzling behavior is
that Go copies structs when it passes them as regular parameters, so it should also
copy them when passed in the receiver position. Very consistent, but not all that
intuitive, since we ordinarily think of a method call as sending a message to a
particular object.

The "mutating your receiver does nothing" behavior is sufficiently weird compared to other
languages that I think Go might be easier to learn if it were a compiler error.
There's nothing you can do with it that couldn't also be done by explicitly copying the
receiver to a local variable first. The compiler could generate the same code and
newcomers wouldn't be surprised by the copy. Otherwise, my guess is that we'll
eventually have tools that emit a warning for this case.