I’ve been using Go to implement some performance-sensitive services
recently and, so far, it’s been a very nice. When writing Go code, it
often feels like a (slightly verbose) scripting language, but has nice
type-safety and good performance.

But I’ve realized that this feeling is really only true on a
line-to-line basis. What I’ve noticed recently is that a lot of my
code ends up being “collection munging”: creating, modifying, and
transforming collections. Python excels at this – the built-in core
functional working set (map, reduce, and friends) combined with
iterators/generators and list/dict comprehensions allows for some
very concise, readable code. And often it even has decent performance.

But it isn’t always good enough performance, and that was a reason for
moving to Go (or any higher performance, more systems-y
language). But I found myself even missing C++98 because collections
are actually pretty nice there. Iterators and templates allow code
nearly as concise as the Python version, even with all those type
declarations.

Unfortunately generics aren’t coming to Go soon 1. But
iterators are just interfaces. They abstract away the underlying
container to make it look like a sequence of values of some type with
a standard interface (e.g. Next(), Prev(), etc). This allows you
to write at least partially generic code, e.g. a sum function that
assumes an iterator producing a specific type. If each container can
generate an IntIterator, we can write one Sum() for all of
them. Without generics we still need SumInt() vs. SumFloat(),
but at least we don’t need SumIntSlice(), SumIntDictValues(),
SumFloatSlice(), and SumFloatDictValues().

Once I realized that this is what I was missing, I wanted to know what
the canonical iterator pattern in Go is: what’s the interface and
the implementation? I ended up in this
thread
on the golang-nuts mailing list which covers the basic
options for a forward-only iterator (which admittedly simplifies the
problem, mainly focusing on implementation rather than the appropriate
interface, but this is also the most common type of iterator we need).

But this discussion only covered the basics of the different options
and some pitfalls, but didn’t address performance issues and focused
primarily on the iterator’s implementation without much consideration
for the caller’s code, which I think is the more important part. The
whole idea is to provide a good abstraction to the caller and
hopefully not compromise performance. So which of the iterator
patterns gives the best balance? Or, a more realistic question – when
I’m writing code, which version should I use in each case: a) I just want
to get the code working b) I want to extract maximum performance or c)
I want a balance between readability and performance?

Iterator Patterns

To start, let’s review both the implementer and calling code for each
of the options, hopefully giving a clearer summary than that original
mailing list thread. None of the patterns explicitly define an iterator
interface. They either use a
channel, which is
a very natural stand-in for a forward-only iterator, or a function
which produces values for the caller. The two key aspects I want to
look at are a) how does the iterator implementation work and b)
compared to ideal caller code that looks something like

for item := range Foo.Iterator() {
do_ops_with(item)
}

what does the calling code look like? What, if any, hoops do they need
to jump through to get things working with that pattern, and how does
it affect readability/speed of coding?

In all the examples, I’ll use iterating over a []int to make the
code as simple as possible. To show what the caller code looks like,
we’ll implement sum() for each type of iterator.

Pattern 1: Iterator Function with Callback

The first approach looks the least like an
iterator. Instead, we create a function with the body of the loop and
the “iterator” gives a callback for each element:

I’ve written it out in steps to make it clear, but we can write it
more “loop-like” by writing the function directly in the call:

var sum int = 0
IntCallbackIterator(func(val int) {
sum += val
})

Because Go has anonymous functions and with closures, this
doesn’t look too bad. It feels a lot like writing JavaScript. To me,
one of the major drawbacks is that we’ve lost type declaration
elision: we have to explicitly define the full type of the function,
including the type you’re iterating over.

You’ll notice there’s no return value on the callback. We can add
control, e.g. break, by allowing the function to return a
value. Unfortunately, that also then means it always must return a
value, or you need two iterators if you want to have concise code when
you don’t need break.

Pattern 2: Closures

The second pattern also uses anonymous functions with closures, but in
the iterator implementation itself. The basic idea is that the
returned iterator is a generator function that you call repeatedly:

The return value has two components: the iterated value followed by a
bool indicating if there is another value, i.e. has_next. The
initial call to get the iterator also returns this has_next so you
can tell if there are any elements.

I’d hope there’s a better way of implementing that, but this is what I
was able to came up with. There are two reasons it’s ugly. First, you
can’t use a regular for loop because the iteration needs to happen
during successive calls to the returned function. Second, you need to
also indicate whether there is a next value so the caller can
terminate the loop. We could also do this by returning a *int to
indicate the validity of the current value, but then we need to
dereference in the caller.

Again, there might be a better way here.
It looks similar to iterator interfaces in, e.g.,
Java. One major annoyance is that the value can’t be
declared in the third part of the for so we have to declare it
earlier. The entire for line is muddied up with extra control info –
has_next needs to be stored and checked. This is downright ugly
compared to a nice rangefor loop.

Finally, some people like to wrap this type of iterator function into an
interface, i.e. they return an object with an interface like this:

type IntIterator interface {
Next() (int,bool)
}

which directly wraps the function, similar to the style common in
other languages like Java. I don’t think it much matters, but some
people find the it.Next() call to help clarify what the loop is
doing.

Pattern 3: Channels

The final pattern uses channels and fits very nicely into Go. The
basic idea is to fire off a goroutine which generates values and
pushes them into a channel:

Control is returned to the caller who can then just read from the
channel until it’s exhausted, which range does automatically:

var sum int = 0
for val := range IntChannelIterator() {
sum += val
}

This is very Go-ish (or whatever the equivalent of “Pythonic” is for
golang…). Because channels have direct support in the language
through range, the calling code is nice, concise, and clear. And in
theory shouldn’t be that expensive because goroutines are cheap (they
aren’t “real” threads).

This is as good as it’s going to get in terms of simplicity from
the caller’s side, barring the introduction of something like list
comprehensions.

However, there is one issue with this approach. If we do something
like this:

we leave uncollectible garbage. The goroutine continues to “run” in
the background, but is blocked from doing any work after it generates
one value after the calling loop exited since it can’t write to the
channel anymore. I haven’t yet thought much about fixing this
problem. I suspect using a bidirectional channel (or two channels) as
a request/response mechanism would work, but you’d still need to make
sure either side close()d the channel at the right time. Maybe a
defer statement could handle that nicely even with break
statements. Handling all the cases properly might not be trivial and
probably requires the calling code to be more complicated than the
simple for-range loop.

Buffered Channels

Although goroutines are cheaper than threads, context switches still
cost something. If you’re only doing a small amount of work to move
to the next item (as the iterator) and a small amount of work
consuming the item (as the caller), the cost of dealing with the
channel and switching between goroutines could be expensive.

Conveniently, Go, provides buffered channels. They have a buffer of
data instead of just one value. This means that the Go runtime can
choose to fill the entire buffer before switching to a different
goroutine, avoiding the overhead of switching so frequently. The only
implementation change is choosing the buffer size and specifying it in
the make() call for the channel. Instead of

ch := make(chan int)

we can use

ch := make(chan int, 10)

The trick is deciding how large that buffer should be. Unfortunately,
this is completely dependent on the context.

Pattern 4: Stateful Iterators

A common pattern in other languages, as mentioned in the section about
closure-based iterators, is to use an interface with some variant of
HasNext() and Next() functions. ryszard contributed this
implementation, which he refers to as Stateful because they hold the
iteration state in the iterator struct itself (as opposed to the
implicit storage in the closure-based implementation). The iterator
implementation:

In Go, just adding an interface declaration let’s us specify a generic
interface to an IntIterator:

type StatefulIterator interface {
Value() int
Next() bool
}

which would allow use to write generic functions for any data
structure that could generate an iterator with that interface.

I didn’t include this approach initially (beyond noting you could get
it easily from the closure based approach) because this was part of
what I set out to avoid. It may seem superficial, but I would much
prefer an iteration technique that can make the actual iterator not
appear in the calling code if possible – I just find that more
readable. This is a place where I think C++ seriously fails. Not only
do iterators show up, but often the meaning of expressions is very
unclear. For example, any time you use a std::map, your code ends up
littered with it.first and it.second. On the other hand, if you
want more than a simple forward-only iterator, this is the right way
to handle it since, e.g., bidirectional iteration can’t clearly be
specified via a single function.

Finally, the other reason I don’t like some of the approaches that
require additional variables is that I think it’s important to scope
properly where possible. I really dislike holding on to the iterator
outside the for loop it’s used in. I would probably change the
calling code to this:

Evaluation

I wrote up versions of each of these with simple benchmarks courtesy
the lovely Go testing framework. The code is available
here.

I wrote two versions of each. One directly iterates over a []int,
the other introduces a struct and uses a []*struct which it needs to
extract the data from. The goal was to hopefully avoid benefits from
locality. The results don’t vary much, so either we weren’t
benefiting anyway or the allocated data is still laid out nicely. In
either case, having both makes me feel more confident these are
somewhat reflective of real world performance.

The results aren’t too surprising, but the scale of the performance
difference is interesting. The simple but noisier
callback based approach wins on performance. The
closure-based almost-an-iterator approach isn’t too slow in
comparison, a bit under 50% performance hit. Using this approach
instead is probably
only an issue if it’s in an inner loop of performance critical code. Finally,
the most natural channel based implementation is… slow. By a
factor of nearly 50x. Buffering does help though, reducing that to a
factor of 25x.

I’d probably choose the channel approach for most code, then use
callbacks for performance sensitive code. I think the callback code is
cleaner than the closure-based iterator and it only gets too ugly when
you have too many nested “loops”.

Updated: The stateful iterator results are new. Not surprisingly,
since all the state management is done manually, they are on par
with the basic callback version. The two are essentially equivalent
when you use the struct directly – the iterator struct takes the
place of the closure, and both require one function call per
iteration. The interface version is more expensive because it
requires extra runtime checks and indirection to find the correct
function to call.

Apparently, based on some examples from the standard packages, this
style (or a version using a pointer return value to indicate both
availability and value at once) are the “Gothic” way of implementing
this. I went looking for other patterns because I assumed such a
verbose approach wouldn’t be preferred.

Notes

Apparently channel performance has improved significantly in Go over
the past couple of years, but clearly the overhead is still too
much. My guess would be that the overhead of the goroutine scheduler
is the source of the cost, although with only 2 goroutines it’s not
clear why it’s as high as it is.

To me, the biggest drawback to all of these options is that they don’t
all provide the same type, so they aren’t interchangeable. We can’t
modify the iterator implementation independently of the calling code
to get better performance.

The channel approach fits perfectly with the language because range
directly supports iterating over channels. It would be great to be
able to unify these all with a channel interface, but there’s no way
to adapt the more efficient approaches without involving a
goroutine. I’d guess that the overhead is with scheduling, not with
the channel itself (which is effectively a 1-element buffer). It looks
like the best option for unifying them for now is to write boilerplate
wrapper code, i.e. define something like this:

type IntIterator interface {
Next() (int,bool)
}

and expose a function that returns that interface. Unfortunately, this
generator-like function also produces the ugliest, hardest to read (in
my opinion) caller code. So for now, I just use the channel interface
first for ease of implementation and assume I’ll have to pay the
refactoring cost to uglify the calling code if I need better
performance.

What I’d really like to see is some sort of range adapter: provide
an interface in the builtin package that lets you use any iterable
in a range expression. That should be the unifying interface and
allow you to use any of these interchangeably and get really clean
calling code. It could just be the usual HasNext()/Next()
interface seen in other languages. I assume the issue with this is
that there is no way to specify those generically.

Thanks

Thanks to Faolan Cheslack-Postava for editing this post. Thanks to ryszard for the Stateful
Iterator implementations.

I understand why generics have not appeared in Go
yet. I’m glad they are taking their time to make sure they get it
right since generics in many languages have some seriously ugly
warts. On the other hand, I clearly disagree with the Go designers on
how important they are. I think they are an obviously big win for the
language, and the fact that I have a bunch of FooInt(), FooFloat(),
FooString() functions is a testament to that. ↩