Published

This post is about an issue with Scala collections I encountered recently. Although it is purely an issue of Scala standard library, I haven’t faced it for almost 3 years of my Scala development experience.

The consequences of not being familiar with the matter can lead to very painful results: unexpected stack overflow exceptions in your application.

The core

This issue can arise in variety of different forms, so let’s first look at the core: the simplest reproducer. It involves folding over a standard Iterator:

1

2

3

4

5

6

7

valinitial=(0to10).toIterator

valsums=(0to48000).foldLeft(initial){

case(acc,i)=>acc.map(_+i)

}

sums.isEmpty

As any Scala developer knows from day one, foldLeft is stack safe. But don’t let the feeling of safety trick you here. This code blows up the stack:

1

2

3

4

5

6

7

8

9

10

java.lang.StackOverflowError

at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:447)

at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:447)

at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:447)

at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:447)

at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:447)

at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:447)

at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:447)

at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:447)

...

And, of course, it is not foldLeft. The biggest “WTF?” moment here is that stack overflow is caused by this trivial-looking call:

1

sums.isEmpty

How is that?

It appears, that for Iterator most transformation methods, like map, filter, etc., do not immediately evaluate the resulting collection. Instead, they build another Iterator on top of the original one.

Calling these transformations several times in a row builds a chain of Iterators. Each one delegates in some way to the respective underlying iterator. Let’s look for the evidence in the source:

1

2

3

4

5

6

7

8

9

10

11

12

traitIterator[+A]extendsTraversableOnce[A]{

self=>

//...

defmap[B](f:A=>B):Iterator[B]=newAbstractIterator[B]{

defhasNext=self.hasNext

defnext()=f(self.next())

}

//...

}

As you can see, hasNext of the new iterator just calls the hasNext of underlying one. At this moment the vulnerability should be pretty clear: it’s possible to build a chain of arbitrary depth, that can eventually implode.

What’s important to note, that it’s not the transformation itself, that blows the stack. To make the iterator crash you have to call something that triggers the chain. It can be something as simple as .toString or .isEmpty. As we will see later, it can make debugging harder.

The faces

In hindsight, this issue looks straightforward and even obvious. However, it can take so many forms and shapes, that it can be a total riddle from the first look. Debugging experience can be very painful, especially when you are not familiar with the issue before it comes to the surface. I had that painful experience, so let’s dive into details.

I guess, this kind of behaviour is fine for Iterator by it’s very nature. Also, one can fairly say, that Iterator is not a frequent citizen of our code bases. I would agree — I can’t recall using it directly. But things are not so simple.

There are places, where underlying Iterator can leak out of well-known standard collections. I, personally, burnt my fingers with mapValues method on standard Map.

Let’s now take it to the next level. The foldLeft reproducer we’ve seen is very focused. Everything is in one place and, thus, quite simple to debug. But with advent of Reactive Programming, various distribution abstractions like actors and streams are used more and more often.

What it means, is that you may have the very same Iterator trap, scattered over multiple stages of your stream. Or some actor can silently build up an iterator chain and then message it to some other actor, which triggers evaluation and blows up the stack.

I my case, an akka-persistence-query stream crashed after several thousands of replayed events. It’s quite mind boggling, when you see a stack overflow error inside the stream stage, where there’re no signs of infinite recursion or complex calculations.

To illustrate, here’s a simplified version of my case. We build an event-sourcing application with CQRS. When streaming events to the query-side, we want to ensure, that each event is processed exactly once by each consumer.

For that purpose, there’s a small Deduplicator stage, which accumulates a map of last processed event numbers. This allows the stage to mark each event as an original or a duplicate separately for each consumer:

// Each consumer is tagged with a String key, which is used in versions Map

objectDeduplicator{

// EventEnvelope => Map[String, Deduplicated]

defapply(initialVersions:Map[String,Long])=

Flow[EventEnvelope]

.statefulMapConcat(()=>{

varlastVersions=initialVersions

(e:EventEnvelope)=>{

// for each consumer key deduplicate the event

// and provide new threshold version

valdeduplicated=lastVersions.mapValues{lastVersion=>

valisOriginal=e.sequenceNr>lastVersion

if(isOriginal)e.sequenceNr->Deduplicated(e,isDuplicate=false)

elselastVersion->Deduplicated(e,isDuplicate=true)

}

// update thresholds

lastVersions=deduplicated.mapValues(_._1)

// pass deduplicated events further down the stream

List(deduplicated.mapValues(_._2))

}

})

}

Although not trivial, the code is reasonably simple. And it is a time bomb.

Notice, that deduplicated value is constructed using mapValues [line 20]. Then, after another mapValues it becomes the stage state for the next stream element [line 26].

So this is effectively the same fold, just an asynchronous one. After some number of events the Iterator chain becomes larger, than our stack can fit in. The bomb is ready.

Last ingredient for the explosion is the trigger. And our innocent Deduplicator guy just passes the trigger down the stream [line 28]. The actual crash happens somewhere in the consumer code, which happens to access the Map.

The fix

One can fairly argue, that Deduplicator is not optimally implemented and can be more efficient. But for the sake of our study, let’s fix it without rewriting.

The solution to an overly long Iterator chain is to break it. First idea that comes to mind is to trigger evaluation of mapValues result immediately.

There are many ways to do that. For example, there are several good suggestions in this SO thread. Developers from Lightbend argued, that .view.force call is the best way to “evaluate” the collection.

Conclusion

So this is it. Now any time you’ll see a stack overflow with Iterator involved, you know what to look for.

While debugging this issue, I was lucky to have access to paid Lightbend subscription. The support was incredible, I saved tons of time using their experience and guidance, while looking for the root cause and eliminating it.