Search This Blog

Idiomatic concurrency: flatMap() vs. parallel() - RxJava FAQ

Simple, effective and safe concurrency was one of the design principles of RxJava. Yet, ironically, it's probably one of the most misunderstood aspects of this library. Let's take a simple example: imagine we have a bunch of UUIDs and for each one of them we must perform a set of tasks. The first problem is to perform I/O intensive operation per each UUID, for example loading an object from a database:

First I'm generating 100 random UUIDs just for the sake of testing. Then for each UUID I'd like to load a record using the following method:

Person slowLoadBy(UUID id) {
//...
}

The implementation of slowLoadBy() is irrelevant, just keep in mind it's slow and blocking. Using subscribe() to invoke slowLoadBy() has many disadvantages:

subscribe() is single-threaded by design and there is no way around it. Each UUID is loaded sequentially

when you call subscribe() you can not transform Person object further. It's a terminal operation

A more robust, and even more broken, approach is to map() each UUID:

Flowable<Person> people = ids
.map(id -> slowLoadBy(id)); //BROKEN

This is very readable but unfortunately broken. Operators, just like subscribers, are single-threaded. This means at any given time only one UUID can be mapped, no concurrency is allowed here as well. To make matters worse, we are inheriting thread/worker from upstream. This has several drawbacks. If the upstream produces events using some dedicated scheduler, we will hijack threads from that scheduler. For example many operators, like interval(), use Schedulers.computation() thread pool transparently. We suddenly start to perform I/O intensive operations on a pool that is totally not suitable for that. Moreover, we slow down the whole pipeline with this one blocking, sequential step. Very, very bad.

You might have heard about this subscribeOn() operator and how it enables concurrency. Indeed, but you have to be very careful when applying it. The following sample is (again) wrong:

The code snippet above is still broken. subscribeOn() (and observeOn() for that matter) barely switch execution to a different worker (thread) without introducing any concurrency. The stream still sequentially processes all events, but on a different thread. In other words - rather than consuming events sequentially on a thread inherited from upstream, we now consume them sequentially on io() thread. So what about this mythical flatMap() operator?

flatMap() operator to the rescue

flatMap() operator enables concurrency by splitting a stream of events into a stream of substreams. But first, one more broken example:

Oh gosh, this is still broken! flatMap() operator logically does two things:

applying the transformation (id -> asyncLoadBy(id)) on each upstream event - this produces Flowable<Flowable<Person>>. This makes sense, for each upstream UUID we get a Flowable<Person> so we end up with a stream of streams of Person objects

then flatMap() tries to subscribe to all of these inner sub-streams at once. Whenever any of the substreams emit a Person event, it is transparently passed as an outcome of outer Flowable.

Technically, flatMap() only creates and subscribes to the first 128 (by default, optional maxConcurrency parameter) substreams. Also when the last substream completes, outer stream of Person completes as well. Now, why on earth is this broken? RxJava doesn't introduce any thread pool unless explicitly asked for. For example this piece of code is still blocking:

But we did use subscribeOn() last time, what's going on? Well, subscribeOn() on the outer stream level basically said that all events should be processed sequentially, within this stream, on a different thread. We didn't say that there should many sub-streams running concurrently. And because all sub-streams are blocking, when RxJava tries to subscribe to all of them, it effectively subscribes sequentially to one after another. asyncLoadBy() is not really async, thus it blocks when flatMap() operator tries to subscribe to it. The fix is easy. Normally you would put subscribeOn() inside asyncLoadBy() but for educational purposes I'll place it directly in the main pipeline:

Now it works like a charm! By default RxJava will take first 128 upstream events (UUIDs), turn them into sub-streams and subscribe to all of them. If sub-streams are asynchronous and highly parallelizable (e.g. network calls), we get 128 concurrent invocations of asyncLoadBy(). The concurrency level (128) is configurable via maxConcurrency parameter:

That was a lot of work, don't you think? Shouldn't concurrency be even more declarative? We no longer deal with Executors and futures, but still, it seems this approach is too error prone. Can't it be as simple as parallel() in Java 8 streams?

Enter ParallelFlowable

Let's first look again at our example and make it even more complex by adding filter():

asyncHasLowRisk() is rather obscure - it either returns a single-element stream when predicate passes or an empty stream when it fails. This is how you emulate filter() using flatMap(). Can we do better? Since RxJava 2.0.5 there is a new operator called... parallel()! It's quite surprising because operator with the same name was removed before RxJava became 1.0 due to many misconceptions and being misused. parallel() in 2.x seems to finally address the problem of idiomatic concurrency in a safe and declarative way. First, let's see some beautiful code!

Just like that! A block of code between parallel() and sequential() runs... in parallel. What do we have here? First of all the new parallel() operator turns Flowable<UUID> into ParallelFlowable<UUID> which has a much smaller API than Flowable. You'll see in a second why. The optional int parameter (10 in our case) defines concurrency, or (as the documentation puts it) how many concurrent "rails" are created. So for us we split single Flowable<Person> into 10 concurrent, independent rails (think: threads). Events from original stream of UUIDs are split (modulo 10) into different rails, sub-streams that are independent from each other. Think of them as sending upstream events into 10 separate threads. But first we have to define where these threads come from - using handy runOn() operator. This is so much better than parallel() on Java 8 streams where you have no control over concurrency level.

At this point we have a ParallelFlowable. When an event appears in upstream (UUID) it is delegated to one of 10 "rails", concurrent, independent pipelines. Pipeline provides a limited subset of operators that are safe to run concurrently, e.g. map() and filter(), but also reduce(). There is no buffer(), take() etc. as their semantics are unclear when invoked on many sub-streams at once. Our blocking slowLoadBy() as well as hasLowRisk() are still invoked sequentially, but only within single "rail". Because we now have 10 concurrent "rails", we effectively parallelized them without much effort.

When events reach the end of sub-stream ("rail") they encounter sequential() operator. This operator turns ParallelFlowable back into Flowable. As long as our mappers and filters are thread-safe, parallel()/sequential() pair provides very easy way of parallelizing streams. One small caveat - you will inevitably get messages reordered. Sequential map() and filter() always preserve order (like most operators). But once you run them within parallel() block, the order is lost. This allows for greater concurrency, but you have to keep that in mind.

Should you use parallel() rather than nested flatMap() to parallelize your code? It's up to you, but parallel() seems to be much easier to read and grasp.

Ex: I create a stream of 3 strings: A, B, C and I introduce high latency for C and the same latency for A and B. By using the operator first() I unsubscribe too faster to C and C does not have the time to be executed so the unscription kill C. Is there a easy way to leave C continuing running in the background???