2015. szeptember 30., szerda

Introduction

In this final part about Subjects, I'll show how to implement a PublishSubject. To make things a bit interesting, I'll implement it in a way that prevents PublishSubject overflowing its child subscribers in case they happen to be not requesting fast enough.

PublishSubject

The main body of the PublishSubject will look very similar to UnicastSubject from the last post, therefore, I'm going to skip those details and focus on the State class.

The State class will look differently because in PublishSubject, there could be many Subscribers at once, each with its own unsubscribed, requested and wip counters. Therefore State can't implement Producer and Subscription anymore. We need to implement those in a new class, SubscriberState, and have an instance for each child Subscriber.

One thing before detailing the classes, namely the backpressure handling strategy. We want to give the user the ability to specify the backpressure-behavior of the PublishSubject, so he/she doesn't have to apply onBackpressureXXX methods on the output. For this, I define an enum with 3 values:

Nothing standing out so far. We have a volatile array of SubscriberState instances, add, remove and terminate methods. By using EMPTY, we will avoid allocating an empty array whenever all subscribers unsubscribe. This pattern should be familiar from an earlier post about Subscription containers. Now let's see the implementation of add().

For the sake of diversity, the State class will use synchronized instead of an atomic CAS loop. The block is essentially a copy-on-write implementation. The benefit of such implementation is that looping through the current array of subscribers is faster and relies on the observation that many Subjects don't actually serve too many Subscribers at once. If, however, one encounters a case where the number of subscribers is large, one can use any list or set based container inside the block instead. The drawback there is that one needs a safe way to iterate over the collection which may only be possible by doing a defensive copy all the time.

If we detect unsubscription, we clear the queue and remove ourselves from the active set of subscribers (1). However, we don't need to call remove when reaching an done and empty state (2) because at this point, the state contains no subscribers anymore and is also terminated.

A word about BehaviorSubject.

BehaviorSubject is a special kind of subject sitting between a PublishSubject and ReplaySubject: it replays the very last onNext event before relaying other events then on. One might think it can be emulated via a size-bound ReplaySubject, however, their terminal behaviors differ. ReplaySubject of 1 will always replay 1 event and 1 terminal event whereas a terminated BehaviorSubject won't retain the very last onNext event and will emit only the terminal even.

From concurrency perspective, the single value-replay creates complications when there is a race between a subscription and onNext. Since the requirement is that there can't be missed events from the point on where the subscription happens, one has to somehow capture the last value, emit it and then emit any other value.

In the current 1.x implementation of BehaviorSubject, this is achieved via a per-subscriber lock and a split mode: first and next. When the subscription happens, the subscribing thread tries to enter into the first mode which reads out the last value from the subject and emits it. If there is a concurrent onNext call, it is temporarily blocked. Once the first mode finishes, it switches to next mode and now onNext calls are immediately relayed.

Basically, it is an asymmetric emitter loop: if the emitNext() wins, emitFirst()won't run and who is to say the first onNext() the child subscribes wasn't the last one when the subscription happens asynchronously?

There is, however, a very subtle bug still lurking in this approach. It is possible emitFirst() will emit the same value twice!

In the right interleaved conditions, onNext sets the last value in the state, emitFirst picks up the last state. Then onNext tries to run emitNext() which finds an emitting state and queues the value. Finally, emitFirst notices there is still work to do and dequeues the value and now we have emitted the same value twice.

The solution, although works, is a bit complicated and can be seen here with RxJava 2.x. Basically one has to add a version tag to the value, lock out onNext() for a small duration and drop old indexed values when emitting. The clear drawback is that we now have another lock in the execution path and in theory, any concurrently subscribing child can now block the emitter thread. A lock-free approach is possible, but it requires allocation of an immutable value+index every time an onNext event appears.

Conclusion

In this post, I've shown how to implement a PublishSubject with 3 different kinds of backpressure-handling strategies. I consider this as the final piece about Subjects.

If you look into RxJava 1.x, you may see that standard Subjects aren't implemented this way, however, 2.x Subjects are. This is no accident and the 2.x implementation come from lessons learned from the 1.x implementation.

In the next blog post-series, we're going to utilize the knowledge about Subject internals and I'm going to show how to implement ConnectableObservables.

Introduction

Sorry for the delay in posting, I was busy with my professional work and I've been implementing a fully reactive-streams compliant RxJava 2.0 in the meantime.

In this blog post, I'm going to talk about the requirements of building Subjects, the related structures and algorithms and I'm going to build a backpressure-aware special subject, UnicastSubject, with them.

Requirements

Since subjects implement both Observer and Observable, they have to conform to both:

[Observer] onXXX events have to be sequential and expected to be sequential

In addition, since subjects can reach a terminal state via onError() or onCompleted(), we must deal with the situation when a Subscriber subscribes to a Subject after such event. Clearly, keeping the Subscriber hanging at this point isn't a good idea. The standard RxJava subjects, therefore, re-emit their terminal event to such late-commers (ReplaySubject may emit onNext events before that though).

Given that, we want the UnicastSubject to allow only a single Subscriber, buffer incoming events until this single Subscriber subscribes and replay/relay events while conforming to the backpressure requests of the Subscriber.

Luckily, we already saw all components needed for implementing the UnicastSubject: tracking the Subscriber's presence and using queue-drain to replay/relay events to it.

The main drawback of the Java language regarding RxJava is that there are no extension methods: methods that appear to be part of a class but in reality, they are static methods somewhere else and the compiler turns a fluent invocation to them into a regular imperative static method call. Therefore, a fluent API requires a wrapper class, Observable, to hold onto all operators and methods.

Since we need to customize the subscription actions for each custom Observable, the Observable class has a protected constructor with a callback, OnSubscribe<T> on it. However, Subjects need to both handle the OnSubscribe<T> calls and the onXXX methods at the same time. Java forbids calling instance methods before the constructor calls super, therefore, the following code doesn't compile:

The workaround is somewhat awkward, but nonetheless working: use static factory methods to create Subjects and have a shared state object between which is then used as an OnSubscribe<T> target and serves as the state of the Observable itself (1).

Note that most 1.x Observable extensions, such as the standard Subjects use separate objects for state and subscription handling. 2.x has been an improvement in this regard and Subjects also use a single state object for both tasks, similar to what I'm showing here.
Given a single State<T> object, we use it as the OnSubscribe<T> callback and store it inside the UnicastSubject (2, 3). The State class itself implements a bunch of interfaces (4):

OnSubscribe for handling subscription,

Observer for conveniently have onXXX methods that will be delegated to from the UnicastSubject.onXXX methods,

Producer, since we know there will be only a single Subscriber which needs only a single Producer to communicate with (saving on allocation and on cross-communication) and

Subscription for handling the unsubscription call coming from the child directly (again saving on allocation and cross-communication).

The implementation of the main onXXX methods is straightforward delegation into the state object:

The child field will hold onto the only subscriber when it arrives and will be set back to null once it leaves. It has to be volatile because hasObservers() needs to check it in thread-safe manner.

We need to make sure only a single Subscriber is let in during the entire lifetime of the subject and we do this via an atomic boolean. Naturally, this field could have been inlined into State by StateextendsAtomicBoolean. An alternative, although requires more work would have been to use child directly and have a private static Subscriber instance that indicates a terminal state.

This queue will hold onto the incoming values until there is a Subscriber or said subscriber requests some elements. Here, I'm using a single-producer single-consumer linked queue implementation from RxJava, but this queue is relatively expensive due to constant node allocation. In an upcoming PR and in RxJava 2.x, we have a SpscLinkedArrayQueue (it is a slightly modified version of the JCTools SpscUnboundedArrayQueue) that amortizes this allocation by using 'islands' of Spsc buffers.

We hold onto the terminal event (which might be an error) in these two fields. Since error will be written once before done and read after done, it doesn't need to be a volatile in itself.

Since the child may go at any time, it would be infeasible to keep buffering events indefinitely since no one else could observe them after that. This flag, along with the done flag will be used in the onXXX methods to drop events.

We need to keep track of the requested amount of the child subscriber so we emit values no more than requested.

The wip field is part of the queue-drain approach explained in an earlier post and makes sure only a single thread is emitting values to the child subscriber, if any.

In the next step, let's implement the call() method of OnSubscribe and manage the incoming Subscribers:

If once is not set and we succeed setting it to true atomically, we now have our single child subscriber.

It is important that setting up the unsubscription callback and the producer happens before the store to the child field, otherwise, an asynchronous onNext call may end up running before this setup. This isn't much of a problem in RxJava 1.x but it is of a large concern in a reactive-streams compliant Publisher (to which we'd like to port our code more easily.)

Once the producer/unsubscription are set up, we set the child reference. Whether or not the child is cancelled at this point, we need to call drain which will take care of replaying buffered values and cleaning up if the child has unsubscribed in the meantime.

If there is/was a subscriber already there, we need to check if the subject is actually terminated.

If the subject is terminated, we simply emit the terminal event (error or completion), similar to how the standard subjects behave.

Otherwise, if there is a subscriber and the subject isn't terminated, we simple emit an IllegalStateException explaining the situation.

Next comes onNext, which could be implemented in a simple way and in a more complicated way. Let's see the simpler way:

Not very exciting, is it? If the subject isn't done or isn't unsubscribed, offer the value to the queue and call drain(). (Note that I'm omitting the null handling here for brevity, again).
The complicated way is to have a fast-path implementation that bypasses the queue if there is no contention, but more importantly, when the child has caught up and thus the queue is empty. Note, however, that such caught-up state is not permanent because the child can slow down and not request in time. In this case, we still have to put the value into the queue.

This is the entry to the fast-path; if we manage to set the wip counter from 0 to 1, we are in.

We retrieve the requested amount.

We need to check if the requested amount is non-zero and if the queue is empty. This queue check is crucial because otherwise the fast-path would skip the contents of the queue and thus reorder the events.

If the queue happens to be empty and the child runs in bounded mode, we emit the event and decrement the requested amount.

We also decrement the wip count and if it is zero, we simply return. If it is non-zero, it means a concurrent request() call arrived and we need to deal with it. The execution stay's in "emission mode" and continues on line (8)

If there is no request or the queue is non-empty, we enqueue the current value and let the drainLoop() deal with the situation (8).

If we couldn't enter the fast-path, we offer the value into the queue and try to enter the drain loop by incrementing wip. If it was actually zero, we enter the drain loop, otherwise, we did indicate a running drain loop there is more work to do.

Finally, while still in "emission mode", we jump to the drain loop.

The implementation of the onError and onCompleted isn't that different from the simpler onNext implementation:

In request() (1), we add the amount to the requested value and then call drain. The unsubscribe() is a bit more interesting (2). We set the unsubscribed flag (the get and then set isn't atomic, but it doesn't matter here) then increment the wip counter, which may seem odd. The idea here is that if there is no contention, the transition from 0 to 1 makes sure the cleanup code runs only once and also prevents any further attempts to enter the emission loop by other means. If there is a drain loop running, this will indicate more work is available and since drainLoop() will check for unsubscription before any other action, drainLoop() will call the cleanup for us.

The clear() method clears the queue and nulls out the child. Since this only runs while wip != 0, there is no chance the child gets null and onNext tries to invoke a method on it, yielding NullPointerException. The drain() method simply increments wip and if it was zero, enters the drainLoop().

Instead of decrementing the wip counter one by one, we fist assume we only missed 1 drain call. Later on, if we happened to miss multiple calls, missed will be bigger and we'll subtract all of them at once on line (9) reducing the likelihood of looping too many times.

We cache the child and queue fields to avoid re-reading them all the time.

If we don't have a child subscriber, there is nothing we can do as of then.

We have to check if a terminal state has been reached. Since onError and onCompleted travel without requests, we need to do the check before checking the requested amount and quit the loop if so. It is important to remember that checking the done flag before the emptiness of the queue is mandatory, because values added to the queue happen before the setting of the done flag.

We read the requested amount, check if it is unbounded and prepare an emission counter.

We read the done flag, poll an element from the queue and if that value is null, we set an empty flag. The call to checkTerminated, again makes sure the unsubscription and terminal events are handled.

We decrement the requested amount and the emission count. By decrementing instead of incrementing, we save a negation on line (8).

If there was emission and the request amount is not unbounded, we decrement the requested amount be the emission amount (e is negative).

Once all that could be done emission-wise, we update the wip counter by subtracting the known missed amount. It is possible more calls were missed and thus the return value won't be zero. In this case, we loop again, otherwise we quit the drain loop.

If we missed an event, it is possible a subscriber arrived thus the child field has to be re-read if we know it as being null locally.

The final method is the checkTerminated and we are done. Depending on one wants to delay error or not, there are two implementations possible. If errors should be delayed, the method looks like this:

Here, we first check for the unsubscribed flag, if set, we clear the queue, null out the child field and indicate the loop should simply quit (1). Otherwise, we check if the source is done and the queue is empty, at which point we set the unsubscribed flag (for convenience), null out the child and emit the terminal event. Any other case will keep the loop running (i.e., done but queue not empty).

The alternative implementation sends an error as soon as it is detected, ignoring the contents of the queue.

If the done flag is set, we check if there is an associated error as well. In this case, we clear the queue, null out the child via clear() and emit the error (1). Otherwise, we need to check if the queue is empty since we first have to emit the events and only then complete the child (2). There is no need to clear the queue since we know it to be empty at this point, hence no call to clear().

Conclusion

In this blog post, I've shown how to build a custom UnicastSubject that honors backpressure and allows only a single Subscriber to subscribe to it.

In the next blog post, I'll show how to handle multiple Subscribers within a subject by re-implementing the PublishSubject.