Thursday, August 19, 2010

Rx Part 7 – Hot and Cold Observables

STOP THE PRESS! This series has now been superseded by the online book www.IntroToRx.com. The new site/book offers far better explanations, samples and depth of content. I hope you enjoy!

In this post we will look how to describe and handle 2 styles of observable streams

Streams that are passive and start publishing on request,

Streams that are active and publish regardless of subscriptions.

In this sense passive streams are called Cold and active are described as being Hot. You can draw some similarities between implementations of the IObservable<T> interface and implementations of the IEnumerable<T> interface with regards to Hot and Cold. With IEnumerable<T> you could have an “On demand” collection via the yield return syntax, or you could have an eager evaluation by populating a List<T>, for example, and returning that (as per the example below)

Implementations of IObservable<T> can exhibit similar variations in style.
Examples of Hot observables that could publish regardless of if there are any subscribers would be:

Mouse movements

Timer events

broadcasts like ESB channels or UDP network packets.

price ticks from a trading exchange

Some examples of Cold observables would be:

subscription to a queue

when Rx is used for an asynchronous request

on demand streams

In this post we will look at 3 scenarios in which cold, hot and both cold & hot are implemented.

Cold Observables

In this first example we have a requirement to fetch a list of products from a service. In our implementation we choose to return an IObservable<string> and as we get the results we publish them until we have the full list and then we publish an OnComplete. This is a pretty simple example.

This style of API would allow for a non blocking call to fetch the list of products and would inform the consumer of when the list was complete. This is fairly common stuff, but note that every time this is called, the database will be accessed again.
In the example above I use Disposable.Create factory method. This factory method just creates an implementation of IDisposable that executes a given action when disposed. This is perfect for doing a Console.WriteLine once the subscription has been disposed.
In this example below, we have a consumer of our above code, but it explicitly only wants up to 3 values (the full set has 128 values). This code illustrates that the Take(3) expression will restrict what the consumer receives but GetProducts() method will still publish all of the values.

Hot Observables

Trying to come up with an example for Hot Observables has been a real pain. I have started off with examples with some sort of context (streaming stock prices or weather information) but this all seemed to detract from the real working of the code. So I think it is best to step through this slowly with a contrived demo and build it up to a piece of code you might actually want to use.
Let us start with subscribing to an Interval. In the example below we subscribe to the same Observable that is created via the Interval extension method. The delay between the two subscriptions should demonstrate that while they are subscribed to the same observable instance, it is not the same logical stream of data.

Publish and Connect

If I want to be able to share the actual stream of data and not just the instance of the observable, I can use the Publish() extension method. This will return an IConnectableObservable<T>, which extends IObservable<T> by adding the single Connect() method. By using the Publish() then the Connect() method, we can get this functionality.

In the example above the observable variable is an IConnectableObservable<T>, and by calling Connect() it will subscribe to the underlying (the Observable.Interval). In this case we are quick enough to subscribe before the first item is published but only on the first subscription. The second subscription subscribes late and misses the first publication. We could move the invocation of the Connect() method until after each of the subscriptions have been made so that even with the Thread.Sleep we wont really subscribe to the underlying until after both subscriptions are made. This would be done as follows:

You can probably imagine how this could be quite useful where an application had the need to share streams of data. In a trading application if you wanted to consume a price stream for a certain asset in more than one place, you would want to reuse that stream and not have to make another subscription to the server providing that data. Publish() and Connect() are perfect solutions for this.

Disposal of connections and subscriptions

What does become interesting is how disposal is performed. What was not covered above is that the Connect() method returns an IDisposable. By disposing of the “connection” you can turn the stream on and off (Connect() to turn it on and then disposing of the connection to turn it off). In this example we see that the the stream can be connected and disconnected multiple times.

Let us finally consider automatic disposal of a connection. It would be common place for a single stream to be shared between subscriptions, as per the price stream example mentioned above. It would however also be common place for the developer to want to only have the stream running hot if there are subscriptions to it. Therefore it seems not only obvious that there should be a mechanism for automatically connecting (once a subscription has been made), but also a mechanism for disconnecting (once there are no more subscriptions) from a stream. First let us look at what happens to a stream when we connect with no subscribers, and then later unsubscribe:

I use the Do extension method to create side effects on the stream (ie writing to the console). This allows us to see when the stream is actually connected.

We connect first and then subscribe, which means we can be publishing without any subscriptions.

We dispose of our subscription but don’t dispose of the connection which means the stream will still be running. This means we will be publishing even though there are no subscriptions.

RefCount

Taking the last example, if we just comment out the line that makes the Connection, and then add a further extension method to our creation of our observable RefCount we have magically implemented all of our requirements. RefCount will take an IConnectableObservable<T> and turn it back into an IObservable<T> and automatically implement the connect and disconnect behaviour we are looking for.

Other Connectable Observables

While this is a post about Hot and Cold Observables, I think it is worth mentioning the other ways IConnectableObservable<T> can pop up.

Prune

The prune method is effectively a non blocking .Last() call. You can consider it similar to an AsyncSubject<T> wrapping your target Observable so that you get equivalent semantics of only returning the last value of an observable and only once it completes.

Replay

The Replay extension method allows you take an existing Observable and give it “replay” semantics as per the ReplaySubject<T>. As a reminder, the ReplaySubject<T> will cache all values so that any late subscribers will also get all of the values. In this example 2 subscriptions are made on time, and then a third subscription can be made after they complete. Even though the third subscription can be done after the OnComplete we can still get all of the values.

Apologies for the delay in my reply. The reason you will not see any progress is because you are subscribing and observing on the same scheduler.

If you change your one lines.Subscribe(Console.WriteLine);tos.SubscribeOn(Scheduler.ThreadPool).Subscribe(Console.WriteLine);

Otherwise you are effectively telling the code to block on the current thread while you run your for loop.

An even easier way to do what you have here is to use the built in methods:var s = Observable.Interval(TimeSpan.FromMilliseconds(300), scheduler) .Take(100) //Else it will run forever .Finally(() => Console.WriteLine("--Disposed--"));

The latest drop of Rx has some new windowing functionality that would help you here. To answer your question properly I would need to know more;Are all the values ready or are they getting streamed to you?If they are being streamed and you get many values within the period, do you want to buffer the value or ignore all but the latest?If you were streamed values and in one period no values came through, do you want to onnext something, or perhaps wait till the next value comes and immediately publish the result, or only publish on the set period if there has been a value.