An Acceptable Blog

Menu

Memoized Dataflow Streams

In reactive programming we can choose between two models: “pull,” in which we run a computation each time output is required, and “push,” in which we run the computation each time input arrives.

Which model we use depends on whether we are working with high-frequency or low-frequency data. If we are writing a piece of avionics software that measures pitch, yaw, and roll, then we need to constantly adjust the plane’s aerodynamic surfaces based on those variables. We don’t need a notification when these variables change, because they change constantly. The pull model would be perfect in this case.

On the other hand, engine temperature is every bit as critical to the health of the vehicle, but presumably that variable remains in equilibrium for long stretches of time, and small variations aren’t important. We don’t want to waste CPU time monitoring temperature 100 cycles per second. We might simply want to receive a notification whenever the engine temperature changes by 1 degree or more. The push model works better here.

The problem: How do I efficiently embed a low-frequency signal in a high-frequency channel? If I pass the low-frequency signal naively, it will work, but entail much redundant computation.

When we want to avoid recomputing a value, we often use a memoization strategy. However, in this case we need to memoize a data stream, not a function.

In the engine temperature example, it would be easy to memoize a function of type Int -> a. But we want to compose this function as part of a signal. After all, if the engine temperature is low frequency, then so is any signal derived from the engine temperature. The chain of transformations should be memoized along its entire length. Further, this function risks artificially escalating a meaningless low-amplitude high-frequency component of an otherwise low frequency signal by imposing an arbitrary boundary: suppose that some engine vibration causes the temperature to oscillate rapidly between 198.9 and 199.05 degrees, which would truncate to 198 and 199? This does not yield the notification heuristic we are looking for.

The solution: Tag information with a unique signature at its point of departure and then memoize it at the point of arrival. Transformations of the data stream also need to be tagged. A source signature is either a unique integer, or an annotation of applying one signature to another. There is some overhead associated with comparing signatures, but this overhead can not be greater than the cost of performing the underlying operations.

Memoizable messages are very similar to applicative functors. They can not, however, implement the Control.Applicative interface, because any pure constructor would be unsigned and therefore destroy memoization.

This memoization scheme requires three operations:

Transmit: Sign a message with a unique signature, indicating its source. If a subsequent signal is sufficiently similar, reuse the same signature.

Receive: Unpack a signal, memoizing against the signature of the previous input.