Benchmark design

Each of the benchmarks in the FireHose suite has two parts: a
front-end generator that writes a stream of events (datums) to one
or more UDP sockets in a text-based format, and a back-end analytic
that reads the stream and processes the datums.

The generators create a stream of datums with the
following attributes:

reproducible

scalable rate, up to tens of millions of datums/sec

serial or parallel generation

The computations performed by the analytics have the
following attributes:

well-defined and relatively easy to implement

allow for use of more sophisticated data structures and algorithms

scoring metrics for the processing rate and accuracy of the result

allow for serial or parallel implementation (shared or distributed memory)

The FireHose tarball provides code for each generator. They are meant
to be used as-is, without modification. The tarball also provides
sample implementations of the analytics, but users are free to
re-implement them in a manner optimized for their streaming framework
or their machine, or with better algorithms, so long as they follow
the rules of the benchmark and measure the relevant scoring metrics.

UDP sockets are used as the link between the generator and analytic,
so that the generator is not throttled by the analytic, just as in
stream processing scenarios where data must be processed as it appears
in real-time. The UDP protocol means the analytic will drop datums if
it cannot keep up with the generated stream, which is one of the
effects we wish to measure.