Introduction

flow is a headers-only C++11 framework which provides the building blocks for streaming data packets through a graph of data-transforming nodes. Note that this library has nothing to do with computer networking. In the context of this framework, a data packet is a slice of a data stream. A flow::graph will typically be composed of producer nodes, transformer nodes and consumer nodes. A data packets is produced by a single producer node, can later go through any number of transformer nodes and is finally consumed by a single consumer node. Nodes are connected to one another by pipes attached to their input pins and output pins. The graph and base node classes already provide the necessary API to build a graph by connecting nodes together and to run the graph. As a library user, you are only expected to write concrete node classes that perform the tasks you require.

Here's an example of a simple graph. The two producers nodes could be capturing data from some hardware or be generating a steady stream of data on their own. The transformer node processes the data coming in from both producers. The transformer's output data finally goes to a consumer node.

Data flow for a simple graph

Should we need to monitor the data coming in from producer 2, we can tee it to another consumer node. This new consumer node could save all the data it receives to a file or log it in real-time without preserving it. The tee transformer node is an example of a concrete node that duplicates incoming data to all its outputs. It is provided in the framework and can be found in the flow::samples::generic namespace.

Data flow for a graph with a tee transformer node

Technical considerations

This implementation:

uses templates heavily.

requires RTTI.

depends on many of C++11's language features and library headers.

has been tested with Visual Studio 2012 RC, GCC 4.6.3 and GCC 4.7.0.

uses CMake as the build and packaging tool. As a user of flow, you do not need to build anything since it is only headers.

Design principles

Use of std::unique_ptr

When flowing through the graph, data packets are wrapped in std::unique_ptr. This helps memory managment tremendously and enforces the idea that, at any point in time, only a single entity -pipe or node- is responsible for a data packet.

A thread per node

flow is multi-threaded in that the graph assigns a thread of execution to each of its nodes. The lifetime of these threads is taken care by graph. As a library user, the only mutli-threaded code you would write is whatever a node would require to perform its work.

Node state

A node can be in one of three states: paused, started or stopped. When instantiated, a node is in the paused state. When a node is in the started state, a thread of execution is created for it and it is actively consuming and/or producing packets. When a node is in the paused, it is no longer consuming and/or producing packets. If a concrete node class has internal state, that state should be frozen such that, when the node is re-started, packet processing will continue as if the node had not been paused. When a node is in the stopped state, it's thread of execution is joined. If a concrete node class has internal state, that state should be reset.

Before a node can transition to a new state, it must be added to a graph. Transitioning between these states is done by calling a corresponding member function of the graph class. For this relase, all nodes in a graph are always in the same state. Regardless of the nodes' state, nodes can be added to and removed from a graph at any time and can be connected to and disconnected from another node at any time.

Packet consumption time

Consumption time is the time at which a data packet can be set to be consumed by a consumer node. When a data packet with an assigned consumption time arrives at a consumer node and the consumption time is:

in the future: the consumer node waits or sleeps until the current time and the consumption time match, then consumes the packet.

in the past: the packet is unused and discarded.

Node that consumption time is optional. Data packets with no consumption time are consumed as soon as they reach a consumer node.

Named building blocks

All classes in flow, including the node base class, derive from named. That makes all concrete node classes required to be given a name too. This feature serves two purposes:

nodes can be refered to by their names when building a graph, improving code readability greatly.

helps debugging, especially since all pins and pipes are also named and have names automatically generated based on what they are connected to.

Thanks

License

Boost Software License - Version 1.0 - August 17th, 2003
Permission is hereby granted, free of charge, to any person or organization
obtaining a copy of the software and accompanying documentation covered by
this license (the "Software") to use, reproduce, display, distribute,
execute, and transmit the Software, and to prepare derivative works of the
Software, and to permit third-parties to whom the Software is furnished to
do so, all subject to the following:
The copyright notices in the Software and this entire statement, including
the above license grant, this restriction and the following disclaimer,
must be included in all copies of the Software, in whole or in part, and
all derivative works of the Software, unless such copies or derivative
works are solely in the form of machine-executable object code generated by
a source language processor.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.