Consider a system that must process multiple files simultaneously.
We want to improve system performance but also we want to monitor the process in real time.

To achieve this goal, we suggest to build a distributed architecture consisting of a REST web server ( ASP.NET WEB API, SIGNALR), a WEB client (ASP.NET MVC and Angular JS ) and a web service that processes files ( WCF or Other).

But for this tutorial we will use a single project for easier reading.

To follow this tutorial, you must undertand ASP.NET WEB API , SIGNALR and TPL Dataflow.

TECHNOLOGY ARCHITECTURE

HUB Server : ASP.NET WEB API and SIGNALR

Monitoring Client : ASP.NET MVC and AngularJS

Processing Server : TPL DataFlow, FileWatcher System

HUB SERVER.

To Build our Hub Server, we will use ASP.NET Web API because clients must connect to the hub by uploading json data

.

We will also use SignalR as it allows bi-directional communication between server and client. Servers can now push content to connected clients instantly as it becomes available and supports Web Sockets.

When Server is invoked, Hub.Clients.All.LoadBalance(item) ( where item is Processor), data is pushed to Hub and be available for clients as follow

Monitor Controller MonitorCtrl use MonitorSvc

2.MONITORING CLIENTS

Clients connect to LoadBalance function of the Hub as follow:

Client use MonitorCtrl and iterate through processor to display items in real time. this is possible because MonitorCtrl push item into an array named Processor

$scope.Processor = new Array();

var addProcessor = function (data) {
$scope.Processor.push(data);
};

3.PROCESSING SERVER

We can avoid bottlenecks in performance and improve overall responsiveness of our application using the asynchronous programming. However, traditional techniques for writing asynchronous applications can be complex and difficult to write, debug and update applications.

There exist différent technics to build asynchronous systems :

THREAD

We can Start, Stop, Abort and Coordinating Threads (Join)

TASK

A task Represents an asynchronous operation that can return a value

ASYNC and AWAIT

PARALLEL PROGRAMMING

TPL DATAFLOW

we want to just write the code, and the way we structure it results in no synchronization issues. So we don’t have to think about synchronization. In this world each object has its own private thread of execution, and only ever manipulates its own internal state.

Instead of one single thread executing through many objects by calling object methods, objects send asynchronous messages to each other.

If the object is busy processing a previous message, the message is queued. When the object is no longer busy it then processes the next message.
Fundamentally, if each object only has one thread of execution, then updating its own internal state is perfectly safe.

TPL Dataflow enable us to achieve this goal by building blocks. Blocks are
essentially a message source, target, or both. In addition to receiving and sending messages, a block represents an element of concurrency for processing the messages it receives.

Multiple blocks are linked together to produce networks of blocks. Messages are then posted asynchronously into the network for processing.

Consider the following use case. Several files are sent to a server (in a directory), The data contained in each file need to be transformed into a
data object ready to be sent to the web service. For network efficiency the web service receives multiple data objects as part of a single request, up to a defined maximum.

The following process could be broken down into a series of blocks, where each block is responsible for doing some part of the overall processing.

_bufferBlock has the responsibility to fetch files from directory as they arrive

_receptorBlockOne , _receptorBlockTwo and _receptorBlockThree has the responsibility to load balance fetched files

For better performance , we want to load balance our process. So our next step is to create 3 load balanced receptors, if _receptorBlockOne is busy, _receptorBlockTwo or _receptorBlockThee will process the item,… ReceptorBlockOne, ReceptorBlockTwo and ReceptorBlockThree are blocks. So, if a message is refused by one block, the next linked block will be
offered the message. If all blocks refuse the message, the first block to become available to process the message will do so. To achieve this goal, we have to make a block non-greedy, simply set the queue length to 1.

_transformBlockToManyFiles has responsability transfrom a FileOrderEntity to as List<FileOrderEntity>. large files must be split to many small files.

_printingBlock has responsability to print outputs

Now we are going to build our Dataflow network by linking blocks.

To visualize the TPL Dataflow network , launch debugger and then click the search icon

The schema below represent our DataFlow network, the workflow that will execute at runtime.

Finally, let us use FileSystemWatcher to listen file system change notifications and raises events when a directory receives some files.