Introduction

The robustness and scalability of Software depends heavily on the separation of procedural, logical, functional, and physical components, and also the separation of tiers within each component.

This article will introduce a method for abstracting sequential code by identifying and managing functional separations and constructing a framework for writing distributed computer programs without worrying about its distribution and execution implementation.

Part 1 of the article will focus on the theories behind this framework, while Part 2 will focus on the implementation of the pattern.

Background

It is probably a good idea to read about:

n-tiered architecture

Distributed computing

Event Driven Architecture

Service Oriented Architecture

Threading

Procedural programming language (C#)

Distributed Shared Memory (DSM)

Why Distribute Code Sequence?

Distribution of tasks requires a clear understanding from its micro level and also its macro level. The process of abstraction to form higher order tasks demands this understanding. By abstracting tasks, one can not only effectively manage the performance with introducing division of labour, but also introduce parallelism through identifying its inner associativities and dependencies.

How?

In the simplest form, we could use a simple Data Structure such as LinkedList<Action<I>> that stores a sequence of code blocks with the only relationship being their order of execution. We could then execute this list of actions by iterating through that LinkedList from the first Action<I> to the last. Assuming there are no dependencies, this is the same as an ThreadPool. If this is procedural action, then it is exactly what procedural programming is. Trivial and not interesting.

A slightly more interesting case is to pipeline functions with a LinkedList<Func<I,O>> where the output of one particular function is pipelined to the next functions input. The precondition for this scenario is that all functions have single input parameter (sort of like ML or F#). Again, this case fails to leverage any kind of gains over the traditional approach.

What Do We Really Want?

What we really want is to leverage a distributed architecture through the usage of Asynchronize execution, multithreaded execution, and parallel execution on multiple processors. As we all know, coding in these areas can be extremely difficult in all aspects. And furthermore, being able to not worry about these architectures, and just write good code.

Abstracts

To implement this concept, our Data Structure can no longer be a LinkedList, which implied the order of execution, but rather we need to explicitly define dependencies on each Function to gain more flexibility and management.

Functional Dependency

Dependencies among functions can be understood as this rule: Given functions A and B, A is related to B if and only if either A depends on B, or B depends on A.

We can write this relationship using sets as {A, B, C | dp(B,A) ^ dp(B,C)}. This means given a set of functions {A, B}, B depends on A and C.

Examples

The sequence of functions <A, B, C> can be viewed as a set of functions {A, B, C} such that B depends on A, and C depends on B. The fact that C depends on A, in this case, is implied. Also, Since A does not depend on any other function, it can be executed.

Given a set of functions {A, B, C} such that B depends on A, and C depends on A or B. In this case, C can be executed once A or B is executed.

Given a set of functions {A, B}. There are no dependencies which means that A and B can happen in parallel. The Precondition assumes that there are no side-effects.

{A, B | dp(A,B) ^ dp(B,A)}. This program would never execute as a result.

The concept of dependencies between distinct functions is especially useful because it enables us to partition the functions to individual groups.

(Note: Keep in mind that we only care about the procedural distribution, rather than the functional distribution. In our example 5 above, "Walk" could be a recursive functional routine that happens for 10 minutes.)

Procedural Groups

The definition of such a group is defined as: Procedural group G1 is a sequence of functions <x0, x1,...xn> such that x0 depends on yi where yi does not exist in G1 and xn is depends on xn-1.

Type Inferencing and Parameters

Communication and the transferring of data is a key element of any distributed system. To ensure the flow of information, we need to establish strict type checking rules. Similar to ML, the Type of function A is denoted as A:a'->b' where a' is the input parameter type and b' is the output type of the function A. In this case, since we do not know the inner works of A, just by looking at A, we cannot say anything about its input and output types. So {A, B | dp(B,A)} => A:a'->b' and B:b'-c'. In this case, the return type of A must be the same type as the input type of B.

{A, B, C | dp(C,A) v dp(C,B)} => A:a'->b' and B:c'->b' and C:b'->d'In this case, both A and B must have the same return type.

{A, B, C | dp(A,B) ^ (dp(C, B) v dp(C,A)) } => B:a'->b' and A:b'->b' and C:b'->c'because C depends on B or A and A depends on B, A must have the same input and output type.

By ensuring types uniformity of dependent functions, we can now shift our focus on the communication and transport of information.

Communication

Many methods for transport exists today. COM+, MSQ, Remoting, webservices, etc. all deal with Service Oriented Distributed Architecture, and Event Driven Architecture. However, all this is great but we still need to configure and implement it. Products such as BizTalk, SQL Server Enterprise enable distributed computing with load balancing, clustering, and pararel processing without a line of code. If only this type of technology was available for normal compilers, which can generate distributable code clusters that not only execute functional code, but can communicate and synchronize with one another depending on the number of servers, processors, etc. We could utilize COM+ and MSQ to take care of the transport layer for us if we really wanted to but the point is to investigate it ourselves!

The Coordinator and Agents do not know about each other, the only interface is with the Service Bus.

Events

Since we are using an event driven architecture, we will need to define some events for both the Coordinator, and the Agents.

Coordinator

Distribute_Group

Message: Group

Description: The coordinators issues out the next group in the stack.

Agents

Request_Group

Message: null

Description: The Agent is not working on any groups so it sends out a group request.

Processed_Function

Message: Function

Description: An Agent finished processing of a Group.

Execution Preparation

Coordinator

Each Function is assigned a Unique ID.

IDictionary<int,Function> GenerateID(IEnumerable<Function>)

The Coordinator figures out the Procedural Groups of the Program.

IEnumerable<Group> DistributeFunctions(IDictionary<int,Functions>)

Each Group is Assigned a Unique ID.

IDictionary<int,Group> GenerateID(IEnumerable<Group>)

Agents

No preparation required.

Execution

Coordinator

Wait for Request_Group event.

Select a Group from the Dictionary.

Send out Distribute_Group event with the Group as its message.

Agents

If Agent is Free

Send out Request_Group event

Wait for Distribute_Group event

Intercept event

Agent is now Processing Group Functions

If Agent is Processing Group Functions

For each Function f(x) in Group sequence

For each dependency d(x) of f(x)

Wait For Processed_Function event of the d(x)

Process f(x) with event message

Send out a Processed_Function event with the output of f(x) as the message.

Agent is now free

Deployment

The main assembly consisting of the set of functional relationships is deployed on the Coordinator. Once deployed, the coordinator will take care of the rest.

As more and more agents join the service bus, the faster the overall performance will become.

Function Compilation

In order for the Agents to execute the Functions received from the coordinator, each Group needs to be its own assembly. This way, the Agents themselves do not need to contain any code at the start, but rather receive different code at runtime. We will explore this in more depth in Part 2 of this article.

Summary

Once we have a clear separation of functions, its dependencies, procedural groups, communication, we can put together a program that actually works, distributively!

Part 1 of this article covers the abstract of the framework. The next part will cover its implementation.

thank you. still working on the implementation. I will talk about the use of dynamic assembly injection and creation, priority queues, event driven architecture, and asynchronize message processing. will be good.

For example, consider a group with functions A, B and C with A depending on B and B on C, A finishes to execute but before B was finished the groups' thread/processor/computer terminates (i.e. pc shutdown).

Will the coordinator be able to detect this situation and reassign the group (with only B and C, not A which was executed)?

good question!
since this is a abstract framework, i did not go into too much specifics with error handling and failure handling in this article but will be going into this in the implementation article. Sorry I am currently caught in a important client for my business, which is causing some delays in the completion of this implementation

so i will attempt to explain this portion briefly here.

for exception handling, we will need to create a new generic event which will be pushed out by the agent. We could then setup the application to be terminate on first exception, or continue on with other items.

for failure handling (power failure, forced shutdown) this one is a bit challenging to handle since everything is done asynchronously and with a call and forget model. The solution that I am proposing is to create a Processing_Group event which is sent out by the Agents after intercepting it. The coordinator in this case would handle this event and register the agent ip address in a hash. we can then utilize a monitoring service that will try to ping the ip if in fact it is done. then terminate the application or issue that item to the next available agent.

for exception handling, we will need to create a new generic event which will be pushed out by the agent. We could then setup the application to be terminate on first exception, or continue on with other items.

A settings could be used to try an item several times before it is dropped. For example the coordinator could repeat an item and if it fails 3 times we terminate or continue on with other items.

The monitoring service is good.

Yang Yu wrote:

we can then utilize a monitoring service that will try to ping the ip if in fact it is done.

Mh... at what level will you ping? If you only ping the machine you will not know if the agents are still active.

IMO a ping should be send to each machine and the response will return a list of active agents/threads (with perhaps a stall/active/idle flag). This will permit the coordinator to know exactly which agents is working.