TCPHA

TCPHA can be used to build a high performance and high available server based on a cluster of Linux servers.

The architecture of TCPHA is illustrated as follows:

Note:The algorithm used for request distribution may be distributed. Not always done by FE.
Here only for simplification.

TCPHA implements an architecture for scalable content-aware request distribution in cluster-base servers. It implements
Kernel Layer-7 switching based on TCP Handoff for the Linux operating system.
Since the overhead of layer-7 switching in
user-space is very high, it is good to implement it inside the kernel in order to avoid the overhead of context switching and memory copying between user-space and kernel-space, furthermore, the responses are sent directly to clients, not passing the dispatcher, which will greatly improve the performance of cluster.

TCPHA, inspired by KTCPVS and IPVS, merges their strongpoints. Otherwise, the installation and configuration are very simple.

It is the initial release, debugged on Linux offcial kernel 2.4.20. There are a lot of work to do, such as good scheduling algorithm, better persistent connction handling, implementing it on Windows (i don't think it is so difficult:)). If you are interested in the development, you are very welcome, hopefully we will make it a useful one in the near future.

TCPHA is released under GNU GPL(General Public License).

TCPHA's Features

High Performance

TCP HAndoff ( That is why TCPHA so named:) ), so that response can
be sent directly to clients by BE.

Local Node Function ( Since tcpha-0.1.1 ), so that FE can handle requests directly, which is useful when the load is light, or number of BE are not many or some requests which can not be scheduled etc.

High Availability

( Since tcpha-0.1.4 ) FE detect BE's status periodically (the interval is settable), if FE find one BE not available, then mark its status NOTAVAILABLE,
which leave it not be scheduled. On the contrary, if FE find one BE available again, then mark its status AVAILABLE, leave it be scheduled.

Scalability

( Since tcpha-0.1.1 ) For response data not passing FE, and efficient TCP Handoff implementation, the load of FE is lightweighted.

( Since tcpha-0.1.4 ) Permit BE dynamic register, that means no need restarting system to add a new BE to cluster.

Easy to Use

Others

(Since tcpha-0.1.4) User can set the CONFIG_TCPHA_USELOG option to use invidual log file instead of /var/log/messages,
but it will influence the performance, because the log job is done by TCPHA self, other than klogd. So recommend it used for debugging.
Also, user can set debug level to control the number of debug messages printed and cat file "/proc/net/tcpha_fe_conn" to see the
connection status handled by FE.

Latest news about TCPHA...

The TCPHA module (version 0.2.0) was released on August 27, 2005.
It adopts symmetric multiple-thread event-driven architecture, and some bugs
are fixed. It runs well on Linux offcial kernel 2.4.20 (including SMP kernel).
Also a paper: Design and Implementation of TCPHA (draft release) was available.

Glossary and all the configuration details be added to
Readme on January 10, 2005.

The TCPHA module(version 0.1.4) was released on January 10, 2005.
Haven't touched the exciting TCPHA codes for long,busy in some other things. I'm
very sorry for having some letters not writing back. Recently, I added some new
features into TCPHA, mainly dealing with fault tolerance, debugging, logging, BE
dynamic register etc. All the
TCPHA's features see above. and also fixed some
tiny bugs, tidied up codes, with some codes rewritten and many commentaries added. Thanks go to Dan Tang,Mulyadi Santosa and cheaney Chen for some helpful discussion.

The TCPHA module(version 0.1.3) was released on May 19, 2004.
In this version, I tidied up code and made some tiny changes to algorithm
used in version 0.1.x.

The TCPHA module(version 0.3.1) was released on May 15, 2004.
In this version, I tried a new mechanism to support P-HTTP(Persistent HTTP).
The documents about it are being tidied up. In short, I let the TCP
Connection endpoint 'bounding' between BEs. It looks interesting, doesn't it?
The algorithm is far from perfect and it is only very draft and ugly code.
I only want to present a algorithm framework.
But at least, it is REALLY Content-aware request distribution.
That means EVERY HTTP request is distributed with taking into account the
content, which will greatly improve the cache hit rate in BEs. And in this
algorithm, BEs also take part in scheduling, which won't introduce more overhead
to FE. Furthermore, although the algorithm current used can not guarantee it,
I had no problems found in my limited tests when I stored files in BEs by
file type. If you are interested in the issue, You are VERY welcome,
more discussions and ideas are VERY needed in supporting P-HTTP. If you want
to download this version to serve your cluster, I recommend you NOT. It has not
been under enough tests. At least, DON'T store files by type:)

The TCPHA module(version 0.1.2) was released on May 8, 2004.
In this version, I improved the TCP Handoff mechanism. In fact,
Only ONE packet exchange is needed between FE(front end) and
BE(back end) during the TCP Handoff process.I think it should
be efficient enough:)