To run JASocket you will need the JAR files for compatible versions of JActor, JID, JASocket and JFile. Download these projects from here and extract the jar files to a common directory, in our case c:\jaconfig. You will also need jar files from slf4j, sshd, joda-time and jline.

Next you need a few shell scripts. Here are some which work with windows:

node.bat - Runs JAConfig. When an optional port number is not provided, JAConfig uses port 8880 and also starts an operator console:

SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

*** ConsoleApp 10.0.0.2:8880 ***

name: admin

password:

1>

The config database now holds the new admin password.

Accounts for operators are created just by assigning a password. Any operator can do this, but you will need the admin password. Note however that there is no difference between creating an operator account and changing another operator's password.

1>server config setPassword Fred

admin password>

password>

confirm>

OK

2>server config values

operator.Fred.password = **********

operator.admin.password = **********

3>

Quorum

A quorum requires the participation of N/2+1 hosts. So with one host, you need one host to achieve a quorum. With two hosts you need both hosts running to achieve. But with three hosts, only two hosts are needed. Fault tolerance then is only achieved when running with 3 or more hosts.

The quorum server tracks the number of hosts that are available and gets the total host count from the config database--which must be manually assigned.

The JASocket package makes it easy to create distributed, scalable software. JAConfig, in turn, makes it easy to manage in production.

The Config DB

JAConfig provides a fully replicated non-transactional, eventually consistent, key/value pair database for maintaining both configuration data and operator passwords. The database also provides change notifications, so servers can react to configuration changes. Every node in the cluster has a copy of this database both on disk and in memory, ensuring that the database is fully robust and supports fast queries. And there is neither a separate log file nor any need for a recovery mechanism--on startup, if the database is not valid its contents are discarded.

The underlying assumption of the database is that changes are infrequent, and that the system clocks of all the nodes in the cluster all have roughly the same time. Key/value pairs in the database always carry the timestamp of when the last change was made. Changes are propagated across all nodes in the cluster and shared when a node connects to another node. For each key, the change with the latest timestamp is retained.

The database is peer-based. So there is no single point of failure. And because there is no master copy, there are no warm or hot backups and fallover time is effectively 0. On the flip side, status information is completely out of scope, as frequent updates will break the underlying assumptions.

Quorum

A cluster can be split into 2 or more smaller clusters by something as simple as a loose cable. If these smaller clusters act independently, inconsistent results can occur. This is managed by knowing the total number of host computers in the cluster and only allowing some activities to occur the the number of hosts currently connected to a given cluster is equal to or grater than (totalNumberOfHosts / 2) + 1. Clusters connected to this number of hosts have what is called a quorum and as there can not be two sub-clusters with a quorum of hosts, at most only one sub-cluster will be active.

Note that we are talking about a quorum of hosts rather than a quorum of nodes. Multiple nodes can run on each host and indeed there may be a large host which runs many of the nodes in a cluster. So if the quorum was based on nodes, it is possible that all the nodes of the quorum are running on the same host, which creates a single point of failure.

Cluster and Host Server Managers

At this time there are two types of server managers, the Cluster Manager and the Host Manager. These managers are started (and monitored) by a kingmaker server, which runs in every node. The kingmaker servers are responsible for having one cluster manager running in the cluster and one host manager running on every host.

The cluster manager uses the data in the config database to start and monitor a number of cluster servers, where each type of cluster server has only a single instance running somewhere in the cluster. The cluster manager and all the cluster servers stop running when the node is not a part of the active cluster (a cluster with a quorum of hosts).

Similarly, the host managers use the data in the config database to start and monitor a number of host servers, where each type of host server has only a single instance running on each host. Unlike the cluster manager and servers, the host manager and servers are unaffected by quorum considerations.

Load Balancing

The cluster and host managers use a server named ranker to determine which node to use when starting a server. A simple ranker is provided which provides a list of nodes ordered by the number of servers running on each node. Alternative ranker implementations can be used as they are developed.