Daily Archives: 05/10/2016

Last time we look at remoting. You can kind of think of clustering as an extension to remoting, as some of the same underlying parts are used. But as we will see clustering is way more powerful (and more fault tolerant too).

My hope is by the end of this post that you will know enough about Akka clustering that you would be able to create your own clustered Akka apps.

A Note About All The Demos In This Topic

I wanted the demos in this section to be as close to real life as possible. The official akka examples tend to have a single process. Which I personally think is quite confusing when you are trying to deal with quite hard concepts. As such I decided to go with multi process projects to demonstrate things. I do however only have 1 laptop, so they are hosted on the same node, but they are separate processes/JVMs.

I am hoping by doing this it will make the learning process easier, as it is closer to what you would do in real life rather than have 1 main method that spawns an entire cluster. You just would not have that in real life.

What Is Akka Clustering?

Unlike remoting which is peer to peer, a cluster may constitute many members, which can grow and contract depending on demand/failure. There is also the concept of roles for actors with a cluster, which this post will talk about.

You can see how this could be very useful, in fact you could see how this may be used to create a general purpose grid calculation engine such as Apache Spark.

Seed Nodes

Akka has the concept of some initial contact points within the cluster to allow the cluster to bootstrap itself as it were.

Here is what the official Akka docs say on this:

You may decide if joining to the cluster should be done manually or automatically to configured initial contact points, so-called seed nodes. When a new node is started it sends a message to all seed nodes and then sends join command to the one that answers first. If no one of the seed nodes replied (might not be started yet) it retries this procedure until successful or shutdown.

You may choose to configure these “seed nodes” in code, but the easiest way is via configuration. The relevant part of the demo apps configuration is here

The seed nodes can be started in any order and it is not necessary to have all seed nodes running, but the node configured as the first element in the seed-nodes configuration list must be started when initially starting a cluster, otherwise the other seed-nodes will not become initialized and no other node can join the cluster. The reason for the special first seed node is to avoid forming separated islands when starting from an empty cluster. It is quickest to start all configured seed nodes at the same time (order doesn’t matter), otherwise it can take up to the configured seed-node-timeout until the nodes can join.

Once more than two seed nodes have been started it is no problem to shut down the first seed node. If the first seed node is restarted, it will first try to join the other seed nodes in the existing cluster.

We will see the entire configuration for the demo app later on this post. For now just be aware that there is a concept of seed nodes and the best way to configure those for the cluster is via configuration.

Saying that there may be some amongst you that would prefer to use the JVM property system which you may do as follows:

Roles

Akka clustering comes with the concept of roles.You may be asking why would we need that?

Well its quite simple really, say we have a higher than normal volume of data coming through you akka cluster system, you may want to increase the total processing power of the cluster to deal with this. How do we do that, we spin up more actors within a particular role. The role here may be “backend” that do work designated to them by some other actor say “frontend” role.

By using roles we can manage which bits of the cluster get dynamically allocated more/less actors.

You can configure the minimum number of role actor in configuration, which you can read more about here:

We will see more on this within the demo code which we will walk through later

ClusterClient

What use is a cluster which cant receive commands from the outside world?

Well luckily we don’t have to care about that as Akka comes with 2 things that make this otherwise glib situation ok.

Akka comes with a ClusterClient which allows actors which are not part of the cluster to talk to the cluster. Here is what the offical Akka docs have to say about this

An actor system that is not part of the cluster can communicate with actors somewhere in the cluster via this ClusterClient. The client can of course be part of another cluster. It only needs to know the location of one (or more) nodes to use as initial contact points. It will establish a connection to a ClusterReceptionist somewhere in the cluster. It will monitor the connection to the receptionist and establish a new connection if the link goes down. When looking for a new receptionist it uses fresh contact points retrieved from previous establishment, or periodically refreshed contacts, i.e. not necessarily the initial contact points.

Receptionist

As mentioned above the ClusterClient makes use of a ClusterReceptionist, but what is that, and how do we make a cluster actor available to the client using that?

The ClusterReceptionist is an Akka contrib extension, and must be configured on ALL the nodes that the ClusterClient will need to talk to.

There are 2 parts this, firstly we must ensure that the ClusterReceptionist is started on the nodes that ClusterClient will need to communicate with. This is easily done using the following config:

The other thing that needs doing, is that any actor within the cluster that you want to be able to talk to using the ClusterClient will need to register itself as a service with the ClusterClientReceptionist. Here is an example of how to do that

The “Official” example as it is, provides a cluster which contains “frontend” and “backend” roles. The “frontend” actors will take a text message and pass it to the register workers (“Backend”s) who will UPPERCASE the message and return to the “frontend”.

I have taken this sample and added the ability to use the ClusterClient with it, which works using Future[T] and the ask pattern, such that the ClusterClient will get a response from the cluster request.

We will dive into all of this in just a moment

For the demo this is what we are trying to build

SBT / Dependencies

Before we dive into the demo code (which as I say is based largely on the official lightbend clustering example anyway) I would just like to dive into the SBT file that drives the demo projects

frontend : frontend cluster based actors (the client will talk to these)

backend : backend cluster based actors

The Projects

Now that we have seen the projects involved from an SBT point of view, lets continue to look at how the actual projects perform their duties

Remember the workflow we are trying to achieve is something like this

We should ensure that a frontend (seed node) is started first

We should ensure a backend (seed node) is started. This will have the effect of the backend actor registering itself as a worker with the already running frontend actor

At this point we could start more frontend/backend non seed nodes actors, if we chose to

We start the client app (root) which will periodically send messages to the frontend actor that is looked up by its known seed node information. We would expect the frontend actor to delegate work of to one of its known backend actors, and then send the response back to the client (ClusterClient) where we can use the response to send to a local actor, or consume the response directly

Common

The common project simply contains the common objects across the other projects. Which for this demo app are just the messages as shown below

As you can see this is a regular actor, but there are several important things to note here:

We setup the ClusterClient with a known set of seed nodes that we can expect to be able to contact within the cluster (remember these nodes MUST have registered themselves as available services with the ClusterClientReceptionist

That we use a new type of actor a ClusterClient

That we use the ClusterClient to send a message to a seed node within the cluster (frontend) in our case. We use the ask pattern which will give use a Future[T] which represents the response.

We use the response to send a local message to ourself

FrontEnd

As previously stated the “frontend” role actors serve as the seed nodes for the ClusterClient. There is only one seed node for the frontend which we just saw the client app uses via the ClusterClient.

So what happens when the client app uses the frontend actors via the ClusterClient, well its quite simple the client app (once a connection is made to the frontend seed node) send a simple TransformationJob which is a simple message that contains a bit of text that the frontend actor will pass on to one of its registered backend workers for processing.

The backend actor (also in the cluster) will simply convert the TransformationJob contained text to UPPERCASE and return it to the frontend actor. The frontend actor will then send this TransformationResult back to the sender which happens to be the ClusterClient. The client app will listen to this (which was done using the ask pattern) and will hook up a callback for the Future[T] and will the send the TransformationResult to the clients own actor.

Happy days.

So that is what we are trying to achieve, lets see what bits and bobs we need for the frontend side of things

The important parts here are that we embellish the read config with the role of “frontend”, and that we also register the frontend actor with the ClusterClientReceptionist such that the actor is available to communicate with by the ClusterClient

Other than that it is all pretty vanilla akka to be honest

So lets now focus our attention to the actual frontend actor, which is shown below

That when a backend registers it will send a BackendRegistration, which we then watch and monitor, and if that backend terminates it is removed from the list of this frontend actors known backend actors

That we palm off the incoming TransformationJob to a random backend, and then use the pipe pattern to pipe the response back to the client

And with that, all that is left to do is examine the backend code, lets looks at that now

BackEnd

As always lets start with the configuration, which for the backend is as follows:

That we use the cluster events, to subscribe to MemberUp such that if its “frontend” role actor, we will register this backend with it by sending a BackendRegistration message to it

That for any TrasformationJob received (from the frontend which is ultimately for the client app) we do the work, and send a TransformationResult back, which will make its way all the way back to the client

And in a nutshell that is how the entire demo hangs together. I hope I have not lost anyone along the way.

Anyway lets now see how we can run the demo

How do I Run The Demo

You will need to ensure that you run the following 3 projects in this order (as a minimum. You can run more NON seed node frontend/backend versions before you start the root (client) if you like)

Frontend (seed node) : frontend with command line args : 2551

Backend (seed node) : backend with command line args : 2551

Optionally run more frontend/backend projects but DON’T supply any command line args. This is how you get them to not be treated as seed nodes