Designing a Fully Scalable Application

There are many issues a software engineer needs to take into account when designing a new application, including functionality, performance, security, and graphical user interface (GUI). But there are some hidden issues that will be harder for you to spot and integrate into the initial design process due to a variety of initially unknown and unpredictable variables, such as the number of users that will be using the application in the future and their specific needs.

For these and other reasons you may need to extend the functionality of your application beyond its design limits in the future or, in other words, scale it up. For example, the application may need to handle much greater loads then it was designed for, or allocate resources that are greater than one computer can provide, which would force you to split your application into several smaller applications residing on different computers.

This article provides guidelines for writing your application while taking into account that you may need to scale it up in the future. These rules of thumb will enable your application to start small and scale up as needed. In addition, this article will introduce a new set of utilities provided by MantaRay, an innovative, open source data messaging project based on peer-to-peer, serverless architecture. These utilities allows you to write the same code for your application whether it is running in a single JVM or distributed over several computers/JVMs.

As an example, we will use a simple application called WebMonitor. It starts out as a very simple application with a GUI written in Swing. The application gets a URL as input from the user and provides information on whether or not the URL is responding to HTTP requests. The application monitors the URLs from time to time and provides an up-to-date status report about them.

Figure 1 shows the WebMonitor example application GUI.

Figure 1. WebMonitor example application GUI

Modularity

The first rule of thumb is to separate your application into modules: pieces of code that can be viewed as separate independent entities, each with its own dedicated responsibility. A module can be composed of one or more objects, but not the other way around--it is bad practice for one object to encapsulate more then one module (in the "Decoupling the Modules" section, we will go over the reason for this).

It is easy to spot at least two modules in the WebMonitor application. First, a GUI module that is responsible for getting user inputs and displaying the results. Second, an engine module that is responsible for checking the URLs and coming back with a result indicating whether or not the URLs are responding.

Actually, one can also spot a third module, a "model" module, that acts like a data storage facility; one that notifies "listener" modules when data changes, thus completing the MVC paradigm. The rest of this example has been intentionally simplified by ignoring the third module and concentrating only on the engine and GUI modules.

Identifying the modules in your application is necessary if you want to know how your application will scale up in the future. Modules can scale up and become separate smaller applications when time comes; we will discuses how to separate them later on in this article.

Decoupling the Modules

Going back to our example, let's imagine some time has passed and the WebMonitor has become very successful. Your clients are very happy, but have requested that you create additional user interfaces. The IT department wants a command-line user interface, the support department wants a web-based user interface, while the developers would like to keep the Swing GUI. To do all of that, we need to decouple the WebMonitor's modules.

In order to decouple modules we have to make sure that each one is an independent entity. Each decoupled module has to be a "black box" with a well-defined interface in order to communicate with the other modules.

Let's take the WebMonitor and break it into decoupled modules.

Figure 3 shows the decoupled modules in the WebMonitor.

Figure 3. Decoupled Modules in the WebMonitor

The modules communicate with each through a mediator object that decouples the modules and eliminates the detailed knowledge one module has of the others. You can think of these mediators as buffers between one module and another. While these mediators may not seem important at this stage, they are crucial for what will be covered in the next section of this article.

After we have decoupled the modules, it becomes easy to comply with the clients' requests to create additional user interfaces. The necessary modules are added to the WebMonitor and can communicate with the older modules as needed.

Figure 4 shows the WebMonitor with additional modules.

Figure 4. WebMonitor with additional modules

Scaling up the Application

By now the WebMonitor application is a great success; it monitors thousands of sites and serves hundreds of users. You start to get reports that the responsiveness of the application is becoming slow; it seems that the engine takes a lot of memory and CPU and has reached the limits of the machine, and clients want to add additional URLs to be monitored but they can't because the machine crashes.

All the work you have done on the application--identifying the modules and decoupling them with mediators--will now pay off. Since you had designed the WebMonitor to be scalable from the ground up, it is now easy to put the engine on a separate machine and even create several engines that can divide the load among them. Because the modules are decoupled by mediators, you do not need to rewrite the code inside of the modules but instead, simply use different mediators. This way, you can seamlessly scale up your distributed application to be used over many computers and easily handle heavy loads.

Figure 5 shows the WebMonitor distributed over several machines.

Figure 5. WebMonitor distributed over several machines

As you may have noticed, the mediators are not drawn inside of any of the machines because they are logical entities that can be referenced from all the machines.

The next section will introduce a set of mediators provided by the MantaRay open source project that can work both in memory and in a scaled-up, distributed environment. Using these mediators, no code modifications are needed in order to scale up the application.

Scalable Stage and Dispatcher

MantaRay provides two types of mediators, point-to-point (stage) and publish-subscribe (dispatcher), which answer different mediation needs. Together, they form a complete set that serves all of the mediation needs described in this article.

Stage

The SEDA project defines a Staged Event-Driven Architecture that introduces a well decoupled methodology for module interaction where stages are used to moderate between modules. Modules queue events into stages, while other modules that serve as handlers to these stages receive the events, triggering an operation. These modules, in turn, queue events into other stages as needed.

Figure 6 shows the decoupling modules with a stage.

Figure 6. Decoupling modules with a stage

The MantaRay project takes the stage idea to the next level. Modules obtain a reference to stage through a factory object. The factory dynamically creates the proper implementation, depending on whether the stage is in a distributed environment or in memory.

Several modules (or several instances of the same module) can add themselves as handlers to the same stage, but a single event is sent to one and only one handler. An event can be any object that implements the marker interface Serializable. Most Java objects can be serialized, but objects like Socket and OutputStream can not cannot be serialized because by nature they cannot be passed through the network. You can add this marker interface to any object without needing to implement additional methods.

In the WebMonitor example, the GUI modules send a request to the engine to monitor a URL through a stage. The engines register themselves as handlers to the stage, which in turn acts as a software load balancer in addition to its role as a mediator.

Dispatcher

The dispatcher is very similar to a stage, with one significant difference. As opposed to a staged event that has one and only one handler, a dispatcher event is sent to all handlers that have added themselves to the dispatcher. Thus, a dispatcher serves as a one-to-many mediator.

Figure 7 shows the decoupling modules with a dispatcher.

Figure 7. Decoupling modules with a dispatcher

The same roles apply to the events dispatched on a dispatcher as to the events queued to a stage. Additionally, the methodology used to obtain a reference to the dispatcher through a factory is the same as the one used to obtain a stage.

In the WebMonitor example, engines notify the GUI about changes to the status of the monitored URLs through a dispatcher. All of the GUI modules register themselves as handlers to the dispatcher, and all of them get notifications about changes to the status of the monitored URLs.

Below is a short code example of how the GUI and the engine work with the stage and the dispatcher:

GUI Code

import org.mr.api.blocks.ScalableDispatcher;
import org.mr.api.blocks.ScalableFactory;
import org.mr.api.blocks.ScalableHandler;
import org.mr.api.blocks.ScalableStage;
...
public class WebMonitorGui implements ActionListener
, ScalableHandler{
// the input text of the url to be checked
JTextField urlInput;
// The outgoing stage to the engine
public ScalableStage engineStage;
// The inbound dispatcher from the engine
public ScalableDispatcher guiDispatcher;
/**
* The GUI gets a reference to the incoming
* dispatcher and the outgoing stage,
* the distributed parameter is passed to the
* factory of these objects
* @param distributed. if this boolean is true then
* this sample is running in a distributed environment
* and the engine is in a different VM
*/
public WebMonitorGui(boolean distributed){
// get the stage from the factory
engineStage = ScalableFactory.getStage("engine"
, distributed);
// get the dispatcher from the factory
guiDispatcher = ScalableFactory.getDispatcher(
"gui", distributed);
// register this object has handler to all
// incoming events
guiDispatcher.addHandler(this);
}
/**
* Start the SWING GUI
*/
public void startGUI(){
//Make sure we have nice GUI.
..
}//startGUI
/**
* Called when there is an input from the user
* sends the URL input from the user to the engine
* via the stage
*/
public void actionPerformed(ActionEvent e) {
engineStage.queue(urlInput.getText());
}
/**
* This is an implementation method of
* ScalableHandler interface
* Called by the dispatcher when an event (results)
* is received from the engine, updates the display
*/
public void handle(Object event) {
HashMap result = (HashMap) event;
Iterator urls = result.keySet().iterator();
while(urls.hasNext()){
String url = (String) urls.next();
String status = (String) result.get(url);
updateURLStatus(url,status);
}
}
}//WebMonitorGui

Engine

import org.mr.api.blocks.ScalableDispatcher;
import org.mr.api.blocks.ScalableFactory;
import org.mr.api.blocks.ScalableHandler;
import org.mr.api.blocks.ScalableStage;
...
public class WebMonitorEngine extends Thread
implements ScalableHandler {
//Inbound stage this engine gets its URL
//to check from this stage
public ScalableStage engineStage;
// Outbound dispatcher this engine returns a
// result map on this dispatcher
public ScalableDispatcher guiDispatcher;
// the result map
public HashMap urlsToStatus = new HashMap();
/**
* @param distributed if this boolean is true
* then this sample is running in a distributed
* environment, the engine is in a different VM
*/
public WebMonitorEngine(boolean distributed){
//get the inbound stage and register as handler
engineStage = ScalableFactory.getStage("engine"
, distributed);
engineStage.addHandler(this);
// get the outbound dispatcher
guiDispatcher = ScalableFactory.getDispatcher(
"gui", distributed);
}
/**
* This is an implementation method of
* ScalableHandler interface. Called by the stage
* when an event (request) was received from
* the GUI, adds the URL to the "to checked" map
* called urlsToStatus
*/
public void handle(Object event) {
// The event is a string like "java.sun.com"
String urlStr = (String) event;
// put in map of URLs to be checked
urlsToStatus.put(urlStr, "Checking");
}
}
/**
* The engine is a thread that runs and checks the
* URL and creates a report about status changes
*/
public void run(){
while(true){
HashMap result= getUrlNewStatus();
// only is result is not empty then send it
if(!result.isEmpty())
guiDispatcher.dispatch(urlsToStatus);
}// while
}// run
}
}//WebMonitorEngine

The full code of the WebMonitor example using both stage and dispatcher methodologies can be found in the sample folder of MantaRay's latest release. This article simplified the code example to make it more suitable for the format of an article.

Conclusion

We all hope that we are successful, that our clients like the software we have developed, and that the software will be widely used, but we must make sure that we are ready for this success. We must plan ahead in order to ensure that when the time comes and millions of users are happily using our product, it can gracefully and flexibly scale up to handle any additional load, necessary modifications, and additional functionality.

MantaRay's scalable set of utilities makes it easy to create an application that is ready to be scaled up from day one, with no code modification. A factory method dynamically creates the required module-decoupling implementation, whether it is distributed or in memory, thus seamlessly transforming a small-scale application into an enterprise-grade one.