The Architecting Magic Behind Taking Mash Ups Offline : Page 3

While most of the complexity you encounter in synchronization efforts will be unique to your application and infrastructure, synchronization works best when it's designed to be a core feature in each data store.

by Eric Farrar

Feb 12, 2008

Page 3 of 5

Offline Architectures
With some of the high-level synchronization concerns identified, you can evaluate some advanced synchronization architectures. With the criteria just discussed in mind, two general architectures emerge: direct remote synchronization and staged synchronization. They are certainly not the only ways to do synchronization, but instead you can think of them as representing two extreme approaches—with a whole spectrum of hybrids in between.

However, before diving into the architectures it is very important to note that quite often the offline application will end up being quite dissimilar from its online version. While the data from the web services can be stored offline, the logic behind the web services usually cannot be stored offline without it being duplicated totally in the offline version. Additionally, you should ask, "Does this application make sense when it is offline?" An application that mashes together a contact's address with an interactive mapping service, for example, would simply not work without being able to access the mapping service. However, a feature-limited version of that same application (such as displaying a single, static map tile) may turn out to be very feasible while offline. You must carefully plan an offline application to decide which features are both sensible and feasible.

A diagram of a connected application can show the connection to the application's data sources through Java Message Service (JMS), SOAP, and XML-RPC (see Figure 1).

Direct Remote Synchronization
The most obvious approach would be to try to use the same application, but have the offline version access the data from its local data store. Upon reconnection, the application performs its synchronization with each data source (see Figure 2).

Figure 1. Connected Architecture: JMS, SOAP, and XML-RPC can connect an application to its data sources.

Figure 2. Direct Remote Synchronization Architecture: One approach to synchronization is using the same application locally. The offline version accesses data from its local data store, and when reconnected it performs its synchronization with each data source.

The most obvious security question regarding this model is how can the application make connections to the back-end data sources when synchronizing over a public network? One solution would be to expose all the services to the Internet directly. This approach is really a non-starter though because of the huge security implications of exposing data sources directly to the Internet. If this solution were to be implemented this way, each system would have to be responsible individually for its own encryption, decryption, and authentication. Granted, you can solve most of these problems by requiring synchronization through a VPN connection, but a VPN solves the problem at the expense of adding a (sometimes frustrating) step before every synchronization.

In terms of robustness, the fewer communications sessions there are, the fewer opportunities there are for problems. Since the application will be interacting directly with each service (often multiple times), there are far more opportunities for communication errors to occur. Therefore, many more opportunities exist to leave the synchronization in an incomplete state that must be rolled back.

This architecture approach will also tend to require large volumes of data to be transferred. Consider a situation where your business's inventory is exposed though a series of simple services—getInvetoryList(), getInventoryByCategory(), getItemById(), and so on. If your inventory contains 5,000 items and fewer than 10 items change each day, how do you go about synchronizing just the items that change? You can't.

Because the inventory system was only designed for central access, it has no concept of change tracking. The only way to get the updated inventory list is to request the entire list again. With this list, the items that have changed can be determined through the application by comparing the old and new lists, item by item.

Furthermore, assume that the online version of the application lets the user view the item's price in two alternate currencies. When the user requests the price in another currency, a request is sent to an externally hosted currency conversion service. If this same functionality is required offline (because it is unknown which items the user will want to see and in which currencies), all conversion prices must be stored for each item as part of the synchronization. The application, then, must call the conversion service twice for each inventory item that has changed, and because an average of 10 items change between synchronizations, the application will have to make 20 unique requests to the conversion service. Additionally, each application must perform this process. If your business has 100 workers that synchronize daily, that volume could translate to almost 1,980 redundant calls to the conversion service.