Web Development

Building Scalable Web Architecture and Distributed Systems

By ﻿Kate Matsudaira, December 31, 2012

Like most things in life, taking the time to plan ahead when building a Web service can help in the long run.

In our image server example, it is possible that the single file
server used to store images could be replaced by multiple file
servers, each containing its own unique set of images.
(See Figure 4.) Such an
architecture would allow the system to fill each file server with
images, adding additional servers as the disks become full. The
design would require a naming scheme that tied an image's filename
to the server containing it. An image's name could be formed from a
consistent hashing scheme mapped across the servers. Or alternatively,
each image could be assigned an incremental ID, so that when a client
makes a request for an image, the image retrieval service only needs
to maintain the range of IDs that are mapped to each of the servers
(like an index).

Of course there are challenges distributing data or functionality
across multiple servers. One of the key issues is data
locality; in distributed systems the closer the data to the
operation or point of computation, the better the performance of the
system. Therefore it is potentially problematic to have data
spread across multiple servers, as any time it is needed it may not be
local, forcing the servers to perform a costly fetch of the required
information across the network.

Another potential issue comes in the form of
inconsistency. When there are different services reading and
writing from a shared resource, potentially another service or data
store, there is the chance for race conditions — where some data is
supposed to be updated, but the read happens prior to the update — and
in those cases the data is inconsistent. For example, in the image
hosting scenario, a race condition could occur if one client sent a
request to update the dog image with a new title, changing it from
"Dog" to "Gizmo", but at the same time another client was reading
the image. In that circumstance it is unclear which title, "Dog" or
"Gizmo", would be the one received by the second client.

There are certainly some obstacles associated with partitioning data,
but partitioning allows each problem to be split — by data, load, usage
patterns, etc. — into manageable chunks. This can help with scalability
and manageability, but is not without risk.
There are lots of ways to mitigate risk and handle failures; however,
in the interest of brevity they are not covered in this article. If
you are interested in reading more, you can check out my blog post
on fault tolerance and monitoring.

The Building Blocks of Fast and Scalable Data Access

Having covered some of the core considerations in designing
distributed systems, let's now talk about the hard part: scaling
access to the data.

Figure 5: Simple Web applications

As they grow, there are two main challenges: scaling access to the
app server and to the database. In a highly scalable application
design, the app (or Web) server is typically minimized and often
embodies a shared-nothing architecture. This makes the app server
layer of the system horizontally scalable. As a result of this design,
the heavy lifting is pushed down the stack to the database server and
supporting services; it's at this layer where the real scaling and
performance challenges come into play.

The rest of this article is devoted to some of the more common
strategies and methods for making these types of services fast and
scalable by providing fast access to data.

Figure 6: Oversimplified Web application

Most systems can be oversimplified to Figure 6.
This is a great place to start. If you have a lot of data, you want
fast and easy access, like keeping a stash of candy in the top drawer
of your desk. Though overly simplified, the previous statement hints
at two hard problems: scalability of storage and fast access of data.

For the sake of this example, let's assume you have many terabytes (TB)
of data and you want to allow users to access small portions
of that data at random. (See Figure 7.)
This is similar to locating an image file
somewhere on the file server in the image application example.

Figure 7: Accessing specific data

This is particularly challenging because it can be very costly to load
TBs of data into memory; this directly translates to disk IO. Reading
from disk is many times slower than from memory — memory access is
as fast as Chuck Norris, whereas disk access is slower than the
line at the DMV. This speed difference really adds up for large
data sets; in real numbers memory access is as little as 6 times
faster for sequential reads, or 100,000 times faster for random
reads, than reading from
disk (see The Pathologies of Big Data). Moreover, even with unique IDs, solving the problem of
knowing where to find that little bit of data can be an arduous
task. It's like
trying to get that last Jolly Rancher from your candy stash without
looking.

Thankfully there are many options that you can employ to make this
easier; four of the more important ones are caches, proxies,
indexes and load balancers. The rest of this article
discusses how each of these concepts can be used to make data
access a lot faster.

Caches

Caches take advantage of the locality of reference
principle: recently requested data is likely to be requested
again. They are used in almost every layer of computing:
hardware, operating systems, Web browsers, Web applications and
more. A cache is like short-term memory: it has a limited amount of
space, but is typically faster than the original data source and
contains the most recently accessed items. Caches can exist at all
levels in architecture, but are often found at the level nearest
to the front end, where they are implemented to return data
quickly without taxing downstream levels.

How can a cache be used to make your data access faster in our API example?
In this case, there are a couple of places you can insert a cache. One
option is to insert a cache on your request layer node, as in
Figure 8.

Figure 8: Inserting a cache on your request layer node

Placing a cache directly on a request layer node enables the local
storage of response data. Each time a request is made to the service,
the node will quickly return local, cached data if it exists. If it
is not in the cache, the request node will query the data from disk. The
cache on one request layer node could also be located both in memory
(which is very fast) and on the node's local disk (faster than going
to network storage).

Figure 9: Multiple caches

What happens when you expand this to many nodes?
As you can see in Figure 9, if the request layer is expanded to multiple nodes, it's still quite
possible to have each node host its own cache. However, if your load
balancer randomly distributes requests across the nodes, the same
request will go to different nodes, thus increasing cache misses. Two
choices for overcoming this hurdle are global caches and distributed
caches.

Global Cache

A global cache is just as it sounds: all the nodes use the same single cache
space. This involves adding a server, or file store of some sort,
faster than your original store and accessible by all the request
layer nodes. Each of the request nodes queries the cache in the same
way it would a local one. This kind of caching scheme can get a bit
complicated because it is very easy to overwhelm a single cache as the
number of clients and requests increase, but is very effective in some
architectures (particularly ones with specialized hardware that make
this global cache very fast, or that have a fixed dataset that needs to be
cached).

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!