16.4.3 Deploying a MySQL Database Using EC2

Because you cannot guarantee the uptime and availability of your
EC2 instances, when deploying MySQL within the EC2 environment,
use an approach that enables you to easily distribute work among
your EC2 instances. There are a number of ways of doing this.
Using sharding techniques, where you split the application across
multiple servers dedicating specific blocks of your dataset and
users to different servers is an effective way of doing this. As a
general rule, it is easier to create more EC2 instances to support
more users than to upgrade the instance to a larger machine.

The EC2 architecture works best when you treat the EC2 instances
as temporary, cache-based solutions, rather than as a long-term,
high availability solution. In addition to using multiple
machines, take advantage of other services, such as
memcached to provide additional caching for
your application to help reduce the load on the MySQL server so
that it can concentrate on writes. On the large and extra large
instances within EC2, the RAM available can provide a large memory
cache for data.

Most types of scale-out topology that you would use with your own
hardware can be used and applied within the EC2 environment.
However, use the limitations and advice already given to ensure
that any potential failures do not lose you any data. Also,
because the relative power of each EC2 instance is so low, be
prepared to alter your application to use sharding and add further
EC2 instances to improve the performance of your application.

For example, take the typical scale-out environment shown
following, where a single master replicates to one or more slaves
(three in this example), with a web server running on each
replication slave.

You can reproduce this structure completely within the EC2
environment, using an EC2 instance for the master, and one
instance for each of the web and MySQL slave servers.

Note

Within the EC2 environment, internal (private) IP addresses used
by the EC2 instances are constant. Always use these internal
addresses and names when communicating between instances. Only
use public IP addresses when communicating with the outside
world - for example, when publicizing your application.

To ensure reliability of your database, add at least one
replication slave dedicated to providing an active backup and
storage to the Amazon S3 facility. You can see an example of this
in the following topology.

Using
memcached within your EC2 instances
should provide better performance. The large and extra large
instances have a significant amount of RAM. To use
memcached in your application, when loading
information from the database, first check whether the item exists
in the cache. If the data you are looking for exists in the cache,
use it. If not, reload the data from the database and populate the
cache.

Sharding divides up data in your
entire database by allocating individual machines or machine
groups to provide a unique set of data according to an appropriate
group. For example, you might put all users with a surname ending
in the letters A-D onto a single server. When a user connects to
the application and their surname is known, queries can be
redirected to the appropriate MySQL server.

When using sharding with EC2, separate the web server and MySQL
server into separate EC2 instances, and then apply the sharding
decision logic into your application. Once you know which MySQL
server you should be using for accessing the data you then
distribute queries to the appropriate server. You can see a sample
of this in the following illustration.

Warning

With sharding and EC2, be careful that the potential for failure
of an instance does not affect your application. If the EC2
instance that provides the MySQL server for a particular shard
fails, then all of the data on that shard becomes unavailable.