Application Scaling

Why is scaling out so important? We have several options regarding to increasing application performance. We can
buy new server hardware,
add more RAM, and so on. This is vertical scaling, or scaling up. However, this option can be pretty costly,
time-consuming, and has its limits.
You can go for cutting-edge technology, but it can still be insufficient for your application.

The second option is horizontal scaling, or scaling out. It implies extending the computing power of the
application by increasing the
number of servers where the application is located, or application instances in case of PaaS. In other words, if
the application is located
on one server, and at some point it stops "pulling its weight", we can simply buy another server,
identical to the first one, and install
our application on it. This way, the queries to the application will be divided between the two servers.

This is the fundamental principle of scaling out applications hosted in a "cloud", but instead of real
physical servers, we have the notion
of a virtual machine. When one copy of a virtual machine becomes insufficient for our application, we can
increase the performance by distributing
the workload between several virtual machines.

The capabilities of Microsoft cloud platform are quite broad. They include auto-scaling and on-demand scaling,
which are available through
a user interface as well as through SDK, REST API and PowerShell.

Although scaling applications (PaaS) or virtual machines (IaaS) is simple ? just define the necessary number of
instances and you got them ?
more questions will arise if your application uses MS SQL databases. The first thing that comes to mind is to
create a cluster of virtual
machines with SQL Server. This solution is relatively simple and widely known. But what if the application uses
the database as a service (SaaS)?
And what if we don't want to fiddle with setting up an SQL Server cluster?

Since we are dealing with Windows Azure, the SQL database should be SQL Azure. This database supports a scaling
out technology (sharding)
called SQL Azure Federations. It is very simple: logically independent lines of one table are stored in separate
databases. A simple example:

This is one and the same table, but its data is stored in separate copies of the database (shards). The account
information with ID 1 is stored
in the first database, with ID 2 - in the second one, and so on.

What do we get from all of this? First of all, data isolation. There is no connection between data from one
account and data from another.
Thereafter, sharding allows us not only to scale out databases, but also to implement multi-tenant scenarios.
Since the amount of data stored
in a specific shard is significantly smaller than the total size of the database, data retrieval is much faster.

SQL Azure limitations

Every technology has its drawbacks, and sharding is not an exception. Let's consider what drawbacks we may
encounter with the sharding technology,
using the database described in the example above.

As specified by the SQL Azure architecture, the service does not support data retrieval from several databases
simultaneously.
One database equals one connection. The shards are not an exception. We would have to run a separate query to
return the number of clients
in the database for every shard.

The logic of summing up the values returned by this query must be implemented in the application. As a result of
using federations, part of the
code will "go into" the application, as some of the database-level capabilities of regular SQL Server
are limited.

Foreword

SQL Azure Federations is not a silver bullet, of course. You can always develop and implement your own principle
for scaling out your database.
Let's assume that the multi-tenant approach is a kind of database scaling out, as one user's data is separated
from the data of another not only
"logically", but also "physically".

If we need to add new users, we configure separate databases for them. The problem is that the application logic
must contain a "routing" mechanism,
meaning the application must know which database it is currently working with.

But let's return to SQL Azure Federations.

The idea offered by Microsoft idea itself is worthy to be praised. It would be great to have a tool for scaling
out databases quickly and easily.
Add to that auto-scaling based on the results of specific queries (but that would be pure sci-fi)...

As a rule, before you make the final decision to start using SQL Azure Federations, you must perform a detailed
analysis of the existing database
(because you usually port existing databases), or plan out in minute detail the architecture of the database
that the application will be using,
as well as the logic of the application that will be working with the database.

That's all for now with the theory, but when it comes to practice, there is usually a significant number of
obstacles. Instead of learning how easy
it is to start using SQL Azure Federations from scratch, we will try to migrate an existing SQL Azure database
to using federations. We will observe
the steps performed by the DBA, as well as the problems that can be encountered during the migration.