12.4 Backup and recovery

In this section, we'll discuss another aspect of application management: data backup and recovery on production servers. We often encounter situations where production servers don't behave as we expect them to. Server network outages, hard drive malfunctions, operating system crashes and other similar events can cause databases to become unavailable. The need to recover from these types of events has led to the emergence of many cold standby/hot standby tools that can help to facilitate disaster recovery remotely. In this section, we'll explain how to backup deployed applications in addition to backing up and restoring any MySQL and Redis databases you might be using.

Application Backup

In most cluster environments, web applications do not need to be backed up since they are actually copies of code from our local development environment, or from a version control system. In many cases however, we need to backup data which has been supplied by the users of our site. For instance, when sites require users to upload files, we need to be able to backup any files that have been uploaded by users to our website. The current approach for providing this kind of redundancy is to utilize so-called cloud storage, where user files and other related resources are persisted into a highly available network of servers. If our system crashes, as long as user data has been persisted onto the cloud, we can at least be sure that no data will be lost.

But what about the cases where we did not backup our data to a cloud service, or where cloud storage was not an option? How do we backup data from our web applications then? Here, we describe a tool called rsync, which can be commonly found on unix-like systems. Rsync is a tool which can be used to synchronize files residing on different systems, and a perfect use-case for this functionality is to keep our website backed up.

Note: Cwrsync is an implementation of rsync for the Windows environment

Rsync installation

You can find the latest version of rsync from its official website. Of course, because rsync is very useful software, many Linux distributions will already have it installed by default.

Note: If want to compile and install the rsync from its source, you have to install gcc compiler tools such as job.

Note: Before using source packages compiled and installed, you have to install gcc compiler tools such as job

Rsync Configuration

Rsync can be configured from three main configuration files: rsyncd.conf which is the main configuration file, rsyncd.secrets which holds passwords, and rsyncd.motd which contains server information.

You can refer to the official documentation on rsync's website for more detailed explanations, but here we will simply introduce the basics of setting up rsync:.

Starting an rsync daemon server-side:

# /usr/bin/rsync --daemon --config=/etc/rsyncd.conf

the --daemon parameter is for running rsync in server mode. Make this the default boot-time setting by joining it to the rc.local file:

echo 'rsync --daemon' >> /etc/rc.d/rc.local

Setup an rsync username and password, making sure that it's owned only by root, so that local unauthorized users or exploits do not have access to it. If these permissions are not set correctly, rsync may not boot:

-avzP are some common options. Use rsync --help to review what these do.

--delete deletes extraneous files on the receiving side. For example, if files are deleted on the sending side, the next time the two machines are synchronized, the receiving sides will automatically delete the corresponding files.

--password-file specifies a password file for accessing an rsync daemon. On the client side, this is typically the client/etc/rsyncd.secrets file, and on the server side, it's /etc/rsyncd.secrets. When using something like Cron to automate rsync, you won't need to manually enter a password.

username specifies the username to be used in conjunction with the server-side /etc/rsyncd.secrets password

192.168.145.5 is the IP address of the server

::www (note the double colons), specifies contacting an rsync daemon directly via TCP for synchronizing the www module according to the server-side configurations located in /etc/rsyncd.conf. When only a single colon is used, the rsync daemon is not contacted directly; instead, a remote-shell program such as ssh is used as the transport .

In order to periodically synchronize files, you can set up a crontab file that will run rsync commands as often as needed. Of course, users can vary the frequency of synchronization according to how critical it is to keep certain directories or files up to date.

MySQL backup

MySQL databases are still the mainstream, go-to solution for most web applications. The two most common methods of backing up MySQL databases are hot backups and cold backups. Hot backups are usually used with systems set up in a master/slave configuration to backup live data (the master/slave synchronization mode is typically used for separating database read/write operations, but can also be used for backing up live data). There is a lot of information available online detailing the various ways one can implement this type of scheme. For cold backups, incoming data is not backed up in real-time as is the case with hot backups. Instead, data backups are performed periodically. This way, if the system fails, the integrity of data before a certain period of time can still be guaranteed. For instance, in cases where a system malfunction causes data to be lost and the master/slave model is unable to retrieve it, cold backups can be used for a partial restoration.

A shell script is generally used to implement regular cold backups of databases, executing synchronization tasks using rsync in a non-local mode.

The following is an example of a backup script that performs scheduled backups for a MySQL database. We use the mysqldump program which allows us to export the database to a file.

This sets up regular backups of your databases to the /var/www/mysql directory every day at 00:00, which can then be synchronized using rsync.

MySQL Recovery

We've just described some commonly used backup techniques for MySQL, namely hot backups and cold backups. To recap, the main goal of a hot backup is to be able to recover data in real-time after an application has failed in some way, such as in the case of a server hard-disk malfunction. We learned that this type of scheme can be implemented by modifying database configuration files so that databases are replicated onto a slave, minimizing interruption to services.

But sometimes we need to perform a cold backup of the SQL data recovery, as with database backup, you can import through the command:
Hot backups are, however, sometimes inadequate. There are certain situations where cold backups are required to perform data recovery, even if it's only a partial one. When you have a cold backup of your database, you can use the following MySQL command to import it:

mysql -u username -p databse < backup.sql

As you can see, importing and exporting database is a fairly simple matter. If you need to manage administrative privileges or deal with different character sets, this process may become a little more complicated, though there are a number of commands which will help you to do this.

Redis backup

Redis is one of the most popular NoSQL databases, and both hot and cold backup techniques can also be used in systems which use it. Like MySQL, Redis also supports master/slave mode, which is ideal for implementing hot backups (refer to Redis' official documentation to learn how to configure this; the process is very straightforward). As for cold backups, Redis routinely saves cached data in memory to the database file on-disk. We can simply use the rsync backup method described above to synchronize it with a non-local machine.

Redis recovery

Similarly, Redis recovery can be divided into hot and cold backup recovery. The methods and objectives of recovering data from a hot backup of a Redis database are the same as those mentioned above for MySQL, as long as the Redis application is using the appropriate database connection.

A Redis cold backup recovery simply involves copying backed-up database files into the working directory, then starting Redis on it. The database files are automatically loaded into memory at boot time; the speed with which Redis boots will depend on the size of the database files.

Summary

In this section, we looked at some techniques for backing up data as well as recovering from disasters which may occur after deploying our applications. We also introduced rsync, a tool which can be used to synchronize files on different systems. Using rsync, we can easily perform backup and restoration procedures for both MySQL and Redis databases, among others. We hope that by being introduced to some of these concepts, you will be able to develop disaster recovery procedures to better protect the data in your web applications.