Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training,
learning paths, books, tutorials, and more.

Appendix D. Testing Database Migrations

Django-migrations and its predecessor South have been around for ages,
so it’s not usually necessary to test database migrations. But it just
so happens that we’re introducing a dangerous type of migration, ie one
that introduces a new integrity constraint on our data. When I first ran
the migration script against staging, I saw an error.

On larger projects, where you have sensitive data, you may want the additional
confidence that comes from testing your migrations in a safe environment
before applying them to production data, so this toy example will hopefully
be a useful rehearsal.

Another common reason to want to test migrations is for speed—migrations
often involve downtime, and sometimes, when they’re applied to very large
datasets, they can take time. It’s good to know in advance how long that
might be.

An Attempted Deploy to Staging

Here’s what happened to me when I first tried to deploy our new validation
constraints in Chapter 14:

What happened was that some of the existing data in the database violated
the integrity constraint, so the database was complaining when I tried to
apply it.

In order to deal with this sort of problem, we’ll need to build a “data
migration”. Let’s first set up a local environment to test against.

Running a Test Migration Locally

We’ll use a copy of the live database to test our migration against.

Warning

Be very, very, very careful when using real data for testing. For
example, you may have real customer email addresses in there, and you don’t
want to accidentally send them a bunch of test emails. Ask me how I know
this.

Entering Problematic Data

Start a list with some duplicate items on your live site, as shown in
Figure D-1.

Inserting a Data Migration

Data
migrations are a special type of migration that modifies data in the database
rather than changing the schema. We need to create one that will run before
we apply the integrity constraint, to preventively remove any duplicates.
Here’s how we can do that:

Conclusions

This exercise was primarily aimed at building a data migration and testing it
against some real data. Inevitably, this is only a drop in the ocean of the
possible testing you could do for a migration. You could imagine building
automated tests to check that all your data was preserved, comparing the
database contents before and after. You could write individual unit tests
for the helper functions in a data migration. You could spend more time
measuring the time taken for migrations, and experiment with ways to speed
it up by, eg, breaking up migrations into more or fewer component steps.

Remember that this should be a relatively rare case. In my experience, I
haven’t felt the need to test 99% of the migrations I’ve worked on. But,
should you ever feel the need on your project, I hope you’ve found a few
pointers here to get started with.

On Testing Database Migrations

Be wary of migrations which introduce constraints

99% of migrations happen without a hitch, but be wary of any situations,
like this one, where you are introducing a new constraint on columns that
already exist.

Test migrations for speed

Once you have a larger project, you should think about testing how long
your migrations are going to take. Database migrations typically involve
downtime, as, depending on your database, the schema update operation may
lock the table it’s working on until it completes. It’s a good idea to use
your staging site to find out how long a migration will take.

Be extremely careful if using a dump of production data

In order to do so, you’ll want fill your staging site’s database with an
amount of data that’s commensurate to the size of your production data.
Explaining how to do that is outside of the scope of this book, but I will
say this: if you’re tempted to just take a dump of your production
database and load it into staging, be very careful. Production data
contains real customer details, and I’ve personally been responsible for
accidentally sending out a few hundred incorrect invoices after an
automated process on my staging server started processing the copied
production data I’d just loaded into it. Not a fun afternoon.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training,
learning paths, books, interactive tutorials, and more.