Menu

Counting changes

Decoupling your applications is tough work, and if by any change you started after reading the previous topic on this series, I don’t really expect you to be even halfway by now! Still, you might wonder what the next step in this evolution could be. Database Versioning is a sensible change to follow. If you put your code under source control, why wouldn’t you make the same choice with your database?

Keeping your database scripts under a control system allows you to keep detailed history of what has been done, as well as an easy audit of who has changed something. This discourages last-minute freebies, reducing the problems derived of undocumented changes. You could even include Continuous Integration or Continuous Deployment, launching the new scripts against your database server and executing tests to make sure that nothing is broken after any change.

The benefits are clear. We are invested in this change so… Where do we begin?

If this is a new project, it will be easy to define a versioning strategy from the start. Sadly, it won’t be the case in most businesses, which means we need to define a baseline for our database. A “version 0” of sorts. We will take a snapshot of the current database, taking into consideration the whole structure and only the minimum necessary data for our application to work. All this input will be saved into a single file, which we will use to create our database from scratch, if necessary.

After deciding the baseline, we have to include a new table to hold the associated changes. Could be something as simple as this:

The idea behind these fields is to be able to include semantic versioning in our scripts. This means that we provide a numeric representation of the state of the database as it grows and changes over time. Each number is increased once depending on the changes of the script that will be executed:

“Major” is increased when changes are not backwards compatible: changing a column’s name, deleting fields or tables…

“Minor” grows when the script only expand what already exists. Adding a new column or table, or applying conditional changes to existent data. Inserts could also be considered a “minor” change.

“Bugfix” must be updated when the script just solves issues created by previous scripts, without affecting structure. Fixing the content of a line with a specific ID is a good example.

We can check the state of our database with a simple query that just outputs “Major.Minor.Bugfix”. In order for this versioning to work, each single script should update this table after executing and committing the changes. As well, the script files should follow single naming rules that ensure that they always executed in the same order. For example, all scripts could follow the rule {Major}_{Minor}_{Bug}_{FreeText}, and a glance at the filename would reveal what version is the script intended for. This would make 1_0_0_Baseline an adequate name for the “version 0” that we talked about, then 1_1_0_NewTableForBilling the following script, and after that is just making sure that everyone follows the guidelines.

In order to make easier the life of everyone involved in database evolution, you should keep these rules in mind:

No changes to a file after committed: scripts are meant to be one-shots. Editing files already established with a version ruins the concept behind this control. Though inefficient, creating a second script to solve an error created by the first file will make errors easier to track and fix.

No branching: including branching policies on your source control files includes several degrees of complexity with the naming convention of the files. There is no easy way to fix this, so I strongly advise against working elsewhere than on the main timeline.

The goal behind Database Versioning is to push changes in a consistent and repeatable way. This will reduce impacts between teams when there are strong dependencies with the database, and will make any error tracking less of a hassle.