Database migrations with Liquibase

Liquibaseis very powerful and flexible database migration tool. It can be used as a stand-alone application and thus can update the schema of any RDBMS it supports, but most of the times it is more beneficial to have it integrated into Java-based applications. In this post I will cover the technique of integrating it with the project.

Database migration in general

Database migration is the process of updating the schema of a (developer, test or production) database from state A to state B. Database states are bound to different revisions of the application source tree.When updating the schema from state A to state B, A can be anything including an initial empty database, but B must be at least 1 revision later. These states often represent different versions of an application where schema evolution was part of the ongoing development process. Database migration tools are designed to achieve the goal of making this process traceable, visible, convenient and more fault-tolerant. Using database migration tools provides pretty the same benefits as using revision control systems for source code.

Application development frameworks / ORM providers often have support for migrations but most of the time they are used for rapid prototyping in development time and don’t meet the requirements for production use. For example, Hibernate has the option hbm2ddl which can have various settings. One of them is hbm2ddl=update. This way, Hibernate automatically fires the necessary DDL statements to create or modify the schema to fit the application’s entities – but executing these statements might result in data loss. For more information, check the official documentation.

Leading migration solutions are database-agnostic, so the migration can be portable between RDBMS products (obviously with certain limitations). This unburdens the process of changing the RDBMS that our application uses.

At this point we have to mention that these tools are designed for schema migration but can be also used for data migrations (insert, update, delete) if we don’t have any better idea (we can use INSERT, UPDATE and any other query in the changeset at any time).

What is Liquibase?

Liquibase is one of the well known migration tools out there for Java applications. In general it can be used to:

apply a set of changes,

analyze the schema of a database and export the changeset for later use (can be useful for introducing liquibase to an already in-production project)

It has an extensive feature set yet it’s simple enough to get started with.

Changelog

The very heart of a Liquibase-managed migration is the changeLog file. It defines what changes are necessary in what order. Traditinally, it’s an XML document but these days there is support for the following formats:

We will stick to XML in the samples. Generally all practices are available in any formats and XML changesets are the most common in projects.

General approach

The changelog file describes all the necessary schema changes in a strict incremental approach, each change depending of all the previous changes in the order of definition (or to be more precise, expecting that previous changes are/will be ran in the order of definition). The changelog file is continuously appended by the developers and once a change is introduced into the changelog file (and potentially executed against some DB), it must never be altered, because it could mess up Liquibase’s expectations.This is because Liquibase maintains a DATABASECHANGELOGtable that records what changes are already ran (with an MD5 hash computed from metadata and content of a changeSet). The hash is (re)computed on all described changes every time an update is executed and Liquibase checks for DATABASECHANGELOG table to determine if the actual changeSet is ran or not.

Advanced features: preconditions, custom preconditions, boolean logic

Basics

You can define some preliminary requirements before a changeSet is run. If the precondition evaluates to true, the change will run. If it’s false, the developer can choose the outcome:

HALT: An error message will be emitted and the migration process will stop. The changeSet and following changeSets won’t be marked as run.

CONTINUE: Skips the actual changeSet and continue on the next. The changeSet won’t be marked as run and on the next update, Liquibase will attempt to execute it again

MARK_RAN: The changeSet won’t be run, but will be marked as run. This is commonly used when introducing Liquibase to an already used database (you expect some object to not exists, and on fail, you know it’s already there, so you mark it as ran)

WARN: A warning message will be emitted, continues on the next changeSet.

Custom precondition example

A custom precondition is in fact a Java class created by the actual application developer which implements the liquibase/precondition/CustomPreconditioninterface. Using the Liquibase API, an arbitrary precondition can be used. There are some conventions to make sure your class signals condition errors the right way, examine the handling of the exceptions in the example.

You can assign any number of attributes to your class. They should be non-final private members without assigned value. Liquibase will fill them using reflection with the values supplied in the corresponding XML attributes.

In the following example, by nesting the precondition into a not element, we expect false evaluation of the custom precondition, so we can achieve a ‘foreign key not exists’ precondition before actually creating one.

The following custom precondition can be used to check if a column on a table has a foreign key constraint for a foreign table without knowing the name of this constraint.

Further advanced features: custom changes

Why would anyone implement a database DDL/DML in Java when there are DB agnostic refactor commands or as last resort native SQL queries. Well, the perfect example would be some special data migration, which is easily implemented in Java because it’s rich feature set (for example, complex String manipulations, hashing plain text password columns, and so on), but would be painful/hacky in the standard ways.

You can include a custom change implementation with the customChange tag.

Using the custom change

1

2

3

4

<customChangeclass="com.ctp.confluence.CustomChange">

<paramname="id"value="2"/>

<paramname="newName"value="Astra"/>

</customChange>

Custom change implementation

Java

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

publicclassCustomChangeimplementsCustomTaskChange{

privateStringid;

privateStringnewName;

// getters, setters

@Override

publicvoidexecute(finalDatabase database)throwsCustomChangeException{

finalJdbcConnection conn=(JdbcConnection)database.getConnection();

try{

conn.setAutoCommit(false);

finalStringinsertTableSQL="UPDATE CAR SET name = ? WHERE id = ?";

finalPreparedStatement stmt=conn.prepareStatement(insertTableSQL);

stmt.setString(1,newName);

stmt.setString(2,id);

stmt.executeUpdate();

conn.commit();

}catch(DatabaseException|SQLExceptione){

// ...

}

}

// other overrides

}

Liquibase commands

Liquibase supports a number of commands. The command is actually what Liquibase will do after invocation, and one invocation is initiated with exactly one command of the following:

Update: Look at the provided changeLog, calculate MD5 hashes for each changeSets. Look into the DATABASECHANGELOG table, and check which changeSets did not run yet. Run them and update DATABASECHANGELOG, marking them ran.

Rollback: Roll back previously executed changeSets

Diff: Checks the database and reports the differences between it’s schema and the provided changeLog

Generate changelog:Analyzes the database schema and generates a changeLog which can be used to produce the same schema on an initial database. (Generally it works well but the documentation states that for various RDBMS systems it lacks some features, such as functions for Oracle, which must be added by hand after).

Integration into the application

This topic is strictly about the technical details of integrating Liquibase into a Java SE or EE app and doesn’t cover the necessary steps when you have to introduce it to existing databases with schema and data. That is discussed in a seperated topic.

Integrating Liquibase

Liquibase is available from Maven Central. The current latest version is used here at the time of writing the article, so you should check if there’s a new version.

Liquibase Maven artifact

1

2

3

4

5

<dependency>

<groupId>org.liquibase</groupId>

<artifactId>liquibase-core</artifactId>

<version>3.2.2</version>

</dependency>

If you want to use the Liquibase Maven plugin to execute migrations from a maven goal, you have to include this:

Important: always place Liquibase listener before any other listeners. In the Java EE specification, there is a strict rule for container implementations to call listeners in the defined order. Not putting into the first position might cause interference with ORM providers and so on.

This way, at each (re)deployment, the schema migration will be handled automatically by Liquibase.

There is another option for the cases when the application uses a CDI container. The official documentation is here.

Introduction into existing databases

Introduction into existing databases means you have one or more databases with schema and data (which are valuable so the introduction of Liquibase must not alter them in any way).

Technically speaking, the only missing stuff from such a database is the DATABASECHANGELOG table which is created and maintained by Liquibase to track the changes. Also, from the application’s side, you need a correct databaseChangeLog that’s execution results in a schema that equals to our existing schema.

We will divide the process into two different operations:

Creating the correct databaseChangeLog which can produce exactly the same schema we need

Creating the correct meta table, to make Liquibase think it has been taking care of migrations since the beginning

Creating the correct databaseChangeLog

There are a number of possibilities.

The developer analyzes schema with his/her eyes, and writes all the changeSets with his/her hand. This is error-prone and could take a very long time (and will drive the developer mad)

(preferred) The developer uses Liquibase’s “generate changelog” feature, and checks the documentation after what schema constructs it skipped, and appends them by hand (for Oracle, functions won’t be exported for sure. Each missing support for various RDBMS systems are documented)

Let’s assume we have a complete and correct changeLog that describes our schema with 100% precision, and hop to the next step.

Creating the correct meta table

Again, we have two possibilities. This time we won’t mark any of them as preferred.

As I wrote in the previous section, we have a correct changeLog. We could execute it against an initial empty database, cut Liquibase meta table and insert it into our production databases. Well, tempting. We could save valuable time, but needs strong attention to not mess up anything.

Take our changeLog and insert correct preconditions before each and every change

This could take some time depending the complexity of our schema

In general, we should create precondition which expects some database object to NOT exist (that object we want to create). Upon fail (it does exist), mark this change RAN with MARK_RAN option (see above).

The obvious advantage is that we will have a changeLog that can be applied to any of the in-use databases and an empty database also

Extensions

Hibernate Integration

Allows automatic generation of changeSets based on the differences of the entity classes and the database schema. Can be found here.

Oracle extensions

Allows many new refactor tags for Oracle like CreateMaterializedView, AddDeferredPrimaryKey and so on. Can be found here.

Conclusions

I used Liquibase for about 3 years on a Java application development project. The tool passed the test of time and its various features were used daily. As the time passed, the application went under several structure changes, like migration from a container-based operation to a standalone operation (GlassFish -> self-contained Jetty), invocation from build tool and direct invocation was introduced (using JCommander), etc, but we always kept Liquibase around to do its work.

As a personal opinion, the best thing about the Liquibase integration into the source tree was that when the developer was checking older revisions of the source tree (with some visual Git repository viewer), the commits were composed of application source code changes and also database schema changes, hand in hand, and it was easy to track which part of the schema was modified by whom and when.

There were occasional complex data migration tasks where we used Java-based custom changes. It was very easy to do the necessary DML operations, because we got a standard JDBC connection and we could use Java’s rich feature set for the task, instead of trying to achieve our goals with raw SQL statements.

One thought on “Database migrations with Liquibase”

Can you help me?
I have two includes in my db.changelog-master.xml, just like this:

The first include is database schema, and my second include:

I run the liquibase:update command in maven for create and put the insert’s in my database.

But i have a problem, if i need insert more data in data.sql file i receive this error:
Failed to execute goal org.liquibase:liquibase-maven-plugin:3.4.1:update (default-cli) on project project-name: Error setting up or running Liquibase: Validation Failed: 1 change sets check sum

Theres a way to add more data in the data.sql file and the liquibase insert just the rows that isn’t in the database ?