Name

Synopsis

Description

This tutorial explains how to create a sqitch-enabled PostgreSQL project, use a VCS for deployment planning, and work with other developers to make sure changes remain in sync and in the proper order.

We'll start by creating a new project from scratch, a fictional antisocial networking site called Flipr. All examples use Git as the VCS and PostgreSQL as the storage engine, but for the most part you can substitute other VCSes and database engines in the examples as appropriate.

If you're a Git user and want to follow along the history, the repository used in these examples is on GitHub.

Now that we have a repository, let's get started with Sqitch. Every Sqitch project must have a name associated with it, and, optionally, a unique URI. We recommend including the URI, as it increases the uniqueness of object identifiers internally, so let's specify one when we initialize Sqitch:

Good, it picked up on the fact that we're creating changes for the PostgreSQL engine, thanks to the --engine pg option, and saved it to the file. Furthermore, it wrote a commented-out [engine "pg"] section with all the available PostgreSQL engine-specific settings commented out and ready to be edited as appropriate.

By default, Sqitch will read sqitch.conf in the current directory for settings. But it will also read ~/.sqitch/sqitch.conf for user-specific settings. Since PostgreSQL's psql client is not in the path on my system, let's go ahead an tell it where to find the client on our computer:

> sqitch config --user engine.pg.client /opt/local/pgsql/bin/psql

And let's also tell it who we are, since this data will be used in all of our projects:

Note that it has picked up on the name and URI of the app we're building. Sqitch uses this data to manage cross-project dependencies. The %syntax-version pragma is always set by Sqitch, so that it always knows how to parse the plan, even if the format changes in the future.

The add command adds a database change to the plan and writes deploy, revert, and verify scripts that represent the change. Now we edit these files. The deploy script's job is to create the schema. So we add this to deploy/appschema.sql:

CREATE SCHEMA flipr;

The revert script's job is to precisely revert the change to the deploy script, so we add this to revert/appschema.sql:

DROP SCHEMA flipr;

Now we can try deploying this change. We tell Sqitch where to send the change via a database URI:

First Sqitch created registry tables used to track database changes. The structure and name of the registry varies between databases (PostgreSQL uses a schema to namespace its registry, while SQLite and MySQL use separate databases). Next, Sqitch deploys changes. We only have one so far; the + reinforces the idea that the change is being added to the database.

With this change deployed, if you connect to the database, you'll be able to see the schema:

Trust, But Verify

But that's too much work. Do you really want to do something like that after every deploy?

Here's where the verify script comes in. Its job is to test that the deploy did was it was supposed to. It should do so without regard to any data that might be in the database, and should throw an error if the deploy was not successful. In PostgreSQL, the simplest way to do so for non-queryable objects such as schemas is to take advantage the access privilege inquiry functions. These functions conveniently throw exceptions if the object being inquired does not exist. For our new schema, has_schema_privilege() will do very nicely. Put this query into verify/appschema.sql:

SELECT pg_catalog.has_schema_privilege('flipr', 'usage');

Such functionality may not be available to other databases, but you can use any query that will throw an exception if the schema doesn't exist. One handy way to do that is to divide by zero if an object doesn't exist. So for other databases, assuming division by zero is fatal, you could do something like this:

Looks good! If you want to make sure that the verify script correctly dies if the schema doesn't exist, temporarily change the schema name in the script to something that doesn't exist, something like:

The revert command first prompts to make sure that we really do want to revert. This is to prevent unnecessary accidents. You can pass the -y option to disable the prompt. Also, notice the - before the change name in the output, which reinforces that the change is being removed from the database. And now the schema should be gone:

On Target

I'm getting a little tired of always having to type db:pg:flipr_test, aren't you? This database connection URI tells Sqitch how to connect to the deployment target, but we don't have to keep using the URI. We can name the target:

> sqitch target add flipr_test db:pg:flipr_test

The target command, inspired by git-remote, allows management of one or more named deployment targets. We've just added a target named flipr_test, which means we can use the string flipr_test for the target, rather than the URI. But since we're doing so much testing, we can also use the The engine command to tell Sqitch to deploy to the flipr_test target by default:

> sqitch engine add pg flipr_test

Now we can omit the target argument altogether, unless we need to deploy to another database. Which we will, eventually, but at least our examples will be simpler from here on in, e.g.:

Note that we're requiring the appschema change as a dependency of the new users change. Although that change has already been added to the plan and therefore should always be applied before the users change, it's a good idea to be explicit about dependencies.

Now edit the scripts. When you're done, deploy/users.sql should look like this:

A few things to notice here. On the second line, the dependence on the appschema change has been listed. This doesn't do anything, but the default deploy PostgreSQL template lists it here for your reference while editing the file. Useful, right?

Notice that all of the SQL code is wrapped in a transaction. This is handy for PostgreSQL deployments, because PostgreSQL DDLs are transactional. The upshot is that if any part of this deploy script fails, the whole change fails. Such may work less-well for database engines that don't support transactional DDLs.

The table itself will be created in the flipr schema. This is why we need to require the appschema change.

Now for the verify script. The simplest way to check that the table was created and has the expected columns without touching the data? Just select from the table with a false WHERE clause. Add this to verify/users.sql:

SELECT nickname, password, timestamp
FROM flipr.users
WHERE FALSE;

Now for the revert script: all we have to do is drop the table. Add this to revert/users.sql:

DROP TABLE flipr.users;

Couldn't be much simpler, right? Let's deploy this bad boy:

> sqitch deploy
Deploying changes to flipr_test
+ users .. ok

We know, since verification is enabled, that the table must have been created. But for the purposes of visibility, let's have a quick look:

Note that we've used the --to option to specify the change to revert to. And what do we revert to? The symbolic tag @HEAD, when passed to revert, always refers to the last change deployed to the database. (For other commands, it refers to the last change in the plan.) Appending the caret (^) tells Sqitch to select the change prior to the last deployed change. So we revert to appschema, the penultimate change. The other potentially useful symbolic tag is @ROOT, which refers to the first change deployed to the database (or in the plan, depending on the command).

Back to the database. The users table should be gone but the flipr schema should still be around:

Flip Out

Now that we've got the basics of user management done, let's get to work on the core of our product, the "flip." Since other folks are working on other tasks in the repository, we'll work on a branch, so we can all stay out of each other's way. So let's branch:

Wash, Rinse, Repeat

Now comes the time to add functions to manage flips. I'm sure you have things nailed down now. Go ahead and add insert_flip and delete_flip changes and commit them. The insert_flip deploy script might look something like:

Oh, a conflict in sqitch.plan. Not too surprising, since both the merged lists branch and our flips branch added changes to the plan. Let's try a different approach.

The truth is, we got lazy. Those changes when we pulled master from the origin should have raised a red flag. It's considered a bad practice not to look at what's changed in master before merging in a branch. What one should do is either:

Rebase the flips branch from master before merging. This "rewinds" the branch changes, pulls from master, and then replays the changes back on top of the pulled changes.

Create a patch and apply that to master. This is the sort of thing you might have to do if you're sending changes to another user, especially if the VCS is not Git.

So let's restore things to how they were at master:

> git reset --hard HEAD
HEAD is now at ff60b9b Merge branch 'lists'

That throws out our botched merge. Now let's go back to our branch and rebase it on master:

> git checkout flips
Switched to branch 'flips'
> git rebase master
First, rewinding head to replay your work on top of it...
Applying: Add flips table.
Using index info to reconstruct a base tree...
M sqitch.plan
Falling back to patching base and 3-way merge...
Auto-merging sqitch.plan
CONFLICT (content): Merge conflict in sqitch.plan
Failed to merge in the changes.
Patch failed at 0001 Add flips table.
The copy of the patch that failed is found in:
.git/rebase-apply/patch
When you have resolved this problem, run "git rebase --continue".
If you prefer to skip this patch, run "git rebase --skip" instead.
To check out the original branch and stop rebasing, run "git rebase --abort".

Oy, that's kind of a pain. It seems like no matter what we do, we'll need to resolve conflicts in that file. Except in Git. Fortunately for us, we can tell Git to resolve conflicts in sqitch.plan differently. Because we only ever append lines to the file, we can have it use the "union" merge driver, which, according to its docs:

Run 3-way file level merge for text files, but take lines from both versions, instead of leaving conflict markers. This tends to leave the added lines in the resulting file in random order and the user should verify the result. Do not use this if you do not understand the implications.

This has the effect of appending lines from all the merging files, which is exactly what we need. So let's give it a try. First, back out the botched rebase:

> git rebase --abort

Now add the union merge driver to .gitattributes for sqitch.plan and rebase again:

Note the use of rebase, which combines a revert and a deploy into a single command. Handy, right? It correctly reverted our changes, and then deployed them all again in the proper order. So let's commit .gitattributes; seems worthwhile to keep that change:

If user "foo" ever got access to the database, she could quickly discover that user "bar" has the same password and thus be able to exploit the account. Not a great idea. So we need to modify the insert_user() and change_pass() functions to fix that. How?

We'll use pgcrypto's crypt() function to encrypt passwords with a salt, so that they're all unique. We just add a change to add pgcrypto to the database, and then we can use it. The deploy script should be:

CREATE EXTENSION pgcrypto;

And the revert script should be:

DROP EXTENSION pgcrypto;

If you're on PostgreSQL 9.0 or lower, you won't be able to deploy pgcrypto with a Sqitch change, alas. You'll have to install it manually, like so:

psql -d flipr_test -f /path/to/pgsql/share/contrib/pgcrypto.sql

Don't forget to do this with your staging and production databases, too. Or consider upgrading to PostgreSQL 9.1 or higher; the SQL-level extension support is amazingly useful.

We're going to use the crypt() and gen_salt() functions, so in the verify script, let's make sure that the extension exists and that both those functions exist:

Now we can use pgcrypto. But how to deploy the changes to insert_user() and change_pass()?

Normally, modifying functions in database changes is a PITA. You have to make changes like these:

Copy deploy/insert_user.sql to deploy/insert_user_crypt.sql.

Edit deploy/insert_user_crypt.sql to switch from MD5() to crypt() and to add a dependency on the pgcrypto change.

Copy deploy/insert_user.sql to revert/insert_user_crypt.sql. Yes, copy the original change script to the new revert change.

Copy verify/insert_user.sql to verify/insert_user_crypt.sql.

Edit verify/insert_user_crypt.sql to test that the function now properly uses crypt().

Test the changes to make sure you can deploy and revert the insert_user_crypt change.

Now do the same for the change_pass scripts.

But you can have Sqitch do it for you. The only requirement is that a tag appear between the two instances of a change we want to modify. In general, you're going to make a change like this after a release, which you've tagged anyway, right? Well we have, with @v1.0.0-dev2 added in the previous section. With that, we can let Sqitch do most of the hard work for us, thanks to the rework command, which is similar to add, including support for the --requires option:

Oh, so we can edit those files in place. Nice! How does Sqitch do it? Well, in point of fact, it has copied the files to stand in for the previous instance of the insert_user change, which we can see via git status:

The "untracked files" part of the output is the first thing to notice. They are all named insert_user@v1.0.0-dev2.sql. What that means is: "the insert_user change as it was implemented as of the @v1.0.0-dev2 tag." These are copies of the original scripts, and thereafter Sqitch will find them when it needs to run scripts for the first instance of the insert_user change. As such, it's important not to change them again. But hey, if you're reworking the change, you shouldn't need to.

The other thing to notice is that revert/insert_user.sql has changed. Sqitch replaced it with the original deploy script. As of now, deploy/insert_user.sql and revert/insert_user.sql are identical. This is on the assumption that the deploy script will be changed (we're reworking it, remember?), and that the revert script should actually change things back to how they were before. Of course, the original deploy script may not be idempotent -- that is, able to be applied multiple times without changing the result beyond the initial application. If it's not, you will likely need to modify it so that it properly restores things to how they were after the original deploy script was deployed. Or, more simply, it should revert changes back to how they were as-of the deployment of deploy/insert_user@v1.0.0-dev2.sql.

Fortunately, our function deploy scripts are already idempotent, thanks to the use of the OR REPLACE expression. No matter how many times a deployment script is run, the end result will be the same instance of the function, with no duplicates or errors.

As a result, there is no need to explicitly add changes. So go ahead. Modify the script to switch to crypt(). Make this change to deploy/insert_user.sql:

Yes, it works! Sqitch properly finds the original instances of these changes in the new script files that include tags.

But what about the verify script? How can we verify that the functions have been modified to use crypt()? I think the simplest thing to do is to examine the body of the function, using pg_get_functiondef(). So the insert_user verify script looks like this:

More to Come

Sqitch is a work in progress. Better integration with version control systems is planned to make managing idempotent reworkings even easier. Stay tuned.

Author

David E. Wheeler <david@justatheory.com>

License

Copyright (c) 2012-2015 iovation Inc.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Module Install Instructions

To install sqitchtutorial, simply copy and paste either of the commands in to your terminal