Ideas on Databases, Logic, and Language by Jeff Davis

Main menu

Post navigation

Exclusion Constraints are generalized SQL UNIQUE

Say you are writing an online reservation system. The first requirement you’ll encounter is that no two reservations may overlap (i.e. no schedule conflicts). But how do you prevent that?

It’s worth thinking about your solution carefully. My claim is that no existing SQL DBMS has a good solution to this problem before PostgreSQL 9.0, which has just been released. This new release includes a feature called Exclusion Constraints (authored by me), which offers a good solution to a class of problems that includes the “schedule conflict” problem.

I previously wrote a two part series (Part 1 and Part 2) on this topic. Chances are that you’ve run into a problem similar to this at one time or another, and these articles will show you the various solutions that people usually employ in the real world, and the serious problems and limitations of those approaches.

The rest of this article will be a brief introduction to Exclusion Constraints to get you started using a much better approach.

First, install PostgreSQL 9.0 (the installation instructions are outside the scope of this article), and launch psql.

Before installing these modules, make sure that PostgreSQL 9.0 is installed and that the 9.0 pg_config is in your PATH environment variable. Also, $SHAREDIR meas the directory listed when you run pg_config --sharedir.

That is identical to a UNIQUE constraint on a.i, except that it uses the Exclusion Constraints mechanism; it even uses a normal BTree to enforce it. The performance will be slightly worse because of some micro-optimizations for UNIQUE constraint, but only slightly, and the performance characteristics should be the same (it’s just as scalable). Most importantly, it behaves the same under high concurrency as a UNIQUE constraint, so you don’t have to worry about excessive locking. If one person inserts 5, that will prevent other transactions from inserting 5 concurrently, but will not interfere with a transaction inserting 6.

Let’s take apart the syntax a little. The normal BTree is the default, so that’s omitted. The (i WITH =) is the interesting part, of course. It means that one tuple TUP1 conflicts with another tuple TUP2 if TUP1.i = TUP2.i. No two tuples may exist in the table if they conflict. In other words, there are no two tuples TUP1 and TUP2 in the table, such that TUP1.i = TUP2.i. That’s the very definition of UNIQUE, so that shows the equivalence. NULLs are always permitted, just like with UNIQUE constraints.

The conditions for a conflicting tuple are ANDed together, just like UNIQUE. So now, in order for two tuples to conflict, TUP1.i = TUP2.i AND TUP1.j = TUP2.j. This is strictly a more permissive constraint, because conflicts require both conditions to be met. Therefore, this is identical to a UNIQUE constraint on (a.i, a.j).

What can we do that UNIQUE can’t? Well, for starters we can use something other than a normal BTree, such as Hash or GiST (for the moment, GIN is not supported, but that’s only because GIN doesn’t support the full index AM API; amgettuple in particular):

So now we can do UNIQUE constraints using hash or gist. But that’s not a real benefit, because a normal btree is probably the most efficient way to support that, anyway (Hash may be in the future, but for the moment it doesn’t use WAL, which is a major disadvantage).

The difference really comes from the ability to change the operator to something other than “=“. It can be any operator that is:

Commutative

Boolean

Searchable by the given index access method (e.g. btree, hash, gist).

For BTree and Hash, the only operator that meets those criteria is “=”. But many data types (including PERIOD, CIRCLE, BOX, etc.) support lots of interesting operators that are searchable using GiST. For instance, “overlaps” (&&).

Ok, now we are getting somewhere. It’s impossible to specify the constraint that no two tuples contain values that overlap with eachother using a UNIQUE constraint; but it is possible to specify such a constraint with an Exclusion Constraint! Let’s try it out.

Now, try out various combinations (including COMMITs and ABORTs), and try with concurrent sessions also trying to insert values. You’ll notice that potential conflicts cause transactions to wait on eachother (like with UNIQUE) but non-conflicting transactions proceed unhindered. A lot better than LOCK TABLE, to say the least.

To be useful in a real situation, let’s make sure that the semantics extend nicely to a more complete problem. In reality, you generally have several exclusive resources in play, such as people, rooms, and time. But out of those, “overlaps” really only makes sense for time (in most situations). So we need to mix these concepts a little.

CREATE TABLE reservation(room TEXT, professor TEXT, during PERIOD);
-- enforce the constraint that the room is not double-booked
ALTER TABLE reservation
ADD EXCLUDE USING gist
(room WITH =, during WITH &&);
-- enforce the constraint that the professor is not double-booked
ALTER TABLE reservation
ADD EXCLUDE USING gist
(professor WITH =, during WITH &&);

Notice that we actually need to enforce two constraints, which is expected because there are two time-exclusive resources: professors and rooms. Multiple constraints on a table are ORed together, in the sense that an ERROR occurs if any constraint is violated. For the academic readers out there, this means that exclusion constraint conflicts are specified in disjunctive normal form (consistent with UNIQUE constraints).

The semantics of Exclusion Constraints extend in a clean way to support this mix of atomic resources (rooms, people) and resource ranges (time). Try it out, again with a mix of concurrency, commits, aborts, conflicting and non-conflicting reservations.

Exclusion constraints allow solving this class of problems quickly (in a couple lines of SQL) in a way that’s safe, robust, generally useful across many applications in many situations, and with higher performance and better scalability than other solutions.

Additionally, Exclusion Constraints support all of the advanced features you’d expect from a system like Postgres9: deferrability, applying the constraint to only a subset of the table (allows a WHERE clause), or using functions/expressions in place of column references.

Peter was of a similar opinion. What really killed that idea, for me at least, was that exclusion constraints can express constraints that are the *opposite* of UNIQUE: Consider the <> operator, which satisfies the necessary criteria, and will be the subject of a future post.

Also, we discussed the matter at length. I think that using the word UNIQUE would have led to superficial understanding, at best. When I get Range Types in, then we can add some syntax sugar for a “range key” that is more intuitive for the common case (temporal keys).

Let’s say you have a relationship table between two entities. Usually, you could say that one of the entities is “superior” to the other, for example, a parent to a child, a group to its members, but even if that is semantically expressed in the design, the UNIQUE constraints cannot control the insertion of a duplicate reverse relationship. For example,

To answer your question directly, if you’d like to prevent a bidirectional relationship, you can use a UNIQUE constraint on a functional index, where the index function puts the elements in a canonical form. For instance, if you have a function
f(x,y):
if x > y then y::text || ‘,’ || x::text
else x::text || ‘,’ || y::text end

Put a functional UNIQUE index on that, and it will prevent the bidirectional relationships.

However, that won’t prevent cycles (a->b->c->a). To prevent cycles, you might need to use some other trick, like saying that a parent ID is always less than a child ID.

You might want the constraint that a child only has one parent, as well, to make it more tree-like.

Also, the two UNIQUE constraints you have in your example are redundant. One implies the other.

I’ve recently started looking at PostgreSQL closely and I am very excited by your work on Temporal support in the Database.
In MS SQL Server I created constraints that called User Defined Functions – to check for Overlap/ Abut/ Contains for Period Ranges. These seemed to work OK in my Dev environment, but I’m not sure how practical would be in production.
Your Exclusion Constraints look perfect for preventing Overlap in a fast efficient manner – What do you suggest for Foreign Key Includes? Is it practical to use a Function called by a Constraint in PostgreSQL, to enforce this (I would prefer this to Triggers) ? Can I use the Geometric Contains Operator (after converting Periods to Integers via a Calendar table) ?
I got excited when I saw the Interval Data Type (not in SQL server) – but then was disappointed when I saw that it lacks a Contains Operator ???
Your Period Type appears to resolve this, but I have found it impossible to install on Windows – any help or clues for installing on Windows for version 9.0 would be very much appreciated, do I need to re-install PostgreSQL from Source Code – Instead of from enterprisedb Installer. If I purchase PostgrePlus would this help ? – does not appear to be version 9.0 though.
If you could please send me some clues or pointers – I will be eternally grateful
Gary Clarke CEO M4 Systems Ltd