The whole point in using SQL to interact with a database is for developers to be able to use the same basic language to interact with any database. The intentions were good but, as it turns out, even something as universally desired as having a primary key that automatically increments itself is not standardized, and, in some cases it isn’t even directly supported.

So, here is how you create an auto generated primary key in the current version of each of several popular SQL dialects. I will demonstrate creating a table with an auto indexed primary key, and how to add rows to it in a plain vanilla way (that works from any client). Unfortunately, there is no plain vanilla way to create the auto incremented primary key, so table creation must be done on a per database basis.

The CREATE TABLE Statments that work

Comments on the MySQL Way

It seems to me that adding an AUTO_INCREMENT attribute to a column just like a NOT NULL constraint or a DEFAULT property is a very straightforward way of handling the problem. It is extremely obvious what the AUTO_INCREMENT field should do, even without consulting the documentation.

I give them full points because the MySQL way is easy to read, easy to write, easy to remember, and it handles both kinds of the generic insert statements below.

Comments on the Sqlite Way

Sqlite generally ignores any data types assigned to a column, and assigns variable length string types to almost everything. The INTEGER PRIMARY KEY option is one of the few exceptions.

Its elegant, clean, not quite as obvious, but very well documented (its question #1 in their FAQ page).

I ding them half a point because there is no obvious way to have another column auto increment. I’ve never had a reason to need that, but it isn’t really an option, and the MySql solution could handle that if I ended up needing it some day. Furthermore, someone coming from another SQL dialect might not consider it intuitive that ALL primary keys are naturally auto generating.

CREATE SEQUENCE item_seq;
CREATE TABLE item
( id NUMBER
, title VARCHAR2(50)
);
CREATE TRIGGER item_auto_increment
BEFORE INSERT ON item
gap_between NUMBER;
FOR EACH ROW
BEGIN
IF NVL(:new.id, 0) = 0 THEN
:new.id := item_seq.NEXTVAL;
ELSE
-- make sure the sequence isn't going to overlap the key value just provided
IF :new.id > item_seq.CURRVAL THEN
gap_between := :new.id - item_seq.CURRVAL;
ALTER SEQUENCE item_seq
INCREMENT BY gap_between;
-- read the sequence so it jumps the gap
gap_between := item_seq.NEXTVAL; -- should now be :new.id
-- now set it back to increment 1 at a time
ALTER SEQUENCE item_seq
INCREMENT BY 1;
END IF;
END IF;
END;
/

note: The above trigger is not the standard trigger seen in most oracle books, and online tutorials, it is my own derived version. This is the standard auto-incrementing trigger you might find elsewhere.

--not my style, but works great
CREATE TRIGGER item_auto_increment
BEFORE INSERT ON item
gap_between NUMBER;
FOR EACH ROW WHEN (new.id is null)
BEGIN
select item_seq.nextval into :new.id from dual;
END;

Points: 8 out of 10

Comments on the Oracle Way

First off, let me say gross. In fact, let me say it again gross. You have to create your own trigger to pull from a sequence. Furthermore, you have to explicitly create the sequence (PostgreSQL was nice enough to do that for you and tell you it did it, I’m not sure how mysql does it, but I doubt it uses a trigger because it supported AUTO_INCREMENT before it supported triggers).

I ding 2 points for sheer uglyness because having to do the whole thing yourself is tedious and error prone. On the other hand, once it is there, it works, and you can do all of what is needed as part of a table creation script, and once it is done, developers can use the table just as easily as the other SQL dialects discussed, so I won’t doc them too much for it.

The fact that you have to build the trigger yourself has one advantage: You can make it behave exactly the same way MySQL behaves, and so it contributes to the “vanilla” factor. Specifically, you can make it handle both NULL and 0 values the same way, and you can make sure that when a value is specified, our sequence can be updated to make sure we don’t try to reuse that value. Ugly, YES, but it works.

Why I like my trigger more than the industry standard

Upon careful inspection of my trigger, there are a lot of people who would argue that it is long and kind of pointless. The normal way uses the WHEN clause to only execute when a null is provided, mine doesn’t, so it is less efficient when an ID is provided. And, it has a lot more code, which means the database has to do more thinking each time I insert a record. Seems inefficient.

To them I say Yes, but… I pretty much always use the trigger to add the key, so it isn’t a big deal that it executes my trigger when I don’t. Furthermore, my trigger protects from the possibility that someone will supply an ID on a new record that is not in sequence, and that, months down the road will create a conflict with the key values generated by the sequence. That will cause a series of strange errors that no one can account for, and that I don’t want to be called about. Its an ounce of prevention in a critical location, and its worth the overhead it might cause on the few cases where people aren’t dong things the normal way.

Comments on the Microsoft Way

My only real comment here is that IDENTITY is not as obvious a name as AUTO_INCREMENT, but the use is simple enough, looks nice, works as expected, … I like it.

I ding a point for using a word that sounds more synonymous with PRIMARY KEY then it does with AUTO_INCREMENT the word IDENTITY has little or nothing to do with picking sequentially growing key values for your primary key.

I ding 2 points for not being able to use the second form of INSERT command (the one that uses the default signature for the table). This happens because they protect you from inserting anything into this column. I have nothing against protection, and in this case I would even agree that it is important protection. But, the end result of this protection is that you can’t use the second form of insert statement because you are specifying a column value (NULL) on a read only field.

Furthermore, often times, when you initially populate tables, you create a bunch of records with specific primary key values, during that phase of database creation, you don’t want the protection so you could create the column as IDENTITY_MODIFIABLE, but that eliminates the protection when you are done with the initial records and you do want some protection.

Comments on the Postgres Way

Its clean, straightforward enough, and makes sense. I like it but I don’t really love it, here’s why:

I dinged half a point for picking the keyword SERIAL. It is obviously not as good as AUTO_INCREMENT because in their own documentation they explain it by saying it works like AUTO_INCREMENT in “other” databases. I decided not to ding them too bad for this because SERIAL is a much better choice of words than IDENTITY. At least serial implies that it will be assigned sequential values. The first time I saw it, it made sense to me.

I ding one point for replacing the data type “INTEGER” with the pseudo data type “SERIAL” rather than using SERIAL as a modifier. I do this because it is good to know what kind of field it is so you can make your foreign keys match. The current implementation requires you to know what is going on behind the scenes in order to use it anyway. Furthermore, instead of adding one modifier, they created two pseudo types (the other is BIGSERIAL).

To be honest, I really like the idea of having the data type be SERIAL instead of INTEGER, but they way they implemented it ruins the reason I like the serial type. If SERIAL were its own data type, then Postgres could move it to a larger int size down the road and handle that for you when you upgrade. They could even substitute in a GUID or some other magic ID. BUT, instead, it makes it an integer, and you have to know what the data type is in order to reference it. There is no SERIAL_REFERENCE data type. So you hard code it to be an integer in other tables… so there is no benefit from a somewhat mysterious SERIAL datatype.

I ding two more points because the insert statement using the default table signature fails complaining that you have provided a value for the column, which is not allowed (even though NULL isn’t much of a provided value).

Insert Statement that works for MOST of the above

INSERT INTO item
VALUES (null, 'some item title');

Inserting based on the default signature for the table is a common (albeit a little bit quick and dirty) way to add the initial records to a table in the schema creation SQL script. This sort of insert will work with an auto generated key for some of the databases, but not all of them. I dinged the Microsoft implementation and the PostgreSQL implementation of SQL each two points for not handling it gracefully.

I think that supplying a NULL value for a column is exactly the same thing as not supplying a value for the column, and 3 of the 5 database engines seem to agree with me on that, Microsoft and Postgres need to catch up with the game here.