The gospel of the guid — and why it matters

The Gospel of the GUID

There is only one commandment in the Gospel of the GUID:

— — — I. Thou shalt always use a GUID to uniquely identify a row in a table.— — —

Whenever I hire a new member for my team, this is one of the first rules that I teach them. At first, they almost always look at me like a puppy with their ears up and their head tilted to the side as if to say "huh?"

So I show them that every single table has field that is a unique identifier, and most of the time, that field is the primary key and usually the clustered index at that. For example, if we have an Employee table, it has a field named EmployeeGUID that is a unique identifier, and it is the primary clustered key. It's at this point that I usually start get the "uh oh, my boss is an idiot but I don't want him to know I think that look."

Next the questions and the "but I do x" comments start to come:

1. I just use an int.

2. I don't do that because they are so big.

3. Don't you know how hard it is to go look up something by that, an int is so much easier.

4. Surely you don't do that for many-to-many bridge table?

5. What happens if they run out of GUIDs?

6. I don't think I trust that they won't be duplicated.

7. I have never heard of this, therefore it is probably a bad idea.

8. Don't you think there is a performance penalty for doing that?

So then begins my quest to make them a convert. I have a high conversion rate among "thinking" people, and therefore I challenge you to read this and see if you too will become a believer!

— — — Reason 1: I can make less trips to the database, now THAT is a performance enhancement!— — —

There are many reasons to use a GUID as the primary key for your table. My primary reason is directly related to how I build my object models. I prefer to create a "new" instance of my object without round tripping the database. So if I am going to create a new Order object, I don't have to round trip the database to get the OrderID (OrderGUID in my world) like I would if I was using an int as the "ID" of the order. While not all that impressive at this level, consider this. I can create a new Order object with an OrderGUID, and ten OrderLineItem objects with their own OrderLineItemGUID without EVER round tripping the database unlike the INT solution which needs to make 11 calls to the DB.

My second reason for always using a GUID is data merging. I can't count how many times that I have had to merge data. Back before I saw the light, I too would use INTs or some other thing to make my row unique but when I had to merge data, I always had to write a conversion.

If customer 1 acquires customer 2 and I need to merge their data, I have to convert someone's OrderID 1 and CustomerID 1 to some INT that is not used, and update a lot of rows of data, and possibly monkey around with seed values. Multiply that out by dozens of tables, millions of rows of data, and I have a REALLY big task ahead of me, that will require a lot of testing after I write the SQL and/or code.