Tag Archive for Data Modeling

Robert Pearl (b|t) is hosting this month’s T-SQL Tuesday blog party. The party was started by Adam Machanic (b|t) in December of 2009.

This month’s invitation is about making sure your database is healthy. For me, that is not just making sure that patches are updated and backups restore properly. It also means the database schema is healthy.

In the beginning

One of the simplest things that we can do when creating a new table, is to make sure there is a primary key constraint. A primary key constraint is simply a specialized index. It can be a clustered or non-clustered index. The important part is, it is created as a primary key constraint.

Why?

Two strong reasons come to mind on why you want to have primary key constraints on your tables. The first, is the accuracy of your data. The purpose of a primary key is to uniquely identify a record (just like your fingerprints). By using primary key constraints, SQL Server will guarantee that the primary keys are unique.

I’ve had application developers argue with me that the application will guarantee that the primary key is unique, so a primary key constraint is not needed. I’ve also had other developers tell me that using GUIDS or identities will guarantee uniqueness. Even if you use these data types or have checks in your application to verify the value is unique, it is still possible to go around the application to insert data and it is possible for multiple applications to simultaneously provide what they think is a unique value. A primary key constraint can will prevent duplication.

This does bring me to the second reason. Say, you do a have a spectacular way to keep all of your primary keys unique. Without primary key constraints, you can’t implement Transactional Replication. That’s right. Transactional Replication requires primary key constraints.

How can I find my straying tables?

Now that I’ve (hopefully) peeked your curiosity, let’s look at a way to find a list of tables that are missing primary key constraints.

SELECT
t.name AS TableName
FROM
sys.tables AS t
LEFT JOIN sys.indexes AS i ON i.object_id = t.object_id
AND is_primary_key = 1
WHERE
is_primary_key IS NULL
ORDER BY
t.name

Remember it is not a good idea to “just add” new constraints without determining the impact on the current system. Once you have the list of tables that are missing primary key constraints, you can go through the process of determining if you can add them into the environment without side affects. Some side affects can include having to deal with duplicate data, increase database size by the addition of new indexes, and better performance.

Thanks for all the fish

Thanks go out to Robert Pearl for hosting this month’s T-SQL Tuesday blog party.