Be Aware of These Loopholes in Your Referential Integrity

I am going to discuss several potential problems with referential
integrity, because they keep coming up in the newsgroups.

Parent and Child Tables
Are in Different Databases.

Although you cannot use a foreign key in this situation,
there are workarounds – you can use either triggers or UDFs wrapped in check
constraints. Either way, your data integrity is not completely watertight: if
the database with your parent table crashes and you restore it from a backup,
you may easily end up with orphans.

Comment Notification

Comments

I agree that it is best to put all related tables in the same database. That assumes however, that it is always possible to move all related tables into one database. Many organizations (for multiple reasons) have multiple databases. Of course it is possible to copy an entire table and maintain the data in multiple locations, but in my experience this causes much bigger problems than the downsides of triggers that you mentioned above.

You can further minimize the risks of triggers/udfs by maintaining triggers/udfs on both databases. One to check for parental constraint, the other for checking child constraints. This is still not ideal, but it is useful when maintaining cross-database RI is required.

We can grant permissions to schemas or have even finer level of granularity. If users do not have permissions for a schema or an object, they just do not see it. There is no need to have separate databases to enforce security. Similarly, in Enterprise Edition there are built in ways to partition and scale up without splitting a database.

On the other hand, if you ever have to restore two or more databases with parent-child relationships across database boundaries, you will have to take care of orphan rows. You will not want to do it again once you have experienced it for yourself.

Agree with Alex. Basically the DB designer should not decide which entity goes into same DB. The problem itself tells what entity should or shouldn't be together in the same DB. In other words it's not upto DB designer but rather upto to the purpose of the design. Of course DB designer uses his/her knowledge to identify and to define entities but this should be only if he/she is sure they belong together. Now the question is how we use data that are in different DBs, refering to Maurice's case. The answer is you want to pouplate the same dimensions separately for each DBs. Again, two DBs needs same dimension, then populate the dimensions as if they are representting different entity even though in real world they don't. I referring to dimensions not the facts because facts must not at all be shared between DBs, it's terrible design otherwise.

August 12, 2011 7:10 PM

Leave a Comment

About Alexander Kuznetsov

Alex Kuznetsov has been working with object oriented languages, mostly C# and C++, as well as with databases for more than a decade. He has worked with Sybase, SQL Server, Oracle and DB2. He regularly blogs on sqlblog.com, mostly about database unit testing, defensive programming, and query optimization.
Alex has written a book entitled "Defensive Database Programming with Transact-SQL" and several articles on simple-talk.com and devx.com. Currently he works as an agile developer.