We had two tables, A and B, with B having a foreign key linked to the primary key of A.

On our application, we got two main situations that created the problem :

One where we had to add add some records in A, followed by the creation of some records in B referring to the ones in A. This was just fine, since we created our BLL to handle first the changes in A and them B. The "problem" is, sometimes we will have to delete the records of B and them delete the referred record in A, wich raises an error, since we are trying to delete A before deleting B, what could break the referential integrity of these tables. Of course everything is inside a transaction, and we could deal with that in our BLL code, simply changing the order of wich changes cames first (in A or in B) based on the actions that have to be made, but my question is more conceptual than a "please help me solve my problem" one, since the "problem" is already solved (but of course, I'm accepting suggestions if they came up! =D).

Chatting in the office, we agreed that, inside a transaction, everything is being donne and if some problem occurs, what was done is rolled back.

My idea: Why can't the transaction evaluate everything and them order the statements and execute that in a way that don't break the referential integrity, since after the transaction everything will be done and the order might not matter at all? After thinking a little about it, even I am not sure if this thing would be nice or create some king of living hell, since I can think of some situations where the order of the statements would matter, so I'm posting it here so we can discuss about it.

I hope that my text is not confusing. And just to mention, I'm using Microsoft SQL Server. Dunno if this is possible in other databases.

"Some lines in A", "Some lines in "B"? How about a schema explanation so we have some idea what you are talking about?
–
psrMar 16 '12 at 0:36

I think that the schema isn't really important here, since the question is just something conceptual and could be thought of in just any schema with a foreign key on B pointing to a primary key in A.
–
Liordino NetoMar 16 '12 at 1:14

And sorry about the use of the word "lines", what I wanted to say was "records". I'm editing the post to make it more clear. :)
–
Liordino NetoMar 16 '12 at 1:19

SQL Server lets you do a delete with a where clause, if your BLL knows about them in a way that facilitates that, or you could add the logic to do a topological sort of the dependencies. This could go in the BLL, a data layer of your application, or a SPROC
–
psrMar 16 '12 at 1:34

4 Answers
4

Expanding on BillThor's answer, in Oracle at least, you can create deferrable constraints which are validated at commit time and then freely switch between having the constraint validated immediately and having it validated at the time you commit.

For example, I can create a parent table, a child table with a deferrable constraint, insert a row in the child that references a parent row that doesn't exist, and I don't get an error until I commit.

Your second code, the one where you insert the child before the parent is almost what I was looking for. The difference is that I want to do it with deletes instead of inserts. So, it actually do the inserts inside the transaction, but just enforces the constraints at the end, right?
–
Liordino NetoMar 16 '12 at 14:00

@LiordinoNeto - The same thing works with all DML operations. If the constraints are deferrable and set to deferred, you can delete the parent and then delete the child, insert the child and then insert the parent, or do anything else that temporarily violates the constraint so long as the constraint is not violated when your transaction ends.
–
Justin CaveMar 16 '12 at 14:50

thanks! This is exactly what I was looking for! In fact it is better than what I thought of, since I what I thought was to make the statements be ordered according to the design of the tables, but making the constraints be checked just on the commit is way better.
–
Liordino NetoMar 16 '12 at 14:57

It simply doesn't work that way... When a transaction starts it actually does the deletes. On rollback it undoes the deletes by reading the Log. The entire mechanics of this would have to change to do what you sre asking. It would have to simulate the entire transaction then check for integrity, then perform the transaction (holding locks on all the effected the records the entire time)

Yes, I know that it does not work that way. Its more of a conceptual question asking why should a database not work like that. I'm aware of the cascading delete to.
–
Liordino NetoMar 16 '12 at 1:09

Because of the performance\locking issues caused by what I described above. The DB would basically need to perform the entire transaction twice, While locking a bunch of records.
–
MoronsMar 16 '12 at 1:13

Yeah, I didn't thought about that, and I think you're absolutely right. That could be a big problem, and the larger the database, the larger the problem. If it have a lot of users the locking problems could be tremendous to.
–
Liordino NetoMar 16 '12 at 1:26

Some databases work like that, others don't. Some databases also do, or can be told, to only check constraints when trying to commit transaction, so you can do the operations in any order within transaction and commit fails if you missed something.
–
Jan HudecMar 16 '12 at 11:04

The database can not reorder the step of the transaction as it receives them in order. It doesn't know you intend to delete from B when you requested deletions from A. However there two approaches to handling referential integrity within a transaction. On Delete Cascade operation might solve your problem, but I prefer to limit their use.

There are two approaches to validation:

Immediate validation: Test the integrity as each action is performed. This requires records be added in the proper order. This seems to be the default setting for most databases.

Deferred validation: Test the integrity of the data when the data is committed. This allows record to be added in any order. On large transactions this can slow down commits. If you are adding datasets with circular references, deferred validation is required.

Some (most) databases may not support both validation approaches. On some databases it may be possible to switch validation approaches on a per transaction basis.

I'm aware of the delete cascade option. I think it handle something like that, since it solves everything, but I'm not a fan of this one two, since it take a part that could very well be made on the apps code, instead of separating it between the app and the database. Is it the same that you think for limiting its use or you have other thoughts about it? Reading the descriptions of the validations that you pointed, I think that the immediate one is the one thats used on ms sql server (wich I'm using).
–
Liordino NetoMar 16 '12 at 1:16

And about the deferred validation: what you're saying is that, putting everything on a transaction, it's possible to set this kind of validation on the database (of course, if supported), and if I delete a record on table A, wich is referenced on table B, and them delete the records on B that reference A after that and it will work it all out so it don't break anything? Or it works just for the inserts?
–
Liordino NetoMar 16 '12 at 1:23

I have only used it on updates, as the application only did logical deletes (status cancelled, deleted, etc.). I've used deferred validation as an option on constraints. As I rarely need it, I haven't checked for the option at the database level.
–
BillThorMar 16 '12 at 1:38

Imagine a user at a website processing a purchase and instead of going through your step by step process of filling the shopping cart and processing the payment and shipping, they want to skip the credit card address information because you can just use the shipping address once they fill that in at the end. You demand they fill out the billing address first, but they just typed out that lengthy card number and need a break. Can't they just skip this address and enter an address later on? You can go back and get that address. Bill to the shipping or ship to the billing; does it really matter? I entered one address, obviously it applies to both.

Programmers can get frustrated working with databases because they do some things outside our control. They accomodate our needs for the most part, but there is always a situation where it doesn't obey. Your suggestion is instead of using the settings and capabilities to do exactly what you want in the database, you want to violate a simple rule, but have the database sort out your whole transaction just to see if there is a way it might work. Once all the options are exhausted, then you get an error.

If a transaction isn't going to work, wouldn't you rather know sooner than later? You can control what you send to the database, why not take advantage of that and program it accordinly?

Referential Integrity is enough of a performance issue, it doesn't need to act more intelligently to address a rule violation that could get a pass if it is handled within a transaction. Either setup the database to enforce the rules you want or don't have any rules and let your application handle it.

Many databases can optimize and establish a plan on the first request in hopes of performing better on subsequent requests. It's great when it works. What you are proposing could get complicated with more tables (i.e. a many-to-many situation) or nested transactions. Some simple things could get preplanned, but nothing too complex. Juat be thankful it processes the transactions in the order you sent them; it's saving you a debugging nightmare.

Yeah, the way that I first thought could create a debugging nightmare, but what if it processes everything in the order you sent, but wait to test all the constraints in the end of a commit instead of testing everything as it goes? Dunno if I understood right what @JustinCave wrote, but I think that's what occurs on Oracle if you set the constraints to be deferred.
–
Liordino NetoMar 16 '12 at 14:07

Despite that, one problem that we came up here was about the use of deferred constraints at the same time as triggers, since you can design tables and triggers to expect that a child has its parent in the moment that its created, and if it has not, an error at the moment of the commit would occur.
–
Liordino NetoMar 16 '12 at 14:20