Idle Musings of a Delphi Junkie

Welcome to the Blog of David Clegg. I am a hard-core Delphi fan, so most of my posts will probably reflect this.

Friday, June 01, 2007

Inserting a record only once

A few weeks back, my boss showed me a technique he uses when inserting a record into a FireBird table, which caters for the scenario that the record may already be there and therefore shouldn't be inserted. It was one of those simple yet elegant solutions, and I've been meaning to share it here. Well, I'm waiting for a data import process to finish, so I thought I'd take the time to finally do this.

The technique revolves around turning the concept of a left join on its head and using it to check for failure (i.e a null field value). Here is an example of what I'm talking about :-

The above snippet will attempt to insert a record into the tax_category table using a set of hard-coded values, but because of the left join and where clause, the recordset being used for the insert will be empty if the join finds an existing record in the table.

I hate this "trick", I'm sure a programmer debugging a problem in 2 years will not understand why this insert into select is used, and replace it with a normal insert ....Here are "real" methods that do the same thing.

For those saying he could use a unique index or other mechanisms, it could be possible he wanted to insert a record and ensure its uniqueness WITHOUT raising an error. Why raise an exception if you can avoid it?

In regards to portability, maybe their product works only with Firebird/Interbase, and that's all they need. Maybe the SQL composed is a string returned from a function that takes a database vendor into account.

In regards to using count(*), that isn't optimal because you don't want the database to count each and every occurrence. You are only interested if there is AT LEAST 1. Better option would be using "not exists". A good server will bail out on the first occurrence of the exists being satisfied. Another option might be to join the table with itself. Would have to test that one. Third option would be to create a static table to take the place of RDB$Database. This static table would always contain just one row and be used for nothing other than for occasions like this.

Preventing an error from raising? Are you kidding me? You heard of exception blocks? To try and do the work that a RDMS does and not use the built-in functionality in regard to constraints is the worse thing someone can do, not to add what one of the other users said about it not being readable in 2 years down the road. This is a very good example of bad programming and hacking.

Well, this has turned out to be quite an interesting blog post. There have been agreements, disagreements and inaccurate assertions (and a little touch of irony, which I'll not bother pointing out for now ;-)).

Let me clarify a few things. We are fully aware of what database constraints are and how they can be applied to ensure data integrity. It goes without saying that armed with this knowledge we do use them when applicable.

As Kyle pointed out, performing an insert which would violate a unique constraint would result in an exception being raised. If we are running a script which could possibly be inserting a record which would violate a unique constraint, this is a condition known to us, so why would we not handle it, rather than allow the exception to be raised?

And SQL statements are not always run in contexts which allow conditional logic to be applied. This is one of the instances where the technique I outlined can be used.

But the best thing about this technique is that you don't have to use it. Period. We see it as another useful tool in our development toolbox, and as such apply it in situations where we deem it to be appropriate. You are 100% free to adopt it or ignore it as you see fit.

In dml script code, there's no other way to do it. If a single company has a branch 99, then the entire statement fails (following the atomicity rules).

If you're programming with more control over the statement (as in you did it all in bite sized chunks with a check for "exists" at each record) thenh you're asking for a lot of round-trips to the server and back into your own code.

Surely this type of code makes it easy to get a job done in a very short amount of time, and when the customers are paying for time, either from your boss, or you in his team, you are shooting yourself in the foot.

I applaud the elegance of the technique when applied to problem-solving, but you (and your boss) need to learn some business sense.

I've followed b.p.n.t and as far as I can make out, Cleggy has been working on an ECO system, but then he went all quiet.

If he was selling Win32 stuff, he wouldn't have any where near enough built within this time to make a sale like that, and if he was doing ECO and .NET and had a sale like that we would have heard about it already about it.