Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

Is it good practice (or would it have any adverse effects) to use a set of 4 columns to identify a row as being unique (one being a foriegn key, the other three being float data types)? I'm attempting to build a table that (with 4 keys linked) would describe a unique entry in the table. I'm curious if this is a good plan of attack or if there is a better way.

For visual purposes, picture the following table. We have inventory items that are organized like the following table: ([K] is symbolic of the primary key, the lines are relationships)

As it stands (and I've shown in my "schema" above), I use a simple (auto-increment) integer primary key for entries in the Sheet_Size table. However, I'd like to know if it's better to use a combination of the *Sheet_Type*, Length, Width, & Thickness columns instead? Given each entry in Sheet_Size should share all these unique qualities, and that an auto-incrementing field wouldn't demonstrate this well enough, is this the best route to take?

If I'm not explaining the situation well enough, please let me know. I'm finding myself needing to break out these portions (Class vs. Type vs. Actual stock sizes) of an inventoried material for other logic purposes, but I'm up for any other kind of feedback.

Any guidance would be appreciated.

Update (08-12-2011)

After the answers posted, I've decided to do a combination of Mark's answer & X-Zero's answer. I decided it's a good idea to place a unique constraint on the length, width and thickness columns, but I also like the idea of breaking out material sizes in to unique rows and linking them with a relationship.

Unfortunately I can not accept both answers, so I am going to accept X-Zeros for taking (what I feel) to be a more critical look at the problem and offering a schema adjustment.

There should be some sort of unique key (and index) enforced on the dimensions in Sheet_size. Consider, are two sheets of dimensions (48, 96, .5) and (96, 48, .5) equal (that is, does the direction of the dimensions matter)? This sort of problem may be difficult to enforce if through the use of the columns as part of the primary key, but becomes more manageable when using constraints and stored procedures.

The primary key (and index) of Sheet_size_type should use both of the foreign keys, starting with the one with the lower cardinality (probably sheet_type, given your example). You may want an additional index in the other direction, but it might not be necessary.

This revision will save you database space (as a ratio of the number of sheet types using the same size), and shouldn't impact overhead too much.

There are other potential concerns about equality/uniqueness if you are using a float datatype, as the imprecision may trip you up unexpectedly. You should consider whether or not a fixed-point type, to some given precision, would be more appropriate.

I was planning to limit the Length and Width to a single (possible two) decimal points, and thickness would (at most) extend out to three. Beyond that, we're getting too finite (and the stock itself never comes in at the describes numbers anyways). That aside, I like the thought of breaking out the sheet sizes, but the problem I'm facing are the other columns involved (that I've excluded). (need more room, see next post)
–
Brad ChristieAug 11 '11 at 15:07

Because this is a list of inventoried stock, I have to include other information such as Density and Cost/lb (which is heavily based on the type (and thickness even. For example, "Steel"/"1018" could be $0.55/lb at 0.018-0.125" thick, but becomes $0.65/lb once the thickness goes beyond 0.125". (And this also can differ between a 48"x96"x0.250" sheet size of 1018 versus 5052-H32). In your example, I'd only have one entry for a 48"x96"x0.125" (though I suppose the relationship table could have these additional metrics)
–
Brad ChristieAug 11 '11 at 15:10

If you only need a small number of decimal places, then yes, use a fixed precision. Yes, that's where (in this case) you'd put information like that (cost is a dependency on the sheet type and size, for example), although you may want to generate additional tables that can be referenced. You may also want to consider making custom data-types (like density) so people don't try to do query your data in unexpected ways.
–
Clockwork-MuseAug 11 '11 at 15:46

Personally, if I have a narrow, single attribute candidate key, I’m tempted to make use of it. Wide and/or composite keys, by default I’m adding a surrogate to the model. In your case, I’d vote for the identity column on Sheet_Size as primary clustered key and a unique constraint on type/length/width/thickness.

But given that you now have an "arbitrary" key given to the row, how does unique enforce that the columns (when combined) can not have duplicate values? It's my understanding that the unique attribute relates to the key. You're saying Sheet_Size INT PRIMARY KEY and Length UNIQUE, Width UNIQUE, Thickness UNIQUE? I still don't understand how that prevents duplicates in the table (without applying logic to the insertion interface). (Maybe I'm missing something?)
–
Brad ChristieAug 11 '11 at 15:01

Thank you for the feedback. I do agree a unique constraint on the columns would be a great solution, but I also like X-Zero's recommendation of breaking the sizes out in to a new table (linked with a new table). So to combine ideas, I will apply the unique constraint tot he "Stripped-down" size table, while removing the density & cost/lb information and placing it in the relationship table.
–
Brad ChristieAug 12 '11 at 18:32

The composite key makes perfect sense. Implementing that key ensures that the business attributes can't be duplicated. That's a good thing because recording the same data multiple times would cause ambiguity, undesirable dependencies and make user errors and incorrect data more likely.

The auto-incrementing key alone won't protect the integrity of your business data. If the auto-incrementing key serves no particular purposes (e.g. as the target of a foreign key reference in another table) then it can safely be dropped.

... Except dropping the auto-increment as the foreign key would require using all of the dimension columns as part of the foreign key (that is, all four columns, when including type). Not something I want as my foreign key, period - single columns only, please. I agree that it's a good idea to put a unique key (and/or check constraint) on the dimensions (and type, depending on table design).
–
Clockwork-MuseAug 11 '11 at 16:10

@X-Zero, I did make the point about foreign key references in my second para. The question as I read it is whether to implement the composite key, not whether also to have an auto-increment.
–
sqlvogelAug 11 '11 at 16:18