Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

Considering a procurement system as an example. Would it make sense to have one single table for the following: Quote, Order, Invoice and Credit Memo?

There will be some details that are different. For example each list will have its 5 special fields but the other say 50 fields would be exactly the same. Would it make sense in this case to join all tables in a single one. And some fields are unused for some types.

I can see that Navision system does something like that in its database. It uses one table, but then splits into separate tables once the items are posted. I think this may be for performance reasons because the posted items can tend to become very large. & splitting into separate tables would speed up queries. But this is just my thinking.

So my main dilemma is ... what is the recommended approach in such case: 1 table or separate tables? what are the pros and cons of each? What are the business considerations to take into account?

2 Answers
2

There are essentially three options (with the possibility of complexity via combination of the three options):

Keep each table separate

Bring all of the tables together as one major table

Use an associative relationship: a base table with the common columns, and individual tables for the non-common columns

Here are some of the factors to consider:

Do these really represent the same entity? If so, what is that entity? Based on your examples (Quote, Order, Invoice, and Credit Memo), these are separate entities to me. They are 'related' but are not the same or subtypes of a master entity (such as having an Employee table and ExemptEmployee and NonExemptEmployee subtables which have fundamentally different attributes). This tilts in favor of keeping tables separate.

Do you need to query against Quote UNION Order UNION Invoice? Note that this is different from joining Quotes to Orders (to get, say, a percentage of successful sales). If you do need to do those types of queries often, that argues in favor of a single table. I could see doing this if you're looking at "the life of an order." On the other hand, that's still a simple query even when split out into several tables, so it's a minor consideration.

Are there columns which are not common to all of the types? You mentioned that there are several special fields per type. This means you probably do not want to go with option #2 in any case, because option #3 re-introduces relatively simple integrity constraints. Suppose that, in your big table, you have three columns which are required if you are doing an Order, but should be NULL otherwise. Doing this is trivial with associative tables, but requires check constraints with the One Big Table approach. And if column X for an Order needs to be numeric but for an Invoice needs to be text, that just makes validation way more difficult.

Do you ever change types? Not just copying Quote data into an Order, but actually changing the Quote itself to an Order. If so, that's an argument in favor of one big table. But I don't think that's a realistic business process.

How many orders do you normally get? The more orders you get, the more you want to split out those tables. A scan against X pages is still better than a scan against 4X pages, and separate tables increases the likelihood that you'll be able to do seeks instead of scans because of finer-grained control of indexes.

Based on these considerations, I would personally keep the tables separate. Even though they may have similar attributes, they are fundamentally different entities.

What I would try to do, however, is look at those sets of common attributes and see if they really belong on the Quote/Order/Invoice/Credit Memo, or if normalizing those out into their own tables would be a better idea. For example, if you have customer first name, customer last name, customer street 1, etc., splitting that out into a customer key (or customer and address keys, depending upon circumstances) could cut down on the total number of columns in each of the four major entities, reducing repetition and gaining the other benefits of normalization.

It's possible that you've already normalized everything out to the fullest extent and that each of those entities does, in fact, legitimately have 50 attributes which are in common. In that case, I'd still keep separate tables because you're talking about fundamentally different things which happen to relate to one another.

Thanks for the very detailed answer
–
Joseph CaruanaNov 20 '12 at 12:48

1

I have faced this question several times among several projects. The "even have same fields, are different entities" proves right almost all the time. I'd suggest keep separate tables, for a good design you need to look at the business logic, not to worry about repeating data, sometimes repeating data is right and needed, lets say, the product is deleted but an invoice refering to it, must keep a copy of it, then it makes sense copying the product data into the invoice when the invoice is created.
–
NestorApr 24 '13 at 3:32

Size of the tables increases the need for normalization, and complexity of the queries vice versa.

Or to make it simple: if you have a rather constant need for the union of these tables, think of one big table. If the amount of rows is large, normalize.
Or if you case is somewhere in between, go with the proper (from the DB integrity point of view) solution - separate.