If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

multiple M:M subtables or one big table?

I am currently designing a database for a small business to track documents. Their scheme of issuing documents for every job is so complex that I ended up having a number of many-to-many "intermediary subtables" with consists of a mapping of the primary keys of two tables:
e.g.

Code:

receipt_invoice
-----------------
receipt_noinvoice_no

In fact, i have 15 of these tables. As I said, the design was a complex one to accommodate future modifications in the way the physical documents are handled. Therefore, some of these tables may hold only 1 row for the next year and some may not even contain a row up until the time the owner decides to re-design the database.

Owing to the complexity of the database design and to the number of tables to be tracked, I was thinking of doing an optimization:

Code:

table_mapping
---------------
initial_idsecondary_id
mapping_type

where mapping_type shall indentify which kind of primary keys initial_id and secondary_id are. For example, if initial_id is an invoice number and secondary_id is a receipt number, mapping_type would be INV-RCPT. Unique codes shall be used for the other mapping types.

I am seriously thinking of adopting this scheme. Comments will be highly appreciated!

Why? Simply so that you hve less tables? But you will have thrown away referential integrity (foreign keys), and your queries will be no simpler (in fact, somewhat more complex). So what is the real benefit of doing this?

yes i agree. I also find that the use of outer joins will be a more of an occurance if im querying an all-encompassing single table - again due to filtering, similar to your previous point - although this is specific to my own experiences.

thanks andrew! well, yes, aside from the trivial benefit of having less tables, I was thinking of the occurence of some tables being wasted if not used. As I said, some of the tables may hold only 1 row for the next year and some may not even have the benefit of having any row at all.

Another thing is, the application will be heavily using a function to check if a particular M:M relationship exists. The scenario is actually like this: an order may have an invoice, and an invoice may have a receipt. Also, an order may be issued a receipt directly. The order is paid if a receipt for it exists. The M:M tables would be:

order_invoice
invoice_receipt
order_receipt

To see if the the order has been paid already, I have to see if an M:M for the particular order exists in the invoice_receipt or in the order_receipt table. If I have used the one-big-table approach, I would only checked one table only; whereas in this one, I first have to check invoice_receipt and if it does not exist, check order_receipt.

This scenario repeats at least 5 times on different combinations for the project.

And also on this note, I would like to ask: in balancing good database design (avoid redundancy, avoid null, etc) versus speed in accessing data (redundant data exists but is taken care of by the application's front-end), where and how do we draw the line?

It seems that you are being driven towards a super-generic, everthing M:M everything else model because the requirements have not been clearly analysed. It is difficult to comment with so little background knowledge, but for example if the existence of one receipt associated with an order means that it has been paid then what would be the point of creaing a second receipt for that order? In other words, it seems that each order may be associated with just one receipt.

It still isn't clear to me how having one table simplifies the problem. To see if an order 123 has a receipt the SQL will be something like this with your approach:

Actually, there can exist more than one receipt for every invoice or order as partial payment is allowed. But you hit the point: OTLT. I guess I'm just not comfortable with a lot of database tables for a very simple project. But thanks for the help and will now start reading your blog

database design

to avoid later problems on the use of so many tables, you must fully understand the overall need of the company. This can be avoided with proper gathering of data before design of various entities needed with proper normalization needed with thorough knowledge on how they relate with each other. (entity relationship)

Later problems on design can be avoided if the real problem is fully understood and customer needs is fully met.

If you want to make a new design, please take time to investigate all other possible needs of your clients.