Normalizing an Existing Database

March 6, 2001

Introduction

Last week a Swynk
author posted an article on database normalization. To make the database analysis for this project more
interesting David Faour
has been asked to comment on the normalization of the database. Without getting into a threaded discussion
it should provide some additional insight. This should allow you to get an opinion on the
strengths and weaknesses of the design from an outside perspective. David's
recent article
on database normalization is a good introduction to this article.

The Normalization Process

This is the second article in the series on the process of redesigning a program.
As mentioned in the previous article (Modifying an Existing Program (Overview)) the
starting point in normalizing the current
database was to analyze the existing program and database. Because the program is old and has
been updated many times
there is no programmer available to discuss design issues. The new database design will be based
on user input, the existing program, and the existing database. The
original user interface
was built using visual FoxPro and consisted of 17 tabs. The tabs allowed the information to be organized based
on common characteristics but gave a cluttered appearence to the user interface.
In addition, in many cases information categories had
to use more than one tab to fit all the relevant information. The interface updated a
FoxPro database. The database is not first normal with
the many repeating address fields in the multiple tables. In addition the
tables do not contain data relavant to
one area. In some cases the tables contain data from many categories.

The primary table in the new database is the firm table (tblFrim). This table contains the
unique information for each firm. Although the address of the firm would normally be in the same table in this
situation it became clear that each firm could potentially be linked to a number of different addresses. The
addresses a firm could be linked to was based on the use of the address or role. The role table became the basis
for organizing the addresses, contacts, phone numbers, email addresses and web addresses. A table was created for
each of these categories (tblContact, tblAddresses,tblEmail,tblWeb,tblNumbers) and the tables were linked
to the firm table (tblFirm) and the role table (tblRoleLookup).
A diagram of the entire database depicts these relationships they are on the left
side of the figure.

The next type of information were options and policies. There were three options tables; order, copyright and returns
(tblOrder,tblCopyRight,tblReturns).
The three tables allowed organized fields by category. Each of these three option tables
had a one to one relationship with the firm table(tblfirm). A policy table exists for discounts, returns and copyrights
(tblDiscountPolicies, tblReturnPolicies, tblCopyRightPolicies). Each table has a many to one relationship with the firm table (tblFirm).
This many to one relationship dictated a separate table for this type of information, instead of adding the information to
the appropriate options table. The added benefit is if a
different policy needs to be added it will not affect table structure nor program structure.

Of the remaining tables tblOrderLanguage, tblOrderCode, tblOrderClassifiction are join tables that relate information
from orders to tblLanguageType, tblCodeType, and tblClassificationType, respectively. The join tables allow the many
to many relationships between the orders table the other three tables. The thlOrderISBNImprint has a many to one
relationship with tblOrder and is used to hold ISBN information for each firm.

The last set of tables is going to be used by the program. The tblChangeTracking and tblModification tables hold
information about when, where and who made modifications to the data. This can be used to resolve any questions
about when data was updated and who updated the data.

Although the data is normalized it is not in third normal form. A third normal database is the ideal in
some situations denormalization can be used to help optomize queries on the database. The following are some of
the areas where the data is not normalized and the reasons for denormalization. The true test to whether
denormalization makes sense is in the use and maintenance of not only the database but also the program the
database supports.