2.2 Principle of Unintended Uses

2.2 Principle of Unintended Uses

The previously cited examples demonstrate that you cannot separate
data from uses. To assess the quality of data, you must first collect a
thorough specification of the intended uses and then judge the data as to its
suitability for those uses. In a perfect world, database builders would gather
all requirements and then craft a database design and applications to match
them.

In the real world there is a serious problem when dealing with
"unintended uses." These are uses that were not known or defined
at the time the databases were designed and implemented.

Unintended uses arise for a large variety of reasons. Some examples
follow:

The company expands to new markets.

The company purchases another company and consolidates
applications.

External requirements are received, such as a new tax law.

The company grows its usage, particularly in decision
making.

The company migrates to a new, packaged application that has
different needs.

This represents the single biggest problem with databases. Unintended
uses proliferate at ever-expanding rates. You cannot anticipate all uses for a
database when initially building it unless it is a database with unimportant
content. In the real world you can expect (and in fact depend on) a large
number of unintended uses appearing with surprising regularity. Each of these
can cause a good database to become a bad database. Two things are needed:
anticipation in database design and flexibility in implementations.

Need for Anticipation in Database
Design

Database designers need to be schooled in the principles of data
quality. By doing so, they will be able to avoid some data quality problems
from occurring when unintended uses
appear. In the least, they should be schooled to be diligent in the careful and
thorough documentation of the content. This means that metadata repositories
should be more prevalent, more used, and more valued by information systems
groups.

Anticipation also includes questioning each database design
decision in light of what might appear in the future. For example, name and
address fields should anticipate the company's growth into markets in other
countries where the structure and form of elements may vary. Another example is
anticipating sales amounts in multiple national currencies.

Database and application designers should be discouraged from using
confusing and complicated data encoding methods. Many bad techniques
proliferated in the past that were the result of the limited capacity and slow
speed of storage devices. These are no longer excuses for making data
structures and encoding schemes overly complicated.

A good database design is one that is resilient in the face of
unintended uses. This principle and the techniques to achieve it must be taught
to the newer generations of information system designers.

Need for Flexibility in
Implementations

We know that changes will be made to our systems. Corporations
always change. They change so much that keeping up with the changes is a major
headache for any CIO. Unintended uses is one of the reasons for change. When a
new use appears, its requirements need to be collected and analyzed against the
data they intend to use. This needs to be done up front and thoroughly. If the
database is not up to the task of the new uses, either the new use needs to be
changed or discarded, or the database and its data-generating applications must
be upgraded to satisfy the requirements. This analysis concept needs to be
incorporated into all new uses of data.

It is amazing how many companies do not think this way. Too often
the data is force fit into the new use with poor results. How many times have
you heard about a data warehouse project that completed but yielded nonsense
results from queries? This generally is the result of force fitting data to a
design it just does not match.

Database systems are better able to accept changes if they are
designed with flexibility in the first place. Relational-based systems tend to
be more flexible than older database technologies. Systems with thorough,
complete, and current metadata will be much easier to change than those lacking
metadata.

Most information system environment do a very poor job of creating
and maintaining metadata repositories. Part of the blame goes to the repository
vendors who have built insufficient systems. Part goes to practitioners who
fail to take the time to use the metadata
systems that are there. Part goes to lack of education and awareness of how
important these things really are.

Note

Organizations must improve their use of metadata repositories
hugely if they have any hope of improving data quality and recouping some of
the losses they are regularly incurring.