Description

Copyright 2006

Dimensions: 7x9-1/4

Pages: 384

Edition: 1st

Book

ISBN-10: 0-321-29353-3

ISBN-13: 978-0-321-29353-4

Refactoring has proven its value in a wide range of development projects—helping software professionals improve system designs, maintainability, extensibility, and performance. Now, for the first time, leading agile methodologist Scott Ambler and renowned consultant Pramodkumar Sadalage introduce powerful refactoring techniques specifically designed for database systems.

Ambler and Sadalage demonstrate how small changes to table structures, data, stored procedures, and triggers can significantly enhance virtually any database design—without changing semantics. You’ll learn how to evolve database schemas in step with source code—and become far more effective in projects relying on iterative, agile methodologies.

This comprehensive guide and reference helps you overcome the practical obstacles to refactoring real-world databases by covering every fundamental concept underlying database refactoring. Using start-to-finish examples, the authors walk you through refactoring simple standalone database applications as well as sophisticated multi-application scenarios. You’ll master every task involved in refactoring database schemas, and discover best practices for deploying refactorings in even the most complex production environments.

The second half of this book systematically covers five major categories of database refactorings. You’ll learn how to use refactoring to enhance database structure, data quality, and referential integrity; and how to refactor both architectures and methods. This book provides an extensive set of examples built with Oracle and Java and easily adaptable for other languages, such as C#, C++, or VB.NET, and other databases, such as DB2, SQL Server, MySQL, and Sybase.

Using this book’s techniques and examples, you can reduce waste, rework, risk, and cost—and build database systems capable of evolving smoothly, far into the future.

Preface

Untitled Document

Refactoring Databases

Evolutionary Database Design

Preface

Evolutionary, and often agile, software development methodologies, such as
Extreme Programming (XP), Scrum, the Rational Unified Process (RUP), the Agile
Unified Process (AUP), and Feature-Driven Development (FDD), have taken the
information technology (IT) industry by storm over the past few years. For the
sake of definition, an evolutionary method is one that is both iterative and
incremental in nature, and an agile method is evolutionary and highly collaborative
in nature. Furthermore, agile techniques such as refactoring, pair programming,
Test-Driven Development (TDD), and Agile Model-Driven Development (AMDD) are
also making headway into IT organizations. These methods and techniques have
been developed and have evolved in a grassroots manner over the years, being
honed in the software trenches, as it were, instead of formulated in ivory towers.
In short, this evolutionary and agile stuff seems to work incredibly well in
practice.

In the seminal book Refactoring, Martin Fowler describes a refactoring
as a small change to your source code that improves its design without changing
its semantics. In other words, you improve the quality of your work without
breaking or adding anything. In the book, Martin discusses the idea that just
as it is possible to refactor your application source code, it is also possible
to refactor your database schema. However, he states that database refactoring
is quite hard because of the significant levels of coupling associated with
databases, and therefore he chose to leave it out of his book.

Since 1999 when Refactoring was published, the two of us have found
ways to refactor database schemas. Initially, we worked separately, running
into each other at conferences such as Software Development (http://www.sdexpo.com)
and on mailing lists (http://www.agiledata.org/feedback.html).
We discussed ideas, attended each other's conference tutorials and presentations,
and quickly discovered that our ideas and techniques overlapped and were highly
compatible with one another. So we joined forces to write this book, to share
our experiences and techniques at evolving database schemas via refactoring.

The examples throughout the book are written in Java, Hibernate, and Oracle
code. Virtually every database refactoring description includes code to modify
the database schema itself, and for some of the more interesting refactorings,
we show the effects they would have on Java application code. Because all databases
are not created alike, we include discussions of alternative implementation
strategies when important nuances exist between database products. In some instances
we discuss alternative implementations of an aspect of a refactoring using Oracle-specific
features such as the SE,T UNUSED or RENAME TO commands, and many of our code
examples take advantage of Oracle's COMMENT ON feature. Other database
products include other features that make database refactoring easier, and a
good DBA will know how to take advantage of these things. Better yet, in the
future database refactoring tools will do this for us. Furthermore, we have
kept the Java code simple enough so that you should be able to convert it to
C#, C++, or even Visual Basic with little problem at all.

Why Evolutionary Database Development?

Evolutionary database development is a concept whose time has come. Instead
of trying to design your database schema up front early in the project, you
instead build it up throughout the life of a project to reflect the changing
requirements defined by your stakeholders. Like it or not, requirements change
as your project progresses. Traditional approaches have denied this fundamental
reality and have tried to "manage change," a euphemism for preventing
change, through various means. Practitioners of modern development techniques
instead choose to embrace change and follow techniques that enable them to evolve
their work in step with evolving requirements. Programmers have adopted techniques
such as TDD, refactoring, and AMDD and have built new development tools to make
this easy. As we have done this, we have realized that we also need techniques
and tools to support evolutionary database development.

Advantages to an evolutionary approach to database development include the
following:

You minimize waste. An evolutionary, just-in-time (JIT) approach
enables you to avoid the inevitable wastage inherent in serial techniques
when requirements change. Any early investment in detailed requirements,
architecture, and design artifacts is lost when a requirement is later found
to be no longer needed. If you have the skills to do the work up front,
clearly you must have the skills to do the same work JIT.

You avoid significant rework. As you will see in Chapter 1, "Evolutionary
Database Development," you should still do some initial modeling up
front to think major issues through, issues that could potentially lead
to significant rework if identified late in the project; you just do not
need to investigate the details early.

You always know that your system works. With an evolutionary approach,
you regularly produce working software, even if it is just deployed into
a demo environment, which works. When you have a new, working version of
the system every week or two, you dramatically reduce your project's
risk.

You always know that your database design is the highest quality possible.
This is exactly what database refactoring is all about: improving your schema
design a little bit at a time.

You work in a compatible manner with developers. Developers work
in an evolutionary manner, and if data professionals want to be effective
members of modern development teams, they also need to choose to work in
an evolutionary manner.

You reduce the overall effort. By working in an evolutionary manner,
you only do the work that you actually need today and no more.

There are also several disadvantages to evolutionary database development:

Cultural impediments exist. Many data professionals prefer to follow
a serial approach to software development, often insisting that some form
of detailed logical and physical data models be created and baselined before
programming begins. Modern methodologies have abandoned this approach as
being too inefficient and risky, thereby leaving many data professionals
in the cold. Worse yet, many of the "thought leaders" in the data
community are people who cut their teeth in the 1970s and 1980s but who
missed the object revolution of the 1990s, and thereby missed gaining experience
in evolutionary development. The world changed, but they did not seem to
change with it. As you will learn in this book, it is not only possible
for data professionals to work in an evolutionary, if not agile, manner,
it is in fact a preferable way to work.

Learning curve. It takes time to learn these new techniques, and
even longer if you also need to change a serial mindset into an evolutionary
one.

Tool support is still evolving. When Refactoring was published
in 1999, no tools supported the technique. Just a few years later, every
single integrated development environment (IDE) has code-refactoring features
built right in to it. At the time of this writing, there are no database
refactoring tools in existence, although we do include all the code that
you need to implement the refactorings by hand. Luckily, the Eclipse Data
Tools Project (DTP) has indicated in their project prospectus the need to
develop database-refactoring functionality in Eclipse, so it is only a matter
of time before the tool vendors catch up.

Agility in a Nutshell

Although this is not specifically a book about agile software development,
the fact is that database refactoring is a primary technique for agile developers.
A process is considered agile when it conforms to the four values of the Agile
Alliance (http://www.agilealliance.org).
The values define preferences, not alternatives, encouraging a focus on certain
areas but not eliminating others. In other words, whereas you should value the
concepts on the right side, you should value the things on the left side even
more. For example, processes and tools are important, but individuals and interactions
are more important. The four agile values are as follows:

Individuals and interactions OVER processes and tools. The most
important factors that you need to consider are the people and how they
work together; if you do not get that right, the best tools and processes
will not be of any use.

Working software OVER comprehensive documentation. The primary goal
of software development is to create working software that meets the needs
of its stakeholders. Documentation still has its place; written properly,
it describes how and why a system is built, and how to work with the system.

Customer collaboration OVER contract negotiation. Only your customer
can tell you what they want. Unfortunately, they are not good at thisthey
likely do not have the skills to exactly specify the system, nor will they
get it right at first, and worse yet they will likely change their minds
as time goes on. Having a contract with your customers is important, but
a contract is not a substitute for effective communication. Successful IT
professionals work closely with their customers, they invest the effort
to discover what their customers need, and they educate their customers
along the way.

Responding to change OVER following a plan. As work progresses on
your system, your stakeholders' understanding of what they want changes,
the business environment changes, and so does the underlying technology.
Change is a reality of software development, and as a result, your project
plan and overall approach must reflect your changing environment if it is
to be effective.

How to Read This Book

The majority of this book, Chapters 6 through 11, consists of reference material
that describes each refactoring in detail. The first five chapters describe
the fundamental ideas and techniques of evolutionary database development, and
in particular, database refactoring. You should read these chapters in order:

Chapter 2, "Database Refactoring," explores in detail the concepts
behind database refactoring and why it can be so hard to do in practice.
It also works through a database-refactoring example in both a "simple"
single-application environment as well as in a complex, multi-application
environment.

Chapter 3, "The Process of Database Refactoring," describes
in detail the steps required to refactor your database schema in both simple
and complex environments. With single-application databases, you have much
greater control over your environment, and as a result need to do far less
work to refactor your schema. In multi-application environments, you need
to support a transition period in which your database supports both the
old and new schemas in parallel, enabling the application teams to update
and deploy their code into production.

Chapter 4, "Deploying into Production," describes the process
behind deploying database refactorings into production. This can prove particularly
challenging in a multi-application environment because the changes of several
teams must be merged and tested.

Chapter 5, "Database Refactoring Strategies," summarizes some
of the "best practices" that we have discovered over the years
when it comes to refactoring database schemas. We also float a couple of
ideas that we have been meaning to try out but have not yet been able to
do so.

About the Cover

Each book in the Martin Fowler Signature Series has a picture of a bridge on
the front cover. This tradition reflects the fact that Martin's wife is
a civil engineer, who at the time the book series started worked on horizontal
projects such as bridges and tunnels. This bridge is the Burlington Bay James
N. Allan Skyway in Southern Ontario, which crosses the mouth of Hamilton Harbor.
At this site are three bridges: the two in the picture and the Eastport Drive
lift bridge, not shown. This bridge system is significant for two reasons. Most
importantly it shows an incremental approach to delivery. The lift bridge originally
bore the traffic through the area, as did another bridge that collapsed in 1952
after being hit by a ship. The first span of the Skyway, the portion in the
front with the metal supports above the roadway, opened in 1958 to replace the
lost bridge. Because the Skyway is a major thoroughfare between Toronto to the
north and Niagara Falls to the south, traffic soon exceeded capacity. The second
span, the one without metal supports, opened in 1985 to support the new load.
Incremental delivery makes good economic sense in both civil engineering and
in software development. The second reason we used this picture is that Scott
was raised in Burlington Ontarioin fact, he was born in Joseph Brant hospital,
which is near the northern footing of the Skyway. Scott took the cover picture
with a Nikon D70S.