Refactoring Legacy Code

My recent
talk at GoGaRuCo is
about refactoring legacy code. The talk is based on the
Gilded Rose code kata,
which is a really fun kata to try. I highly recommend setting aside a
few hours to play with it. There are variants of the kata in a
number of different languages.

This post reviews the main points I made in the talk, expanding on
some of them a bit more than time allowed in the talk itself.

First of all, if you work with legacy code, you need to read Michael
Feathers’ book,
Working Effectively With Legacy Code.
This book gives all kinds of techniques for understanding legacy code
and getting tests around it so that you can safely refactor.

The Gilded Rose kata comes with a spec. I ignored it initially
because with legacy code, it’s hard to tell how accurate the spec is.
It might be out-of-date, or the code might be buggy and not match the
spec. I did come back to the spec at the end once I had refactored
the code and understood it better.

Every time I’ve seen someone tackle the Gilded Rose kata, they’ve
chosen to rewrite the code. It might be an incremental rewrite like
Sandi Metz does in her awesome
All The Little Things
talk, or it might be a full-on, throw-all-the-code-away-and-start-over
rewrite. I’ve never seen anyone try to refactor the code into shape.
And yet,
rewrites are almost never a good idea.
It’s really important to know how to refactor legacy code, so that’s
what I demonstrated.

In order to make progress on necessary features while making the code
better, I tried to balance the following two guidelines:

The Boy Scout Rule: “Leave the campground cleaner that you found
it”. Every time you have to work in a piece of legacy code, do a
bit of extra refactoring to make things better. The more you touch
that code, the cleaner it will get over time. Code that you never
have to touch will stay messy, but that’s OK because you never have
to touch it. The code that will benefit most from the investment in
extra cleanup is the code that gets the extra cleanup.

Don’t Boil the Ocean: Stay focused on the task you’re working on.
Yes, make the extra investment in cleaning up the code you have to
touch for the new feature, but don’t go down rabbit trails. You can
clean up other code the next time you work on a feature that touches
it.

Be Safe

If there aren’t tests, write some “characterization tests” as Feathers
calls them. You really want to be sure that you’re not changing the
visible behavior of the code when you refactor.

Work in very small steps. Do lots of simple micro-refactorings one
step at a time. These baby steps will add up to big changes over time.
Even if you choose to take bigger steps most of the time, it is
important to be able to fall back to really tiny steps when things get
complicated.

In my talk, I performed
80 baby steps to go from
the initial state to a significantly cleaner final state that also
implemented the new feature that was the goal of the kata.

Start Simple

Start with really basic refactorings, preferably ones that can be
performed automatically. I used
RubyMine to prepare the talk, and
was able to use its built-in refactoring support quite a bit,
especially early on.

Do simple things like reducing noise and removing duplication by
extracting variables and methods, using common language idioms to make
the code cleaner and more expressive, etc.

When you’re first starting on some new legacy code, you know less
about the code than you ever will. Using simple, mechanical changes
allows you to get your hands on the code and begin to understand it.

Capture Learning

As you learn something about part of the code, try to capture that new
learning in the code. Extract a variable or method and give it a name
that says what the code does. That way you can get that knowledge
out of your head, freeing up bandwidth for later cleanups.

Make sure you express important ideas and domain concepts in the
code. This makes it easier for the next person to understand the code
when they come back to it.

Don’t Blindly Fix Bugs

As you clean up legacy code, you will almost certainly discover bugs.
Don’t blindly fix them. Many times, the bug is actually masked or
worked around by other code or systems. Or people have come to expect
the output they’re getting, and fixing the bug will mess them up.

You should absolutely make a note of the bug and do some research to
figure out if it should be fixed. Talk to others, look at the code
that consumes the buggy result, etc. But don’t just fix the bug
without doing some due diligence.

Until you have a better handle on things, you want to give priority to
preserving the existing behavior of the system, even if it seems
wrong.

Practice and Learn

Practice refactoring whenever you can. Try some code katas that
emphasize refactoring. Try refactoring some code in your current
codebase. Even if you choose not to commit the changes, the practice
will help.