There are either too many possible answers, or good answers would be too long for this format. Please add details to narrow the answer set or to isolate an issue that can be answered in a few paragraphs.
If this question can be reworded to fit the rules in the help center, please edit the question.

Of course the point is to add tests and then refactor. The book is largely about how you manage to get impossibly convoluted code under test, and there are a lot of eye-openers on that point. Let's just say that Feathers doesn't take any prisoners!
–
Kilian FothNov 28 '11 at 7:57

5 Answers
5

The key problem with legacy code is that it has no tests. So you need to add some (and then more...).

This in itself would take a lot of work, as @mattnz noted. But the special problem of legacy code is that it was never designed to be testable. So typically it is a huge convoluted mess of spaghetti code, where it is very difficult or downright impossible to isolate small parts to be unit tested. So before unit testing, you need to refactor the code to make it more testable.

However, in order to refactor safely, you must have unit tests to verify that you haven't broken anything with your changes... This is the catch 22 of legacy code.

The book teaches you how to break out of this catch by making the absolute minimal, safest changes to the code just to enable the first unit tests. These aren't meant to make the design nicer - only to enable unit tests. In fact, sometimes they do make the design uglier or more complex. However, they allow you to write tests - and once you have unit tests in place, you are free to make the design better.

There are lots of tricks to make the code testable - some are sort of obvious, some are not at all. There are methods I would have never thought about myself, without reading the book. But what is even more important is that Feathers explains what precisely makes a code unit testable. You need to cut dependencies and introduce barriers into your code, but for two distinct reasons:

sensing - in order to check and verify the effects of executing a piece of code, and

separation - in order to get the specific piece of code into a test harness first of all.

Cutting dependencies safely can be tricky. Introducing interfaces, mocks and Dependency Injection is clean and nice as a goal, just not necessarily safe to do at this point. So sometimes we have to resort to subclassing the class under test in order to override some method which would normally e.g. start a direct request to a DB. Other times, we might even need to replace a dependency class/jar with a fake one in the test environment...

To me, the most important concept brought in by Feathers is seams. A seam is a place in the code where you can change the behaviour of your program without modifying the code itself. Building seams into your code enables separating the piece of code under test, but it also enables you to sense the behaviour of the code under test even when it is difficult or impossible to do directly (e.g. because the call makes changes in another object or subsystem, whose state is not possible to query directly from within the test method).

This knowledge allows you to notice the seeds of testability in the nastiest heap of code, and find the minimal, least disruptive, safest changes to get there. In other words, to avoid making "obvious" refactorings which have a risk of breaking the code without you noticing - because you don't yet have the unit tests to detect that.

I work on a code base of millions of lines of code, some dating back to the 1980's. It's just software, so it's just a matter of writing a few unit tests, so you can go and just refactor it, and make it so much better.

The key word here is just - it's a four-letter word that does not belong in any programmer's vocabulary, let alone one who is working on legacy systems.

How long do you think it takes to write a unit test, to test one hour worth of development effort? For discussion sake, let's say another hour.

How much time is invested in that million-line, 20-year-old legacy system? Let's say, 20 developers for 20 years times 2000 hours / annum (they worked pretty hard). Let's now pick a number - you have new computers and new tools, and you are so much more clever than the guys who wrote this piece of $%^^ in the first place - let's say you are worth 10 of them. Have you got 40 man years, well, have you...?

So the answer to your question is there is much much more. For instance, that routine that's 1000 lines (I have a few that are over 5000), it is overly complex and is a piece of spaghetti. It would only (yet another four letter word) take a couple of days to re-factor it into a few 100 line routines and a few more 20 lines helpers, right? WRONG. Hidden in those 1000 lines is 100 bug fixes, each one an undocumented user requirement or an obscure edge case. It's 1000 lines because the original 100-line routine did not do the job.

You need to work with the mind set "if it ain't broke, don't fix it". When it is broke, you need to be very careful when you fix it - as you make it better, that you don't accidentally change something else. Note that "broke" may include code that is unmaintainable, but working correctly, that depends on the system and its use. Ask "what happens if I screw this up and make it worse", because one day you will, and you will have to tell the bosses' boss why you chose to do that.

These systems can always be made better. You will have a budget to work to, a timeline, whatever. If you don't - go and make one. Stop making it better when the money/time has run out. Add a feature, give yourself time to make it a little better. Fix a bug - again, spend a little extra time and make it better. Never deliver it worse than it was when you started.

thanks for the tips! Are these yours or from the book?
–
AlisonNov 28 '11 at 8:50

2

Probably a bit of both - I read the book after a few years of doing this work, and should probably read it agian. Like any good book, it will make you challange some of what you are currently doing, re-enforce most of what you do, but does not have all the answers for your specific set of problems.
–
mattnzNov 28 '11 at 19:14

3

"It's 1000lines because the orginal 100line routine did not do the job." So very rarely does this appear to actually be the case. More often than not it's 1,000 lines simply because the original developer rolled up his/her sleeves and started coding before sparing even a moment for planning or design.
–
Stephen TousetAug 22 '12 at 15:16

1

No. I'm saying that in most of the cases (I've personally encountered), 1,000 line routines are clear indications that people started writing code before thinking about how to write a proper abstraction. Thousand line routines are virtually by definition too complicated – are you saying that a thousand-line routine with hundreds of hidden, uncommented bugfixes is the hallmark of a responsible developer?
–
Stephen TousetAug 23 '12 at 4:53

5

If you believed every post on this site, everyone has to deal with 1000 line spaghetti code, yet no one ever wrote it. In my experience, 1000 (and 10000) line routines are the hall mark of developers who are doing the best they can with what they have, to deliver what is required of them by the boss who pays their wages. I find it insulting and arrogant the way so many developers feel free to comment from the sidelines with no knowledge of the circumstances, while never having to expose there own work to the community to critique.
–
mattnzAug 24 '12 at 1:41

Whenever you have to change legacy code, you should make sure it has coverage.

As other responders have pointed out, trying to pre-emptively update your existing legacy code is a fool's errand. Instead, whenever you have to make a change to legacy code (for a new feature or a bug fix), take the time to remove its legacy status.