Suppose one had a relatively large program (say 900k SLOC in C#), all commented/documented thoroughly, well organized and working well. The entire code base was written by a single senior developer who no longer with the company. All the code is testable as is and IoC is used throughout--except for some strange reason they did not write any unit tests. Now, your company wants to branch the code and wants unit tests added to detect when changes break the core functionality.

Is adding tests a good idea?

If so, how would one even start on something like this?

EDIT

OK, so I had not expected answers making good arguments for opposite conclusions. The issue may be out of my hands anyway. I've read through the "duplicate questions" as well and the general consensus is that "writing tests is good"...yeah, but not too helpful in this particular case.

I don't think I am alone here in contemplating writing tests for a legacy system. I'm going to keep metrics on how much time is spent and how many times the new tests catch problems (and how many times they don't). I'll come back and update this a year or so from now with my results.

CONCLUSION

So it turns out that it is basically impossible to just add unit test to existing code with any semblance of orthodoxy. Once the code is working you obviously cannot red-light/green-light your tests, it usually not clear which behaviors are important to test, not clear where to begin and certainly not clear when you are finished. Really even asking this question misses the main point of writing tests in the first place. In the majority of cases I found it actually easier to re-write the code using TDD than to decipher the intended functions and retroactively add in unit tests. When fixing a problem or adding a new feature it is a different story, and I believe that this is the time to add unit tests (as some pointed out below). Eventually most code gets rewritten, often sooner than you'd expect--taking this approach I've been able to add test coverage to a surprisingly large chunk of the existing codebase.

@DanPichelman You've never experienced a schroedinbug - "A design or implementation bug in a program that doesn't manifest until someone reading source or using the program in an unusual way notices that it never should have worked, at which point the program promptly stops working for everybody until fixed."
–
MichaelTAug 6 '13 at 15:33

7

@MichaelT Now that you mention it, I think I have seen one or two of those. My comment should have read "Adding tests usually won't break existing code". Thanks
–
Dan PichelmanAug 6 '13 at 15:39

2

Only write tests around things which you intend to refactor or change.
–
Steve EversAug 13 '13 at 22:18

7 Answers
7

While tests are a good idea, the intention was for the original coder to build them as he was building the application to capture his knowledge of how the code is supposed to work and what may break, which would have then been transferred to you.

In taking this approach, there is a high probability that you will be writing the tests that are least likely to break, and miss most of the edge cases that would have been discovered while building the application.

The problem is that most of the value will come from those 'gotchas' and less obvious situations. Without those tests, the test suite loses virtually all of its effectiveness. In addition, the company will have a false sense of security around their application, as it will not be significantly more regression proof.

Typically the way to handle this type of codebase is to write tests for new code and for the refactoring of old code until the legacy codebase is entirely refactored.

But, even without TDD, unit tests are useful when refactoring.
–
pdrAug 6 '13 at 16:43

1

If the code continues to work well, it's not a problem, but the best thing to do would be to test the interface to legacy code whenever you write anything that relies on its behaviour.
–
dewordeAug 6 '13 at 20:10

1

This is why you measure test coverage. If the test for a particular section doesn't cover all the ifs and elses and all the edge cases then you can't safely refactor that section. Coverage will tell you if all the lines are hit so your goal is to increase coverage as much as possible before refactoring.
–
omouseAug 6 '13 at 22:14

2

A major shortcoming of TDD is that whilst the suite may run, to the developer unfamiliar with the code base, this is a false sense of security. BDD is much better in this regard as the output is the intention of the code in plain english.
–
Robbie DeeAug 13 '13 at 14:26

2

Just like to mention that 100% code coverage does not mean your code is working correctly 100% of the time. You can have every line of code for method tested but just because it works with value1 does not mean it is guaranteed to work with value2.
–
ryanzecAug 13 '13 at 20:25

You say that it's well documented, and that puts you in a good position. Try creating tests using that documentation as a guide, focussing on parts of the system that are either critical or subject to frequent change.

Initially, the sheer size of the codebase will probably seem overwhelming compared to the tiny amount of tests, but there's no big-bang approach, and making a start somewhere is more important than agonising about what the best approach will be.

Another reason for adding tests: when a bug is found, you can easily add a test case for future regression testing.
–
bdaresAug 7 '13 at 2:57

This strategy is basically the one outlined in the (free) edX online course CS169.2x in the chapter about legacy code. As the teachers tell it: "Establishing Ground Truth With Characterization Tests" in Chapter 9 in the book: beta.saasbook.info/table-of-contents
–
FGMNov 17 '13 at 15:48

Not all unit tests have equal benefit. The benefit of a unit test comes when it fails. The less likely it is to fail, the less beneficial it is. New or recently changed code is more likely to contain bugs than rarely changed code that is well tested in production. Therefore, unit tests on new or recently changed code are more likely to be more beneficial.

Not all unit tests have equal cost. It's much easier to unit test trivial code you designed yourself today than complex code someone else designed a long time ago. Also, testing during development usually saves development time. On legacy code that cost savings is no longer available.

In an ideal world, you'd have all the time you need to unit test legacy code, but in the real world, at some point it stands to reason that the costs of adding unit tests to legacy code will outweigh the benefits. The trick is to identify that point. Your version control can help by showing you the most recently changed and most frequently changed code, and you can start by putting those under unit test. Also, when you make changes going forward, put those changes and closely related code under unit test.

Following that method, eventually you will have pretty good coverage in the most beneficial areas. If you instead spend months getting unit tests in place before resuming revenue-generating activities, that might be a desirable software maintenance decision, but it's a lousy business decision.

If the code rarely fails, a lot of time and effort could be expended finding arcane issues that could never occur in real life. On the other hand, if the code is buggy and error prone, you could probably start testing anywhere and find issues immediately.
–
Robbie DeeAug 13 '13 at 14:36

Absolutely, though I find it a little hard to believe that the code is clean and working well and using modern techniques and simply has no unit tests. Are you sure they're not sitting in a separate solution?

Anyways, if you're going to extend/maintain the code then true unit tests are invaluable to that process.

If so, how would one even start on something like this?

One step at a time. If you're unfamiliar with unit testing, then learn a bit. Once your comfortable with the concepts, pick one little section of the code and write tests for it. Then the next, and the next. Code coverage can help you find spots you've missed.

It's probably best to pick dangerous/risky/vital things to test first, but you might be more effective testing something straight-forward to get into a groove first - especially if you/the team isn't used to the codebase and/or unit testing.

"Are you sure they're not sitting in a separate solution?" is a great question. I hope the OP doesn't overlook it.
–
Dan PichelmanAug 6 '13 at 16:10

No chance unfortunately, the application was started many years ago just as TDD was gaining traction, so the intent was to do tests at some point, but for some reason once the project started they never got to it.
–
PaulAug 6 '13 at 16:17

2

They probably never got to it because it would have taken them extra time that wasn't worth it. A good developer working alone can definitely create a clean, clear, well-organized, and working application of relatively large size without any automated tests, and generally they can do it faster and just as bug-free as with tests. Since it's all in their own head, there is significantly lower chance of bugs or organizational problems compared to the case of multiple developers creating it.
–
Ben LeeAug 13 '13 at 17:50

Yes, having tests is a good idea. They will help document the existing codebase works as intended and catch any unexpected behaviour. Even if the tests initially fail, let them, and then refactor the code later so that they pass and behave as intended.

Start writing tests for smaller classes (ones that have no dependencies and are relatively simple) and move on to larger classes (ones that have dependencies and are more complex). It will take a long time, but be patient and persistent so that you can eventually cover the codebase as much as possible.

Would you really add failing tests to a program that is working well (the OP says)?
–
MarkJAug 6 '13 at 15:56

Yes, because it shows that something isn't working as intended and requires further review. This will prompt discussion and hopefully correct any misunderstandings or previously unknown defects.
–
BernardAug 6 '13 at 17:04

@Bernard - or, the tests may expose your misunderstanding of what the code is supposed to do. Tests written after the fact run the risk of not correctly encapsulating the original intentions.
–
Dan PichelmanAug 7 '13 at 13:14

@DanPichelman: Agreed, but this shouldn't discourage one from writing any tests at all.
–
BernardAug 7 '13 at 13:31

If nothing else, it would indicate that the code hasn't been written defensively.
–
Robbie DeeAug 13 '13 at 14:38

Adding tests to an existing, working system is going to alter that system, unless that is, the system is all written with mocking in mind from the start. I doubt it, though it's quite possible it has good separation of all components with easily definable boundaries that you can slip your mock interfaces into. But if it isn't, then you are going to have to make what are quite significant (relatively speaking) changes that may well break things. In the best case, you're going to spend a load of time writing these tests, time that could be better spent writing a load of detailed design documents, impact analysis documents or solution configuration documents instead. After all, that's the work your boss wants done more than unit tests. Isn't it?

Anyway, I would not add any unit tests whatsoever.

I would concentrate on external, automated testing tools that will give you a reasonable coverage without changing a thing. Then, when you come to make modifications... that's when you can start adding unit tests inside the codebase.