Background: We have been handed over a very large codebase (1.4 million lines) that is primarily in C#. The application consists primarily of asp.net 2.0 style asmx web services accessing data in a SQL server 2008 database and also in various XML files. There are no existing automated tests in place. We have an automated nightly build in place (CC.NET).

We want to introduce some level of automated testing, but refactoring in granular level unit tests for this amount of code seems unlikely. Our first thought is to find a way to construct automated tests that simply call each web service with a given set of parameters to give us some level of code coverage. Seemed like the quickest way to get the highest amount of code coverage with some automated tests. Is this even called unit testing or would this be considered something else?

How would I go about isolating the data stores to get consistent test results? Would any test tools work better for this approach than others? xUnit? MS tests? NUnit?

Any suggestions to get us started in the right direction would be greatly appreciated. Thanks

4 Answers
4

My company has been doing something similar with our code base (C rather than C#) that totals about a million lines. The steps have gone something like this:

1) Write some automated tests like you describe that do system level tests.
2) Implement the rule that new code shall have a unit test.
3) When an area has a few bugs against it the process of fixing those bugs should include writing a basic unit test.

The point is that 3 shouldn't require a complete unit test (if it's easier people will do it). If you move from 0% test coverage to 40% coverage of a particular module then you've made great progress.

Though 6 months in you may only be up to 5% of the total code base that 5% is the code that is changing the most and where you are most likely to introduce bugs. The code I work on now is about 60% covered by integration tests and 15% (by line) covered by unit tests. That doesn't seem like a lot but it does provide significant value and our development effort has benefited from it.

edit: in response to one of the other comments the current set of integration tests we run take about 14 hours at the moment. We're now looking at running some in parallel to speed them up.

The tests that you describe sound like more of an end-to-end or integration test than a unit test, strictly speaking. But that's not necessarily a bad thing! In your situation, it may be productive to work down from the end-to-end tests towards the unit tests, rather than up as you would on a new codebase.

Write at least one simple test for each web service API, to ensure that you have some coverage of everything

Identify the APIs that have historically been prone to failure, based on past bug reports. Write more extensive tests for those APIs.

Every time that you encounter a failing test, drop down a level and write tests for the methods called on that code path. Eventually you'll find that one of those tests is failing. If the bug isn't obvious at that level, drop down a level and repeat.

Rinse, wash, repeat every time you get a new bug report.

The idea here is that you introduce unit-test coverage "as needed" along the code paths that are currently prone to failure. This will harden those code paths for the future, and will gradually expand your unit test coverage over the entire application.

As JSBangs pointed out, those are call Integration test. And I agree that integration test is better thqn unit test for your case. With 1.4 million lines of code, I am not exactly sure what you can unit test out of that code base.

For isolating the data stores, I like to do the following:

Have a set of hard coded data to start off with initially

After awhile, you should have test that actually creates a bunch of test data. Run those test first and you no longer need the hard coded data.

When something goes wrong in production due to a set of "bad data", add those to your test. Eventually, you will have a good set of test data that test most of the cases.

Also, keep in mind that integration test takes longer to run. You might want to run them on a testing machine so it doesn't block your computer. It's pretty normal for a test suites that takes hours to run.

One thing you could look at is using user stories using something like SpecFlow. I 've found that these stories more naturally map to integration tests, which is what you want. The added benefit of using these is it creates a set of almost use cases that could be used by teams that are other than technical (i.e. product manages/business analysts).