For the project my team and me are working on we often find that we need large pieces of scaffolding code. Creating domain objects with correct values, setting up mocks for repositories, dealing with the cache,... are all things that occur commonly throughout the tests. A lof of times we're working with the same basic objects that are central to our domain (person,...) so lots of tests create instances of these objects for other objects to work on. We have a lot of different solutions using the base domain so this kind of code is often spread across those solutions.

I've been thinking about creating common classes that do a lot of this scaffolding. This would allow us to request a fully instantiated person with everything set up (access via repository, caching...). This removes duplicate code from our individual unit tests but would also mean there is a large amount of code that probably does 'too much' per test (as it would set up a full object and not just the parts required).

Has anyone ever done this? Are there any insights, remarks, thoughts... you can offer that might validate or invalidate this approach?

This question came from our site for professional and enthusiast programmers.

I edited your question slightly before migrating it here to remove stuff that'd normally go into an answer. With you listing advantages and disadvantages in the question itself, it reduces the opportunity for someone to give a comprehensive answer that doesn't duplicate what you already said. This way the points you made will ideally emerge as part of the answers. If they don't, feel free to repost them as an answer and community voting on it can tell you if you're right on the money or not.
–
Anna Lear♦Dec 22 '11 at 19:55

8 Answers
8

I use the builder pattern to create domain objects for tests as detailed in the "Growing Object-Oriented Software" book. The builders have static methods that generate default values for common objects with methods for overriding the default values.

Setting up for all tests can be a good idea, particularly if you do a deploy. Instead of using mocks we create, deploy and populate a test database and then point the test at the instance.
Not exactly the way you should do it as per the manual, but it works, even more so for us because we use our deployment code to do it.

Let's assume that there is no dependency on the database whatsoever and the objects are POCO's, would you still consider a database to retrieve them from? Would having to maintain a database that is only used for testing differ from having to maintain code to create objects that is just used for testing?
–
JDTDec 23 '11 at 6:39

I'm not such a fan of code generation (scaffolding, codesmith, lblgen,..). It only helps you to write duplicate code much faster, not really maintenance friendly. If a framework is causing you to write lots of duplicate (test)code, maybe you need to find a way to move the code base away from the framework.

If you keep writing repository acces code, you might be able to solve this with a generic base repository and have a generic base testclass that your unit tests inherit from.

To avoid code duplication AOP can also help for things like logging, caching,...

It's a valid approach - but not for the code base we have. I'm currently working with Builder-classes that create XML-based messages from our domain objects. No amount of facade classes will prevent me from having to plug in fully constructed objects at some point - something I've had to do several times over in different solutions.
–
JDTDec 22 '11 at 14:40

I can relate to this question. On my team we've tried to do TDD before and it was the same story, too many tests required a lot of scaffolding/utility methods that we simply didn't have. Because of time pressures and few other not so right decisions, we ended up with unit tests which were very long and reinvented things. Over the course of a release, those unit tests became so convoluted and hard to maintain, we had to drop them.

I'm currently working with few other developers on reintroducing TDD back into the team, but instead of simply telling people, go write tests, we want to do it carefully this time and people have the right skills and the right tools. Absence of the common support code is one thing we identified where we need to improve so that when others do start writing tests, all the code they write would be small and to the point.

I don't have as much experience in this area yet as I'd like, but few things I'd suggest based on work we did so far and some homework I've done:

Common code is a good thing. Doing same work 5 times makes no sense and under time pressures at least 4 of those times will be half-assed.

My plan is to build up common framework, then build up TDD leadership team consisting of 4-6 developers. Have then write unit tests, learn the framework and provide feedback on it. Then have these guys go out to the rest of the group and guide other people in producing maintainable unit tests that utilize the power of the framework. Also have these guys gather feedback and bring it back into the framework.

For the examples you've provided with Person class, I'm actually investigating the use of mockpp or googlemock frameworks. Leaning towards googlemock for now, but haven't tried either yet. In either case, why write something that can be "borrowed"

Be careful about creating "god" fixtures that instantiate objects for all tests regardless if those tests actually need those objects. This is what xUnit Test Patterns calls "General Fixture Pattern". This could cause few problems such as a) each test taking too long to run due to excessive setup code and b) tests ending up having too many unnecessary dependencies. In our framework each test class picks exactly which framework features it needs in order to execute, so no unnecessary baggage is included (in the future we might change it to each method picking independently). I got some inspiration from Boost and Modern C++ Design and came up with a base template class that takes a mpl::vector of needed features and builds custom class hierarchy specifically for that test. Some really nifty captain cruch decoder ring super crazy black magic template stuff :)

"Any problem in computer science can be solved with one additional layer of indirection. But that usually will create another problem." – David Wheeler

(disclaimer: I'm part of JDT's team, so I have some more inner knowledge of the problem presented here)

The essential problem here is that the objects in the framework themselves expect a database connection, as its loosely based on the CSLA framework by Rockford Lhotka. This makes mocking them almost impossible (as you'd have to change the framework to support this) and also violates the black-box principle a unit test should have.

Your proposed solution, while not a bad idea in itself, will not only require a LOT of work to get to a stable solution, but will also add yet another layer of classes that needs to maintained and extended, adding even more complexity to an already complex situation.

While I agree that you should always look for the best solution, in a real-world situation such as this, being practical has its values as well.

And practicality suggests the following option:

Use a database.

Yes, I know technically speaking its not a "real" unit test anymore then, but strictly speaking the moment you cross a class boundary in your test, its not a real unit test anymore either. What we need to ask ourselves here is, do we want 'pure' unit tests that will require loads of scaffolding and glue code, or do we want testable code?

You might as well make use of the fact that the framework used encapsulates all database access for you, and run your tests against a clean sandbox database. If you then run each test in its own transaction, and rollback at the end of each 'logical' set of tests, you shouldn't have any problems of interdependency or test order.

Sam, see my comment on Tony Hopkinson's reply. Would you do the same if we're dealing with a project that is all just POCO classes?
–
JDTDec 23 '11 at 6:45

Unless you're testing an algorithm that lives on its own without external resources, sure. The idea would be that your 'test' database only contains minimal data and that any non-infrastructure test data would need to be inserted by the tests themselves, thereby reducing your maintainability cost. But admittedly, with 'clear' POCO objects, the case for a test database is not as strong.
–
SamDec 23 '11 at 7:29

I would warn for having too much code that does too much 'black magic' without developers knowing what exactly is being set up. Also, in time, you'll become very dependent on this code/data. That's why I'm no fan of basing your tests on test databases. I believe test databases are for testing as a user, not for automated tests.

That said, your problem is real. I'd make some common methods to set up data you need regularly. Ideally, try to parameterize the methods, because certain tests need different data.

But I'd avoid creating big object graphs. It's best to have tests run with the minimum of data they need. If there's a lot of other data, you could unexpectedly influence your test results (worst case scenario).

I like to keep my tests as "silo'ed" as possible. It is a subjective question though. But I'd go for common methods with parameters so that you can set up your test data in only a few lines of code.