Still having fun writing code

My Opinions on Data Setup for Functional Tests

I read Jim Holmes’s post Data Driven Testing: What’s a Good Dataset Size? with some interest because it’s very relevant to my work. I’ve been heavily involved in test automation efforts over the past decade, and I’ve developed a few opinions about how best to handle test data input for functional tests (as opposed to load/scalability/performance tests). First though, here’s a couple concerns I have:

Automated tests need to be reliable. Tests that require external, environmental tests can be brittle in unexpected ways. I hate that.

Your tests will fail from time to time with regression bugs. It’s important that your tests are expressed in a way that makes it easy to understand the cause and effect relationship between the “known inputs” and the “expected outcomes.” I can’t tell you how many times I’ve struggled to fix a failing test because I couldn’t even understand what exactly it was supposed to be testing.

My experience says loudly that smaller, more focused automated tests are far easier to diagnose and fix when they fail than very large, multi-step automated tests. Moreover, large tests that drive user interfaces are much more likely to be unstable and unreliable. Regardless of platform, problem domain, and team, I know that I’m far more productive when I’m working with quicker feedback cycles. If any of my colleagues are reading this, now you know why I’m so adamant about having smaller, focused tests rather than large scripted scenarios.

Automated tests should enable your team to change or evolve the internals of your system with more confidence.

Be Very Cautious with Shared Test Data

If I have my druthers, I would not share any test setup data between automated tests except for very fundamental things like the inevitable lookup data and maybe some default user credentials or client information that can safely be considered to be static. Unlike “real” production coding where “Don’t Repeat Yourself” is crucial for maintainability, in testing code I’m much more concerned with making the test as self-explanatory as possible and completely isolated from one another. If I share test setup data between tests, there’s frequently going to be a reason why you’ll want to add a little bit more data for a new test which ends up breaking the assertions in the existing tests. Besides that, using a shared test data set means that you probably have more data than any single test really needs — making the diagnosis of test failures harder. For all of those reasons and more, I strongly prefer that my teams copy and paste bits of test data sets to keep them isolated by test rather than shared.

Self-Contained Tests are Best

I’ve been interested in the idea of executable specifications for a long time. In order to make the tests have some value as living documentation about the desired behavior of the system, I think it needs to be as clear as possible what the relationship is between the germane data inputs and the observed behavior of the system. Plus, automated tests are completely useless if you cannot reliably run them on demand or inside a Continuous Integration build. In the past I’ve also found that understanding or fixing a test is much harder if I have to constantly ALT-TAB between windows or even just swivel my head between a SQL script or some other external file and the body of the rest of a script.

I’ve found that both the comprehensibility and reliability of an automated test are improved by making each automated test self-contained. What I mean by that is that every part of the test is expressed in one readable document including the data setup, exercising the system, and verifying the expected outcome. That way the test can be executed at almost any time because it takes care of its own test data setup rather than being dependent on some sort of external action. To pull that off you need to be able to very concisely describe the initial state of the system for the test, and shared data sets and/or raw SQL scripts, Xml, Json, or raw calls to your system’s API can easily be noisy. Which leads me to say that I think you should…

Decouple Test Input from the Implementation Details

I’m a very large believer in the importance of reversibility to the long term success of a system. With that in mind, we write automated tests to pin down the desired behavior of the system and spend a lot of energy towards designing the structure of our code to more readily accept changes later. All too frequently, I’ve seen systems become harder to change over time specifically because of tight coupling between the automated tests and the implementation details of a system. In this case, the automated test suite will actually retard or flat out prevent changes to the system instead of enabling you to more confidently change the system. Maybe even worse, that tight coupling means that the team will have to eliminate or rewrite the automated tests in order to make a desired change to the system.

With that in mind, I somewhat strongly recommend against expressing your test data input in some form of interpreted format rather than as SQL statements or direct API calls. My team uses Storyteller2 where all test input is expressed in logical tables or “sentences” that are not tightly coupled to the structure of our persisted documents. I think that simple textual formats or interpreted Domain Specific Language’s are also viable alternatives. Despite the extra work to write and maintain a testing DSL, I think there are some big advantages to doing it this way:

You’re much more able to make additions to the underlying data storage without having to change the tests. With an interpreted data approach, you can simply add fake data defaults for new columns or fields

You can express only the data that is germane to the functionality that your test is targeting. More on this in below when I talk about my current project.

You can frequently make the test data setup be much more mechanically cheaper per test by simply reducing the amount of data the test author will have to write per test with sensible default values behind the scenes. I think this topic is probably worth a blog post on its own someday.

This goes far beyond just the test data setup. I think it’s very advantageous in general to express your functional tests in a way that is independent of implementation details of your application — especially if you’re going to drive a user interface in your testing.

Go in through the Front Door

Very closely related to my concerns about decoupling tests from the implementation details is to avoid using “backdoor” ways to set up test scenarios. My opinion is that you should set up test scenarios by using the real services your application uses itself to persist data. While this does risk making the tests run slower by going through extra runtime hoops, I think it has a couple advantages:

It’s easier to keep your test automation code synchronized with your production code as you refactor or evolve the production code and data storage

It should result in writing less code period

It reduces logical duplication between the testing code and the production code — think database schema changes

When you write raw data to the underlying storage mechanisms you can very easily get the application into an invalid state that doesn’t happen in production

Case in point, I met with another shop a couple years ago that was struggling with their test automation efforts. They were writing a Silverlight client with a .Net backend, but using Ruby scripts with ActiveRecord to setup the initial data sets for automated tests. I know from all of my ex-.Net/now Ruby friends that everything in Ruby is perfect, but in this case, it caused the team a lot of headaches because the tests were very brittle anytime the database schema changed with all the duplication between their C# production code and the Ruby test automation code.

Topics for later…

It’s Saturday and my son and I need to go to the park while the weather’s nice, so I’m cutting this short. In a later post I’ll try to get more concrete with examples and maybe an additional post that applies all this theory to the project I’m doing at work.