Testing the Tests, or How to Avoid Recursion

To gain more confidence in production code, tests at varying levels
accompany the development of such code. Unittests and integration tests
are common, and sometimes suites of functional/acceptance tests are
present as well. There's been quite a bit of discussion surrounding how
to best write unittests (especially if you're using some xUnit framework
variant), and sometimes these best practices even make their way into
integration tests. Unfortunately, I still have not seen these best
practices make their way into functional tests. In particular, I
frequently see functional tests that are either very complicated, and/or
obscure.

Before going on, it's worth voicing why I think complicated and obscure
tests are bad. Typically production code is complicated. The domain is
filled with special cases and unintuitive behavior. In order to help
write production code, unittests are written to help make sure the
production code works as we expect (in addition to driving the design if
you use TDD). There's a secret so fundamental to writing unittests, yet
it's rarely explicitly called out:

We don't test our tests!

Think about that for a moment. We have complicated production code. We
write tests to ensure that the production code works as expected. What
tests the test? Well if you have really complicated test logic, then are
you really better off because you've written tests? Testcases are only
effective if we can assume they are correct, in which case they are
correctly asserting things about the system under test. If testcases are
incorrect, that is, they are asserting incorrect things about the
system, then the validity of the system under test is unknown, and
you're arguably worse off than before (is a test fail really a fail?).

So what's the solution for this? Well, test the tests of course! And
what if those tests are still complicated? Well, we'll test those as
well! In fact it's tests all the way down. In order to avoid this
recursion, we have to set a practical limitation to testcases: they
must be simple enough to not require tests. In practice, this typically
means two things:

No conditionals

Small in length

Now, how to avoid these two things (or how to replace these two things
with better alternatives) is the topic on it's own. What I'm interested
in for this post is why the trend of simple and small tests has still
not made it's way into the realm of functional testing.

If you look at any (reasonable) code base's unittests, they're typically
not so bad. They're short, they're small, you can read a test and more
or less understand what it's testing. If this code base contains
functional tests (and by functional tests, I basically mean any test
that simulates interacting with the system in similar way to how a user
would interact with the system), you'll typically see these
characteristics instead:

Long

Complicated logic

Hard to understand

Fragile, failures aren't really failures sometimes

Terrible defect localization

Why is this? Is it because functional testing is still progressing
towards what unittests currently are? Is it because of a lack of
consensus that functional testing should have the same characteristics
of unittests? Is it because writing small tests with no conditionals is
hard to do at such a high level?

My take on this is that I believe writing good functional tests is much
harder than writing unittests, and because of this, best practices are
typically ignored. Fortunately, I think this situation can be remedied,
and in the next post, I'll show several things you can do to achieve
more succinct tests.