Wednesday, 15 June 2016

Mocking Framework Expectations

I’ve never really got on with mocking frameworks as I’ve generally preferred to write my own mocks, when I do need to use them. This may be because of my C++ background where mocking frameworks are a much newer invention as they’re harder to write due to the lack of reflection.

I suspect that much of my distaste comes from ploughing through reams of test code that massively over specifies exactly how a piece of code works. This is something I’ve covered in the past in “Mock To Test the Outcome, Not the Implementation” but my recent experiences unearthed another reason to dislike them.

It’s always a tricky question when deciding at what level to write tests. Taking an outside-in approach and writing small, focused services means that high-level acceptance tests can cover most of the functionality. I might still use lower-level tests to cover certain cases that are hard to test from the outside, but as tooling and design practices improve I’ve felt less need to unit test everything at the lowest level.

The trade-off here is an ability to triangulate when a test fails. Ideally when one or more tests fail you should be able home in on the problem quickly. It’s likely to be the last thing you changed but sometimes the failure is slightly further back but you’ve only just unearthed it with you latest change. Or, if you’re writing your first test for a new piece of code and it doesn’t go green, then perhaps something in the test code itself is wrong.

This latter case is what recently hit the pair of us when working on a simple web API. We had been fleshing out the API by mocking the back-end and some other dependent services with NSubstitute. One test failure in particular eluded our initial reasoning and so we fired up the debugger to see what was going on.

The chain of events that lead to the test failure ultimately were the result of a missing test method implementation on a mock. The mocking framework, instead of doing something sensible like throwing an exception such as NotImplementedException, chose to return the equivalent of default(T) which for a reference type is null.

While a spurious null reference might normally result in a NullReferenceException quickly after, the value was passed straight into JArray.Parse() which helpfully transformed the bad value into a good one – an empty JSON array. This was a valid scenario (an empty set of things) and so a valid code path was then taken all the back out again to the caller.

Once the missing method on the mock was added the code started behaving as expected.

This behaviour of the mock feels completely wrong to me. And it’s not just NSubstitute that does this, I’ve seen it at least once before with one of the older C# mocking frameworks. In my mind any dependency in the code must be explicitly catered for in the test. The framework cannot know what a “safe” default value is and therefore should not attempt to hide the missing dependency.

I suspect that it’s seen as a “convenience” to help people write tests for code which has many dependencies. If so, then that to me is a design smell. If you have a dependency that needs to be satisfied but plays no significant role in the scenario under test, e.g. validating a security token, then create a NOP style mock that can be reused, e.g. an AlwaysValidIdentityProvider and configure it in the test fixture SetUp so that it’s visible, but not dominant.

A tangential lesson here is one that Sir Tony Hoare is personally burdened with, which is the blatant overloading of a null string reference as a synonym for “no value”. One can easily see how this can happen because of the way that String.IsNullOrEmpty() is used without scant regard for what a null reference should symbolise. My tale of woe in “Extension Methods Should Behave Like Real Methods” is another illustration of this malaise.

I did wonder if we’d shot ourselves in the foot by trying to test at too a high a level, but I don’t believe so because you wouldn’t normally write unit tests for 3rd party components like a JSON parser. I might write characterisation tests for a 3rd party dependency such as a database API if I wanted to be sure that we really knew how it would handle certain types of data conversion or failure, or if we rely on some behaviour we might suspect is not clearly defined and could therefore silently change in a future upgrade. There did not feel like there should have been anything surprising in the code we had written and our dependencies were all very mature components.

Of course perhaps this behaviour is entirely intentional and I just don’t fit the use case, in which case I’d be interested to know what it is. If it’s configurable then I believe it’s a poor choice of default behaviour and I’ll add it to those already noted in “Sensible Defaults”.