Musings on software construction

Angular is one of the few front-end frameworks I’ve had the pleasure
to work with that emphasizes testing as a first-class concern. The
dependency injection system encourages a separation of concerns that
makes testing of all components easier, and the framework itself comes
with a number of helpful mocking and test-support pieces. Things
aren’t quite perfect – the DI system is somewhat intrusive, forcing
itself into unit tests where it might not be especially welcome – but
it’s a whole lot better than hacking your way through a pile of jQuery
code.

So what’s the problem

The problem doesn’t lie with Angular, nor with the way it approaches
TDD. Rather, it lies with the following two observations:

Most front-end developers are not particularly familiar with unit
testing, and especially not TDD, because of the lack of historical
in existing frameworks

TDD is hard. Like any other new approach to software development
it requires practice, perserverance, and a lot of focused
concentration, especially at first.

So, what happens when you start using Angular? Well, you’re obviously
introducing a major new topic: Angular itself. Angular’s approach is
itself substantially different to anything most of your devs are
likely have seen before, so there’s going to be a lot of time, effort,
and brainpower expended on picking that up.

If, like me, you’re eager to introduce TDD to as many people as
you can, you’re going to be introducing a second major topic:
testing and TDD. This is the problem.

The net result

Attempting to introduce both TDD and Angular at the same time is very
appealing: if we’re building something new with Angular, we want to
start out with something that is well tested, well designed (which our
TDD will drive), and can be reliably worked on by all of our
colleagues. The risk, though, is that the two concepts get confused,
mixed together, and neither of them learned properly: the pace of acquiring
Angular knowledge is reduced, and the quality of test and design is
diminished.

John Lindquist put it well at Fluent Conference 2014 when he said that
when writing Angular code we should strive for testable code from
people who are learning the framework, and accept that tested code
will be something that may come later. We don’t want to give up on
the idea of using TDD to build our Angular codebase, but we equally
don’t want to get so hung up on it that we put everyone off both
approaches.

What do we do about it?

I don’t yet have a full answer to this. I’ve found it depends heavily
on the experience of each engineer, both in prior exposure to TDD and
comfort with the concepts in Angular. Somewhat counter-intuitively,
I’ve found those coming from a server-side background – as opposed to
existing front-end developers – have had an easier time, likely
because TDD and IoC/DI are better established there. I now aim
to teach people Angular first, and introduce testing and TDD as a
second round of important concepts.

This doesn’t mean that when first demonstrating Angular I don’t push
the testing and testability message: I still show how tests can be
written, how separating concerns helps with testability and the
evolution of design, and how that design can be evolved through the
application of TDD. What I don’t do is require that everyone practice
that approach from the start. Rather, I enourage people to learn the
framework in their own way, and gradually push for more and earlier
testing as they “level-up” in Angular.

Lately I’ve been spending a lot of my time introducing people to TDD.
One of the best mechanisms I’ve found for this is a Coding Dojo,
documented best in Emily Bache’s excellent Coding Dojo Handbook.
In the Coding Dojo, we spend an hour or so with a group of engineers
writing test-driven code to solve very specific, tractable problems
(katas, in the lingo) that are independent of their day-to-day work.

As a result, I’ve been getting to vicariously relive the process of
learning TDD for the first time. One thing that has really stood out
from the sessions is how hard it is to know not when we should test
something, but when we shouldn’t.

There’s a lot to say on this subject, but I’ll start out with a very
simple observation: traditional test driven development is an
example-based approach to verifying system behavior.1 In logical
terms, we’re writing there-exists checks, not for-all checks.
Making this simple distinction clear has proven a critical piece in
getting over the hump of “I don’t know when to stop writing tests”.

The clearest example of this I’ve seen is when TDDing the FizzBuzz
problem. Quickly stated, FizzBuzz is a child’s game wherein one has
to count from 1 to 100, replacing any number divisible by 3 with
“Fizz”, any number divisible by 5 with “Buzz”, and any number
divisible by both 3 and 5 with “FizzBuzz”. In at least 50% of the
tests I see written for this problem, I’ll find some variant of code
that looks like the following:

There are a few things that are problematic with this approach that we
can quickly point out (looping in a test, duplication of conditional
checks between test and production code, conditional execution of
assert, multiple assertions), but the real problem is both more
fundamental and simpler: the test is attempting to establish a for-all
condition over the currently-specified domain of the problem.

So, why don’t we want to do this? Let’s think about what we want out
of our tests:

They must be fast

They should be decoupled/cohesive: a change in requirements should
affect as small a number of tests as possible

They should provide error locality: a test failure should be easy
to pin down to a specific fault in production code

What happens with the looping approach is that we focus on the fact
that we can achieve point 1, and forget about the importance of 2
and 3. We know our domain is 100 numbers, we know we can run that
loop thousands of times a second, so we go ahead and throw it in
there.

When we do that, though, we’re left in a bit of a quandary. To test
every value, we need to know what each value should be, so we slip in
a little conditional check to see if, for a given i, we should be
printing out the word “FIZZ”. It’s a simple check, so we think
nothing of it. We add the assert, run the test, it passes, and we
move on to the next test:

It’s red, so we change our code, run our tests, and boom! Our earlier
numbersDivisibleByThreePrintFizz test blows up. What happened? Of
course, we didn’t account for the 3 and 5 case in our earlier test,
so we hop back over and change it:

And we’re back to green. But wait a minute…now our two tests are
almost identical: they both have a loop, condition on variations of
the same properties, and assert a very similar result. What’s going
to happen if our boss comes in with a new requirement and says that
the children now are going to have to say “WOOF” every time a number
is divisible by 7? Every test is going to have to be changed! We’re
also starting to notice that there’s lots of duplication between our
test and our production code, which is also doing evenly-divisible-by
checks: yet more places to change!

This is clearly not what TDD promised us. This is, in fact, the exact
reason I hear from many people about why they don’t do TDD: if I
have all of these tests and try to change my code, I now have twice
the work to do because I have to fix the tests too! Yuck!

So, what’s the solution? This is where we go back to the argument
that we’re testing by example and not against all possible
inputs. We’re attempting to establish properties of the system for
specific inputs that we, as developers, feel are valuable in assessing
whether the behavior of the system under other inputs will be
predictable and as expected.

For this to work well, we have to recognize that TDD is inherently
white-box testing: we’re feeding knowledge from the tests into the
code, and knowledge from the code into the tests. The two are in a
symbiotic relationship, and we can use our understanding of what
corners we are and will be exposing in the code to drive what tests we
write next.

For example, rather than testing that all numbers divisible by 3 are
going to print FIZZ, we can just test a couple of points with
different characteristics that we think are interesting:

Although we’ve not checked any of the numbers between 6 and 93, we’re
pretty confident that our ability to implement an evenlyDivisibleBy3
behavior is good enough that if we’re doing the right thing for these
two cases, we’ll be doing the right thing for all the cases in
between. Put another way, our assessment of the risk of our code
being wrong for those unverified values is low enough that we don’t
feel it worthwhile to expend the effort to write additional tests for
them.

What it really comes down to is this: good programming requires good
judgment. TDD offers excellent feedback on whether your judgment is
taking you in a good direction or a bad direction, and whether your
assumptions hold or do not hold. What it does not do is let you stop
thinking (constantly) about what decisions you’re making and why.
You have to decide when you feel you’ve driven out enough examples
to be confident your code is correct. Knowing when you’re at that
point is a mixture of experience, educated guesswork, and diligence.

For the purposes of this article I’m focusing on TDD as is
commonly practiced, and not looking at techniques like property or
theory based testing, which attempt to establish for-all
properties over the SUT.↩

I recently had a lengthy coversation with some colleagues that proved
a great opportunity to talk about intentionality and the often subtle
nature of premature optimization and cargo-cult coding practices.

It started with a discussion about an optimization in the JVM to
eliminate the historical performance issues with inline String
concatenation. It brought up some interesting questions of design,
both historical and current.

If you remember back to the bad old days, you’ll recall that this
pattern of String use was strongly discouraged:

1

Stringresult="Hello, "+name+"!";

This code has the problem of allocating and immediately discarding
multiple String instances: one for "Hello, ", one for
"Hello, <name>", one for "!", and one for "Hello, <name>!".
As you add more concatenations, so you allocate more intermediate
Strings and performance gets progressively worse (especially if the
String grows large).

The solution to this performance problem is to use a mutable builder:
StringBuffer, back before 1.5, and StringBuilder ever since. Now
your code would be:

So, now your code is nice and performant, but at what cost? A trivial
one liner is now either excessively long or a rather unwieldy 5-line
monster. What was previously obvious is now hidden: the “interesting”
behavior of customizing a hello message to a specific person is buried
behind an object construction, a series of method calls, a conversion
back to String…those 4 characters naming the variable to interpolate
are surrounded by a morass of syntax.

The optimization in newer versions of the JVM makes a simple
observation: it’s a trivial, mechanical transform to take the
inefficent + form and convert it to the StringBuilder form. Indeed,
if you look at the bytecode generated by a recent version of javac
you’ll see that this is exactly what it does. So, no need to use the
messy StringBuilder version: great!

So, how does this all play into the design of a system? Well, I would
argue that even before Java made this optimization it was almost
always a mistake to actually use the StringBuilder form for
immediate String construction.

It is a rare situation that immediate String construction is actually
a performance issue in a system. One or more of the following
properties have to hold for it to actually matter:

The String being constructed must be large anough that the
performance variance is measurable. Large probably means “well
in excess of 1,000 bytes”, depending on frequency of
construction.

The String being constructed must be constructed many times.
Many for a modern system means “multiple times per second”,
not a few thousand times over the lifetime of an app

In my experience these properties tend not to hold far less often than
we see the use of StringBuilders: generating large Strings in tight
loops is something that does happen, but typically in well-constrained
parts of the system.

This doesn’t mean that StringBuilder isn’t a useful and commonly
applicable API. Instead, it means that you should use it when you
want to communicate a specific intent to the user. StringBuilder
implies mutability: its raison d’etre is to allow for
progressive construction of a String. String implies immutability:
it is one of the most immutable structures possible in Java,
being both declared immutable at the spec level and made final to
prevent subclass mutation.

I will use a StringBuilder primarily to indicate that I intend to
do something imperative in nature. The typical example would be
looping through a list of some kind and appending entries to a
string, or pulling together bits of data from here and there to
build something bigger. By contrast, I will tend to use a String
and concatenation where I’m doing something more functional and
side-effect free: for example, when recursively constructing a
String, or building something simple from immediately avaiable
values with no conditional juggling. I will on very rare
occassions pass a StringBuilder for mutation, though in general treat
that as a code smell to be refactored out later (the dangers of
out-parameters deserve a post all their own).

So, my general rule is this: if you’re constructing an immediate,
immutable value, you should always have been and continue to favor
the + concatenation form. If you’re starting construction of a
value that you intend to add to progressively,
use the StringBuilder. Your decision should not be based on
performance until you have a provable data point showing that
performance matters: write for your reader, not your compiler.
You’ll get better performance by having an easy to understand and
change design than you’ll ever get from micro-optimizing with a
cargo-culted pattern.