Ten years in software

Last updated December 15th, 2016

I have been working professionally as a Software Engineer for the past 10 years. In that time, I've learned a huge
amount, gained a bit of confidence, and largely ignored the social nature of our field. I haven't given back to the
community and now feel like it's a good time to change that. I've been very lucky in my career thus far and want to
share the broad lessons that I've learned along the way.

This is part five of a series of pieces written reflecting on my career:

When writing software, you'll eventually need to verify that it does what you intended. Sometimes this is a manual
process, and other times you'll automate it with code. For better or worse, at every place I've worked so far, it was up
to the judgement of the developer whether to write automated tests or to manually verify behavior. At Sendio, we tended
to lean on the side of manual testing. At IMVU we wrote lots of tests (often in advance of implementation), and aimed to
eliminate slow or intermittent tests. At Etsy, we landed somewhere inbetween.

Regardless of the practice, a test is only valuable when it fails. A test's failure tells you something. The more
reliable information a test failure gives you, the more valuable that test is.

I believe there are exactly three kinds of tests, and combining qualities of each will only cause trouble:

Unit tests: Tests that verify the contract upheld by individual components or functions.

Fuzz tests: Tests that look for bugs in a function or component by generating unexpected input.

Good tests are fast, reliable, informative, easy to read, and will help you predictably deliver better software. Bad
tests are slow, unpredictable, useless when they fail, hard to decipher, and will strike fear into your heart when you
have to work with them.

Here's what I've learned about writing and maintaining these kinds of tests.

Tests are like everything else in the world: coming up with a good name is difficult, but doing so will make things much
easier in the future. A test should describe the behavior that it is verifying, it should not describe the value that
it's verifying. To demonstrate, here are a bunch of good and bad names for tests—read each and ask yourself if the name
helps give context:

An interface provides the verbs that you can use to operate a component. An object's contract is the set of guarantees
that its public interface provides. Tests are meant to uphold guarantees. This means your test should only interact with
objects through their public interface.

If you're trying to assert behavior (without checking an object's side effects) without using a public interface,
you're testing how it does it not what it does.

Regressions happen when what it does changes.

Nobody will notice if how it does it changes.

If you change all of the internal data structures abstracted away by an object, your tests should still pass. If you
think you really need to test the internal data structures, you probably should be measuring the performance
characteristics instead (more on that later).

If a test relies on an external value—the database, the network, the system time, or the filesystem, it's prone to
failures due to preexisting conditions. To avoid this, use dependency injection.

Provide your code with interfaces which implement access to these internal values. In this way, a test can use in-memory
data to verify that a "database" contains the correct value, or that the current local time is on a leap second or a
daylight savings time boundary, or that a file already exists with unexpected data.

In addition to making your tests more reliable, this technique also removes I/O, making your tests extremely fast.

Here, "mocks" and "stubs" refer to the practice of altering the execution of specific functions at test runtime.

If you need to mock out a method in order for a test to pass, you're testing how it does it, not what it does.
This distinction is really important: doing things wrong is not the same as doing things differently.

If a test tells you that your component is doing things differently, you'll resent your tests when you refactor your
codebase or perform perfectly valid optimizations. However, if your test tells you that your component is doing things
wrong, that's something which will save you and your customers from failure.

Relying on mocks or stubs for individual functions/methods will make a system more fragile over time. Avoid this as much
as possible.

Here, "fakes" refer to alternate implementations of components which uphold the interface's contract—an example being an
in-memory filesystem fake.

Let's say you have a flaky test which has an external dependency on user account data which is stored in the database.
One way to reduce the flakiness would be to remove its dependence on the database itself. To do this, you could extract
an interface which is an abstraction around access to that user account data and then provide two implementations of
that data: one which uses the real database and the other which uses an in-memory data structure. You can then use
dependency injection so that the test uses the in-memory implementation. This way, the test removes the external
dependency and can always control the user account data.

However, that's not good enough. Now your test depends on the in-memory implementation. If the database implementation
were to change in a breaking way (and not the in-memory implementation), the test would falsely pass, but production
would fail!

To prevent against this, you can write a test suite which verifies the user account interface's contract. The exact same
test suite can then be run against the "fake" in-memory implementation and the "real" I/O-backed implementation.

This will ensure that changes to one will not impact the other and that you can use the faster and more reliable
in-memory implementation wherever you need to access user data.

Sometimes systems need to be fast. Enforcing this can be the same as enforcing the contract that your system
upholds.

Let's say you're building a 3d engine—there will be components which must perform their behavior within a fixed
amount of wall clock time (on your minimum supported hardware). Rendering one frame of a 3d scene will need to take less
than one frame (1/60hz = 16.67ms).

Write a test to ensure this performance constraint is met! Generate a big load of data which represents your worst case,
and measure the wall clock time of rendering that scene.

This isn't limited to realtime systems. Let's say you're refactoring a system to be more generalized, but don't want to
unintentionally introduce performance regressions. Take a known workload for that system, measure the time it takes to
execute it, and write an automated test to verify that the processing of that same workload finishes in the same or less
wall clock time than the original system. This can be extremely valuable, as unintentional regressions can be detected
early in development.

Do you want to learn more? Was something confusing? Was something insightful? In the NYC area
and want to grab a coffee? Feel free to drop me an email at
sufian@gmail.com or send a tweet my way
@sufianrhazi

Disclaimer: Unless stated otherwise, the above words are my own and do not represent the
opinions of any person or business but myself.