Sunday, February 12, 2017

Fifty quick ideas to improve your tests is, well, a series of fifty quick ideas that you can implement on some of your automated test suites to improve their value or lower their creation or maintenance costs.

These ideas are pattern-like in which they are mostly self-contained and often independent from each other. They are distilled from real world scenarios that the authors (David Evans, Tom Roden and Gojko Adzic) have encountered in their work.

This format helps a lot readability, as ideas are organized into themes, giving you the ability to focus on the area you want to improve and to quickly skip the ideas that do not make sense in your context or you find impractical or not worth the effort. For the same reasons I enjoyed the book this one is a sequel to, Fifty quick ideas to improve your user stories. Moreover, both were published on Leanpub so you have the ability to adjust the price to the value you think you'll get out of them; and despite Leanpub's large collection of unfinished books, this one is 100% complete and ready to read without the hassle of having to update to a new version later (who really does that?)

Some selected quotes follow, highlighted phrases mine.

Something one person considers critical might not even register on the scale of importance for someone from a different group.

Drawing parallels between the different levels of needs, we can create a pyramid of software quality levels: Does it work at all? What are the key features, key technical qualities? Does it work well? What are the key performance, security, scalability aspects? Is it usable? What are the key usability scenarios? Is it useful? What production metrics will show that it is used in real work? Is it successful?

In order to paint the big picture quickly, we often kick things off with a ten-minute session on identifying things that should always happen or that should never be allowed. This helps to set the stage for more interesting questions quickly, because absolute statements such as ‘should always’ and ‘should never’ urge people to come up with exceptions.

Finally, when an aspect of quality is quantified, teams can better evaluate the cost and difficulty of measuring. For example, we quantified a key usability scenario for MindMup as ‘Novice users will be able to create and share simple mind maps in under five minutes’. Once the definition was that clear, it turned out not to be so impossible or expensive to measure it.

Avoid checklists that are used to tick items off as people work (Gawande calls those Read-Do lists). Instead, aim to create lists that allow people to work, then pause and review to see if they missed anything (‘Do-Confirm’ in Gawande’s terminology).

A major problem causing overly complex examples is the misunderstanding that testing can somehow be completely replaced by a set of carefully chosen examples. For most situations we’ve seen, this is a false premise. Checking examples can be a good start, but there are still plenty of other types of tests that are useful to do. Don’t aim to fully replace testing with examples in user stories – aim to create a good shared understanding, and give people the context to do a good job.

Waiting for an event instead of waiting for a period of time is the preferred way of testing asynchronous systems

The sequence is important: ‘Given’ comes before ‘When’, and ‘When’ comes before ‘Then’. Those clauses should not be mixed. All parameters should be specified with ‘Given’ clauses, the action under test should be specified with the ‘When’ clause, and all expected outcomes should be listed with ‘Then’ clauses. Each scenario should ideally have only one ‘When’ clause that clearly points to the purpose of the test.

Difficult testing is a symptom, not a problem. When it is difficult for a team to know if they have a complete picture during testing, then it will also be difficult for it to know if they have a complete picture during development, or during a discussion on requirements. It’s unfortunate that this complexity sometimes clearly shows for the first time during testing, but the cause of the problem is somewhere else.

Although it’s intuitive to think about writing documents from top to bottom, with tests it is actually better to start from the bottom. Write the outputs, the assertions and the checks first. Then try to explain how to get to those outputs. [...] Starting from the outputs makes it highly unlikely that a test will try to check many different things at once,

Technical testing normally requires the use of technical concepts, such as nested structures, recursive pointers and unique identifiers. Such things can be easily described in programming languages, but are not easy to put into the kind of form that non-technical testing tools require.

For each test, ask who needs to resolve a potential failure in the future. A failing test might signal a bug (test is right, implementation is wrong), or it might be an unforeseen impact (implementation is right, test is no longer right). If all the people who need to be make the decision work with programming language tools, the test goes in the technical group. If it would not be a technical but a business domain decision, it goes into the business group.

Manual tests suffer from the problem of capacity. Compared to a machine, a person can do very little in the same amount of time. This is why manual tests tend to optimise human time [...] Since automated tests are designed for unattended execution, it’s critically important that failures can be investigated quickly. [...] To save time in execution, it’s common for a single manual test to check lots of different things or to address several risks.

Whenever a test needs to access external resources, in particular if they are created asynchronously or transferred across networks, ensure that the resources are hidden until they are fully complete.

Time-based waiting [sleep() instead of polling or waiting for an event in tests] is the equivalent of going out on the street and waiting for half an hour in the rain upon receiving a thirty-minute delivery estimate for pizza, only to discover that the pizza guy came along a different road and dropped it off 10 minutes ago.

Instead of just accepting that something is difficult to test and ignoring it, investigate whether you can measure it in the production environment.

Test coverage is a negative metric: it measures how bad something is, not how good it is.

There are just two moments when an automated test provides useful information: the first time it passes and when it subsequently fails.

It’s far better to optimise tests for reading than for writing. Spending half an hour more writing a test will save days of investigation later on. [...] In business-oriented tests, if you need to compromise either ease of maintenance or readability, keep readability.