How do you test your unit-tests for false negatives (tests that pass but should have failed).

For example, I am writing a mission-critical class in JavaScript. To make sure every little nook and cranny is covered, we have well over 300 unit tests for it. From time to time, bugs are introduced but one or two relevant tests continue to pass, usually because of JavaScript's willingness to convert various data types to boolean.

Or, one or two non-central tests pass for a long time, only to be revealed as false negatives upon close inspection.

Are there well-worn patterns out there in the world used by great QA teams to hunt for false negatives?

Terminology note: in medicine, the purpose of a test is to detect a disease; if it detects the disease, the outcome is positive, otherwise it is negative. Similarly, the purpose of a test is to find bugs, so FAIL is a positive outcome and PASS is a negative outcome. A false positive would be a FAIL that does not actually correspond to a bug, and a false negative would be a PASS that fails to detect a bug. You seem to be using the terms in the opposite sense. See for example "Positive or negative" in en.wikipedia.org/wiki/Medical_test.
– user246Mar 1 '15 at 22:24

In my question, I describe two scenarios. In Secenario A, the code under test is entirely broken, but a subset of unit tests continue to pass--false positives. In Scenario B, the code under test is partially broken, and the unit tests written for that particular part of the code passes--false positives. I recognize the wording is confusing, though. Feel free to edit as you wish.
– StudentsTeaMar 2 '15 at 11:13

I edited the question to use "positive" and "negative" in the conventional way.
– user246Mar 2 '15 at 12:08

@Nope, user246 has the correct definition of false positive and false negative. In the case of test code, a "positive" means the code has found a bug - so a false positive is the test code saying something is wrong when there is no bug (that is, the test code is incorrect). A false negative is the test code reporting no bugs when there are bugs present (again, the test code is incorrect).
– Kate Paulk♦Mar 3 '15 at 12:09

3 Answers
3

There are two (actually both are very similar) techniques in order to reveal the tests which tend to be "false/positive" ones – Error Seeding and Mutation Testing.

The both principles are based on introducing the errors in the application's program code, mainly in the places where it will have the most dramatical effect for application.

It can be for example changing the values of boolean operators (from "true" to "false") or some arithmetic operations (from "==" to ">=") in order to see whether the test which is covering this functionality will fail or not.

In addition to the techniques Victor mentioned, which introduce errors into the program code, you can also introduce errors into the test code. I do this all the time as I'm test-driving new code. Make the test wrong, run it, and observe the results. Then make the test right... which might differ from how it was before.

Also a good technique which I actually use when developing automated test scripts in order to make sure that I have "basic" degree of confidence about the checks in my code
– Viktor MalyiMar 2 '15 at 21:30

How do you write automated tests that don't generate false positives or false negatives?

How do you check existing automated tests for false positives or false negatives?

Both have somewhat different answers: Dale's answer is a good one for writing tests in a way that protects against false positives and false negatives. Victor's is good for checking existing tests.

A few more thoughts:

With JavaScript, you can check data types and perform explicit casts as part of your comparison checks. It's awkward, but it tends to help. The same goes for any other weakly typed or untyped language. Using object-based comparisons rather than direct comparisons helps too (object.equals(otherObject) rather than object == otherObject).

When building the test, run it against failing and passing scenarios, as many as you can think of. You won't get them all, but this will help to make sure you don't accidentally pass a failing scenario.

Keep each test as granular as possible. Depending on your tool and your language, this can be challenging - there is always a tension between granular, independent tests and DRY code.

Where the code you are testing has dependencies, mock/fake/stub/shim them (including with good and bad data)

Include boundary and intersection data - I've found that most hidden bugs tend to lurk around boundaries or around the intersection of multiple objects (which is where having a really good helper library that fakes out all the other objects with a wide variety of good and bad data is helpful)

Data driven tests, preferably with the test data stored somewhere other than the test code. Having one test routine that cycles through every row in a CSV file (or some similar input) means it's a lot easier to add tests that could potentially cause false positives or false negatives than if the data has to be stored in code and recompiled.

Multiple levels of testing help, too. Unit tests are fantastic, but having feature level and integration level automated tests as well as end-to-end automation helps to catch false positives and false negatives because of the different kinds of checking that's being done by each type of automated test.