While writing automated test I come across situations when I include multiple verification points in one method but then I have also read/heard that one tests should make just one verification. Which approach do you follow and why?

n.b. I have my pointers to make multiple verification in one method and would detail them once I get a satisfactory answer.
Looks like I am posing one more subjective/argumentative question :-/

10 Answers
10

I incorporate multiple verifications into one test, but only permit them at one point in the execution of the test. The test harness recovers gracefully from a failed verification (verifications should never change state IMO; if gathering data changes state, gather the data as part of the test or make that verification another test) and continues on to perform additional verifications.

Why have multiple verifications?

If the bug causing the failure is repeatable and consistent, it will just repeat again the next time and cause the same verifications to fail as before. Why waste the setup time if the results are going to be the same each time?

If the bug causing the failure is NOT repeatable and consistent, I want to gather as much information on that specific failure as possible - which is best done with multiple verifications.

Why have them only at one point in the test?

A failure in setup is different than a failure in the actual test, and my test harness handles them differently. I can easily report them differently as well, and that information can be useful for business decisions.

If test B is dependent on test A and test A fails, I can't workaround the bug for test A to fix test B because that would change test A. If A and B are separate tests, then I can rewrite the setup for test B to workaround the bug in test A without impacting test A. For example, if A is posting a comment and B is viewing a posted comment from another account, and posting a comment fails because the JavaScript for the comment "submit" button is broken, I can still test B by updating the database directly in setup and then verifying that if the comment had been submitted properly, another account would be able to view it.

Edited to add: I just realized that the infamous "Given-When-Then" format follows the same rules. You have one test (the "When" section), and after "when" is completed, you can verify multiple "then" statements using the "and" keyword. So there's one method's solution to this question.

So this is what I do (and this is something I did not want to reveal while posting the question). When it comes to UI verification of a page, I club (oops I mean "include" :-)) them in one method and my test approach would report failures with one or more missing elements. But for more "logical test" I always have different method. A definite +1
–
TarunJun 15 '11 at 16:45

I personally like to have atomic tests, that are designed, generally, to test only one or two things, with most verification being called explicitly called in the test case itself. That way you can easily see at the top level what is being verified, for example (from parkcalc).

Another draw back to combining too many things in one test, is a failure could prevent other verifications from happening.

Say you combine three verifications & the second fails. Will your automation stop? Or continue to three? And how important is three? If you can live without that being verified then combining will save time. However, if three being verified is a top priority I would put it by itself and take the time hit.

What if you harness lets test execution continue in wake of failures (especially when you want to do so) and reports all of them in your test report?
–
TarunJun 15 '11 at 13:25

1

Ideally your test harness will distinguish between 10 failures in the same test and one failure each in 10 tests, because the first failure in a test may flag a condition that necessarily causes another 9 failures in the same test.
–
user246Jun 15 '11 at 15:52

1

To continue the test execution even in the wake of failures we should use verification instead of assertion. Assert stops the entire test case execution in the wake of failure where as verify continues and reports them as failures
–
ArunaJun 15 '11 at 17:53

For unit tests, I greatly prefer to test one thing per test. The reason is this: When the test fails, I want the test to point me as quickly as possible to the broken code.

If a test tests multiple things, then the name of the test is not enough to point to the broken code. I have to seek additional information, such as a stack trace, which guides me to a line of test code. Then I have to look at the line of test code to see what it was doing, and in what context. Now, that's not the most horrible problem I've ever had, but...

If a test tests one thing, then the name of the failing test is often enough to point me in the right direction. Not always, but often.

For different reasons, I also prefer to make each acceptance test test one thing. But here, my reason is mostly clarity. I want each acceptance tests to describe with crystal clarity the feature or responsibility it's testing. And I want to see why this test matters. The more stuff a test does, the harder it is to understand the point of the test.

I'm more likely to relax my preference for acceptance tests, especially if the test tests through the GUI or browser. The startup and teardown sometimes take a long time. If I do more stuff in each test, then I'm executing fewer of those long-running setups and teardowns, so the suite takes less time to run. On the other hand, each test takes longer to run, and it's not possible to test just the "create a new frobozz" stuff if that test is embedded in a bigger "create a frobozz, edit it, save it, delete it" test.

I incorporate several verification points in a single test method and the reasons are as below:

Before performing any action on an element I would first like to verify if the element is present on the page in the first place. There are some possibilities that the UI control is renamed or removed. The same logic applies for pages as well. I verify at every step if I am on the right page before performing any testing by verifying the title of the page. I thus have several verification points in a single test method. I first verify if I am on the right page, then whether the required UI control is present and after that I perform the intended verification point of the test case.

Certain verification points can be logically grouped in a single test method. For example, when I test "post comment feature" on facebook I have got two verification points. One is to verify if the comment is posted successfully and another is to verify if my friend can read the comment that I just posted. It makes sense to group these two verification points in a single test method since they are logically related. It is incorrect to claim that post comment works if others can't read my comment.

I know this is going to seem a bit odd, but it depends. A lot of the testing I deal with involves completing a defined transaction(Go to this location, select quantity X of item Y, pay with form of payment Z) - for the automation I work with, this constitutes a single test.

One of the ways the team where I work evolved to gain a better idea of where problems originate is in how we structure our scripted error handling - every method in our script code has exception handling, and any exception is logged with the file name and routine name, so what gets logged might look like this: "TransactionLib.RunTransactions: TransactionLib.TransactionStep: PaymentLib.MakePayment: PaymentLib.CashPayment: Authorization form displayed when no authorization required."

Where possible, we try to recover from these so the test run can complete although it will throw baseline data out of synch and cause a lot of comparison differences as a result - the reasoning here is that if the cause of the problem is something that can be configured or is part of the application, it shouldn't stop the test run, where if an access violation or similar level error occurs, that's a drop everything and get it fixed immediately.

Our framework is designed to perform its setup, run the defined set of transaction tests (which we store in what's effectively a relational database of csv files), then do detailed baseline checks and perform cleanup. Whether this counts as running each test individually or stringing them all together would depend on precisely how "test" is defined.

Each test could use several of a large number of code routines, and we try to keep those to a single action where possible. Each test step can also involve multiple code routines, depending on precisely what's being done - again, it depends a lot on your application under test. A small web application won't need anywhere near the level of complexity I've just described: I spend most of my time working against a massively complex application with over a million lines of code.

The short version: if you can quickly diagnose problems and work out how to get them fixed, it doesn't matter how your tests are structured - although it helps if someone else can learn to diagnose and fix problems quickly too.

I assume the OP intends for "verification point" to mean a condition whose evaluation determines whether a test fails, e.g. what test frameworks often call an "assertion".

When deciding whether the code under test is ready, all you may care about is a binary answer to the question, "Does every test pass?". For all other times, your tests serve two purposes, which I will call verification and diagnosis. Verification is checking whether the software meets its requirements. Diagnosis is narrowing down the specific conditions that cause a failure.

This duality is often a source of tension between testers and developers. If a tester does nothing else, they must decide whether a product passes tests. If a test fails, the developer, if nothing else, must change some code. In between those two events, someone must determine which of the many circumstances leading up to the failure were necessary and sufficient. Armed with that information, the developer will think about how those circumstances map to code. And before that, someone may need to think about whether those circumstances happen often enough or are important enough to justify making changes.

Browse the bug tracking system of any product and you will likely see evidence of the tension between verification and diagnosis. A developer bounces a bug back to the tester because the symptoms are "too vague" or "not specific enough". A tester, having spent a great deal of time reproducing a problem, wonders why a lazy developer can't diagnose a problem for themselves.

So how does this apply to automated tests? Using multiple verification points in a single test increases its diagnostic value. If I need for f() to return true, g() to return 1, and h() to return 3 in order for a test to pass, the developer will appreciate knowing which of those three conditions failed.

There are other reasons for using multiple verification points. Sometimes a test will include "sanity checks": conditions that might point to bad assumptions in your test (e.g. availability of a resource) rather than problems in the code under test. You may also use multiple verification points because the setup expense justifies performing multiple tests at the same time, although doing so may reduce a test's diagnostic value. (In reality, we make this compromise every day; most of us do not rebuild our entire test environment for every test.)

Another reason for incorporating several conditions into one test case is simply shortening runtime.
In our system setup and teardown of each test can take more than a minute or two each, when multiplied by 100's of test cases this adds up to a significant number. When I say "setup and teardown" I refer to test framework time itself so it can't be skipped by doing it once for a bunch of tests (distributed test framework over slow networks and slow servers).

Personally (and I'm fairly convinced that this the right approach) I'd always hit only one test condition with a Unit Test. Always without exception.
For integration and acceptance tests I aim to hit as few test conditions as necessary to exercise the component or functionality under test.
This generally doesn't mean that the checks have only one test condition and its possible that the right number of test conditions could actually be many (especially for a large test) but increasing numbers of test conditions in higher level tests can cause automated tests to become brittle and crucially, can lead to failures being more difficult to interpret.
To focus my mind when writing these higher level tests, I keep in my mind the question 'what am I actually trying to check?'. If there are test conditions that are important enough to add to the automated checking but not directly relevant to what am I currently writing Acceptance tests, then I use these as the basis for new, separate test cases.