Just to be clear, I will not accept Jakob Nielsen's Alertbox, March 19, 2000: Why You Only Need to Test with 5 Users without solid empirical data to back it up. I find it hard to believe that in 10 years no one has ever tested Nielsen's claim in a meaningful way.

Further, I find statements from Nielsen like this troubling at best: "The curve clearly shows that you need to test with at least 15 users to discover all the usability problems in the design."

Translation: Test 15 people and you will find 100% of the usability problems. (yeah, right!)

Your translation is incorrect. "Test with at least 15 users" means that you will need at least 15 people to discover all the problems, not that you will discover all the problems with 15 people.
–
Rahul♦Oct 25 '10 at 14:37

Rahuls interpretation is the right one. The statement is made in the context of his explaining why he advocates testing with fewer people - "Since you don't get near finding 100% of the problems until you have fifteen people, why would I do it with fewer?"
–
DJClayworthOct 25 '10 at 16:54

@DJClayworth: You're wrong. The sentence you quote doesn't prove anything, and it's not even present on the page in question. In fact, he SAYS to do the test with fewer ppl.
–
blundersOct 25 '10 at 17:44

Thanks, that makes a lot more sense. Even after reading that and going back to Nelson's to see if that was there all along, it doesn't seem like it. Nelson's wording is awkward at best as it relates to the 31%, which is key to understanding the claim being made. Again, thanks!
–
blundersOct 25 '10 at 13:00

While the article is good, it's quite possible that it will disappear with link rot, so it would be good to highlight the key points of the article in your answer. As it stands, this is a comment, not an answer.
–
zzzzBovMar 7 '13 at 22:30

@VirtuosiMedia: Although some testing is better than not testing at all, Neslon's claim that 5 people are able to find the majority of issues makes no sense mathematical.
–
blundersOct 25 '10 at 2:10

1

His number was based on testing. He didn't arrive at it through a formula.
–
Virtuosi MediaOct 25 '10 at 2:17

@VirtuosiMedia: Yes, I know, but that doesn't mean it's correct. Have you ever heard of a sample size for any test being just 5 people. Neslon just doesn't say 5 is better than none, he says that more than 5 is a waste. Makes no sense how a sample size that sample would work to find 80% of the issues - it just does not add up.
–
blundersOct 25 '10 at 2:55

4

@blunders You're missing Nielsen's point. His point is not that N number of people is the exact amount to discover some percentage of problems, but that there is a negligable return on investment with each additional tester beyond 15. Hence, his advice is to use 3 sets of 5 people, for reasons described in his article (such as smart iteration, budgetary concerns, and pragmatism).
–
Rahul♦Oct 25 '10 at 14:42

@Rahul: I got his point regarding diminishing returns, what I missed was the meaning of this sentence: "L is the proportion of usability problems discovered while testing a single user." - which is the key to understanding what he's saying. That is IF all users being tested each find 31% of the usability issues, after 5 users you will have 85% of the existing issues. To get the next 15% you must expose 10 more people to the same test, which is not worth it in most cases. The whole issue is IF a user on average really does find 31% of the existing issues.
–
blundersOct 25 '10 at 15:51

Having just completed two rounds of user testing, each with 15 users in them, my answer is 15.

Not because Neilsen says so. It was from a practicality point of view. Some testing was necessary, resources and time lines were tight so a larger group was not feasible.

Did we uncover all issues? Absolutely not. Did we get some support for the basic approach and for the key user tasks? Absolutely.

What's actually more important is a rolling programme of user testing, done repeatedly with a small number of users. People's expectations and behaviours change over time as they use a product and other products that might be similar or a direct competitor.

A user test is just a snapshot of a single point in time. A second user test is not an evaluation of any change its just another snapshot. String 5, 10 or 20 rounds of user testing together over a couple of years and you start to get a better understanding of your users.

In The Trouble with Computers, Thomas K. Landauer (1996) elaborates on the minimum number of users in evaluations. As far as I remember, he actually backs up his results with formulas and explanations.

I don't have the book with me right now so I can neither give you the results nor the calculations :( However, the book was littered with good advice and interesting research so I highly recommend it to anyone interested in the field of usability and empirical user evaluation. I will update this answer when I get home. In the meantime I suggest you go to the library and pick it up too, @blunders.

@jensgram: Landauer's book is basically the same research, did just look at the book though... :-) ...thanks for bring up his book! There is way more data and charts in the book, especially in reference to the cost-benefit-ratio of user testing.
–
blundersOct 25 '10 at 11:56