Our Approach to Usability Testing.

At Mäd, we spend a lot of time testing the various products that we build on behalf of our clients, and we spend lots of time thinking of the best way to go about it. Here we share our best practices and why usability testing can be somewhat counterintuitive.

Why Do Usability Testing?

“Most people make the mistake of thinking design is what it looks like. People think it’s this veneer – that the designers are handed this box and told, ‘Make it look good!’ That’s not what we think design is. It’s not just what it looks like and feels like. Design is how it works.” – Steve Jobs

Many people think that usability testing is something that needs to be conducted for only the largest and most important of projects that have huge budgets and long timeframes. While we work on quite a few of these, usability testing is actually something that can (and should!) be integrated into projects of any size, because it can dramatically improve the quality of the product with little effort.

In fact, we would go as far as to say that elaborate user testing is actually a waste of resources, and that doing lots of small tests with a handful of users is best.

The first thing to note is that products are designed by designers, and designers, however strange they may be, are indeed fallible humans - even the ones that we hire at Mäd!

They use their experience and intuition to design what they believe is the best solution to help a user accomplish a certain task, but the key thing to remember is that often the people designing the product are not part of the same target market as the end-users, and while we work hard at Mäd to develop empathy and understanding with our clients' end-user, nothing beats actually going to speak with them!

So design flaws happen, and usability testings ensure that these become apparent and don't make it to the final product.

We don't ask user what they want, you ask them to complete a task. If they cannot or find it difficult, then they need to explain themselves. The solution is not given at the start.

How We Conduct Tests.

To ensure that the test results are valid, it is important to be diligent in conducting the tests in an impartial way.

The first thing we must do is pick our test users. For B2C products that have a wide target audience, we conduct what is called "Guerilla Testing", which means we simply approach people in the street or at coffee shops, and ask them to try out our product.

Of course, this isn't a great approach for all products, and in the cases where the target audience is far more specific, we then ask our clients to make introductions to their existing customer base, so we can meet them and test the new product.

The key thing about soliciting feedback from users is to ensure that they feel comfortable in saying what they actually think. This is why we keep interviews quite informal and only bring a small team so that users do not become nervous. We explain that the goals are just to test one version of the product, and that we are nowhere near finised, to ensure that the user doesn't feel bad about voicing any criticism they may have.

We speak with curiosity, asking the users to complete certain tasks, but we are careful never to ask leading questions or give users solutions, or even ask them what they think the solution should be.

“If I had asked people what they wanted, they would have said faster horses.” Henry Ford

We also ask users to speak out loud as they use the product, so we get real-time feedback. This is useful as there is little chance of justifying past actions incorrectly once they see the solution.

While we will discuss plenty in the next section on the numbers that go along with testing, it is also important to note that the focus is not entirely on the numbers (i.e. 95% of users can complete this action easily), but also on the feelings that the users experience while using the prototype.

How Many Users to Test?

Five is the Magic Number.

In essence, lots of small tests with five users gives you the best return on your time and money. Any more testing after this and you're getting diminishing returns for your investment of time and effort.

This is because of the following equation:

N (1-(1- L ) n )

Which shows that the number of usability problems found in a usability test with n users.

The "L" value is the % of problems found while testing with a single test user. On average, this hovers at around 30% for your average product.

The maths is a little complicated, but it's to do with binominal probability, also called the Poisson Distribution

It is much easier to understand if we plot this out in a graph, we will see the following curve:

The most important point of this graph is that if you do zero usability tests, which happens in a surprising number of projects around the world, you learn absolutely zero. That's not surprising, but then why do so many companies avoid usability testing?

As soon as you collect test data and listen to a single user, you already learn around 30% of everything you need to know.

This trend follows this pattern:

2 test users: 50% of problems

3 test users: 65% of problems

4 test users: 75% of problems

5 test users: 85% of problems

6 test users: 90% of problems

8 test users: 95% of problems

12 test users: 99% of problems

If you are testing more than 12 users, you're just being stupid and wasting your time...you have already learned everything you need to know.

The important thing to note here is that you are not testing your product with just five people, but five people per feature or flow. This means that if you have six features you need to test, you'll actually be speaking to around thirty test users.

However, the curve shows that if you speak with 15 users, you'll pretty much learn everything there is to know, so why are we at Mäd recommending testing with only five users instead?

Well, it comes to budget and constraints. No project is without a timeline and a budget, and we recommend that it is better to run lots and lots of smaller tests instead of blowing budgets on one larger test.

In fact, we can actually apply the above curve to the number of different tests run instead of the number of user tested, and you can see that you want to run at least five different feature tests to get the best results on testing the entire product overall.

Even if you have more than five features in your product, what you will learn from testing five features will most likely carry to all the remaining untested features, and so this means we can build a high-quality product on-time, and on-budget.

We also have to remember than when we do usability testing, we are not just looking to document the weaknesses in the design, we are actually going to edit the design of the product to solve the problems, and then test that design again. We wouldn't just want to assume that we have fixed a problem that came out during the usability testing, we need to test it!

So, it is much better to turn three studies with 5 users each and go back each time and change things, than to do one massive study with 15 users.

When to test more.

If you have more time, or you are designing something incredibly mission critical, testing up to 12 users per flow might be recommended.

When your "L" value is low, you'll need more user tests to ensure that all the subtle problems come up. Imagine if you have a really polished battle-tested product that you are changing up, then perhaps a 31% average problem frequency is actually far too high, it may be as a low as 5%, which means you would need to test more users to find these subtle problems.

And, of course, it is actually impossible to know upfront the percentage of average problem frequency without prior testing, so the best way to do this is to estimate based on the complexity of the tasks presented to the test users and the maturity of the product.

If the product is a redesign or has not yet been used by the public at large, you can be quite safe that you will have a 30% to 50% average problem frequency, which means 5 test users per feature is plenty.

Conclusion

We believe that informal tests with five users per feature is the best approach for the vast majority of products. This allows us to keep our processes simple, move fast, keep costs low, and iterate quickly to benefit our clients.

We always include usability tests in all our projects to ensure that we can build the best possible products.