Tutorial how-to conduct usability testing

General Construct – Moderated Usability Study

In a moderated usability study, there is a moderator (you) and a participant. The moderator guides the process and collects feedback. It could be remote or in person.

Step 1: Goal

Before starting the test, set the ground rules. Be clear what is the goal of this project.
Tip: Consider asking yourself or the stakeholders the following questions:

What is the business/usability goal?

Why are we testing this?

Who is the audience of the test?

What are the key parts to the test?

Where is the product in terms of development cycle?

Why and what to talk to stakeholders?

Understanding what is the cost involved with the study

Once you have that sorted out, you will be able to assess the following:

How much time will the session take?

How much will the participant be compensated?

How many participants do I want to test?

Tip: The current practice is to test 5 to 8 participants for a study. Research shows that testing 5 participants reveals 80% of the problems whereas 9 participants reveal 95% of the problems. Also, try to limit your study to an hour or less.

Step 2: Recruiting

Depending upon the scope and intention of your project, you need to select the right participants. You may want to profile your participants based on:

Drupal experience: Beginner, Intermediate, Expert user

Role: Content Creator, Themer, Site Builder

First time user or Experienced user

E.g.: You are testing a new design of an existing module (say Feeds). You would want to test the ‘site builder’ who is the first time user for the new design but has used the old design before. So, you are looking for participants who are 'site builders', have used Feeds before and haven't seen the new design of Feeds.

Where could you find participants?

Consider recruiting friends, family and colleagues if they fit the profile

Step 3: Preparing for the study

Now that we have figured out so many things, here comes the exciting part. Working on the structure of the study: what are we going to ask the participants.

A typical study normally consists of three parts:

Part One: Pre Session questions
These are the questions you ask the participants before diving them into the tasks.
Tip: Some typical focus areas that you may consider depending upon your goal of the test:

What does a participant do with Drupal?

[For a new feature or a new user] What is the participants’ expectation with a feature like that?

[For a new feature or a new user] How does the participant expect something like this to work?

[For an existing feature or an existing user] How was the participants’ experience? What were their pain and pleasure points?

[For a new or existing feature] What are the most important things that you would like to do with this feature?

Part Two: Tasks
This is the meat of the study. Consider allocating a major chunk of the study to this. The tasks should be written in an unbiased and clear way.

How do I select my tasks?
The basic product tasks you would want to focus on:

Tip: Avoid using the same terminology that is used on the interface. E.g. If you are testing the “Modules” page, you want to not use the word “Modules” in the task.
At the end of the task, it is also helpful to ask the participants “How do you feel about doing this task?” to gather an overall sense of their experience.

Part Three: Post Session questions
This is the time for you to ask them their overall experience with the feature and how does it do overall. Asking participants to rate their experience is of great value (Some of the rating parameters are: Effectiveness, Efficiency, Satisfaction, Ease of Use, Value, etc.)
Tips: Typical post session questions look like:

“How was your overall experience?”

“What are the things that you liked the most and the least?”

“If you have to rate this feature in terms of ease of use on a scale of 1 to 5, where 1 being completely unusable and 5 being completely usable, how would you rate it?”

For samples of usability test sessions, and test plans refer to ‘Resources’ section at the end of the document.

Step 4: Conducting

Note: Consistency is the number 1 thing to remember while testing. You want to ask the same tasks, in the same order, in the same words.
As your greet the participants, manage the participants’ expectation from the test. Inform the participants how long is it going to take, think aloud, be candid about their comments (good and bad), that you are neutral observer and most importantly we are here to evaluate the software and not them.

A typical participant briefing looks like this:

Welcome, my name is and I’m a .

Thank you for taking the time to participate. This means a lot to us.

Here is the basic structure of the session: session will be a few questions, a series of tasks, then a few post-session questions.

Your comments are what is very important to us – we ask that you give open and candid opinions, both good and bad

Ask for clarification if needed, I will be neutral throughout the test

Keep in mind we are testing the software, not you

Try to complete the tasks as if you were doing this for real. Spend as little or as much time as you normally would doing these tasks. It is ok if you cannot complete each task, and we may not get to every task

We will be recording the session, so that might share anything relevant with our community on Drupal.org.

We just record our audio and the screen, so use mouse when describing an area of screen.

Do I have your permission to record this session?

We may post the highlights from this conversation online. Your name or any other personally identifiable information will not be associated with the data.

This should take about minutes. The time is now ___ so we should be done around ___.

Any questions before we begin?

Moderating a usability study
They say moderating a usability study is a skill! But with right skills and a few pointers we guarantee that you will be able to extract the data you need.
Here are some of the tips to consider:

Respect participant’s rights

Ensure participants’ physical and emotional comfort

Make a connection with the participant

Minimize interruptions

Be unbiased

Watch out for signs of extreme frustrations

Don’t give away information inadvertently

Give assists, if needed (and note when you do)

Most importantly, listen to the participant. Let them talk. You should talk only when needed.

Let’s take a look a few examples and see what’s wrong with them?

Taking Notes
You could take notes during the session or visit the recordings and take notes later. Choose a way that is most comfortable to you.

What to note?

Note everything that the participant is doing – where does the participant go, what does the participant say, note quotes and timestamps for relevant things. Also, look for verbal cues and facial cues (if in person).

Remember, while taking notes: Refrain from judging what is an issue and what is not. Doing that while taking notes adds to the ‘note taker’s bias’. See yourself as a scribe, taking notes without processing the information. This method helps to collect more and close to real data.

Step 5: Analyzing

Now that you have done all the work, it is the time to make sense of all the data. Each one of us has a different style to analyze data. Choose a style that suits you.
Tip: One possible method is:
Browse through the notes
Categorize the notes (into a spreadsheet, may be) into positives, issues, and observations by participant
Go through all the participant notes for a particular positive, issue and observation and note how many times the issue was encountered. In other words, look for patterns
After the draft priorities, associate every issue to a severity of the problem and the impact/frequency of the problem
How to rate issues?
Severity / Attributes

Refer this document for more information -> insert document (Sent email to Bojhan)

Step 6: Reporting

This is nothing but writing of what you have done, namely:

Sample Report can be found in the resources section.

Resources

“Rate the degree to which the software you just used has no performance problem”
Don’t do this. By saying “no performance problem” you are biasing the question and insisting the participant agrees that there is no or little problem. The right way to ask would be
“How would you rate this performance of this software on a scale of 1 to 5 where:
1 – Very bad performance
2 – Bad performance
3 – Neither good nor bad
4 – Good performance
5 – Very good performance

“Are you interested in obtaining another degree in the next 10 years?”
Don’t fall into this trap. Same reason as before, it is almost a rhetoric question. It feels like you want the participant to say “Yes”. You could eliminate the bias by asking:
“What are your thoughts on obtaining another degree in the next 10 years?”

“About how many times in the last year did you use online ‘Help’? ______ Number of times.
Nope. Nope. The first assumption this question makes is that you have used online ‘Help’. Also, it makes the assumption that you do remember how many times you have used it. Participants are more likely to pick a number that is far from accurate. The goal of asking a question is to get most accurate information. Consider rephrasing the question as:
“Have you or have you not used online ‘Help’ in the last year?”
“[If the participant says yes] How many times you think you used it in the last year?”
• 1-3 times
• 4-10 times
• 11-20 times
• 21 or more
• I don’t remember

Let’s take a look at one more example: How often did you use Product X during the last month? (Check one answer)
• Never
• Rarely
• Occasionally
• Regularly
Seems all right, right? But there is a subtle fault. The options are vague and mean different to different people. Try giving specific range for better accuracy. Consider the options as:
• 0 times
• 1 -3 times
• 4-10 times
• 11-20 times
• 21 or more times
• I don’t remember

•Severity scale

CRITICAL: Usability catastrophe; imperative to fix this before release

MAJOR: Major usability problem; important to fix, high priority

MINOR: Minor usability problem; low priority

NORMAL: Cosmetic problem only; fixed if extra time is available

•Attributes

Frequency: How often it occurs?

Impact: Level of difficulty for the users to overcome the issue

Persistence: Can user overcome the issue if it persists?

Market Impact

Remember an issue is not measured only by it’s frequency but also by it’s impact. In other words, if all the users face an issue, it is a problem but you need to gauge how much does it affect from getting the task done. As oppose to even if one user faces an issue but that dramatically affects the user from getting the task done, it is an issue and may be a critical issue. It might not affect all the users but affects a small subset of users catastrophically.

It is also recommended to take a step back and think of the design not in terms of granularity of the issue but hunt for overarching issues. Sleep through the data for a while, refocus the problems to see a bigger picture. Think of yourself as a storyteller, tying to piece all things together and keeping the reader engaged.

Positives
Issues (sorted by severity of the problem)
Observations

Goals: Why did we test?

What was the methodology of the study?

What were the participant demographics?

What is the executive summary? (This is a few paragraphs which in a gist explains everything about the findings of the study)