How do users learn and use software applications? An introduction to mental models

As your users learn to use your software application, website, or device, they gradually form a mental model of how it works and how to operate it.

A mental model is a conceptual representation, in a user’s mind, of how a system works, and how to operate it. A user’s mental model reflects the user’s current understanding, and is subject to change as the user gains experience with the product.

When faced with a new situation, users rely on their mental models to reason about the situation and the system, and to make decisions and formulate strategies on how to proceed. Users also form expectations based on their mental models.

But mental models are not always correct representations of how a system works, and the mismatch between an incorrect mental model and the system can explain many usability problems. It’s important, therefore, for a system to be designed in such a way as to help users form a correct mental model of the system’s operation.

Mental models are cognitive structures in peoples’ minds. It’s hard to say that mental models have any particular form or structure. A mental model is not inherently visual, although visual images do form an important part of a mental model.

Let’s look at a number of things that I believe make up a mental model for operating a software-based system.

What does a mental model consist of?

1. General appearance

Users will mentally form visual images of the system, or at least the parts of the system that the user has encountered and is familiar with. But these mental images are typically very vague and imperfect.

For a simple physical device, the mental visual image may be quite vivid: I can quite easily picture a screwdriver or a can opener, for example.

For a typical complex software application, users will become familiar with the general layout of the screens, pages, tabs, or windows that they encounter. The level of detail of mental images will vary depending on each user and the frequency of use.

For example, as a frequent user of Microsoft Word, I have a vague image of the layout of the main window in my mind, though I wouldn’t be able to recall the exact sequence of icons in the toolbar or even what pull-down menus exist after “File” and “Edit”. I know some of the dialogs like Font, Find/Replace, etc. well enough that, even if the text of the labels and buttons were blurred, I could still recognize the dialogs by the “shapes” of their layouts. But my recall is not good enough to be able to sketch them out accurately.

2. Concepts, vocabulary, and rules

In general, every software-based system or product solves some sort of problem or issue. The concepts, vocabulary, and rules involved in the context of that problem are referred to as the problem domain, the business domain, or the application domain. A bank uses specialized software to run its operations, and so the application domain of that software system is banking.

For some systems, the application domain is relatively small. The operator of an e-mail client only needs to understand a handful of concepts, like e-mail addresses and attachments. Other systems will demand in-depth knowledge and understanding of a domain. Imagine what a master operator in the control room of a nuclear power plant needs to know!

Some applications, like the nuclear power control system, can safely assume that the user already has the prerequisite knowledge of the domain, whether through education, training, or experience, or some combination of these.

Other applications have the responsibility of communicating their unique concepts to the user. Games, being “imaginary worlds” rather than real-world problem-solving tools, are an extreme example of this. The first time you play, say, Angry Birds, you need to learn what objects are in the game and how they interact; in other words, what the basic rules of the game are.

Or, take Twitter as another example. To use Twitter, you need to understand what a “tweet” is, and you need to learn that you can follow other users and that other users can follow you. If you’ve never used Twitter before, you might learn these concepts by reading the introductory instructions on the Twitter website, or you might figure them out yourself by trying it out (self-discovery). Or, in the somewhat unique case of something as popular as Twitter, you might learn the concepts “accidentally” by watching a friend use it or by hearing about it in the media.

In many cases, users can grasp concepts without knowing the associated vocabulary. For example, web browser users can enter website addresses without knowing that the addresses are technically called URLs.

Additionally, users often don’t need a full understanding of many concepts if the software handles the details for them. Users of a shipping postage calculator may only need of know of a customs duty fee; they do not need to know the rules and technical details, as they will trust the software to calculate the fee for them.

In many cases, users who aren’t aware of all of the application’s concepts, or don’t understand them completely, are still often able to use the application effectively, if not optimally. Virtually all beginning users of Microsoft Word are unaware of the concept of styles, for instance, but they are still quite capable of producing documents. We’ll see later how we can design applications to help users discover and learn key concepts.

3. Navigation map

Most software systems consist of different “rooms” or “places”, in the form of pages, screens, tabs, or windows, which the user can “visit”. When it is necessary for the user to differentiate between these places and to be able to get to them quickly, the user will gradually form a mental navigation map indicating how to get to the different destinations.

Navigation is often one of the steps needed to carry out a task in an application. For example, to purchase goods on an e-commerce site, you may need to navigate to the “shopping cart” page and then click on a “checkout” button, which may lead to a sequence of pages for finalizing the purchase.

Sometimes there may be more than one way to get to a location. For example, to get to the Print dialog box in most Windows applications, you can navigate to the File menu and choose “Print…”, or you can use the shortcut Ctrl-P, or you can click on the printer icon in the toolbar. The user may not be aware of all ways to navigate to a destination, and users aware of multiple options will tend to use only one of them frequently.

4. Action plans or strategies for accomplishing tasks or for reacting to situations or problems

Users may memorize plans of actions needed for carrying out certain tasks.

An action plan might take the form of a simple sequence of steps to follow. Or, with sufficient experience with the product, users may internalize a conceptual structure similar to a flowchart diagram that has various decision points and branches with steps to follow under different circumstances. (But note that most users will not actually have a visual depiction of a flowchart in their mind, and keep in mind that the structure may not necessarily be complete or correct.)

Sometimes a user may not necessarily understand why a certain sequence of actions performs a particular task, but they’ve still memorized the sequence and are able to reproduce it. This can happen when the user has been taught how to perform a task in a training session, but some of the fundamental concepts (the “why” behind the actions) haven’t been explained. It can also happen when the user has discovered by accident how to perform a task.

5. General heuristics and conventions

The user’s mental model may include general heuristics and conventions from a broader context that happen to apply to the system at hand. For example, based on the user’s experience with the operating system, one such heuristic might be, “to dismiss a dialog box, click on the ‘OK’ button or click on the ‘X’ in the title bar”.

6. Implementation model

A user may or may not form an idea of how the product works internally.

For simple mechanical devices and machines, you might be able to see all the moving parts, and you can mentally envision how the parts interact when the device is in operation. If you examine a manual can opener, for instance, you can see how the edge of the can is pinched between the wheel and the blade, and you can imagine how turning the handle slices open the can.

For simple mechanical devices, seeing and understanding how the parts work can be helpful and may even be necessary for operating the device correctly.

But for more complex mechanical devices, like the engine of an automobile, the inner workings are often too complicated for non-engineers to understand – and so such machinery is tucked out of sight. Automobile operators are offered simplified and abstracted controls – like the gas pedal and the automatic gearshift – which eliminate the need to know how the engine works. In other words, the user’s mental model of the underlying implementation can be extraordinarily simple (basically: “the engine burns gas to run, so I need to make sure there’s still enough gas in the tank”). The user’s mental model can instead focus on the actions needed to make the car move: put the gearshift into “Drive” and push on the gas pedal.

For software products, designers need to hide the internal workings to the maximum extent possible. While technically-sophisticated “power users” might try to envision how the underlying algorithms and data storage and communications protocols work, users should never have to know about the technical implementation details.

But even if users are “perfectly” shielded from unnecessary technical implementation details, they will still often be able to observe patterns in how the system operates and responds to inputs. From these observations and patterns, users will form simple implementation models, and implementation models on this level of abstraction are a good thing.

Let’s say your application has an on-screen table containing a list of contacts, and there is an “Add” button to let the user add a contact to the list. The user observes that every time a new contact is added, it appears at the bottom of the list, below the other entries. Based on this observation, the user will tend to presume that the contacts are maintained in a sequential list, and new contacts are always simply added to the end of the list (rather than being added at the top of the list or being inserted at appropriate places in order to maintain alphabetical ordering or some other sort order).

This is a very simple and abstract form of implementation model, but it helps the user predict what will happen when the action of adding a new contact is performed, and when the user encounters another on-screen table, they will assume that similar behavior will apply there as well.

What’s next?

In upcoming posts, we’ll discover more about mental models by exploring questions such as:

How do users explore and try out products, and how does this impact how mental models are formed?

How do mental models change and evolve as users continue to use a product?

How do mistakes and forgetting impact mental models and the user’s experience?

How do the mental models of beginners, intermediates, and experts differ?

How can we design systems and interfaces so that users can form a mental model that matches the designer’s intended mental model?

My book, Designing Usable Apps, is now available!

Welcome!

Hi, I'm Kevin Matz, founder of Winchelsea Systems Ltd. and creator of the ChapterLab word-processing app. This is my blog about usability and UX design for websites and software products. Let me know what you think!