Heuristics as an Aid to Training a Usability Evaluator’s Expertise

(This is unmarked coursework, part of my HCI course at UCLIC and is released with permission from Prof. Ann Blandfod. Since this is not evaluated and hasn’t gone through any sort of peer review process, it will most certainly contain errors.)

Heuristics as an Aid to Training a Usability Evaluator’s Expertise

Vishnu Gopal

University College London

It is clear that heuristic evaluation as Nielsen envisioned it is a method meant for experts (Nielsen, 1992). Heuristics do not stand alone, and have to be moulded to fit any particular scenario: the general set of heuristics have been expanded into specific guidelines for different kinds of activities: accessibility, internationalization (Gonzalez, Granollers, Pascual, 2008), etc. are examples and these require evaluators who are trained and experienced in separate spheres. Experimental data also seems to suggest that the more experienced the evaluators, the more usability errors they find (Dumas and Redish, 2002). Given this scenario, I seek to explore if heuristics or other similar guidelines can serve as a tool to strengthen a beginning evaluator’s “experience”.

Without doubt, applying heuristics to usability evaluation gives a methodical structure to the task of analysing potential faults in a system. During a recent analysis of e-commerce websites, one major observation that I made was that without rules, it’s easy to miss the forest for the trees—i.e. one might speculate on possible faults: for e.g. the links on the right hand side navigation bar was not prominent or relevant enough, or the visual design was not attractive; but fail to gather the data into meaningful coherent suggestions. The aim, after all, of a successful usability evaluation is to find ways to rectify potential faults. When pitted against an evaluators raw instincts then, following a set of guidelines acts both as a reasonably exhaustive search space and a framework for assessing faults that have been found. It can also be argued that using a set of guidelines methodically can sensitise an evaluator to common errors.

On the other hand, I also observed that there might be faults found more easily not from a strict adherence to guidelines, but from an evaluator’s own prior experience. During the usability evaluation activity, my companion who is a trained visual designer found that much of the website’s apparent clutter was due to it not following a coherent “grid system” (see Chang, Dooley, Tuovinen, 2002). While it might be easy to slot this into either guideline 4 (consistency) or 8 (aesthetic design), it doesn’t cleanly fit into the heuristic framework provided, but is a crucial criticism nevertheless. I suspect that a strict adherence to guidelines without a broader background might harm rather than help a beginning evaluator’s progress, but this requires detailed investigation. It could also be too easy to be trained to look into a series of specific and common problems rather than try to evaluate a system based on its intent.

This is further imperilled by the fact that the minutiae of specific guidelines change often. An example of this debate is how the specific recommendation relating to websites displaying content above the fold (the initial viewable area) changed from 1994 to 1997, a short span of three years (Nielsen 1997).

When compared to other methods of user testing, heuristics pale further in this regard. They remove a vital component from usability evaluation: the serendipity (Stoskopf 2008) that observing a user adds to training an evaluator’s instincts. This is especially important in a field like usability evaluation where observation of real users continues to be stressed (Petrelli, Hansen et. al. 2004), and rightly so, for HCI evaluation has its roots in cognitive psychology and that is a science yet to attain adulthood (Miller 2003).

It would also be instructive to observe how HCI (and accordingly usability evaluation) is taught in University courses worldwide. Saul Greenberg of the University of Calgary remarks that a “fundamental tenant of HCI is that end-users should play an integral role in the design process” and that “performing usability studies in class hammers home the relevance of evaluation”—indeed his course description (Greenberg 1996) is filled with references that directly involve users in class. Interestingly, the course is structured so that “Designing Without the User” is a later event: where lessons learnt from these evaluations are then integrated to try to formulate a theory of user behaviour.

Chan, et. al. exploring issues integrating HCI in master-level MIS programs also stresses the emphasis on users and “empirical testing” and recommends a curriculum that largely ignores heuristics. Faulkner and Culvin in “Integrating HCI and Software Engineering” condenses it well and also explains a crucial difference with software engineering:
“Some HCI practitioners seem to believe that if HCI can be reduced to guides and checklists that anyone can apply to anything, then all will be well. This is tantamount to designing HCI out of software engineering as it is providing rules to be followed without the requisite theoretical under-pinning. Students trained in this way will be chanting mantras and will be woefully unable to deal with problems that have not been solved elsewhere or are not covered by style guides and checklists. Software engineers on the other hand are either keen to embrace these checklists or are unwilling to accept that the age of users having to adapt themselves to systems has gone. Users want systems to work for them and not the other way round.” (Faulkner & Culvin, 2000)

Furthermore, in a study examining how guidelines and patterns might be effective in HCI teaching, Hvannberg et al. “found very little hard evidence” supporting the importance of using patterns or guidelines in HCI teaching. However, they also noted “a desperate need to conduct studies on a suitable scale on the use” of patterns and guidelines in teaching HCI concepts.

There is no doubt that Nielsen’s basic heuristics have stood the test of time as a way to find usability errors. However, as a tool to train a beginning evaluator, they should certainly be supplemented by other evaluation methods.
(1045 words).