Industry case studies are required to assess the suitability of a software technology to the industrial environment. Therefore, this paper presents two case studies carried out in partnership with two companies of an Industrial Park, in order to evaluate and improve the WDP-RT (Web Design Perspectives-Based Inspection - Reading Technique), a usability inspection technique specific for Web applications. The obtained results indicate the adequacy of the WDR-RT technique to the cycles of a Web application development in an industrial environment.

Web Applications are interactive, user-based, and hypermedia-based applications in which the user interface plays a key role (18). According to Offutt, one of the three quality criteria on the dominant Web development drivers is usability (2). ISO 9241-11 (12)defines usability as “the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use”. Usability is considered a fundamental factor of Web applications’ quality because users’ acceptability of Web applications seems to rely strictly on the applications’ usability (16).

As in conventional software applications, the quality of Web applications must also be assured. In order to take into consideration the specific aspects of Web applications, some extensions for current evaluating techniques have been proposed (20). However, most Web development companies are not applying these new methods (11). Thus, it is important to investigate the reason why these new developed technologies are not being transferred to the industry. Some reasons may include: no knowledge about these techniques; or the presumed cost of their use. Also, it is important to examine how many of these new proposals are really based on scientific principles and securely transferred to the industry.

Experimental studies must be performed in order to improve the credibility of the research in Software Engineering, disseminating to other researchers the knowledge obtained in the execution of the experiment [3, 30]. The main goal in executing experimental studies is to construct a knowledge base grounded experimentation that identifies the advantages and the costs of the several proposed techniques and supporting tools in Software Engineering (25) .

There are many types of empirical studies that are useful in software engineering. According to Shull et al.(24), the first empirical study to be executed when evaluating a new technology(in this context, the term "technology" is used as a generalization to procedures, tools, techniques and other types of proposals in the Software Engineering area) is a feasibility study, aiming at comparative evaluation with other technologies. Subsequently, Shull et al.(24) suggest the execution of observational studies to improve the understanding and the cost-effectiveness of the technology; and the execution of case studies in real lifecycle to characterize the technology application during a real lifecycle. After we have some indication that the proposed technology is effective and fits in a real lifecycle, the next step is to use the new technology in an industrial setting. In vivo studies are studies that involve people in their own workplace and in realistic conditions (30). Case studies conducted in industrial environment are an important type of in vivo studies since they allow the analysis of a specific process in the context of a software lifecycle (24).This kind of study is important to the industry since it allows them to decide whether to adopt a new technology or not.

Thus, in this paper, we present a new Web usability inspection technique, known as WDP-RT (Web Design Perspectives-Based Inspection - Reading Technique) (8)(10) , as well as two case studies conducted to evaluate and improve this proposed technology.The WDP-RT inspection technique was proposed specially for evaluating the usability of Web applications. This technique aims at being employed by the software project stakeholders themselves when evaluating the software, allowing them to contribute to the software’s usability improvement. The WDP-RT was originally evaluated through controlled experiments. In our previous work (10), it was presented the quantitative results obtained from an academic study (in vitro) as well as one of the case studies performed in the industry. In this current paper, we present a new case study conducted in a different company. We also present the qualitative analysis for both case studies, comparing their performance. This qualitative analysis allowed a better understanding of the quantitative results and also helped to improve the WDP-RT.

The contributions of this paper are three-fold: (1)to present the proposed usability inspection technique, the WDP-RT; (2) to describe how the data about our case studies was obtained and analyzed, discussing the results of using the WDP-RT in industrial environments; and (3) to disseminate the knowledge about planning, executing, and analyzing case studies to support the improvement of new technologies in Software Engineering.

This paper is organized as follows. Section 2 presents the background of Usability Evaluation Methods, presenting some core ideas and important works related to Usability Evaluation. Section 3 presents the WDP-RT technique. In Section 4, we present our case studies and, in Section 5, we detail our qualitative analysis and we show how these results helped to improve the WDP-RT. Then, in Section 6, we show the improved version of the WDP-RT. At last, in Section 7, we present our conclusions.

2 Background

A Usability Evaluation Method (UEM) is a procedure composed of a set of well-defined activities for collecting usage data related to end-user interaction with a software product and/or how the specific properties of this software product contribute to achieving a certain degree of usability (5). General usability evaluation methods can be divided into two broad categories (16): (1) Usability Inspections - evaluation methods based on Experts’ Analysis; and (2) Usability Tests, in which empirical methods, observational methods and question techniques can be used to measure usability when users perform tasks on the system. When using Usability Tests, usability problems are discovered by means of observation and interaction with users, while they perform tasks or provide suggestions about the interface design and its usability. In Usability Inspection, the focus of our research, defects can be discovered by professionals (that can be experts) applying inspection techniques. Usability inspections are naturally less expensive than evaluation methods that involve user participation, since they do not need, besides the inspectors, any special equipment or laboratory (16). Different usability inspection techniques have been developed and used, such as: Heuristic Evaluation (17), Cognitive Walkthrough (19), and Usability Based Reading (33).

The Heuristic Evaluation, proposed by Nielsen (17), assists the inspector in usability evaluations using the set of proposed heuristics. These heuristics are a collection of rules that seek to describe common properties of usable interfaces. Initially, inspectors examine the GUIs looking for problems and if one is found it is reported and associated with the heuristics it violated. Afterwards, inspectors can rank problems by their degree of severity.

Polson et al.(19) propose the Cognitive Walkthrough, a method in which a set of inspectors analyze if a user can make sense of interaction steps as they proceed in a pre-defined task. It assumes that users perform goal driven explorations in a user interface. During the problem identification phase, the design team answers the following questions at each simulated step: Q1) Will the correct action be made sufficiently evident to the user? Q2) Will the user connect the correct action’s description with what he or she is trying to do? Q3) Will the user interpret the system’s response to the chosen action correctly? Each negative answer for any of the questions must be documented and treated as a usability problem.

According to Zhang et al.(33), it is difficult for an inspector to detect all kind of problems at the same time.Due to that, they proposed a usability inspection technique based on perspectives (Usability Based Reading - UBR).The assumption behind perspective-based inspections techniques is that, thanks to the focus, each inspection session can detect a greater percentage of defects in comparison to other techniques that do not use perspectives. In addition, the combination of different perspectives can detect more defects than the same number of inspection sessions using a general inspection technique (33).

Motivated by the Zhang’s et al.(33) results, Conte et al.(2) decided to investigate if the adoption of specific Web design perspectives would make Heuristic Evaluation (17) more efficient for usability inspections of Web Applications. Conte et al.(2) had identified design perspectives commonly used in Web development through a systematic literature review:

·Conceptual: represents the conceptual elements that make up the application domain;

·Presentation: represents the characteristics related to application layout and arrangement of interface elements;

·Navigation: represents the navigational space, defining the information access elements and their associations.

Conte et al.(2) proposed the Web Design Perspectives-based Usability Evaluation (WDP) technique, using the Web Design Perspectives as a guide to interpret Nielsen’s heuristics. Hints were provided for each related pair Heuristic x Perspective (HxP) to guide the interpretation of each heuristic from a perspective’s viewpoint.

In light of Web design perspectives, usability is defined as follows:

Usability related to Conceptual Perspective: relates to the clarity and the concision of problem domain’s elements. Under this perspective, the usability is satisfactory if different users easily understand the domain terms, which prevents mistakes caused by ambiguous, inconsistent, or unknown terms.

Usability related to Presentation Perspective: relates to how consistent the information is presented to the user.Under this perspective, the usability is satisfactory if the arrangement of the interface elements allow the user to accomplish his/her tasks effectively, efficiently, and pleasantly.

Usability related to Navigation Perspective: relates to different user’s access to the system functionalities. Under this perspective, the usability is satisfactory if the navigation options allow the user to accomplish his/her tasks effectively, efficiently, and pleasantly.

Table 1 shows the associations between heuristics with the three Web design perspectives: Presentation (P), Conceptual (C), and Navigation (N) in WDP (3). The correlated pairs of Heuristics x Perspectives are nominated with the Perspective’s initial (P, C or N) followed by the Heuristic’s number (e.g.: P1 – Perspective: Presentation, Heuristic: 1). For each pair Heuristic x Perspective, we pointed out verification items to guide the use of the heuristic regarding the perspective’s viewpoint. Figure 1 shows an extract of the WDP technique, presenting examples of verification items hints for P4 Pair.

The results obtained from the experimental studies seemed to indicate the WDP’s feasibility and its possibility to be more effective than, and as efficient as, the Heuristic Evaluation (17). Despite the feasibility of the WDP technique to detect usability defects on web applications, novice inspectors had some difficulties using the technique caused by lack of skills such as experience on usability and inspection, which can affect the outcome of the inspection (3).

3 Web Design Perspectives-based Inspection - Reading Technique

The Web Design Perspectives-Based Inspection - Reading Technique (WDP-RT) (8)(10) is a reading technique based on perspectives for the usability inspection of Web applications. A reading technique (29)is a specific type of inspection technique which contains a series of steps for the individual analysis of a software product to achieve the understanding needed for a particular task.

The WDP-RT is an evolution of WDP Technique and it was developed as a reading technique to be employed by inspectors with a low knowledge about usability. The main goal is for the project’s stakeholders themselves to use the proposed technique in order to evaluate the produced software, thus allowing them to contribute for the software’s usability. The WDP-RT development and performance evaluation was based in experimentations using quasi-experiments in academic environment and case studies in industrial environment.

In order to increase the coverage of the WDP-RT evaluation, beyond the set of items used in the WDP verification, we analyzed two sets of features to be considered in usability evaluations, the “non-functional usability requirements” (6), and the set of “functional usability features” (14).

The Non-Functional Requirements for any type of system are related to aspects of software, hardware or external factors, conditions or restrictions that determine the desired behavior of the system (21). Ferreira and Leite (6) proposed a taxonomy for the analysis of Web software usability, divided into two categories: requirements related to displaying the information and requirements related to data input. Figure 2 shows the proposed taxonomy.

The set of "Functional Usability Features", proposed by Juristo et al.(14), is the result of extensive research on usability features with significant benefits, according to the literature on usability. In this set, each feature directly related to the usability of the software functionality was incorporated as a functional requirement. Figure 3 shows the functional usability features proposed.

These sets of characteristics were analyzed in detail, considering its relevance to Web applications in general and checking each of its recommendations. An analysis of equivalence was performed between the recommendations sets proposed by Ferreira and Leite (6) and Juristo et al.(14) and the set of proposed items to be verified by the WDP Technique (2), based on Nielsen’s heuristics (17). An example of relationship between the sets of usability can be found among the items proposed by WDP for the heuristic Error Prevention (“evaluate whether the required data in the input data are clearly defined” and “consider whether the interface indicates the correct format for an entry-specific data”), the recommendations of non-functional requirement of usability Error Prevention (“guidance for correct input data”) and the recommendations of the functional usability feature User Input Error Prevention/ Correction (“report specific formats and sizes of the data”).

The relations between the sets of usability verification items were also important for the separation of the recommendations in each of the Web project perspectives. WDP items are already separated by perspectives, and consequently, the recommendations of the "non-functional requirements of usability" and "functional usability features" were related to the perspective of their WDP item associated. Subsequently, the recommendations that were not related to WDP items were analyzed by checking the types of usability problems that would be found applying them, and placed each of them in one or more perspectives. An example of recommended items not related to the WDP items is the non-functional requirement “Design regardless of monitor resolution”. After checking their recommendations and usability problems associated with it, it was created an instruction in the Presentation perspective for its evaluation. Table 2 presents the usability recommendations unrelated to the WDP that were incorporated in the WDP-RT instructions.

Besides these recommendations, it was also incorporated suggestions for international interfaces, based on the non-functional requirements for international interfaces (7). Based on these requirements, it was added to the WDP-RT:

Logic flow of information and reading direction; and

Use of visual resources compatible with the local culture (acceptance of images and symbols).

Overall, the recommendations incorporated by WDP-RT are included in the general context of the Nielsen heuristics. For example, an inspector could associate problems of customizing the web application to the heuristic "User Control and Freedom" or associate the lack of guidance for step by step executions to the heuristic "Help and Documentation." However, it is difficult for an inspector, especially a beginner, to get this level of abstraction. Therefore, by including more specific terms, the WDP-RT aims at improving the understanding of the inspector regarding the execution of the usability inspection.

The feasibility of the proposed technique was evaluated using two in vitro studies(8), which were compared the results of inspection between WDP-RT and WDP. In both studies, the results indicated that the WDP-RT helped the inspectors to find a larger number of usability defects compared with WDP, when used to inspect a Web portal. The first feasibility study also pointed out that the WDP-RT is more effective and just as efficient as the WDP technique.

The version of WDP-RT used in the studies reported below is the second version of the technique (WDP-RT v2), formulated based on the results of the first feasibility study (8). In this version, the instructions from the WDP-RT are grouped in two inspection phases, being executed first the instructions for the usability verification in relation to the Presentation and Conceptual perspectives, and, at last, the instructions for the Navigation perspective. In Figure 4, we present a small view of WDP-RT v2. The complete reference text for WDP-RT v2 is available in (9).

The experimentation allows researchers to create and maintain a knowledge base in which each item is verified in real world case studies, making them more trustworthy (13) . Among the several kinds of experimentation studies, the case studies allow the careful analysis of a specific process in the context of a software lifecycle (24). Thus, the case studies for the evaluation of the WDP-RT were conducted with the main goal of evaluating the adequacy of the technique in the industrial environment.

To evaluate whether the obtained results of applying the WDP-RT in an industrial environment is satisfactory, we used three main indicators:

Efficiency in the Detection Phase: Bolchini and Garzotto (1) state that the efficiency indicates the degree in which a method helps the quick detection of usability problems. It is computed as the ratio between the number of defects and the inspection time.

Effort in Detection and Discrimination Phase: measured by each of the inspectors, it is the main cost factor of executing the inspection;

Learnability degree: indicates how easy is to learn a new method (1). This indicator was verified using two main factors:

◦Effort spent in the technique training: measured by man-hours, it shows the time spent in training the inspectors to use the technique;

◦Perception of difficulty for applying the technique: the opinion of the inspectors about how hard was to apply the WDP-RT during the usability inspection.

These indicators allow us to examine whether it is possible to have a good use of the technique when having the project’s stakeholders themselves as the usability inspectors. The Efficiency and Efficacy indicators are usually used in the evaluation of defect detection techniques. However, since the total number of usability defects in the inspected applications was not known initially, the Efficacy indicator was not computed since it is measured as the ratio between the number of detected defects and the total number of defects.

4.1 First Industry Case Study

The first industry case study was conducted in collaboration with FabriQ Informática Ltda (www.fabriq.com.br), a small Development Company located at the Manaus Industrial Park.

Case Study Object: the Documents Control Module of DOMMA ISO, a document management and flow control software system. In order to facilitate the inspection, two scripts were created (A and B), where script A contained use cases regarding two profiles (responsible for the direction and elaborator) and script B contained use cases related to other three profiles (partial approver, final approver and deployer). Both scripts had equivalent number of activities. Figure 5 shows the use cases performed in each script by the inspectors.

Participants: the case study had eight professionals as participants (five systems analyst and three support analysts). Each subject filled a characterization form with questions about its knowledge in usability, software’s evaluation and inspection, Web development, and its relation to the application module being inspected (analyst, programmer, or tester).

Four subjects performed the activities described in each one of the use cases of script A and another four subjects performed the activities of the use cases of script B. In order to prevent a subject inspect a use case developed by him, the teams were divided so that inspectors had the smallest relationship possible with the script they were going to perform.

Procedures: the subjects had a training with duration of one hour and fifteen minutes about usability and the technique WDP-RT. The defect detection was conducted individually by the inspectors that had a deadline of one week to execute it.

Data Collection: seven inspectors (four of script A and three of script B) sent their spreadsheets with annotations and discrepancies.

The researchers involved in this study made the compilation of the discrepancies identified by the inspectors. For each script, it was generated a unique list having all identified discrepancies. These discrepancies were then classified as unique or duplicate (a discrepancy identified by more than one inspector) and, at last, the inspector identifier was removed. It is important to note that beside the fact that the researchers compiled the unique list, all identified discrepancies were presented in discrimination meetings so the whole inspectors’ team could judge whether they agree to the final unique list.

For each script, it was conducted a discrimination meeting composed by the inspectors, two members of the application’s team, and the researchers of this study. In these meetings, the evaluated interactions were re-executed, allowing the in loco verification of each discrepancy. After the discussion about the discrepancy among the inspectors and members, each discrepancy was classified as either a defect or a false-positive.

4.1.1 Quantitative Analysis

The first goal of this study was to verify the assistance of WDP-RT to the inspectors in detecting usability defects. Thus, it was observed the number of defects identified by each one of the inspectors. Three of the inspectors already had some previous knowledge in usability and inspections, while two of them had already been in a usability inspection. Table 3 shows the individual results of the inspection.

The indicator Efficiency in the Detection Phase was computed as the ratio between number of defects found and inspection time. It was found 84 defects during the inspection. On average, the inspectors spent 1 hour and 32 minutes in detection. Thus, the Efficiency in the Detection Phase is 7.81 defects/hour by inspector.

The second indicator is the Effort in Detection and Discrimination Phase. To compute this indicator, it is important to take into consideration the time spent in the discrimination activity. Two meetings were conducted, one for each script. The first meeting lasted 1 hour, while the second meeting lasted 1 hour and 40 minutes. In this last case, the inspection cost was low, since the average effort of an inspector, adding the detection activity effort (1 hour and 32 minutes) and discrimination (1 hour e 20 minutes) was of about 2 hours and 52 minutes.

The third indicator is the Learnability degree.Regarding the Effort spent in the technique training, the time spent in training the WDP-RT was of only 1 hour and 15 minutes by inspector. To capture the Perception of difficulty for applying the technique, it was conducted an evaluation survey about the technique as well as some semi-structured interviews with the inspectors. The analysis of the qualitative data will be shown in Section 5.

4.2 Second Industry Case Study

The second industry case study was conducted in collaboration with Trópico Telecomunicações S.A. (www.tropiconet.com.br), a telecommunications company. This study, besides the original goal of evaluating the use of WDP-RT, also had the goal of evaluating the viability of a new assisting tool for usability inspections, the APIU tool (22).

The inspectors of this experiment used the WDP-RT v2 technique in order to inspect one of the company’s Web applications, and answered an evaluation survey about the technique. In particular, the qualitative data collected about the use of WDP-RT were relevant to its improvement. This experiment will be summarized in this section and the qualitative results will be presented in Section 5.

Case Study Object: a software used in the management of the calls made by the company’s clients. The choice of this application was motivated by its importance for the company and by the fact that the inspectors didn’t have any direct relation to this application and not being part of its development

Two scripts (1 and 2) were created, each having four activities. Each activity was briefly described while informing the auxiliary data needed to accomplish it as well as the expected results after its execution.

Participants: six company’s employees were involved in the study. Four system analysts were selected as inspectors. Another analyst acted as the responsible for the system and one project manager was responsible for the data collection.

All inspectors filled a characterization form with questions about its knowledge in usability, software’s evaluation and inspection, Web development, and its relation to the application module being inspected (analyst, programmer, or tester). The participants were divided into two groups of two inspectors (A and B) according to their characterization form answers. This division was necessary due to the procedures required to evaluate the assistant tool.

Procedures: the inspectors’ training consisted of two trainings. The first training session lasted 1 hour and consisted of usability’s concepts as well as the WDP-RT. The second training session, 30 minutes, showed how to use both the APIU assistant tool as well as the spreadsheet to record the defects.

The inspection was conducted in two parts. First, the inspectors executed the activities in the script 1, but the group A used the assistant tool to record the defects found, while group B used the spreadsheet. In the second part, the inspectors executed the activities in the script 2 but group A used the spreadsheet while group B used the assistant tool. The inspectors had three days to execute the individual inspection.

Data Collection: the project manager compiled the unique list of defects, detecting unique and duplicated defects in both assistant tool and spreadsheets.

The discrimination meeting was conducted by the researcher responsible by the APIU assistant tool and by the analyst responsible by the inspected application. Each discrepancy was evaluated and the application’s responsible classified it as defect or false-positive.

4.2.1 Quantitative Analysis

Although the fact that one of the goals of this study was to evaluate the proposed APIU assistant tool, it was possible to verify the number of defects found by the inspectors using the WDP-RT technique. Among the inspectors, three of them had developed similar applications; however, they had never participated in the development of the inspected application. Furthermore, only one had previous knowledge about usability inspection. Table 4 presents the individual results of the inspection.

Analyzing the results, we can see that the inspectors found 120 usability problems. Thus, we can say that the WDP-RT accomplish its goal of allowing inexperienced inspectors to execute a usability inspection. Also, it is possible to see the low number of false-positives obtained by the inspectors (14.89% of the total discrepancies).

The use of APIU interfered in the productivity of inspectors, increasing the time spent on inspection (22). As detailed in (22), all inspectors mentioned that, after finding a defect, had difficulty identifying what item of the technique should be associated with defects suggested by the tool. The list of defects suggested by APIU was associated with aid general terms and not in the technique data, causing doubts in the register of the discrepancy found.

The inspectors also reported that they had no access to technique inspection at registration of the discrepancy, because it could only be viewed on the main page. This interfered because whenever needed display technique was necessary to leave the registration page and return to the main page of the application (22).

Due to this fact, the time-based analysis presented below is restricted to those inspections using the spreadsheet. This fact can also be verified in Table 4, in which we can see the higher time obtained by the inspectors when using the tool. Thus, for each script, we analyzed the results of only two inspectors. It is also important to note that the inspectors that executed the script 2 had already used the technique in the script 1.

The indicator Efficiency in the Detection Phase was 7.22 defects/hour by inspector since (considering only the inspections using the spreadsheet, the inspectors found 54 defects in the averaged time of 112.5 minutes). The indicator Effort in Detection and Discrimination Phase was not computed since the inspectors were not in the discrimination meeting.

Regarding the indicator Learnability degree, the effort spent in the technique training for the WDP-RT was of only 1 hour per inspector. The perception of difficulty for applying the technique was gathered through evaluation surveys. The qualitative analysis of the data will be shown and discussed in the next section.

5 Qualitative Analysis of the Case Studies

Seaman (23)notes that qualitative data can be used to go beyond the statistics and help explain the reasons behind the hypotheses and relationships. When we use analytic methods to examine qualitative data, we achieve a much deeper understanding of the whole phenomena.

In order to capture the perception of difficulty for applying the technique, the inspectors of both case studies answered an evaluation survey about the WDP-RT. This evaluation survey was developed using the Technology Acceptance Model (TAM) (4). This model was proposed to help comprehending the causal relationship between the users’ acceptance external variables and the real use of the system, while trying to understand the behavior of the user through his perception about the utility and easiness of use (26). DAVIS (4) found that the factors perception about ease of use and usefulness have a major impact on the use of applications as suggested by TAM. The TAM has been widely applied to a large set of new technology (31), including inspection techniques (15).

Besides the questions about usability and usefulness of the technique, the survey contained specific questions regarding WDP-RT such as the easiness of understanding the instructions and perspectives. The levels used to evaluate the questions in the survey were composed by 6 points: Totally Agree, Almost Totally Agree, Partially Agree, Partially Disagree, Almost Totally Disagree, and Totally Disagree. We decided not to use any neutral value, since it does not contain much of information about which direction the inspector is most inclined (15).

The inspectors of the first case study (FabriQ) were surveyed through individual semi-structured interviews aiming at collecting information regarding the applicability of the technique (difficulty in using the technique, how to use the technique, the instructions and perspectives order) as well as their general opinion about the inspection. The inspectors of the second study case (Trópico) answered the surveys through the use of forms. Table 5 shows the number of answers for each question on survey.

Regarding the usefulness of the technique, ten inspectors agreed to all affirmations in this criterion. Only one inspector from the first case study partially disagreed in three affirmations of this item, as depicted in Table 5. During the interview with this inspector, it was possible to note that this difficulty in applying the technique affected his perception about his usefulness.

Regarding the ease of use, seven inspectors considered the technique as easy to use and useful for the inspection. Other four inspectors reported difficulties using the technique by disagreeing partially in at least one of the questions on this criterion, as shown in Table 5.

Regarding the specific question about ease of understanding of the perspectives for Web design, nine inspectors believe the prospects are easy to understand, while the other two inspectors partially disagreed on this (Table 5). Regarding the ease of understanding the instructions of WDP, ten inspectors consider the instructions easy to understand and that it uses simple language, as also shown in Table 5.

Besides the analysis through the TAM model, we performed a specific analysis of qualitative data that were collected in surveys and interviews based on procedures of the method Grounded Theory (GT) (27). The qualitative data were analyzed using a subset of the phases of the coding process suggested by (27) for the GT method - the open and axial coding. The purpose of this analysis was to understand the perception of the inspectors on the experience of using the WDP-RT. Since we do not intend to create a theory about this, we did not carry out the selective coding (3rd step of the GT method (27)). The open coding and axial steps were sufficient to understand the difficulties encountered by the inspectors.

The codes were created based on quotes from the inspectors. Twenty seven codes were identified and grouped into three categories: (i) Positive aspects of the WDP-RT, (ii) Difficulties, and (iii) Suggestions for Improvement.

The category “Positive aspects of the WDP-RT” represents the codes related to the aid of the WDP-RT in the execution of the usability inspections. Eight codes were associated with this category (Figure 6) and are related to aspects of usefulness and ease of use of the WDP-RT.

The code “WDP-RT helps to find defects more easily” was mentioned by four inspectors, saying that usability problems that might have gone undetected are found more easily with the help of the technique since it outlines what must be checked in applications. Four inspectors cited the code “WDP-RT helps to improve my knowledge of usability”, and they presume this will help in developing systems with improved usability.

The codes “Technique is easy to use”, cited by five inspectors, “The instructions of the technique are clear”, cited by four inspectors, and other codes of this category demonstrates the ease of use of the RT-WDP and showing that it helps executing the inspection.

The category “Difficulties” presents the codes related to the problems found by inspectors in carrying out the inspection using the WDP-RT (Figure 7). Among the codes associated to this category, the code “Extensive technique”, mentioned by four inspectors and associated to other four codes: “There are many details to take into consideration”, “It takes much practice since it has have a lot of items”, “Tedious technique”, and “Hard to memorize”. Among these codes, the first two are related to the quantity of items to be checked at an inspection.

As in the in vitro study of the WDP-RT (8), two inspectors found the technique tedious. Two other inspectors said the WDP-RT is difficult to memorize since it is a long technique. The analysis of the surveys and interviews revealed that only five inspectors followed the technique linearly at least early in the process. Another six inspectors read the technique and tried to memorize it, looking for the usability problems (in an ad hoc manner) and trying to associate them with the instructions of the WDP-RT (code “Tried to memorize the technique and carried out an ad hoc inspection”). This made it difficult to learn and use the technique, since the instructions of the WDP-RT represent a sequence of steps for carrying out the usability inspection.

The codes “Difficulty in understanding some instructions” and “The Presentation and Conceptualization perspectives can confuse the inspector” points to problems of understanding the technique. This is probably because the similarity between some instructions (code “Some statements are ambiguous”), cited by four inspectors. In addition to the examples given in the survey, it was possible to verify the discrepancies reported in the spreadsheet in which some were associated to more than one statement, which may be an indication of ambiguity among the instructions.

Another difficulty pointed out by three inspectors was the association between the defects and the pairs HxP (relationship between Nielsen’s Heuristics and the Web Project Perspectives) of the WDP (code “HxP Pairs association of the WDP”), which they had seen in the training of the WDP-RT. Thus, we could note that this fact had caused some problems in learning the WDP-RT.

The category “Suggestions for Improvement” (Figure 8) shows the codes related to proposed improvements to the WDP-RT. Among the nine codes of this category, we can highlight the code “Summarize the technique”, mentioned by two inspectors and that is directly related to the difficulty mentioned in the code “Extensive technique”. The inspectors that made this suggestion pointed out that the technique should be summarized without any loss of information.

Two inspectors suggested that the perspective Navigation of the usability inspection should be the first step of the inspection (code “Inspect Navigation first”). According to one of these inspectors, this step is less tedious and, after its execution, the inspector would have an overview of the system, facilitating the inspection of the remaining perspectives. This inspector also suggests that the last part of the inspection should be the data input (code “Inspect input lastly”) since it is the most detailed and requires more testing.

An inspector suggested the addition of more examples to the technique (code “Include more examples”). Another inspector also suggested that the instructions relating the Presentation and Conceptualization perspectives should be separated (code “Separately inspect Conceptualization and Presentation”). However, this suggestion contradicts a good aspect addressed by other inspectors (code “Parallel Inspection of the Presentation and Conceptualization perspectives makes inspection more agile”). The other codes associated to this category are related to specific instructions of the WDP-RT.

Based on the difficulties found and the proposed suggestions, the WDP-RT was reviewed, focusing mainly on the search for possible ambiguities between the instructions of the technique with the goal of summarizing and making its instructions easier to understand. For this analysis, we considered the suggestions made by the inspectors in the technical surveys as well as the inspectors’ discrepancy spreadsheets (for the analysis of the discrepancies associated to more than one instruction).

This analysis revealed that some specific items were present in more wide statements such as the instructions 6C, 9A, and 9B. After a careful review, it was decided to group these items in a new instruction 8 (Figure 9).

The description of the purpose of each phase of the inspection has been summarized, and some instructions had their text revised. The instructions were standardized by using the construction “check if” and changed, in most cases, into affirmative sentences. The training manual was also revised, removing references to the pairs HxP of the WDP and adding more examples related to the instructions of the technique. The suggestions regarding the changes in the order of the inspection need more research to be adopted.

These changes resulted in the third version of the technique (WDP-RT v3). This version consists of eight sets of instructions, separated into two evaluation phases. The first phase of the evaluation corresponds to the instructions 1 through 6, in which is checked both Presentation and Conceptualization perspectives of the usability inspection. The second phase of the usability inspection evaluates the usability Navigation perspective (instructions 7 and 8).

Figure 10 shows an example of usability problems related to instruction 1B WDP-RT (Any error message should inform the problems occurred and help the user to solve them, for example, indicating the procedure to be performed or an alternative solution. It may still be expressed with concepts of the problem domain and according to the user profile). In this picture, note that the system displays a generic error message to the user ("Fill out the form correctly"), without informing the problem occurred or help the user solve the problem.

This paper presented two industry case studies aiming at the evaluation and improvement of a reading technique for usability inspection of Web applications, the WDP-RT. Through the quantitative and qualitative data collected in these studies, the inspection technique has been revised and improved. The case studies show that the WDP-RT facilitates the usability inspection, and the qualitative and quantitative results can be considered as a good indication about the viability of using this technique by the project team members themselves as the inspectors in a usability evaluation.

Although significant, the findings of these case studies are based on field investigation of only two organisations, which could threat their external validity. The active involvement of the first author at the training sessions of each usability inspection may constitute another limitation. According to Wohlin et al.(32), the experimenters can bias the results of a study both consciously and unconsciously based on what they expect from the experiment. We tried to reduce this threat by involving people from the companies which have no expectations on the WDP-RT results. In order to reduce biased opinions, none of the authors classified any of the discrepancies during the discrimination meetings.

In the traditional software development scenario, it is possible to see substantial increases in quality as well as defects reduction due to the adoption of inspections and other review techniques (28).Also, the results presented in (2), (10), and (16), show that it also improves the quality of Web applications regarding usability. Inspections improve productivity since the defects are detected when they are easier and cheaper to fix. Thus, with these results, we also aim at encouraging the Web software industry in executing usability inspections more often.

According to the inspectors who participated in the industry case studies, the WDP-RT has helped them to improve their knowledge of usability, which is expected to contribute to future developed applications with improved usability. Also, according to the inspectors, the increase of the usability will also increase their product’s credibility, reduce calls to customer service, and reduce the number of patches in the applications.

As future work, we will carry out an observational study of the WDP-RT. This study aims at collecting data on how inspectors apply the technique, and we expect to verify the proposed changes in the order of the instructions of the technique. Further studies will also be held in Industry to assess the adequacy of the WDP-RT to the industrial environment, especially by new inspectors.

Acknowledgements

We want to thank all participants of the case studies, specially FabriQ Informática Ltda., Trópico Telecomunicações S.A. and PRODAM S.A. Also, we want to thank CNPq (575808/2008-0 and 483125/2010-5) and FAPEAM for the financial support.