A multiple-choice certification exam may have acceptable reliability, but it's likely that it can be improved in the most important area - the precision of candidate ability estimates near the cut score. The article below describes how.

Ross Brown

Manager Computer Based Testing and Analysis

Targeting Multiple Choice Examinations for Accurate Outcomes

The purpose of certification examinations is to make accurate pass or fail decisions about candidates. The more information available about the candidates who are close to the cut score, the more accurate the pass/fail decisions are. The Rasch perspective suggests that including more items with difficulties close to the cut score yields the most information about those candidates close to the cut score. This is often called test targeting.

On most tests, the item difficulties range from very easy to very difficulty. For test targeting, the items that are too easy and too difficult are replaced with items that have difficulties close to the pass point. These items are those that are answered correctly by between 30% and 80% of candidates. The Wright Map in Figure 1 created by the Winsteps Program (Linacre, 1989), is ideal for tracking item difficulty distribution relative to the pass point. Both candidate measures and item difficulties are on the same scale, so the distribution of candidate measures can be compared to the distribution of item difficulties. In the Wright Map below, the items within the bracket on the right are the targeted items.

There are costs to targeting tests. Item writers often find it difficult to write items that perform at this level. As tests become more targeted, they appear more difficult to candidates. But common-item test equating allows more difficult test forms to be developed while maintaining the pass point as an absolute standard. The table shows two examples of tests that were made more targeted. The equating difficulty is the average item difficulty of the test expressed in scaled scores. When more items are in the target range of .30 to .80, the test becomes more difficult, but the difference in difficulty is accounted for in the analysis. The passing standard is maintained on the more difficult exam forms, but the percent correct necessary to pass decreases as the test becomes more difficult. Thus, the test is better psychometrically, and does not penalize the candidates.