Identifying "Bad" Respondents

MaxDiff displays an average "fit statistic" (RLH) to the screen during HB estimation and also writes to file the individualized fit statistic along with each respondent's scores on the items. This fit statistic ranges from a low of 0 to a high of 1 and characterizes the internal consistency for each respondent.

It is natural to ask what minimum fit statistic distinguishes more thoughtful responses from purely random data. The fit statistic can help you know with a high degree of confidence if a respondent has provided purely random answers and should be discarded.

The magnitude of the fit statistic depends on how many items were shown in each set. If four items are shown in each set, then any random set of scores should be able to predict respondents' answers correctly with 25% likelihood (fit statistic=.25). If two items are shown per set (paired comparison approach), then the expected likelihood given random data is 50% (fit statistic=.5). Thus we can generalize that the fit statistic should be at a minimum 1/c, where c is the number of items displayed per set. We should hope that real respondents should perform considerably better than chance. But since the score estimation algorithm used in MaxDiff (HB) attempts to fit the respondent's choices, the actual fit we observe even from random data is almost always above the chance rate.

The table below displays minimum fit statistics to achieve 95% correct classification of random responders. Respondents with fit statistics below these cutoffs are either random responders or answering with a great deal of error. In developing this table, we assumed a 20-item study wherein respondents are shown each item either three or four times across all sets (more sparse designs than this make it difficult to distinguish between random and good responders).

There is only a 5% likelihood that a random responder can achieve a fit statistic better than these cutoff values. In other words, if a respondent truly is a random responder, you will be 95% successful in identifying them for exclusion following the cutoff values in the table above. Please recognize, however, that this approach to developing cutoff rules doesn’t speak to how likely a respondent answering with a great deal of error (yet not randomly) would be falsely identified as a random responder. In other words, you cannot say that a respondent achieving a fit statistic below these values is 95% likely to be a completely random responder.

Technical Notes: We simulated 1000 respondents answering randomly to each questionnaire, then estimated the scores using HB: prior variance=1, d.f.=5. Both "best" and "worst" answers were simulated for each respondent (with the exception of the 2 items case). For each simulated data set, we sorted the respondents' fit statistics from highest to lowest and recorded the 95% percentile fit (where 95% of the data fell below the cutoff point). If asking only "bests," because the amount of information for each respondent is reduced, the Suggested Minimum Fit would be higher.

Additional Notes: The table above was created using standard MaxDiff, not anchored MaxDiff. Anchored MaxDiff would lead to different norms regarding fit for identifying "bad" respondents. If using anchored MaxDiff, you can estimate the results using HB without the anchor, so that you may use the recommendations above.