National Center for Research on Evaluation, Standards, and Student Testing (CRESST)

It is a well-known problem in testing the fit of models to multinomial data that the full underlying contingency table will inevitably be sparse for tests of reasonable length and for realistic sample sizes. Under such conditions, full-information test statistics such as Pearson's X[superscript 2]?? and the likelihood ratio statistic ????G[superscript 2] are poorly calibrated. Limited-information fit statistics, such as the M[subscript 2]???? statistic proposed by Maydeu-Olivares & Joe (2006), have been suggested as possible alternatives to full-information tests in various modeling including item response theory models. In this study, we considered the application of M[subscript 2]???? to the goodness-of-fit testing of diagnostic classification models (e.g., Rupp, Templin, & Henson, 2010). Through a series of simulation studies, we found that M[subscript 2]???? is well calibrated across a range of diagnostic model structures. The sensitivity of the test statistic to detect various types of model misspecification was also examined. M[subscript 2]???? was found to be sensitive to certain misspecifications of the item model (e.g., fitting disjunctive models to data generated according to a conjunctive model), errors in the Q-matrix, and local item dependence due to unmodeled testlet effects. On the other hand, ????M[subscript 2] was largely insensitive to misspecifications in the distribution of higher-order latent dimensions and to the specification of an extraneous attribute. To complement the analyses of overall model goodness-of-fit, we investigated the potential utility of the Chen and Thissen (1997) local dependence statistic ??????X[superscript 2][subscript LD] ?? for characterizing sources of misfit. The ??????X[superscript 2][subscript LD] ?? statistic was found to be slightly conservative (with Type I error rates consistently below the nominal level) but still useful in identifying potential misspecifications. Patterns of local dependence arising due to model misspecifications are illustrated. Finally, we used the M[subscript 2]???? and X[superscript 2][subscript LD] ?????? ?? statistics to evaluate a diagnostic model fit to a data from the Trends in Mathematics and Science Study (TIMSS), drawing upon analyses previously conducted by Lee, Park, and Taylan (2011).