The talk will review some of the results from a broader simulation study exploring the effects that poorly functioning items have on accuracy (i.e., optimal test length, estimate-to-total correlations, estimate-to-outcome correlations, false-positive, and false-negative rates) across a multitude of testing frameworks (i.e., unweighted summed scores, α-optimization, factor analysis, and Item Response Theory).

Findings suggest that all simulated methods perform quite adequately. Additionally, the testing frameworks may differ in their ability to identify poorly functioning items. Specifically, IRT and factor analysis perform best when tests are longer (e.g., 40- to 80-items) and α-optimization when tests are shorter (e.g., 20-items). Finally, IRT typically generates drastically shorter tests without necessarily compromising accuracy in any outcome that was assessed. This may have implications in test development and construction that require parsimonious scales (i.e., short-forms, clinical applications where time is limited, etc.).