Dissecting knowledge, guessing, and blunder in multiple choice assessments
Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly account for guessing, knowledge and blunder using eight assessments (>9,000 responses) from an undergraduate biotechnology curriculum. A Bayesian implementation of the models, aimed at assessing their robustness to prior beliefs in examinee knowledge, showed that explicit estimators of knowledge are markedly sensitive to prior beliefs with scores as sole input. To overcome this limitation, we examined self-ranked confidence as a proxy knowledge indicator. For our test set, three levels of confidence resolved test performance. Responses rated as least confident were correct more frequently than expected from random selection, reflecting partial knowledge, but were balanced by blunder among the most confident responses. By translating evidence-based guessing and blunder rates to pass marks that statistically qualify a desired level of examinee knowledge, our approach finds practical utility in test analysis and design.
Item and Test Characteristic Curves of Rank-2PL Models for Multidimensional Forced-Choice Questionnaires
A process is proposed to create the one-dimensional expected item characteristic curve (ICC) and test characteristic curve (TCC) for each trait in multidimensional forced-choice questionnaires based on the Rank-2PL (two-parameter logistic) item response theory models for forced-choice items with two or three statements. Some examples of ICC and TCC plots from a real pair form and triplet form are provided, and they appear to help identify misfit trait scores at the item and test levels. TCC plots with negative statements changed to positive ones are also proposed.
