Application of Conditional Means for Diagnostic Scoring
Keywords:
Model-based diagnostic assessments; diagnostic scoring; complex sum scoresAbstract
In educational assessment, demand for diagnostic information
from test results has prompted the development of model-based
diagnostic assessments. To determine student mastery of specific skills,
a number of scoring approaches, including subscore reporting and
probabilistic scoring solutions, have been developed to score diagnostic
assessments. Although each approach has a unique set of limitations,
these approaches are, nevertheless, often used in diagnostic scoring,
whereas an alternative approach, Complex Sum Scores (CSS), has not
received much attention yet. With the process of developing modelbased diagnostic assessments becoming increasingly complex, we revisit
the CSS and demonstrate two applications of the CSS in the
development of diagnostic assessments. Two applications include: (a)
illustrating and validating skills within the model, and (b) partial
mastery scoring using model-based distractors. By demonstrating the
two applications, we aim to show how model-based diagnostic
assessments can be developed and scored using the CSS scoring
approach, the results of which can be used by teachers to inform
teaching and learning.
References
Babenko, O., & Rogers, W. T. (2014). Comparison and properties of correlational and agreement methods for determining whether or not to report subtest scores. International Journal of Learning, Teaching and Educational Research, 4(1), 61-74.
Cui, Y., & Leighton, J. (2009). The hierarchy consistency index: Evaluating person fit for cognitive diagnostic assessment. Journal of Educational Measurement, 46, 429-449.
De la Torre, J., & Patz, R. (2005). Making the most of what we have: A practical application of multidimensional item response theory in test scoring. Journal of Educational and Behavioral Statistics, 30, 295-311.
Embretson, S. (1994). Application of cognitive design systems to test development. In C. R. Reynolds (Ed.), Cognitive assessment: A multidisciplinary perspective (pp. 107-136). New York: Plenum.
Gierl, M. J. (2007). Making diagnostic inferences about cognitive attributes using the rule space model and attribute hierarchy method. Journal of Educational Measurement, 44, 325–340.
Gierl, M., Cui, Y., & Zhou, J. (2009). Reliability and attribute-based scoring in cognitive diagnostic assessment. Journal of Educational Measurement, 46(3), 293-313.
Gorin, J. S. (2007). Test construction and diagnostic testing. In J. P. Leighton & M. J. Gierl, (Eds.) Cognitive Diagnostic Assessment in Education: Theory and Practice. Cambridge University Press.
Haberman, S. J. (2008). When can subscores have value? Journal of Educational and Behavioral Statistics, 33(2), 204–229.
Hartz, S. M. (2002). A Bayesian framework for the uniï¬ed model for assessing cognitive abilities: Blending theory with practicality.
Unpublished doctoral dissertation, Department of Statistics, University of Illinois, Urbana-Champaign.
Henson, R., Templin, J., & Douglas, J. (2007). Using efficient model based sum-scores for conducting skills diagnoses. Journal of Educational Measurement, 44, 361-376.
Junker, B.W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272.
Leighton, J. P., & Gierl, M. J. (2007). Defining and evaluating models of cognition used in educational measurement to make inferences about examinees’ thinking processes. Educational Measurement: Issues and Practice, 26, 3-16.
Leighton, J. P., Gierl, M. J., & Hunka, S. (2004). The attribute hierarchy method for cognitive assessment: a variation on Tatsuoka’s rule space approach. Journal of Educational Measurement. 41(3), 205-237.
Luecht, R. (2007). Using information from multiple-choice distractors to enhance cognitive-diagnostic score reporting in J. Leighton and M. Gierl (Eds.), Cognitive diagnostic assessment for education: Theory and applications. New York, NY: Cambridge. pp. 319-340.
Luecht, R. (November, 2008). Assessment engineering in test design, development, assembly, and scoring. Keynote address at the East Coast Organization of Language Testers (ECOLT), Washington, DC.
Mislevy, R. J. (1994). Evidence and inference in educational assessment. Psychometrika, 59, 439-483.
Nichols, P. (1994). A framework for developing cognitively diagnostic assessments. Review of Educational Research, 64(4), 575-603.
Sinharay, S. (2010). How often do subscores have added value? Results from operational and simulated data. Journal of Educational Measurement, 47, 150-174.
Sinharay, S., Puhan, G., & Haberman, S. (2010). Reporting diagnostic scores in educational testing: Temptations, pitfalls, and some solutions. Multivariate Behavioral Research, 45, 553-573.
Von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61(2), 287-307.
Wainer, H., Vevea, J., Camacho, F., Reeve, B., Rosa, K., Nelson, L., Swygert, K., & Thissen, D. (2001). Augmented scores - “Borrowing strength†to compute scores based on small numbers of items. In D. Thissen & H. Wainer (Eds.) Test Scoring. Mahwah, NJ: LEA. pp. 343-387.
Wilson, M. (2009). Measuring progressions: Assessment structures underlying a learning progression. Journal of Research in Science Teaching, 46(6), 716-730.
Downloads
Published
Issue
Section
License
Copyright (c) 2015 Hollis Lai, Mark J. Gierl, Oksana Babenko

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
All articles published by IJLTER are licensed under a Creative Commons Attribution Non-Commercial No-Derivatives 4.0 International License (CCBY-NC-ND4.0).