Polytomous Item Explanatory IRT Models with Random Item Effects: An Application to Carbon Cycle Assessment Data

Similar documents
Extensions and Applications of Item Explanatory Models to Polytomous Data in Item Response Theory. Jinho Kim

Modeling differences in itemposition effects in the PISA 2009 reading assessment within and between schools

Monte Carlo Simulations for Rasch Model Tests

Whats beyond Concerto: An introduction to the R package catr. Session 4: Overview of polytomous IRT models

IRT Model Selection Methods for Polytomous Items

IRT models and mixed models: Theory and lmer practice

Chapter 3 Models for polytomous data

On the Construction of Adjacent Categories Latent Trait Models from Binary Variables, Motivating Processes and the Interpretation of Parameters

Bayesian Nonparametric Rasch Modeling: Methods and Software

Statistical and psychometric methods for measurement: Scale development and validation

A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions

Comparison between conditional and marginal maximum likelihood for a class of item response models

PIRLS 2016 Achievement Scaling Methodology 1

UCLA Department of Statistics Papers

Modeling rater effects using a combination of Generalizability Theory and IRT

Bayesian and frequentist cross-validation methods for explanatory item response models. Daniel C. Furr

HANNEKE GEERLINGS AND CEES A.W. GLAS WIM J. VAN DER LINDEN MODELING RULE-BASED ITEM GENERATION. 1. Introduction

Item Response Theory (IRT) Analysis of Item Sets

What is an Ordinal Latent Trait Model?

New Developments for Extended Rasch Modeling in R

Anders Skrondal. Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine. Based on joint work with Sophia Rabe-Hesketh

Basic IRT Concepts, Models, and Assumptions

36-720: The Rasch Model

Using Multilevel Models to Analyze Single-Case Design Data

Log-linear multidimensional Rasch model for capture-recapture

Extended Rasch Modeling: The R Package erm

Lesson 7: Item response theory models (part 2)

Joint Modeling of Longitudinal Item Response Data and Survival

DIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Dimensionality Assessment: Additional Methods

Correlation and Linear Regression

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

Comparing IRT with Other Models

Pairwise Parameter Estimation in Rasch Models

Estimating Multilevel Linear Models as Structural Equation Models

Summer School in Applied Psychometric Principles. Peterhouse College 13 th to 17 th September 2010

Equating Tests Under The Nominal Response Model Frank B. Baker

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Module 11: Linear Regression. Rebecca C. Steorts

Exam Applied Statistical Regression. Good Luck!

Statistical and psychometric methods for measurement: G Theory, DIF, & Linking

Alexina Mason. Department of Epidemiology and Biostatistics Imperial College, London. 16 February 2010

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Validation of Cognitive Structures: A Structural Equation Modeling Approach

APPENDICES TO Protest Movements and Citizen Discontent. Appendix A: Question Wordings

Multi-level Models: Idea

Impact of serial correlation structures on random effect misspecification with the linear mixed model.

ESTIMATION OF IRT PARAMETERS OVER A SMALL SAMPLE. BOOTSTRAPPING OF THE ITEM RESPONSES. Dimitar Atanasov

Linear Regression Models

Using Model Selection and Prior Specification to Improve Regime-switching Asset Simulations

Plausible Values for Latent Variables Using Mplus

Local response dependence and the Rasch factor model

The application and empirical comparison of item. parameters of Classical Test Theory and Partial Credit. Model of Rasch in performance assessments

Item-Focussed Trees for the Detection of Differential Item Functioning in Partial Credit Models

A White Paper on Scaling PARCC Assessments: Some Considerations and a Synthetic Data Example

Conditional maximum likelihood estimation in polytomous Rasch models using SAS

Generalized Linear Models for Non-Normal Data

LSAC RESEARCH REPORT SERIES. Law School Admission Council Research Report March 2008

Beyond classical statistics: Different approaches to evaluating marking reliability

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Variance component models part I

A Note on Item Restscore Association in Rasch Models

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Determining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1

Introduction and Background to Multilevel Analysis

WU Weiterbildung. Linear Mixed Models

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Lecture notes I: Measurement invariance 1

First Year Examination Department of Statistics, University of Florida

Overview of Talk. Motivating Example: Antisocial Behavior (AB) Raw Data

Application of Item Response Theory Models for Intensive Longitudinal Data

PREDICTING THE DISTRIBUTION OF A GOODNESS-OF-FIT STATISTIC APPROPRIATE FOR USE WITH PERFORMANCE-BASED ASSESSMENTS. Mary A. Hansen

SCORING TESTS WITH DICHOTOMOUS AND POLYTOMOUS ITEMS CIGDEM ALAGOZ. (Under the Direction of Seock-Ho Kim) ABSTRACT

Contents. Part I: Fundamentals of Bayesian Inference 1

Part 6: Multivariate Normal and Linear Models

Estimating a Piecewise Growth Model with Longitudinal Data that Contains Individual Mobility across Clusters

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

Covariates of the Rating Process in Hierarchical Models for Multiple Ratings of Test Items

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Robust Backtesting Tests for Value-at-Risk Models

A Cautionary Note on Estimating the Reliability of a Mastery Test with the Beta-Binomial Model

An Introduction to the DA-T Gibbs Sampler for the Two-Parameter Logistic (2PL) Model and Beyond

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression

What Rasch did: the mathematical underpinnings of the Rasch model. Alex McKee, PhD. 9th Annual UK Rasch User Group Meeting, 20/03/2015

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report. A Multinomial Error Model for Tests with Polytomous Items

Centre for Education Research and Practice

Ensemble Rasch Models

Neural networks (not in book)

Development and Calibration of an Item Response Model. that Incorporates Response Time

A Nonlinear Mixed Model Framework for Item Response Theory

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

Bayesian Analysis of Latent Variable Models using Mplus

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links

UNIVERSITY OF TORONTO Faculty of Arts and Science

Transcription:

Polytomous Item Explanatory IRT Models with Random Item Effects: An Application to Carbon Cycle Assessment Data Jinho Kim and Mark Wilson University of California, Berkeley Presented on April 11, 2018 at the 2018 IOMW, New York City, New York

Research Support and Disclaimer The project described was supported by Grant Number 0815993 from the National Science Foundation (NSF). Any ideas, opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

OUTLINE 1. Empirical Example & Motivation 2. Two Item Explanatory Extensions of the Partial Credit Model 3. Polytomous Item Explanatory Models with Random Item Errors 4. Empirical Study: An application to the Carbon Cycle assessment data 5. Conclusion and Discussion 3

RESEARCH GOALS How can we explain or predict the polytomous item difficulties by observed item properties? Considering the uncertainty in explanation or prediction, how can we add random item errors to account for residual variation in the polytomous item difficulties? How can we develop polytomous item explanatory models with random item errors and apply them to Carbon Cycle assessment data? 4 4

CONSTRUCTION OF CARBON CYCLE ITEMS The item 01 (BODYTEMP) consists of cellular respiration process, energy progress variable, and multiple choice with explanation item format. Q1. [BODYTEMP] Your body produces heat to maintain its normal temperature. Where does the heat mainly come from? Please choose the ONE answer that you think is best and explain why you think that the answer you chose is better than the others. If you think some of the other answers are also partially right, please explain why you think so. A. The heat mainly comes from sunlight. B. The heat mainly comes from the clothes you are wearing. C. The heat mainly comes from the foods you eat. D. The heat mainly comes from your body when you are exercising. 5 5

CONSTRUCTION OF CARBON CYCLE ITEMS The item 01 (BODYTEMP) consists of cellular respiration process, energy progress variable, and multiple choice with explanation item format. Q1. [BODYTEMP] Your body produces heat to maintain its normal temperature. Where does the heat mainly come from? Please choose the ONE answer that you think is best and explain why you think that the answer you chose is better than the others. If you think some of the other answers are also partially right, please explain why you think so. A. The heat mainly comes from sunlight. B. The heat mainly comes from the clothes you are wearing. C. The heat mainly comes from the foods you eat. D. The heat mainly comes from your body when you are exercising. 6 6

CONSTRUCTION OF CARBON CYCLE ITEMS The item 01 (BODYTEMP) consists of cellular respiration process, energy progress variable, and multiple choice with explanation item format. Q1. [BODYTEMP] Your body produces heat to maintain its normal temperature. Where does the heat mainly come from? Please choose the ONE answer that you think is best and explain why you think that the answer you chose is better than the others. If you think some of the other answers are also partially right, please explain why you think so. A. The heat mainly comes from sunlight. B. The heat mainly comes from the clothes you are wearing. C. The heat mainly comes from the foods you eat. D. The heat mainly comes from your body when you are exercising. 7 7

CONSTRUCTION OF CARBON CYCLE ITEMS 13 items were polytomously scored with three achievement levels of ordered-categories based on the scoring rubric. The items can be reconstructed by combining the three item properties: (1) carbon cycling process, (2) progress variable, (3) item format Num. Item Process Progress Format Cellular Photosyn Digestion / Large Micro Energy Mass Multiple Yes.No Name Respiration -thesis Biosynthesis Scale Scale Choice Explain ( CR ) ( PS ) ( DB ) ( LS ) ( MS ) (EN ) ( MA ) ( MC ) (YN) 01 BODYTEMP 1 0 0 0 0 1 0 1 0 02 CARBCYC2 1 0 0 1 0 0 0 0 1 03 CARBPLNT 0 1 0 0 1 0 0 0 1 ~ 12 PLNTGRWTH 0 1 0 0 0 0 1 0 1 13 THINGTREE 0 1 0 0 0 0 1 0 1 8

MOTIVATION LLTM+e approach as a random item explanatory IRT model The Linear Logistic Test Model (LLTM; Fischer, 1973) decomposes the difficulties of specific items into linear combinations of elementary components such as item properties. The Linear Logistic Test Model with item error (LLTM+e; Janssen et al., 2004) can consider the uncertainty in explanation and enhance the precision of estimation of the item difficulties by accounting for residual variation. LLTM+e approach to polytomous data Extensions of the LLTM+e approach to polytomous data is more complicated and demanding than to dichotomous data. (1) item parameterization with item properties in the polytomous models is complicated, (2) adding random item errors to the polytomous models is also complicated, (3) allowing random effects on both item and person sides makes for crossed random effects, which is demanding for estimation. 9

Item Explanatory Extensions of the Partial Credit Model 1. Partial Credit Model 2. Many-Facet Rasch Model (MFRM) with a Decomposition of the Item Location Parameters 3. Linear Partial Credit Model (LPCM) with a Decomposition of the Step Difficulty Parameters 10

Partial Credit Model with two different item parameterizations PCM with onefold item parameterization (original parameterization) ln Pr y pi = m θ p 2 = θ p δ im, m = 1,, M i, δ i0 = 0, and θ p ~N 0, σ θ Pr y pi = m 1 θ p δ im is a step difficulty parameter for scoring m rather than m 1 on item i PCM with twofold item parameterization (ConQuest parameterization) ln Pr y pi = m θ p M = θ Pr y pi = m 1 θ p β i + τ im, m = 1,, M i, τ i0 = 0, and σ i m=1 τim = 0 p β i is an item location parameter for item i, interpreted as the overall item difficulty, τ im is a step deviation parameter for the m-th step within item i 11

Item Explanatory Extensions with two different item parameterizations Polytomous Rasch Model Partial Credit Model Item Parameterization δ im = β i + τ im (two fold) δ im (one fold) Target Item Difficulty Overall Item Difficulty β i Step Difficulty δ im Relevant Item Explanatory Approach Item Explanatory Model with Fixed Item Effects Many-Facet Rasch Model (MFRM; Linacre, 1989) with a Decomposition of the Item Location Parameters = γ k x ik + τ im δ im K k=0 M σ i m=1 τim = 0 Linear Partial Credit Model (LPCM; Fischer and Ponocny, 1994) with a Decomposition of the Step Difficulty Parameters δ im K = ω km x ik k=0 Item Property Effects item specific property effects item-by-step specific property effects 12

Polytomous Item Explanatory Models with Random Item Errors 1. Many-Facet Rasch Model (MFRM) with Overall Item Random Errors 2. Linear Partial Credit Model (LPCM) with Item-by-step Random Errors 3. Linear Partial Credit Model (LPCM) with Step Specific Random Errors 13

Polytomous Item Explanatory Models with Random Item Errors Relevant Approach Many-Facet Rasch Model (MFRM) Linear Partial Credit Model (LPCM) Target Item Difficulty Overall Item Difficulty β i Step Difficulty δ im MFRM with Overall Item Random Errors K δ im = γ k x ik + ε i + τ im k=0 ε i ~ N 0, σ 2 M ε, σ i m=1 τim = 0 LPCM with Item-by-step Random Errors LPCM with Step Specific Random Errors K δ im = ω km x ik + ε im k=0 ε im ~ N 0, σ ε 2 K δ im = ω km x ik + ξ im k=0 ξ i ξ i1,, ξ im ~ MVN 0, Σ 14

Empirical Study: An application to the Carbon Cycle assessment data 1. Many-Facet Rasch Model (MFRM) with Overall Item Random Errors 2. Linear Partial Credit Model (LPCM) with Item-by-step Random Errors 3. Linear Partial Credit Model (LPCM) with Step Specific Random Errors 15

Performance of the MFRM with Overall Item Random Errors PCM MFRM (Fixed Item Effects) MFRM with Overall Item Random Errors Variance 2 σ θ 1.18 0.80 1.15 σ2 ε 1.25 D 10933.10 11858.17 10947.72 Goodness-offit AIC 10987.10 11900.17 10991.72 BIC 11123.55 12006.30 11102.90 DIC 11747.50 12613.20 11757.00 Parameter K 27 21 22 Residual variance for overall item difficulties Correlation of δ im PCM MFRM (Fixed Item Effects) MFRM with Overall Item Random Errors PCM 1 0.91 1 16

Item Property Effects in the MFRM with Overall Item Random Errors How does the item format property affect the overall item difficulties? Item Property Process Property (Ref: Digestion/Biosynthesis) Progress Property (Ref: Mass) Item Format Property (Ref: Yes/No Explain) Item Property Effects on the Overall Item Difficulties Coefficient (Posterior Mean) Posterior SD γ 0 (Intercept) 0.66 0.90 γ CR (Cellular Respiration) 0.60 1.04 γ PS (Photosynthesis) -0.25 0.90 γ LS (Large Scale) -1.74 1.45 γ MS (Micro Scale) -0.79 0.91 γ EN (Energy) -0.65 1.03 γ MC (Multiple Choice Explain) 0.28 1.09 17

Performance of the LPCM with Item-by-step Random Errors PCM LPCM (Fixed Item Effects) LPCM with Item-bystep Random Errors Variance σ θ 2 1.18 0.79 1.17 Goodness-offit 2 σ ε 1.20 D 10933.10 12072.30 10937.86 AIC 10987.10 12102.30 10969.86 BIC 11123.55 12178.10 11050.72 DIC 11747.50 12819.50 11749.30 Residual variance for step difficulties Parameter K 27 15 16 Correlation of δ im PCM LPCM (Fixed Item Effects) LPCM with Item-by-step Random Errors PCM 1 0.91 1 18

Item Property Effects in the LPCM with Item-by-step Random Errors How does the item format property affect the step difficulties for each step? Item Property Process Property (Ref: Digestion/Biosynthesis) Progress Property (Ref: Mass) Item Format Property (Ref: Yes/No Explain) Item Property Effects on the Step Difficulties Coefficient (Posterior Mean) Posterior SD ω 01 (Intercept for the 1 st step) -1.03 0.86 ω 02 (Intercept for the 2 nd step) 2.22 0.84 ω CR1 (Cellular Respiration for the 1 st step) 0.87 1.04 ω CR2 (Cellular Respiration for the 2 nd step) 0.41 1.00 ω PS1 (Photosynthesis for the 1 st step) 0.38 0.90 ω PS2 (Photosynthesis for the 2 nd step) -0.78 0.89 ω LS1 (Large Scale for the 1 st step) -3.86 1.50 ω LS2 (Large Scale for the 2 nd step) 0.25 1.47 ω MS1 (Micro Scale for the 1 st step) -0.89 0.84 ω MS2 (Micro Scale for the 2 nd step) -0.58 0.86 ω EN1 (Energy for the 1 st step) -0.58 1.01 ω EN2 (Energy for the 2 nd step) -0.72 1.00 ω MC1 (Multiple Choice for the 1 st step) 0.66 1.09 ω MC2 (Multiple Choice for the 2 nd step) -0.06 1.08 19

Performance of the LPCM with Step Specific Random Errors PCM LPCM (Fixed Item Effects) LPCM with Step Specific Random Errors Variance 2 σ θ 1.18 0.79 1.17 2 σ δ1 1.55 2 σ δ2 0.91 σ δ12 0.79 D 10933.10 12072.30 10937.64 Goodness-offit AIC 10987.10 12102.30 10973.64 BIC 11123.55 12178.10 11064.60 DIC 11747.50 12819.50 11748.90 Parameter K 27 15 18 Residual variance for - the 1 st step - the 2 nd step - covariance Correlation of δ im PCM LPCM (Fixed Item Effects) LPCM with Step Specific Random Errors PCM 1 0.91 1 20

Item Property Effects in the LPCM with Step Specific Random Errors How does the item format property affect the step difficulties for each step? Item Property Process Property (Ref: Digestion/Biosynthesis) Progress Property (Ref: Mass) Item Format Property (Ref: Yes/No Explain) Item Property Effects on the Step Difficulties Coefficient (Posterior Mean) Posterior SD ω 01 (Intercept for the 1 st step) -0.97 0.95 ω 02 (Intercept for the 2 nd step) 2.25 0.77 ω CR1 (Cellular Respiration for the 1 st step) 0.80 1.15 ω CR2 (Cellular Respiration for the 2 nd step) 0.35 0.89 ω PS1 (Photosynthesis for the 1 st step) 0.33 0.97 ω PS2 (Photosynthesis for the 2 nd step) -0.80 0.78 ω LS1 (Large Scale for the 1 st step) -3.81 1.69 ω LS2 (Large Scale for the 2 nd step) 0.40 1.27 ω MS1 (Micro Scale for the 1 st step) -0.89 0.96 ω MS2 (Micro Scale for the 2 nd step) -0.57 0.75 ω EN1 (Energy for the 1 st step) -0.64 1.16 ω EN2 (Energy for the 2 nd step) -0.75 0.88 ω MC1 (Multiple Choice for the 1 st step) 0.70 1.19 ω MC2 (Multiple Choice for the 2 nd step) -0.05 0.92 21

CONCLUSION AND DISCUSSION We found that Random item effect models performed better than fixed item effect models. In the MFRM with overall item random errors, we could see how item properties have an effect on the overall item difficulties. In the LPCM with item-by-step random errors and the LPCM with step specific random errors, we could see how item properties have an effect on the step difficulties for each step, but variance of the residuals was differently estimated between the two LPCM variations. Potential value of the proposed models Can predict item locations or step difficulties of the newly developed items in a more scientific and systematic rather than intuitive manner. Can enhance the precision of estimation of the polytomous item difficulties. Can be a methodological foundation for advanced measurement models. 22

REFERENCE Linacre, J. M. (1989). Multi-facet Rasch measurement. Chicago: MESA Press. Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174. Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologi ca, 37, 359-374. Fischer, G. H., & Ponocny, I. (1994). An extension of the partial credit model with an application to the measu rement of change. Psychometrika, 59(2), 177-192. Janssen, R., Schepers, J., & Peres, D. (2004). Models with item and item group predictors. In De Boeck, P. & Wilson, M. (Eds.), Explanatory item response models: A generalized linear and nonlinear approach (pp. 189-212). New York: Springer-Verlag. Spiegelhalter, D., Thomas, A., & Best, N. (2003). WinBUGS version 1.4 [Computer program]. UK: MRC Bios tatistics Unit, Institute of Public Health, Robinson Way, Cambridge CB2 2SR. Van den Noortgate, W., De Boeck, P., & Meulders, M. (2003). Cross-classification multilevel logistic models in psychometrics. Journal of Educational and Behavioral Statistics, 28(4), 369-386. 23

Thank you! potatopaul@berkeley.edu