Whats beyond Concerto: An introduction to the R package catr. Session 4: Overview of polytomous IRT models

Similar documents
A Note on the Equivalence Between Observed and Expected Information Functions With Polytomous IRT Models

A Marginal Maximum Likelihood Procedure for an IRT Model with Single-Peaked Response Functions

On the Construction of Adjacent Categories Latent Trait Models from Binary Variables, Motivating Processes and the Interpretation of Parameters

IRT Model Selection Methods for Polytomous Items

Equating Tests Under The Nominal Response Model Frank B. Baker

What is an Ordinal Latent Trait Model?

Summer School in Applied Psychometric Principles. Peterhouse College 13 th to 17 th September 2010

PIRLS 2016 Achievement Scaling Methodology 1

Chapter 3 Models for polytomous data

Comparison between conditional and marginal maximum likelihood for a class of item response models

Polytomous IRT models and monotone likelihood ratio of the total score Hemker, B.T.; Sijtsma, K.; Molenaar, I.W.; Junker, B.W.

Preliminary Manual of the software program Multidimensional Item Response Theory (MIRT)

Monte Carlo Simulations for Rasch Model Tests

A Markov chain Monte Carlo approach to confirmatory item factor analysis. Michael C. Edwards The Ohio State University

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Applied Psychological Measurement 2001; 25; 283

Item Response Theory (IRT) Analysis of Item Sets

SCORING TESTS WITH DICHOTOMOUS AND POLYTOMOUS ITEMS CIGDEM ALAGOZ. (Under the Direction of Seock-Ho Kim) ABSTRACT

Estimating Integer Parameters in IRT Models for Polytomous Items

Rater agreement - ordinal ratings. Karl Bang Christensen Dept. of Biostatistics, Univ. of Copenhagen NORDSTAT,

Test Equating under the Multiple Choice Model

ANALYSIS OF ITEM DIFFICULTY AND CHANGE IN MATHEMATICAL AHCIEVEMENT FROM 6 TH TO 8 TH GRADE S LONGITUDINAL DATA

PREDICTING THE DISTRIBUTION OF A GOODNESS-OF-FIT STATISTIC APPROPRIATE FOR USE WITH PERFORMANCE-BASED ASSESSMENTS. Mary A. Hansen

A Nonlinear Mixed Model Framework for Item Response Theory

Comparing IRT with Other Models

Polytomous Item Explanatory IRT Models with Random Item Effects: An Application to Carbon Cycle Assessment Data

Item-Focussed Trees for the Detection of Differential Item Functioning in Partial Credit Models

The Factor Analytic Method for Item Calibration under Item Response Theory: A Comparison Study Using Simulated Data

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

NESTED LOGIT MODELS FOR MULTIPLE-CHOICE ITEM RESPONSE DATA UNIVERSITY OF TEXAS AT AUSTIN UNIVERSITY OF WISCONSIN-MADISON

UCLA Department of Statistics Papers

Score-Based Tests of Measurement Invariance with Respect to Continuous and Ordinal Variables

Anders Skrondal. Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine. Based on joint work with Sophia Rabe-Hesketh

A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY. Yu-Feng Chang

Latent Variable Modeling and Statistical Learning. Yunxiao Chen

Scaling Methodology and Procedures for the TIMSS Mathematics and Science Scales

Comparing Multi-dimensional and Uni-dimensional Computer Adaptive Strategies in Psychological and Health Assessment. Jingyu Liu

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

An Overview of Item Response Theory. Michael C. Edwards, PhD

Raffaela Wolf, MS, MA. Bachelor of Science, University of Maine, Master of Science, Robert Morris University, 2008

GENERALIZED LATENT TRAIT MODELS. 1. Introduction

LSAC RESEARCH REPORT SERIES. Law School Admission Council Research Report March 2008

AC : GETTING MORE FROM YOUR DATA: APPLICATION OF ITEM RESPONSE THEORY TO THE STATISTICS CONCEPT INVENTORY

Multidimensional Linking for Tests with Mixed Item Types

Because it might not make a big DIF: Assessing differential test functioning

Application of Item Response Theory Models for Intensive Longitudinal Data

Basic IRT Concepts, Models, and Assumptions

An IRT-based Parameterization for Conditional Probability Tables

Logistic Regression and Item Response Theory: Estimation Item and Ability Parameters by Using Logistic Regression in IRT.

Links Between Binary and Multi-Category Logit Item Response Models and Quasi-Symmetric Loglinear Models

Statistical and psychometric methods for measurement: Scale development and validation

EVALUATION OF MATHEMATICAL MODELS FOR ORDERED POLYCHOTOMOUS RESPONSES. Fumiko Samejima*

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin

Introduction To Confirmatory Factor Analysis and Item Response Theory

Item Selection and Hypothesis Testing for the Adaptive Measurement of Change

The application and empirical comparison of item. parameters of Classical Test Theory and Partial Credit. Model of Rasch in performance assessments

Large-scale assessment programs are increasingly including complex

Score-Based Tests of Measurement Invariance with Respect to Continuous and Ordinal Variables

Item Response Theory for Scores on Tests Including Polytomous Items with Ordered Responses

Editor Nicholas J. Cox Department of Geography Durham University South Road Durham City DH1 3LE UK

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

DETECTION OF DIFFERENTIAL ITEM FUNCTIONING USING LAGRANGE MULTIPLIER TESTS

Extensions and Applications of Item Explanatory Models to Polytomous Data in Item Response Theory. Jinho Kim

Assessment of fit of item response theory models used in large scale educational survey assessments

A Comparison of Item-Fit Statistics for the Three-Parameter Logistic Model

Paul Barrett

Presented by. Philip Holmes-Smith School Research Evaluation and Measurement Services

LATENT VARIABLE SELECTION FOR MULTIDIMENSIONAL ITEM RESPONSE THEORY MODELS VIA L 1 REGULARIZATION

Educational and Psychological Measurement

Making the Most of What We Have: A Practical Application of Multidimensional Item Response Theory in Test Scoring

Chapter 11. Scaling the PIRLS 2006 Reading Assessment Data Overview

Local response dependence and the Rasch factor model

University of Groningen. Mokken scale analysis van Schuur, Wijbrandt H. Published in: Political Analysis. DOI: /pan/mpg002

Conditional maximum likelihood estimation in polytomous Rasch models using SAS

Using Graphical Methods in Assessing Measurement Invariance in Inventory Data

A Note on Item Restscore Association in Rasch Models

Modeling rater effects using a combination of Generalizability Theory and IRT

flexmirt R : Flexible Multilevel Multidimensional Item Analysis and Test Scoring

Lesson 7: Item response theory models (part 2)

Covariates of the Rating Process in Hierarchical Models for Multiple Ratings of Test Items

New Developments for Extended Rasch Modeling in R

Meiser et. al.: Latent Change in Discrete Data 76 as new perspectives in the application of test models to social science issues (e.g., Fischer & Mole

Psychology 282 Lecture #4 Outline Inferences in SLR

UCLA Department of Statistics Papers

The Use of Quadratic Form Statistics of Residuals to Identify IRT Model Misfit in Marginal Subtables

A Bivariate Generalized Linear Item Response Theory Modeling Framework to the Analysis of Responses and Response Times

Assessment, analysis and interpretation of Patient Reported Outcomes (PROs)

Multidimensional item response theory observed score equating methods for mixed-format tests

Measurement precision test construction and best test design

RANDOM INTERCEPT ITEM FACTOR ANALYSIS. IE Working Paper MK8-102-I 02 / 04 / Alberto Maydeu Olivares

A PSYCHOPHYSICAL INTERPRETATION OF RASCH S PSYCHOMETRIC PRINCIPLE OF SPECIFIC OBJECTIVITY

Journal of Statistical Software

A comparison of parameter covariance estimation methods for item response models in an expectation-maximization framework

Studies on the effect of violations of local independence on scale in Rasch models: The Dichotomous Rasch model

A Use of the Information Function in Tailored Testing

Psychology 405: Psychometric Theory Validity

Walkthrough for Illustrations. Illustration 1

A Comparison of Linear and Nonlinear Factor Analysis in Examining the Effect of a Calculator Accommodation on Math Performance

COMPARISON OF CONCURRENT AND SEPARATE MULTIDIMENSIONAL IRT LINKING OF ITEM PARAMETERS

Pairwise Parameter Estimation in Rasch Models

Transcription:

Whats beyond Concerto: An introduction to the R package catr Session 4: Overview of polytomous IRT models The Psychometrics Centre, Cambridge, June 10th, 2014

2 Outline: 1. Introduction 2. General notations 3. Difference models 4. Divide-by-total models 5. Summary

3 1. Introduction Current version of catr is version 3.0 (available online) Up to version 2.6, only dichotomous IRT models were embedded into catr Version 3.0 holds now most known polytomous IRT models Goal of this session: 1. To briefly describe the general framework for polytomous IRT models 2. To set up the notations and IRT models that are embedded in catr

4 2. General notations With dichotomous items, only two responses are possible With polytomous items, more than two responses are allowed For item j, set g j + 1 as the number of possible responses Responses are coded as k {0, 1,..., g j } Settiong g j = 1 yields a dichotomous item Item responses can be ordinal: e.g., ( never, sometimes, often, always ) nominal: e.g., color, political affiliation, etc. p j is the set of item parameters (depending on the model) P jk (θ) = P r(x j = k θ, p j ) is the probability of answering response k to item j, given proficiency θ and item parameters

5 2. General notations Two main categories of polytomous IRT models (Thissen & Steinberg, 1986): 1. Difference models: probabilities P jk (θ) are set as differences between cumulative probabilities P jk (θ) = P r(x j k θ, p j ) 2. Divide-by-total models: probabilities P jk (θ) are set as ratios of values divided by the sum of these values: t jk P jk (θ) = g j l=0 t jl Notations to be used come from Embretson and Reise (2000)

6 2. General notations Most-known difference models: Graded response model (GRM; Samejima, 1969) Modified graded response model (MGRM; Muraki, 1990) Most-known difference models: Partial credit model (PCM; Masters, 1982) Generalized partial credit model (GPCM; Muraki, 1992) Rating scale model (RSM; Andrich, 1978) Nominal response model (NRM; Bock, 1972) All these models are available in catr

7 3. Difference models With difference models, probabilities P jk (θ) are set as differences between cumulative probabilities P jk (θ) = P r(x j k θ, p j ) that is, the probability of selecting response in (k, k + 1,..., g j ) Graded response model (GRM; Samejima, 1969): P jk (θ) = exp [α j (θ β jk )] 1 + exp [α j (θ β jk )] with P j0 (θ) = 1 and P j,g j (θ) = P j,gj (θ) α j discrimination (slope) parameter and β jk threshold (intercept) parameters of cumulative probabilities Only g j thresholds β j1,..., β j,gj are necessary to distinguish the g j + 1 response categories

8 3. Difference models Response category probabilities P jk (θ) are found back as follows (with 0 < k < g j ): Illustration with P j,gj (θ) = P j,g j (θ) P jk (θ) = P jk (θ) P j,k+1 (θ) P j0 (θ) = 1 P j1 (θ) 4 response categories (i.e. g j = 3), α j = 1.5, β j1 = 0.5, β j2 = 0.5, and β j3 = 1.2

9 3. Difference models P * jk (θ) Cumulative probability 0.0 0.2 0.4 0.6 0.8 1.0 k = 1 k = 2 k = 3 6 4 2 0 2 4 6 θ

10 3. Difference models P jk (θ) Probability 0.0 0.2 0.4 0.6 0.8 1.0 k = 0 k = 1 k = 2 k = 3 6 4 2 0 2 4 6 θ

11 3. Difference models With GRM, threshold parameters β jk may vary across items items may not have the same number of response categories g j Modified graded response model (MGRM; Muraki, 1990): modification of GRM to allow for common thresholds across all items With MGRM, thresholds β jk are split in two components: b j (general intercept parameter for item j) and c k (threshold parameter between categories k and k + 1, common to all items) Cumulative probability under MGRM: P jk (θ) = exp [α j (θ b j + c k )] 1 + exp [α j (θ b j + c k )]

12 3. Difference models With GRM, all items have the same number of categories (since all c k parameters are equal across items) MGRM applicable for Likert scales with same response categories across items, such as e.g. Never - Rarely - Sometimes - Often - Always

13 4. Divide-by-total models With divide-by-total models, probabilities P jk (θ) are set directly as ratios of values divided by the sum of these values (across response categories) Partial credit model (PCM; Masters, 1982): with P jk (θ) = exp k t=0 (θ δ jt ) g j r=0 exp r t=0 (θ δ jt ) 0 (θ δ jt ) = 0 or exp t=0 0 (θ δ jt ) = 1 t=0

14 4. Divide-by-total models Detailed equations with three response categories (i.e. g j = 2 and k {0, 1, 2}): g j r=0 exp r (θ δ jt ) = t=0 2 r=0 exp r (θ δ jt ) t=0 = 1 + exp (θ δ j1 ) + exp (θ δ j1 + θ δ j2 ) Denominator is then equal to g j r exp (θ δ jt ) = 1 + exp (θ δ j1 ) + exp (2 θ δ j1 δ j2 ) r=0 t=0

15 4. Divide-by-total models Response category probabilities are then 1 P j0 (θ) = 1 + exp (θ δ j1 ) + exp (2 θ δ j1 δ j2 ), and P j1 (θ) = P j2 (θ) = exp (θ δ j1 ) 1 + exp (θ δ j1 ) + exp (2 θ δ j1 δ j2 ), exp (2 θ δ j1 δ j2 ) 1 + exp (θ δ j1 ) + exp (2 θ δ j1 δ j2 ), Idea: to allow for partial credit in the item response (e.g., false - incomplete - almost correct - correct responses)

16 4. Divide-by-total models Only g j thresholds (δ j1,..., δ j,gj ) are necessary Illustration with g j = 3 (i.e. four response categories) (δ j1, δ j2, δ j3 ) = (1, 1, 0.5)

17 4. Divide-by-total models P jk (θ) Probability 0.0 0.2 0.4 0.6 0.8 1.0 k = 0 k = 1 k = 2 k = 3 β j1 = 1 β j2 = 1 β j3 = 0.5 6 4 2 0 2 4 6 θ

18 4. Divide-by-total models Generalized partial credit model (GPCM; Muraki, 1992) extends the PCM by introducing a discrimination parameter α j for the item: exp k P jk (θ) = t=0 α j (θ δ jt ) g j r=0 exp r t=0 α j (θ δ jt ) with 0 α j (θ δ jt ) = 0 t=0 Illustration with same parameters as PCM, i.e. (δ j1, δ j2, δ j3 ) = (1, 1, 0.5), and with α j = 1.5

19 4. Divide-by-total models PCM Probability 0.0 0.2 0.4 0.6 0.8 1.0 k = 0 k = 1 k = 2 k = 3 6 4 2 0 2 4 6 θ

20 4. Divide-by-total models GPCM Probability 0.0 0.2 0.4 0.6 0.8 1.0 k = 0 k = 1 k = 2 k = 3 6 4 2 0 2 4 6 θ

21 4. Divide-by-total models PCM allows for different thresholds δ jt across the items, and also different response categories Rating scale model (RSM; Andrich, 1978) is a modification of PCM to allow equal thresholds across items With RSM, thresholds δ jt are split in two components, λ j (general intercept parameter) and δ t (threshold parameter common to all items): with P jk (θ) = exp k t=0 [θ (λ j + δ t )] g j r=0 exp r t=0 [θ (λ j + δ t )] 0 [θ (λ j + δ t )] = 0 t=0

22 4. Divide-by-total models Finally, Nominal response model (NRM; Bock, 1972) specifies response category probabilities as follows: with P jk (θ) = exp (α jk θ + c jk ) g j r=0 exp (α jr θ + c jr ) α j0 θ + c j0 = 0 Illustration with four categories (i.e.g j = 3) and (α j1, c j1 ) = (1, 1), (α j2, c j2 ) = (1.2, 0.5), (α j3, c j3 ) = (0.8, 0)

23 4. Divide-by-total models P jk (θ) Probability 0.0 0.2 0.4 0.6 0.8 1.0 k = 0 k = 1 k = 2 k = 3 6 4 2 0 2 4 6 θ

24 5. Summary Polytomous IRT models extend dichotomous IRT models by allowing more than two possible responses Response categories can be the same across all items or they may vary Polytomous IRT models: GRM, MGRM, PCM, GPCM, RSM, NRM,... MGRM and RSM are convenient only if the same response categories (as they share common threshold parameters for all items) Extensions of algorithms for dichotomous IRT model calibration to the polytomous framework exist

25 5. Summary Each model can be fully described with specific sets of parameters for each item j: (α j, β j1,..., β j,gj ) for GRM (α j, b j, c 1,..., c g ) for MGRM (δ j1,..., δ j,gj ) for PCM (α j, δ j1,..., δ j,gj ) for GPCM (λ j, δ 1,..., δ g ) for RSM (α j1, c j1,..., α j,gj, c j,gj ) for NRM These orders of parameters are used in catr...

26 References Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43, 561-573. Bock, R.D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29-51. Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates. Masters, G.N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-174. Muraki, E. (1990). Fitting a polytomous item response model to Likert-type data. Applied Psychological Measurement, 14, 59-71.

27 References Muraki, E. (1992). A generalized partial credit model: application of an EM algorithm. Applied Psychological Measurement, 16, 159-176. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika monograph (Vol. 17). Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567-577.