Overview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications

Similar documents
Basic IRT Concepts, Models, and Assumptions

A Markov chain Monte Carlo approach to confirmatory item factor analysis. Michael C. Edwards The Ohio State University

Generalized Linear Models for Non-Normal Data

Comparing IRT with Other Models

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Latent Trait Reliability

Multidimensional Linking for Tests with Mixed Item Types

Lesson 7: Item response theory models (part 2)

Latent Trait Measurement Models for Binary Responses: IRT and IFA

Educational and Psychological Measurement

Summer School in Applied Psychometric Principles. Peterhouse College 13 th to 17 th September 2010

examples of how different aspects of test information can be displayed graphically to form a profile of a test

A DISSERTATION SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY. Yu-Feng Chang

Empirical Validation of the Critical Thinking Assessment Test: A Bayesian CFA Approach

Likelihood and Fairness in Multidimensional Item Response Theory

Introduction To Confirmatory Factor Analysis and Item Response Theory

Psychometric Issues in Formative Assessment: Measuring Student Learning Throughout the Academic Year Using Interim Assessments

Equating Tests Under The Nominal Response Model Frank B. Baker

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Multidimensional Computerized Adaptive Testing in Recovering Reading and Mathematics Abilities

Paradoxical Results in Multidimensional Item Response Theory

COMPARISON OF CONCURRENT AND SEPARATE MULTIDIMENSIONAL IRT LINKING OF ITEM PARAMETERS

UCLA Department of Statistics Papers

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Exploratory Factor Analysis and Principal Component Analysis

The Discriminating Power of Items That Measure More Than One Dimension

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

Monte Carlo Simulations for Rasch Model Tests

IRT linking methods for the bifactor model: a special case of the two-tier item factor analysis model

Exploratory Factor Analysis and Principal Component Analysis

Walkthrough for Illustrations. Illustration 1

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Introduction to Confirmatory Factor Analysis

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Generalized Models: Part 1

Introduction to Generalized Models

Statistical and psychometric methods for measurement: Scale development and validation

A Re-Introduction to General Linear Models (GLM)

USING BAYESIAN TECHNIQUES WITH ITEM RESPONSE THEORY TO ANALYZE MATHEMATICS TESTS. by MARY MAXWELL

Item Response Theory and Computerized Adaptive Testing

Some Issues In Markov Chain Monte Carlo Estimation For Item Response Theory

On the Use of Nonparametric ICC Estimation Techniques For Checking Parametric Model Fit

Application of Plausible Values of Latent Variables to Analyzing BSI-18 Factors. Jichuan Wang, Ph.D

Logistic Regression and Item Response Theory: Estimation Item and Ability Parameters by Using Logistic Regression in IRT.

Preliminary Manual of the software program Multidimensional Item Response Theory (MIRT)

Dimensionality Assessment: Additional Methods

Interactions among Continuous Predictors

An Equivalency Test for Model Fit. Craig S. Wells. University of Massachusetts Amherst. James. A. Wollack. Ronald C. Serlin

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

The Difficulty of Test Items That Measure More Than One Ability

Generalized Multilevel Models for Non-Normal Outcomes

Assessing Factorial Invariance in Ordered-Categorical Measures

flexmirt R : Flexible Multilevel Multidimensional Item Analysis and Test Scoring

Assessment, analysis and interpretation of Patient Reported Outcomes (PROs)

BAYESIAN IRT MODELS INCORPORATING GENERAL AND SPECIFIC ABILITIES

Bayesian Analysis of Latent Variable Models using Mplus

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Application of Item Response Theory Models for Intensive Longitudinal Data

Item Response Theory (IRT) Analysis of Item Sets

Introduction to Matrix Algebra and the Multivariate Normal Distribution

DIAGNOSTIC MEASUREMENT FROM A STANDARDIZED MATH ACHIEVEMENT TEST USING MULTIDIMENSIONAL LATENT TRAIT MODELS

Longitudinal Invariance CFA (using MLR) Example in Mplus v. 7.4 (N = 151; 6 items over 3 occasions)

Confirmatory Factor Models (CFA: Confirmatory Factor Analysis)

Multilevel Structural Equation Modeling

POLI 8501 Introduction to Maximum Likelihood Estimation

Categorical and Zero Inflated Growth Models

Introduction to Random Effects of Time and Model Estimation

Comparing Multi-dimensional and Uni-dimensional Computer Adaptive Strategies in Psychological and Health Assessment. Jingyu Liu

Diagnostic Classification Models: Psychometric Issues and Statistical Challenges

Presented by. Philip Holmes-Smith School Research Evaluation and Measurement Services

References. Markov-Chain Monte Carlo. Recall: Sampling Motivation. Problem. Recall: Sampling Methods. CSE586 Computer Vision II

Local response dependence and the Rasch factor model

Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood

Local Dependence Diagnostics in IRT Modeling of Binary Data

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

Multiple Group CFA Invariance Example (data from Brown Chapter 7) using MLR Mplus 7.4: Major Depression Criteria across Men and Women (n = 345 each)

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Time-Invariant Predictors in Longitudinal Models

Assessment of fit of item response theory models used in large scale educational survey assessments

Markov-Chain Monte Carlo

Anders Skrondal. Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine. Based on joint work with Sophia Rabe-Hesketh

An Overview of Item Response Theory. Michael C. Edwards, PhD

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement

Mixed Models for Longitudinal Binary Outcomes. Don Hedeker Department of Public Health Sciences University of Chicago.

The application and empirical comparison of item. parameters of Classical Test Theory and Partial Credit. Model of Rasch in performance assessments

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Package equateirt. June 8, 2018

Comparison between conditional and marginal maximum likelihood for a class of item response models

Statistical Distribution Assumptions of General Linear Models

SCORING TESTS WITH DICHOTOMOUS AND POLYTOMOUS ITEMS CIGDEM ALAGOZ. (Under the Direction of Seock-Ho Kim) ABSTRACT

Multidimensional item response theory observed score equating methods for mixed-format tests

L6: Regression II. JJ Chen. July 2, 2015

SEM with Non Normal Data: Robust Estimation and Generalized Models

A Re-Introduction to General Linear Models

ABSTRACT. Yunyun Dai, Doctor of Philosophy, Mixtures of item response theory models have been proposed as a technique to explore

Modeling differences in itemposition effects in the PISA 2009 reading assessment within and between schools

Mixture Modeling in Mplus

Transcription:

Multidimensional Item Response Theory Lecture #12 ICPSR Item Response Theory Workshop Lecture #12: 1of 33 Overview Basics of MIRT Assumptions Models Applications Guidance about estimating MIRT Lecture #12: 2of 33

MULTIDIMENSIONAL ITEM RESPONSE THEORY (MIRT) MODELS Lecture #12: 3of 33 Why MIRT? Many of the more sophisticated approaches to IRT (including MIRT) exist because of problems with certain kinds of test data By problems we basically mean: the data don t meet the assumptions of the IRT model Looking at it from the more common educational measurement point of view Or theory posits the existence of more than one trait Lecture #12: 4of 33

Multidimensional IRT (MIRT) Unidimensionality is reasonable for many educational tests, but not for all Personality measures Achievement tests with multiple areas If violations of unidmensionality are found: (1) delete items (2) assume model robustness and carry on (3) form clusters and scale separately (e.g., SATs) (4) use a MIRT model. Lecture #12: 5of 33 Goals of MIRT Represent people with a set of ability scores One for each trait or dimension believed to be generating the data Represent each test item by a set of item parameters relating each trait to the item Lecture #12: 6of 33

Types of MIRT Models Compensatory allow high ability on one dimension to compensate for low ability on other dimensions Much more common Additive in the logit Non Compensatory no such compensation; High probability means high ability for all dimensions Infrequently used, often difficult to estimate Non additive (think latent variable interactions) Lecture #12: 7of 33 Difference in Models Compensatory linear terms are summed; high theta on one or more dimensions can save you from low theta(s) on other dimensions Non Compensatory linear terms are multiplied Theta must be high for all dimensions to maintain a high probability of success Lecture #12: 8of 33

MIRT Applications Multi trait feedback in Education One potentially useful application: using MIRT models to derive scores on subscales within a test Personality Assessment Tests usually designed to measure more than one (possibly correlated) latent traits at one time Lecture #12: 9of 33 Different MIRT Approaches Main Differences: specification of the non linear link in the MIRT model Normal Ogive MIRT (McDonald) Uses a cumulative normal density to model the probability of success on an item Logistic MIRT (Reckase) Uses the more familiar logistic function to model the probability of success on an item Lecture #12: 10 of 33

1 exp 1.7 1 exp 1.7 1 1 Φ The 1.7 in the logistic equation is what makes the a-parameters from both models have equivalent metrics: P logistic -P normal 0.01 for all Lecture #12: 11 of 33 Slope/Intercept Form McDonald presents the normal ogive MIRT model in slope intercept form: MIRT may be thought of as a special case of item factor analysis The purpose of the normal ogive approach to MIRT in slopeintercept form is make this clear MIRT is basically a non linear factor analysis, where the slope parameters are analogous to factor loadings Lecture #12: 12 of 33

Logisitic Approach Reckase presents the logistic MIRT model in slopeintercept form: a T is a transposed vector of discrimination parameters, one for each dimension d is the intercept parameter, related to difficulty c is the guessing parameter, same as always The is the vector of person latent traits Lecture #12: 13 of 33 Assumptions Monotonically increasing relationship between trait and probability of endorsement or correct response The function relating trait and item response is smooth, meaning derivatives are defined (makes the model both functional and estimable) Local Independence of item responses (conditional on, a vector of latent traits) Lecture #12: 14 of 33

Person Parameters is a vector of latent traits, one for each dimension The # of dimensions is based on theory Don t do exploratory analyses (although a lot of people in Education think that is MIRT) The dimensions are associated Covariances (correlations) are estimated Lecture #12: 15 of 33 Discrimination a is a vector of item slopes, one for each dimension measured by an item Same interpretation as the unidimensional IRT model Unless there is simple structure, an item will be more discriminating for combinations of dimensions that for a single dimension Discriminating power of an item for the most discriminating combinations of dimensions (where all are close to b): Lecture #12: 16 of 33

Multidimensional Item Difficulty The value d from the model is an intercept, Like ab, except that there is more than one a parameter (still just one b, though). The equivalent to the b parameter is: Lecture #12: 17 of 33 Lower Asymptote The c parameter has just the same interpretation as from unidimensional models It is now the lower asymptote in a multidimensional space c = probability of success for an examinee who is low for all thetas Lecture #12: 18 of 33

Other Statistics Test Charactertic Curve is now thought of as a Test Characteristic Surface, but is computed in the same way (i.e., by summing ICCs) Test Information is also computed in the same way The function becomes a surface, not just a curve Lecture #12: 19 of 33 Re parameterized Model Reckase (1985, 1997) Lecture #12: 20 of 33

The result is an Item Characteristic Surface, instead of a Curve, with these parameters: 1 per dimension 1 a per dimension 1 b 1 c a1 = 1.5 a2 = 1.5 b = 1.0 c = 0.0 This is a 2-d MIRT model solution any more and you couldn t graph it! Lecture #12: 21 of 33 The result is an Item Characteristic Surface, instead of a Curve, with these parameters: 1 per dimension 1 a per dimension 1 b 1 c a1 = 0.5 a2 = 1.5 b = 1.0 c = 0.0 This is a 2-d MIRT model solution any more and you couldn t graph it! Lecture #12: 22 of 33

The result is an Item Characteristic Surface, instead of a Curve, with these parameters: 1 per dimension 1 a per dimension 1 b 1 c a1 = 1.5 a2 = 1.5 b = 0.0 c = 0.0 This is a 2-d solution any more and you couldn t graph it! Lecture #12: 23 of 33 The result is an Item Characteristic Surface, instead of a Curve, with these parameters: 1 per dimension 1 a per dimension 1 b 1 c a1 = 1.5 a2 = 1.5 b = 1.0 c = 0.2 This is a 2-d solution any more and you couldn t graph it! Lecture #12: 24 of 33

Excel Spreadsheet Demo Show Excel Spreadsheet containing a MIRT item Specify different item parameters and determine how changes affect the resulting graph Lecture #12: 25 of 33 ESTIMATION OF MIRT MODELS Lecture #12: 26 of 33

MIRT Model Estimation MIRT model estimation via marginal maximum likelihood (MML) is difficult to impossible depending upon the size of Your data The number of thetas The issue is with the marginalization (integration); each theta exponentially increases the number of quadrature points needed Furthermore, sample sizes needed for good convergence are likely to be well beyond reasonable So approximations must be used Limited information or MCMC with the ability to tighten priors for misbehaving parameters Lecture #12: 27 of 33 MIRT Software: Mplus Mplus is the most comprehensive package to estimate MIRT models Default estimator is a limited information estimator (weighted least squares mean and variance; WLSMV) More commonly known as diagonally weighted least squares MML is available Integration can be by monte carlo (speeds the estimator) MCMC is also available But may not be as effective yet Lecture #12: 28 of 33

MIRT Software: NOHARM Normal Ogive Harmonic Analysis (Robust Method): NOHARM Fits the Normal Ogive MIRT model Authors: McDonald & Fraser (1988) Can be contacted for a copy of the software Provides multidimensional item parameter estimates, but does not provide multidimensional person parameter estimates One of original programs Uses limited information (but differently from Mplus) Lecture #12: 29 of 33 MIRT Software TESTFACT Part of the SSI suite of IRT software Performs factor analysis as part of testing dimensionality, fits MIRT models Person and item parameters provided Used for exploratory analyses :: BAD! Lecture #12: 30 of 33

CONCLUDING REMARKS Lecture #12: 31 of 33 Concluding Remarks MIRT (and IRT in general) can be thought of as simply a non linear factor analysis More on this later in the afternoon MIRT can be used for tests where the unidimensionality assumption is not viable Either by theory Or by empirical findings MIRT is an extension of CFA with multiple factors Just for categorical data Lecture #12: 32 of 33

Up Next Model based Diagnostic Assessment Comparing IRT with Factor Analytic Methods Conclusions, Discussion, Questions Farewells! Lecture #12: 33 of 33