Comparing IRT with Other Models

Similar documents
SEM with Non Normal Data: Robust Estimation and Generalized Models

Generalized Linear Models for Non-Normal Data

Generalized Models: Part 1

Introduction to Generalized Models

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Generalized Multilevel Models for Non-Normal Outcomes

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes

Basic IRT Concepts, Models, and Assumptions

Test Homogeneity The Single-Factor Model. Test Theory Chapter 6 Lecture 9

Overview. Multidimensional Item Response Theory. Lecture #12 ICPSR Item Response Theory Workshop. Basics of MIRT Assumptions Models Applications

NELS 88. Latent Response Variable Formulation Versus Probability Curve Formulation

Latent Trait Measurement Models for Binary Responses: IRT and IFA

13.1 Causal effects with continuous mediator and. predictors in their equations. The definitions for the direct, total indirect,

General structural model Part 2: Categorical variables and beyond. Psychology 588: Covariance structure and factor models

Introduction to Confirmatory Factor Analysis

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement

Latent Trait Reliability

Factor Analysis. Qian-Li Xue

Introduction to GSEM in Stata

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Introduction To Confirmatory Factor Analysis and Item Response Theory

Introduction to Factor Analysis

Categorical and Zero Inflated Growth Models

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Classical Test Theory (CTT) for Assessing Reliability and Validity

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

Anders Skrondal. Norwegian Institute of Public Health London School of Hygiene and Tropical Medicine. Based on joint work with Sophia Rabe-Hesketh

Application of Item Response Theory Models for Intensive Longitudinal Data

Chapter 1. Modeling Basics

8 Nominal and Ordinal Logistic Regression

Introducing Generalized Linear Models: Logistic Regression

Logistic Regression. Continued Psy 524 Ainsworth

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

A Re-Introduction to General Linear Models

ECON 594: Lecture #6

Data-analysis and Retrieval Ordinal Classification

UCLA Department of Statistics Papers

Assessment, analysis and interpretation of Patient Reported Outcomes (PROs)

Investigating Models with Two or Three Categories

Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

26:010:557 / 26:620:557 Social Science Research Methods

Model Assumptions; Predicting Heterogeneity of Variance

Exploratory Factor Analysis and Principal Component Analysis

Semiparametric Generalized Linear Models

Introduction to Random Effects of Time and Model Estimation

Simple, Marginal, and Interaction Effects in General Linear Models

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

Mixture Modeling in Mplus

STA 216, GLM, Lecture 16. October 29, 2007

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Introduction to Factor Analysis

Exploratory Factor Analysis and Principal Component Analysis

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

Confirmatory Factor Analysis

Mixed Models for Longitudinal Ordinal and Nominal Outcomes

Summer School in Applied Psychometric Principles. Peterhouse College 13 th to 17 th September 2010

Online Appendix for Sterba, S.K. (2013). Understanding linkages among mixture models. Multivariate Behavioral Research, 48,

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Logistic Regression in R. by Kerry Machemer 12/04/2015

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur

2/26/2017. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

Interactions among Continuous Predictors

Statistics: A review. Why statistics?

Econometric Analysis of Cross Section and Panel Data

Stat 642, Lecture notes for 04/12/05 96

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Single-level Models for Binary Responses

Review of the General Linear Model

Measurement Theory. Reliability. Error Sources. = XY r XX. r XY. r YY

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

General Regression Model

Whats beyond Concerto: An introduction to the R package catr. Session 4: Overview of polytomous IRT models

Partitioning variation in multilevel models.

Chapter 9 Regression with a Binary Dependent Variable. Multiple Choice. 1) The binary dependent variable model is an example of a

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Equating Tests Under The Nominal Response Model Frank B. Baker

Gov 2002: 4. Observational Studies and Confounding

An Introduction to Path Analysis

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

An Introduction to Mplus and Path Analysis

Exam Applied Statistical Regression. Good Luck!

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago

Discrete Choice Modeling

A Use of the Information Function in Tailored Testing

Analysis of Categorical Data. Nick Jackson University of Southern California Department of Psychology 10/11/2013

Two-Variable Regression Model: The Problem of Estimation

Description Remarks and examples Reference Also see

Analysing categorical data using logit models

Logistic regression modeling the probability of success

Correlation and Regression Bangkok, 14-18, Sept. 2015

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Transcription:

Comparing IRT with Other Models Lecture #14 ICPSR Item Response Theory Workshop Lecture #14: 1of 45 Lecture Overview The final set of slides will describe a parallel between IRT and another commonly used method for measurement: factor analysis These slides are meant to provide a basis for comparing the two methods, including the appropriate times for applying each method Lecture #14: 2of 45

MORE COMPARING IRT WITH CFA Lecture #14: 3of 45 Introduction Consider a score on an item that is not categorical Rather, consider a score to be continuous For simplicity, call the item X i Often Likert type item scores are considered continuous Other examples of continuous item types include reaction time and many physiological measurements Our goal will be to model the response behavior of an examinee on item X j using a latent variable (in IRT, typically called ) To distinguish the two approaches, we will use F as the latent variable for factor analysis Lecture #14: 4of 45

The Spearman Single Factor Model In relating an examinee s level of the common factor to their performance on an item, we introduce the Spearman single factor model: X is = i + i F s + E is X is is the score for a person e on the i th item F s is the persons s factor score Lecture #14: 5of 45 The Spearman Single Factor Model X is = i + i F s + E is E is is the unique or idiosyncratic property of item The amount by which item i is shifted from the predicted value for person e i is the overall mean for an item Allowing for differing item difficulties i is called the factor loading of item i We will discuss this factor loading quite a bit Lecture #14: 6of 45

Factor Loadings The term factor loading has a long history in Psychology The extent to which the item is loaded onto the factor Some items load more highly on to the factor than other The factor loadings of items reveal much about a test s structure :: the mapping of items onto factors Lecture #14: 7of 45 More on Factor Loadings The factor loading is similar to a regression weight: It represents the amount of change in the item per one unit increase in the factor score It measures how sensitively each item functions as an indicator of the common factor F Items with relatively large loadings are better indicators of F than items with relatively small loadings The factor loading is a measure of the discriminating power of the item How well the item discriminates between examinees with low and high values of F Lecture #14: 8of 45

Single Factor Model Specifics We need to define a few more things about our factor model: The unique component, E is, is independent of the common factor, F s Independence means that Cov(E,F) = Corr(E,F) = 0 The unique components of any two items i and j are independent: Cov(E is,e js ) = Corr(E is,e js ) = 0 The mean for the unique component is zero Lecture #14: 9of 45 More Specifics Like in IRT, we also have to set the scale for F We must pick it s mean and variance For most of our purposes, it serves us well to think of F as being a standardized measure Mean of zero Standard Deviation/Variance of one Standardized factors aren t as common in factor analysis (CFA or SEM) Variance of the factor must be fixed when additional modeling features are added Actually, same in IRT Lecture #14: 10 of 45

What Does The Common Factor Model Say About Our Items? So, what can we say the model predicts about our items, marginally? What is the model predicted item mean? What is the model predicted item variance? Why are these important? Lecture #14: 11 of 45 Model Predicted Item Mean The mean for an item under the single factor model: E(X is ) = E( i + i F s + E is ) = E( i ) + E( i F s ) + E(E is ) = j + i E(F s ) + E(E is ) = i + i * 0 + 0 = i Lecture #14: 12 of 45

Item Mean is Trivial The factor model says that our item mean should be our item mean parameter Generally, we are not concerned with such a quantity because it tells us information only marginally No information about how the item measures the common factor Lecture #14: 13 of 45 Model Predicted Item Variance The variance for an item under the single factor model: Var(X is ) = Var( i + i F s + E is ) = Var( i F s + E is ) = Var( i F s ) + Var(E is ) + 2 Cov(F s,e is ) = i2 Var(F s ) + Var(E is ) = i2 + 2 i We Typically Set this to One We define the variance of E to be the unique variance of the item. Is zero by independence Lecture #14: 14 of 45

Model Predicted Item Variance The variance for an item under the single factor model: Var(X is ) = Var( i + i F s + E is ) = Var( i F s + E is ) = Var( i F s ) + Var(E is ) + 2 Cov(F s,e is ) = i2 Var(F s ) + Var(E is ) = i2 + i 2 We define the variance of E to be the unique variance of the item. Lecture #14: 15 of 45 Model Predicted Item Covariances The covariance for a pair of items under the single factor model : Cov(X i, X j ) = Cov( i + i F s + E is, j + j F s + E js ) = Cov( i F is + E is, j F s + E js ) = Cov( i F s, j F s ) + Cov( i F s, E js ) + Cov( j F s, E is )+ Cov(E is, E js ) = i j Cov(F s, F s ) = i j The covariance of a variable with itself is its variance. The variance of F is set to one. Lecture #14: 16 of 45

Extrapolating to the Covariance Matrix We have seen: The model predicted variance for each item The model predicted covariance for each pair of items The model predicted covariance matrix looks like: Lecture #14: 17 of 45 Item Information Under the Factor Model The item information function under the single factor model is: Item information under the factor model is not a function of the latent trait This is a key distinction between item information in the factor model and IRT It is a consequence of the differences in data type A nuance of categorical data: the mean and the variance are related Lecture #14: 18 of 45

Item Information Functions FA Information Functions IRT Information Functions Lecture #14: 19 of 45 Implications of FA Item Information The test information function in FA is flat Regardless of a person s location on the scale, the standard error of their estimate will be the same CAT algorithms do not make much sense using FA All items would be equally informative across the scale The items with the highest information would always be selected Lecture #14: 20 of 45

Comparing FA with IRT FA and IRT have much in common: They both provide a statistical model for response behavior as a function of a latent trait (or set of latent traits) IRT parameterizations obscure the commonalities between the models To demonstrate, let s rephrase the 2PL model Lecture #14: 21 of 45 IRT Model in Slope/Intercept Form Begin with the original 2PL Model: Then convert into the log odds of the probability of a correct response: Lecture #14: 22 of 45

IRT Model in Slope/Intercept Form Finally, multiply through the equation: ln 1 1 1 1.7 1.7 1.7 Now, we can re configure terms to FA analogs: ln 1 1 1 1.7 1.7 1.7 Lecture #14: 23 of 45 IRT vs. FA Many IRT models are categorical versions of the FA or structural equation model The difference in model properties (like information) is due to the link function used for the data A link function is the function applied to the left hand side of the previous equation IRT models we have discussed usually use a logistic link function (or an ogive) FA models use an identity link function Identity = no link function at all Lecture #14: 24 of 45

IRT vs. FA, continued Appropriate uses of FA are for data that follow continuous distributions Appropriate uses of IRT are for data that follow the corresponding categorical distribution Binary variables use binomial logistic Polytomous variables use multinomial logistic The question to be asked is at what point do categorical data become continuous If you think really hard about it, all data are categorical how many categories, though? Lecture #14: 25 of 45 Other Link Functions: Item Factor Analysis Just as in categorical data analysis, other link functions exist and their use results in models with IRT like properties One of the more prevalent link functions is the probit or normal ogive link This is the cumulative distribution function of a standard normal variable The use of the normal ogive link dates to Lord (1952) More commonly, such models are referred to as Item Factor Models Setting the scaling constant to 1.7 in IRT approximates this function Lecture #14: 26 of 45

Item Factor Analysis An alternative parameterization of the model in terms of underlying quantitative "response tendencies" common factor parameterization Each binary item has associated with it an "underlying" quantitative response tendency and a threshold value, such that: If then 1 If then 0 Lecture #14: 27 of 45 Underlying Response Model The underlying response tendencies, are then used with a factor analysis model, say Spearman single factor model (mean omitted see below): With uncorrelated unique parts For model identification, we impose a scale on each so that it is standardized: With mean zero (hence no i ). With variance one, so i2 + i2 = 1. Lecture #14: 28 of 45

Building on the Previous Model and each have a normal distribution Each has a normal distribution This leads to, for an item i, Larger means larger discriminating power The larger the, the more difficult the item Lecture #14: 29 of 45 Similarities We can relate our new item factor analysis parameterization to the IRT parameterization: Item Factor Analysis IRT Item Discrimination Item Difficulty Lecture #14: 30 of 45

So Why Use One Parameterization over the Other? The common factor parameters are most useful in a preliminary examination of the structure of the data Many people are experienced with using factor loadings Because we can use established factor analytic criteria for judging the sizes of the factor loadings. The response function parameterizations are useful in applications of a fitted model because they generally simplify computations Differing estimation routines can be employed Lecture #14: 31 of 45 Interpretation of Item Factor Parameters One can interpret the common factor parameters in relation to classical item analysis: The factor loading of the item,,is the productmoment correlation between and Which is the biserial correlation between binary and The product of the factor loadings between any pair of items (i and j) gives the model estimate of the tetrachoric correlation between the items: Lecture #14: 32 of 45

GENERALIZING BEYOND CATEGORICAL AND CONTINUOUS DATA Lecture #14: 33 of 45 Welcome to the Family Generalized Linear Models General Linear Models with non normal error terms and transformed data to obtain some kind of continuous outcome to work with Many kinds of non normally distributed outcomes have some kind of generalized linear model to go with them: Binary (dichotomous) Ordered categorical (ordinal) Unordered categorical (nominal) These two are often called multinomial inconsistently Censored (piled up and cut off at one end left or right) Counts (discrete, positive values) Counts with zero issues (too many or none) Lecture #14: 34 of 45

3 Parts of a Generalized Linear Model Link Function (main difference from GLM): How a non normal outcome gets transformed into something we can predict that is more continuous (unbounded) For outcomes that are already normal, general linear models are just a special case with an identity link function (Y * 1) Model for the Means ( Structural Model ): How predictors linearly relate to the transformed outcome New transformed data = β 0 + β 1 X s + β 2 Z s Model for the Variance ( Sampling/Stochastic Model ): If the errors aren t normal and homoscedastic, then what are they? Family of alternative distributions at our disposal that map onto what the distribution of errors could possibly look like Lecture #14: 35 of 45 Model Parts for Binary Outcomes: 2 Choices Logit vs. Probit 2 Alternative Link Functions: Logit link: binary Y = ln(p/1 p) :: logit is new transformed Y Y is 0/1, but logit(y) goes from to + Probit link: binary Y = Φ(Y) Observed probability replaced by value of standard normal curve below which observed proportion is found :: Z score is new transformed Y Y is 0/1, but probit(y) goes from to + Same Model for the Means: Main effects and interactions of predictors as desired No analog to odds coefficients in probit, however 2 Alternative Models for the Variances: Logit: e s s ~ Bernoulli distributed with known variance of π 2 /3, or 3.29 Probit: e s s ~ Bernoulli distributed with known variance of 1 Lecture #14: 36 of 45

Ordered Categorical Outcomes One option: Cumulative Logit Model Called graded response model in IRT Assumes ordinal categories Model logit of category vs. all lower/higher via submodels 3 categories :: 2 models: 0 vs. 1 or 2, 0 or 1 vs. 2 Get separate threshold ( intercept) for each submodel Effects of predictors are assumed the same across submodels :: Proportional odds assumption Is testable in some software (e.g., Mplus, NLMIXED) In Mplus, can do this with the CATEGORICAL ARE option Lecture #14: 37 of 45 Ordered Categorical Outcomes Another option: Adjacent Category Logit Model Called partial credit model in IRT Does not assume order across all categories (only adjacent) Model logit of sequential categories only via submodels 3 categories :: 2 models: 0 vs. 1, 1 vs. 2 Get separate threshold ( intercept) for each submodel Effects of predictors are still assumed the same across adjacent category submodels :: Proportional odds assumption Is testable in some software (e.g., NLMIXED) Currently not available in Mplus Lecture #14: 38 of 45

Unordered Categorical Outcomes: Nominal Model Compare each category against a reference category using a binary logit model Referred to as baseline category logit End up with multiple logistic submodels up to #categories 1 (2 submodels for 3 categories, 3 for 4 categories, etc) Intercept/thresholds and slopes for effects of predictors (factor loadings) are estimated separately within each binary submodel Can get effects for missing contrast via subtraction Effects are interpreted as given that it s one of these two categories, which has the higher probability? Model comparisons proceed as in logistic regression Can also test whether outcome categories can be collapsed In Mplus, can do this with the NOMINAL ARE option Lecture #14: 39 of 45 Censored ( Tobit ) Outcomes For outcomes with ceiling or floor effects Can be Right censored and/or left censored Also inflated or not :: inflation = binary variable in which 1 = censored, 0 = not censored Model assumes unobserved continuous distribution instead for the part it is missing In Mplus, can do with various CENSORED ARE (with options): CENSORED ARE y1 (a) y2 (b) y3 (ai) y4 (bi); y1 is censored from above (right); y2 is censored from below (left) y3 is censored from above (right) and has inflation variable (inflated: y3#1) y4 is censored from above (below) and has inflation variable (inflated: y4#1) So, can predict distribution of y1 y4, as well as whether or not y3 and y4 are censored ( inflation ) as separate outcomes y3 ON x; x predicts value of Y if at censoring point or above y3#1 ON x; x predicts whether Y is censored (1) or not (0) Lecture #14: 40 of 45

A Family of Options in Mplus for Count Outcomes (COUNT ARE) Counts: non negative integer unbounded responses e.g., how many cigarettes did you smoke this week? Poisson and negative binomial models Same Link: count Y = ln(y) (makes the count stay positive) LN(Y is ) = μ i + λ i F s + e is (model has intercepts and loadings) Residuals follow 1 of 2 distributions: Poisson distribution in which k = Mean = Variance Negative binomial distribution that includes a new α scaling or over dispersion parameter that allows the variance to be bigger than the mean :: variance = k(1 + kα) Poisson is nested within negative binomial (can test of α 0) COUNT ARE y1 (p) y2 (nb); :: y1 is Poisson; y2 is neg. binomial Lecture #14: 41 of 45 Issues with Zeros in Count Data No zeros :: zero truncated negative binomial e.g., how many days were you in the hospital? (has to be >0) COUNT ARE y1 (nbt); Too many zeros :: zero inflated poisson or neg binomial e.g., # cigarettes smoked when asked in non smokers too COUNT ARE y2 (pi) y3 (nbi); Refer to inflation variable as y2#1 or y3#1 Tries to distinguish 2 kinds of zeros Structural zeros would never do it Inflation is predicted as logit of being a structural zero Expected zeros could do it, just didn t (part of regular count) Count with expected zeros predicted by poisson or neg binomial Poisson or neg binomial without inflation is nested within models with inflation (and poisson is nested within neg binomial) Lecture #14: 42 of 45

Issues with Zeros in Count Data Other more direct ways of dealing with too many zeros: split distribution into (0 or not) and (if not 0, how much)? Negative binomial hurdle (or zero altered negative binomial) COUNT ARE y1 (nbh); 0 or not: predicted by logit of being a 0 ( 0 is the higher category) How much: predicted by zero truncated negative binomial Two part model uses Mplus DATA TWOPART: command NAMES ARE y1 y4; list outcomes to be split into 2 parts CUTPOINT IS 0; where to split observed outcomes BINARY ARE b1 b4; create names for 0 or not part CONTINUOUS ARE c1 c4; create names for how much part TRANSFORM IS LOG; transformation of continuous part 0 or not: predicted by logit of being NOT 0 ( something is the 1) How much: predicted by transformed normal distribution (like log) Lecture #14: 43 of 45 CONCLUDING REMARKS Lecture #14: 44 of 45

Wrapping Up When fitting latent factor models (or when just predicting observed outcomes from observed predictors instead), you have many options to fit non normal distributions CFA: Continuous outcomes with normal residuals, X Y is linear IRT and IFA: Categorical or ordinal outcomes with Bernoulli/multinomial residuals, X transformed Y is linear; X original Y is nonlinear Censored: Continuous outcomes that shut off, X Y is linear Model tries to predict what would happen if Y kept going instead Count family: Non negative integer outcomes, X LN(Y) is linear Residuals can be Poisson (where mean = variance) or negative binomial (where variance > mean); either can be zero inflated or zero truncated Hurdle or two part may be more direct way to predict/interpret excess zeros (predict zero or not and how much rather than two kinds of zeros) Lecture #14: 45 of 45