A Markov chain Monte Carlo approach to confirmatory item factor analysis. Michael C. Edwards The Ohio State University

A Markov chain Monte Carlo approach to confirmatory item factor analysis Michael C. Edwards The Ohio State University

An MCMC approach to CIFA

Overview Motivating examples Intro to Item Response Theory (IRT) Review of IRT estimation history Current estimation challenges Markov chain Monte Carlo (MCMC) MCMC applied to IRT Some Results

Motivating Examples SAT Diagnostic Scores Quality of Life Measurement Dimensionality Scoring

Item Response Theory IRT is a collection of latent variable models that explain the process by which people respond to items in terms of item and person parameters.

-Parameter Normal Ogive Model One of the most widely used Appropriate for dichotomous item responses when guessing is not present T(Item Response) 0.0 0. 0.4 0.6 0.8 1.0-3 - -1 0 1 3 Theta P( y = 1 θ ) = Φ[ a ( θ b j j j )] P( y = 1 θ ) = Φ[ a θ d j j j ]

Samejima s Graded Model Widely used in psychology Appropriate for ordered categorical data T(Item Response) 0.0 0. 0.4 0.6 0.8 1.0-3 - -1 0 1 3 Theta P( y = c θ ) = Φ[ a ( θ b )] Φ[ a ( θ b )] j j jc j jc+ 1 P( y j = c θ ) = Φ[ a jθ ( d j + o jc )] Φ[ a jθ ( d j + o jc+ 1)]

IRT Parameter Estimation Heuristic Estimation Lord (195), Lord & Novick (1968) Joint Maximum Likelihood Lord (1953), Rasch (1960), Birnbaum (1968) Maximum Marginal Likelihood Bock & Lieberman (1970) Maximum Marginal Likelihood with an EM algorithm Bock & Aitkin (1981) Bayes Modal Estimation with an EM algorithm Mislevy (1986)

Potential Uses Item Analysis Scale Development Scoring Linking and Equating Computerized Adaptive Testing (CAT)

Moving to Multiple Dimensions TESTFACT Exploratory item factor analysis for dichotomous models Bi-factor model Uses MML-EM and different methods of numerical integration SEM approaches WLS, DWLS, UBN, etc.

Old Problems, New Solutions Curse of Dimensionality

Old Problems, New Solutions Curse of Dimensionality How to handle high dimensional integration?

Markov Chain Monte Carlo MCMC estimation can be thought of as Monte Carlo integration using Markov chains Monte Carlo integration works by simulating samples from a target distribution and then computing averages to replace expectations Simulated values from target distribution - generated by constructing a Markov chain with the target as its stationary distribution

How to Simulate from Target Distribution? Metropolis Hastings Gibbs Sampling Data Augmented Gibbs Sampling (DAG) Metropolis Hastings within Gibbs (MHwG)

MCMC and IRT Albert (199) First published application of MCMC to IRT Patz & Junker (1996) MHwG for IRT models Early forays into MCMC for MIRT Béguin & Glas (001); Segall (00) Shi & Lee (1998); Arminger & Muthén (1998)

Dissertation Research An MCMC Approach to Confirmatory Item Factor Analysis

Comparing MCMC Approaches MCMC with GRM Authors a d o DA-Gibbs? MHwG? Combinations? A & C (93) DAG DAG DAG Cowles (96) DAG DAG MHwG J & A (99) DAG DAG MHwG Fox (04) DAG - MHwG W et al (0) MHwG MHwG MHwG MCMC with 3PNO Authors a d g P & J (99) MHwG MHwG MHwG W et al (00) MHwG MHwG MHwG B & G (01) DAG DAG DAG Segall (0) DAG DAG DAG Sahu (0) DAG DAG DAG

Performance Components Examine: Parameter recovery Autocorrelations and effective sample size Mixing of chains Time per cycle

Next Steps Make it work, make it fast, make it pretty R too slow, move to C++ Multiple correlated factors with independent clustering Cross loadings Mixed item types

Example 1 F1 F F3 F4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

10000 0000 30000 40000 50000 60000 Cycle 1. 1.3 1.4 1.5 1.6 Item 1 Slope Item 1 Slope 1.3 1.5 Frequency 0 00 400 10000 0000 30000 40000 50000 60000 Cycle -1.7-1.6-1.5-1.4-1.3-1. Item 1 Intercept Item 1 Intercept -1.6-1.4 Frequency 0 00 400 10000 0000 30000 40000 50000 60000 Cycle 0.9 1.0 1.1 1. Item 1 Offset Item 1 Offset 0.90 1.10 Frequency 0 00 400 10000 0000 30000 40000 50000 60000 Cycle.8.9 3.0 3.1 3. 3.3 3.4 Item 1 Offset 3 Item 1 Offset 3.8 3.1 3.4 Frequency 0 00 400

Example 1 Results (RMSE) N = 000 MCMC Multilog WLS a 0.05 0.05 0.06 d 0.08 0.10 0.10 o 0.04 0.05 0.06 o 3 0.07 0.08 0.10 r 0.0-0.03

Example 1 Results (RMSE) N = 000 N = 500 MCMC Multilog WLS MCMC Multilog WLS a 0.05 0.05 0.06 0.09 0.11 0.7 d 0.08 0.10 0.10 0.10 0.1 0.18 o 0.04 0.05 0.06 0.10 0.10 0.18 o 3 0.07 0.08 0.10 0.15 0.16 0.3 r 0.0-0.03 0.03-0.09

Example S1 G S 3 3 3 3 3

Example 3 F1 F

Example 3 - Slopes 0.40 0.30 0.0 0.10 0.00 Difference -0.10-0.0-0.30-0.40-0.50-0.60-0.70 0.00 0.50 1.00 1.50.00.50 Generating Slope

Example 3 - Slopes 0.40 0.30 0.0 0.10 0.00 Difference -0.10-0.0-0.30-0.40-0.50 Item 3-0.60-0.70 0.00 0.50 1.00 1.50.00.50 Generating Slope

Example 3 Intercepts 1.00 0.50 Difference 0.00-0.50-1.00-1.50-6.00-4.00 -.00 0.00.00 4.00 6.00 Generating Intercept

Example 3 Intercepts 1.00 0.50 Difference 0.00-0.50-1.00 Item 3-1.50-6.00-4.00 -.00 0.00.00 4.00 6.00 Generating Intercept

Example 3 Lower Asymptotes 0.10 0.05 Difference 0.00-0.05-0.10-0.15 0.00 0.05 0.10 0.15 0.0 0.5 0.30 0.35 0.40 Generating Lower Asymptote

Example 4 F1 F S1 S S3 S4

Example 4 - Reflection 0 e+00 e+04 4 e+04 6 e+04 8 e+04 1 e+05 Cycle -1.5-1.0-0.5 0.0 0.5 1.0 1.5 Item 40 Specific Slope Item 40 Specific Slope -1.0 0.0 1.0 Frequency 0 100 00 300 0 e+00 e+04 4 e+04 6 e+04 8 e+04 1 e+05 Cycle - -1 0 1 Item 41 Specific Slope Item 41 Specific Slope -1.5-0.5 0.5 1.5 Frequency 0 100 300 0 e+00 e+04 4 e+04 6 e+04 8 e+04 1 e+05 Cycle - -1 0 1 Item 4 Specific Slope Item 4 Specific Slope -1.5 0.0 1.0 Frequency 0 00 400 600

Example 4 Correlation Troubles R(,1) 0.66 0.70 0.74 Frequency 0 100 00 300 0 e+00 e+04 4 e+04 6 e+04 8 e+04 1 e+05 0.64 0.66 0.68 0.70 0.7 0.74 Cycle R(,1) R(5,) -0.1 0.1 0.3 Frequency 0 100 00 300 0 e+00 e+04 4 e+04 6 e+04 8 e+04 1 e+05 Cycle -0.1 0.0 0.1 0. 0.3 0.4 0.5 R(5,) R(6,) -0. 0. 0 e+00 e+04 4 e+04 6 e+04 8 e+04 1 e+05 Cycle Frequency 0 100 00 300-0.4-0. 0.0 0. 0.4 0.6 R(6,)

Example 4 Slopes 0.60 0.40 0.0 Difference 0.00-0.0-0.40-0.60-0.80 0.00 0.50 1.00 1.50.00.50 Generating Slope

Example 4 Intercepts 1.0 1.00 0.80 0.60 Difference 0.40 0.0 0.00-0.0-0.40-0.60-5.00-4.00-3.00 -.00-1.00 0.00 1.00.00 3.00 4.00 Generating Intercept

Example 4 Lower Asymptotes 0.15 0.10 0.05 Difference 0.00-0.05-0.10-0.15 0.00 0.05 0.10 0.15 0.0 0.5 0.30 0.35 0.40 0.45 Generating Lower Asymptote

Example 5 F1 F F3 F4 F5 F6 F7 F8

Example 5 Slopes 0.30 0.0 0.10 Difference 0.00-0.10-0.0-0.30 0.60 0.70 0.80 0.90 1.00 1.10 1.0 1.30 1.40 1.50 Generating Slope

Example 5 Intercepts 0.30 0.0 0.10 Difference 0.00-0.10-0.0-0.30-0.40-3.00 -.00-1.00 0.00 1.00.00 3.00 Generating Intercept

Example 5 Lower Asymptotes 0.10 0.08 0.06 0.04 Difference 0.0 0.00-0.0-0.04-0.06-0.08 0.10 0.15 0.0 0.5 0.30 Generating Lower Asymptote

Beta Weight = 0 0.10 0.08 0.06 0.04 Difference 0.0 0.00-0.0-0.04-0.06-0.08-0.10 0.10 0.15 0.0 0.5 0.30 Generating Lower Asymptote

Example 5 Correlations 0.0 0.01 0.00-0.01 Difference -0.0-0.03-0.04-0.05-0.06-0.07 0.00 0.0 0.40 0.60 0.80 1.00 Generating Correlation

E5 Continued: More Uniform R 0.04 0.03 0.0 0.01 Difference 0.00-0.01-0.0-0.03-0.04-0.05 0.0 0.30 0.40 0.50 0.60 0.70 0.80 Generating

5 Most Poorly Recovered Items 1 0.9 0.8 0.7 Probability 0.6 0.5 0.4 0.3 0. 0.1 0-4 -3 - -1 0 1 3 4 Theta Generated Item 1 Generated Item 9 Generated Item 38 Generated Item 54 Generated Item 65 Estimated Item 1 Estimated Item 9 Estimated Item 38 Estimated Item 54 Estimated Item 65

Future Directions Software Dissemination Projects MCMC variant comparison Confirmatory item factor analysis and applications MCMC vs. MML-EM, WLS, etc. Extensions Structural Model Model Fit

Concluding Thoughts Item factor analysis is a useful method for the social sciences IRT framework provides an advantageous platform for item factor analysis MCMC can be used to estimate parameters for more complex IRT models

Acknowledgements Dave Thissen Dissertation Committee Ken Bollen, Patrick Curran, Andrea Hussong, & Bud MacCallum Jean-Paul Fox Psychometric Society (& Judges)

Thank You. edwards.134@osu.edu http://faculty.psy.ohio-state.edu/edwards/