The Impact of Model Misspecification in Clustered and Continuous Growth Modeling

Similar documents
Estimating Multilevel Linear Models as Structural Equation Models

Overview of Talk. Motivating Example: Antisocial Behavior (AB) Raw Data

Basic IRT Concepts, Models, and Assumptions

Ruth E. Mathiowetz. Chapel Hill 2010

Chapter 19. Latent Variable Analysis. Growth Mixture Modeling and Related Techniques for Longitudinal Data. Bengt Muthén

Supplemental material for Autoregressive Latent Trajectory 1

Testing Main Effects and Interactions in Latent Curve Analysis

The Integration of Continuous and Discrete Latent Variable Models: Potential Problems and Promising Opportunities

Diagnostic Procedures for Detecting Nonlinear Relationships Between Latent Variables

Bayesian Analysis of Latent Variable Models using Mplus

What is Latent Class Analysis. Tarani Chandola

Mixture Modeling. Identifying the Correct Number of Classes in a Growth Mixture Model. Davood Tofighi Craig Enders Arizona State University

Accounting for Population Uncertainty in Covariance Structure Analysis

Confidence Intervals for a Semiparametric Approach to Modeling Nonlinear Relations among Latent Variables

arxiv: v1 [stat.me] 30 Aug 2018

Longitudinal Invariance CFA (using MLR) Example in Mplus v. 7.4 (N = 151; 6 items over 3 occasions)

Goals for the Morning

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Growth Mixture Model

CHAPTER 3. SPECIALIZED EXTENSIONS

Advances in Mixture Modeling And More

THE IMPACT OF UNMODELED TIME SERIES PROCESSES IN WITHIN-SUBJECT RESIDUAL STRUCTURE IN CONDITIONAL LATENT GROWTH MODELING: A MONTE CARLO STUDY

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017

Equivalent Models in the Context of Latent Curve Analysis

Model Estimation Example

Variable-Specific Entropy Contribution

Introduction to Random Effects of Time and Model Estimation

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

DIC, AIC, BIC, PPL, MSPE Residuals Predictive residuals

Review of CLDP 944: Multilevel Models for Longitudinal Data

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Econ 423 Lecture Notes: Additional Topics in Time Series 1

The Use of Quadratic Form Statistics of Residuals to Identify IRT Model Misfit in Marginal Subtables

Time-Invariant Predictors in Longitudinal Models

An Introduction to Mplus and Path Analysis

PERFORMANCE OF INFORMATION CRITERIA FOR MODEL SELECTION IN A LATENT GROWTH CURVE MIXTURE MODEL

An introduction to clustering techniques

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Running head: AUTOCORRELATION IN THE COFM. The Effects of Autocorrelation on the Curve-of-Factors Growth Model

Statistics 202: Data Mining. c Jonathan Taylor. Model-based clustering Based in part on slides from textbook, slides of Susan Holmes.

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference

1. A Brief History of Longitudinal Factor Analysis

Multi-group analyses for measurement invariance parameter estimates and model fit (ML)

Multiple Group CFA Invariance Example (data from Brown Chapter 7) using MLR Mplus 7.4: Major Depression Criteria across Men and Women (n = 345 each)

An Efficient State Space Approach to Estimate Univariate and Multivariate Multilevel Regression Models

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Discussion # 6, Water Quality and Mercury in Fish

Time-Invariant Predictors in Longitudinal Models

Algebra Topic Alignment

How to run the RI CLPM with Mplus By Ellen Hamaker March 21, 2018

An Introduction to Path Analysis

Ch. 7 Absolute Value and Reciprocal Functions Notes

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Fall Homework Chapter 4

Categorical and Zero Inflated Growth Models

Power and sample size calculations for designing rare variant sequencing association studies.

A Guide to Modern Econometric:

Lesson 5: The Graph of the Equation y = f(x)

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Global Model Fit Test for Nonlinear SEM

A NOVEL MIXED EFFECTS MODELING FRAMEWORK FOR LONGITUDINAL ORDINAL SUBSTANCE USE DATA. James S. McGinley. Chapel Hill 2014

Spatial Regression. 10. Specification Tests (2) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

The Sensitivity of Confirmatory Factor Analytic Fit Indices to. Violations of Factorial Invariance across Latent Classes: A Simulation.

Measurement Invariance Testing with Many Groups: A Comparison of Five Approaches (Online Supplements)

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Dimensionality Assessment: Additional Methods

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes

This is the publisher s copyrighted version of this article.

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Testing and Interpreting Interaction Effects in Multilevel Models

Bridging Meta-Analysis and Standard Statistical Methods

Standard Errors & Confidence Intervals. N(0, I( β) 1 ), I( β) = [ 2 l(β, φ; y) β i β β= β j

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

Describing Nonlinear Change Over Time

LINKING IN DEVELOPMENTAL SCALES. Michelle M. Langer. Chapel Hill 2006

Multilevel Structural Equation Modeling

Functioning of global fit statistics in latent growth curve modeling

Mixture Modeling in Mplus

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Investigating the Feasibility of Using Mplus in the Estimation of Growth Mixture Models

Using Mplus individual residual plots for. diagnostics and model evaluation in SEM

Walkthrough for Illustrations. Illustration 1

EVALUATING EFFECT, COMPOSITE, AND CAUSAL INDICATORS IN STRUCTURAL EQUATION MODELS 1

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.

Misspecification in Nonrecursive SEMs 1. Nonrecursive Latent Variable Models under Misspecification

Quasi-likelihood Scan Statistics for Detection of

Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest

Correlation and Regression Theory 1) Multivariate Statistics

Statistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).

A Markov chain Monte Carlo approach to confirmatory item factor analysis. Michael C. Edwards The Ohio State University

Investigating Population Heterogeneity With Factor Mixture Models

Pattern Recognition and Machine Learning

Modeling growth in dyads at the group level

FIT CRITERIA PERFORMANCE AND PARAMETER ESTIMATE BIAS IN LATENT GROWTH MODELS WITH SMALL SAMPLES

Transcription:

The Impact of Model Misspecification in Clustered and Continuous Growth Modeling Daniel J. Bauer Odum Institute for Research in Social Science The University of North Carolina at Chapel Hill Patrick J. Curran L.L. Thurstone Psychometric Laboratory Department of Psychology The University of North Carolina at Chapel Hill A key focus of developmental science is the study of individual differences in developmental trajectories. Two types of models are often used for this purpose: Latent curve models (LCM) Estimate quantitative variations in continuously distributed trajectory parameters. Growth mixture models (GMM) Estimate latent classes of individuals whose trajectories differ in qualitatively important ways. The relationship between these models can be clarified by writing the density function for the GMM as K g( y π, θ ) = π f ( y θ ) k = where π k represent the proportion of cases from class k, and θ k are the growth parameters that define the trajectory ƒ k for each class k. This model reduces to the LCM when K =. This poster investigates several critical issues raised by these alternative models: Can data screening guide the selection of an LCM or GMM model? Can model misspecification be detected using traditional fit statistics? How will model misspecification effect the estimation and interpretation of the model parameters? Method Design Factors The true number of growth functions in the population (K = or ) The model estimated for the data (LCM or GMM) The type of growth model fit to the data (Unconditional or Conditional) Data Generation Multiclass (K = ) and Single Class (K = ) data were generated with Mplus. o The Multiclass data sets both combined an equal number of cases from each of the three trajectories displayed in Figure. o The Single Class data was generated only from the trajectory of Group. Each set included 6 cases so the results would reflect asymptotic behavior. Two time-invariant predictors (X-X) were generated so that their relations to the trajectory parameters would vary across classes. Results Data Screening Histograms of Y-Y4 are presented in Figure for the Multiclass data. o The distributions appear normal at all time points except Y4. Histograms of individual parameter estimates (generated by case-wise OLS) for the Multiclass data are presented in Figure. k k k

o The distributions appear normal. Scatterplots of Y-Y4 and the OLS parameters (not shown) also appeared bivariate normal. Model Fit Diagnostics Good fit was obtained for both correct and misspecified LCM models. o LCM of Single Class data (K = )! Unconditional Model: χ () =., p =.6; RMSEA =.! Conditional Model: χ () =.4, p =.7; RMSEA =. o LCM of Multiclass data (K = )! Unconditional Model: χ () =., p =.7; RMSEA =.! Conditional Model: χ () =.68, p =.64; RMSEA =. GMM models (K = & ) fit to the single class data consistently iterated to a solution in which the extra groups had estimated sample sizes of. Fit statistics (BIC and AIC) for the GMM models fit to the Multiclass data are presented in Figure 4. o BIC & AIC suggested rather than groups for the unconditional models. o Discrimination of the groups is improved with the inclusion of predictors. Impacts of Misspecification Incorrectly specifying too many groups in a GMM analysis had no adverse effects on parameter estimates, as the extra groups had zero estimated members. Figure 5 presents the mean trajectories estimated for the Multiclass data when too few groups were specified. o The LCM (K = ) model essentially averaged the three trajectory classes o The GMM (K = ) model essentially merged groups and In the conditional models, the effects of the predictors on the growth factors were similarly averaged as the groups were combined. Conclusions Can data screening guide the selection of an LCM or GMM model? o Neither the distributions of the observed variables nor those of the growth parameters conveyed the presence of multiple trajectory classes. o Data screening may be more informative as the trajectory classes become more distinctive and the variance within classes decreases. Can model misspecification be detected using traditional fit statistics? o Misspecifying an LCM model for Multiclass data was not detectable using traditional model fit statistics. o BIC & AIC did not always lead to selection of the correct number of groups.! Including predictors in model helped to discriminate the groups. How will model misspecification effect the estimation and interpretation of the model parameters? o Estimating a GMM with more classes than necessary had no harmful effects, and pointed toward simplifying the model to the correct form.! The effect of this type of misspecification would probably be greater under less ideal conditions (e.g, when single class data are skewed). o When too few groups were specified, the most similar groups collapsed together, obscuring their distinctive trajectories and relations to predictors.

Figure. Trajectories used for Data Generation 5 Dependent Variable (Y) 5 GROUP.. 5 4. Time

Figure. Univariate Distributions of Y Y4 For Multiclass Data 5. 45. 4. 5.. 5.. 5.. 5.. -5. -. -5. -. 5. 45. 4. 5.. 5.. 5.. 5.. -5. -. -5. -. 5. 45. 4. 5.. 5.. 5.. 5.. -5. -. -5. -. 5. 45. 4. 5.. 5.. 5.. 5.. -5. -. -5. -. y Y Y Y Y4

Figure. Distributions of Growth Parameters For MultiClass Data 6 Intercept 4 -.5 -. -5.7-8.. 7.4 4.8. 7.9 5..7 6 Linear Factor 6. 8.4 5.8..5 Quadratic Factor 4 4 6..5.7 7.9 5..4 9.6 6.8 4.. -.6-4.4-7. -9.9 -.7-5.5-8. 6.5 5.5 4.5.5.5.5.5 -.5 -.5 -.5 -.5-4.5-5.5-6.5-7.5-8.5

Figure 4. Fit Statistics for GMM Models 5 Unconditional GMM Fit Statistic Value 49 48 AIC 47 BIC Number of Classes Estimated Conditional GMM 6 4 Fit Statistic Value 8 AIC 6 BIC Number of Classes Estimated

Figure 5. Misspecified Model Implied Trajectories for Multiclass Data LCM (K=) of Multiclass Data 5 Dependent Variable (Y) 5 5 4 GROUP Time GMM (K=) of Multiclass Data 5 Dependent Variable (Y) 5 GROUP. 5 4. Time