How to use Stata s sem with small samples? New corrections for the L. R. χ 2 statistics and fit indices

Similar documents
Evaluating the Sensitivity of Goodness-of-Fit Indices to Data Perturbation: An Integrated MC-SGR Approach

Postgraduate course: Anova and Repeated measurements Day 2 (part 2) Mogens Erlandsen, Department of Biostatistics, Aarhus University, November 2010

Fit Indices Versus Test Statistics

Title. Description. Remarks and examples. stata.com. stata.com. Variable notation. methods and formulas for sem Methods and formulas for sem

EVALUATION OF STRUCTURAL EQUATION MODELS

Evaluating Small Sample Approaches for Model Test Statistics in Structural Equation Modeling

Psychology 454: Latent Variable Modeling How do you know if a model works?

Testing Structural Equation Models: The Effect of Kurtosis

Misspecification in Nonrecursive SEMs 1. Nonrecursive Latent Variable Models under Misspecification

Department of Psychology, Beihang University, China 2. Department of Psychology, University of Notre Dame, USA 3

Scaled and adjusted restricted tests in. multi-sample analysis of moment structures. Albert Satorra. Universitat Pompeu Fabra.

Confirmatory Factor Analysis. Psych 818 DeShon

Introduction to Structural Equation Modeling Dominique Zephyr Applied Statistics Lab

A Threshold-Free Approach to the Study of the Structure of Binary Data

ADVANCED C. MEASUREMENT INVARIANCE SEM REX B KLINE CONCORDIA

Model fit evaluation in multilevel structural equation models

Inference using structural equations with latent variables

Latent Variable Analysis

FIT CRITERIA PERFORMANCE AND PARAMETER ESTIMATE BIAS IN LATENT GROWTH MODELS WITH SMALL SAMPLES

Testing structural equation models: the effect of kurtosis. Tron Foss BI Norwegian Business School. Karl G. Jøreskog BI Norwegian Business School

ABSTRACT. Between-Subjects Design under Variance. Heterogeneity and Nonnormality. Evaluation

Dimensionality Assessment: Additional Methods

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

Multi-group analyses for measurement invariance parameter estimates and model fit (ML)

Longitudinal Invariance CFA (using MLR) Example in Mplus v. 7.4 (N = 151; 6 items over 3 occasions)

Factor analysis. George Balabanis

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Introduction to Confirmatory Factor Analysis

Psychology 454: Latent Variable Modeling How do you know if a model works?

Robust Means Modeling vs Traditional Robust Tests 1

Running head: AUTOCORRELATION IN THE COFM. The Effects of Autocorrelation on the Curve-of-Factors Growth Model

Structural Equation Modeling

PLEASE SCROLL DOWN FOR ARTICLE. Full terms and conditions of use:

Inference with Heywood cases

Introduction to Structural Equation Modeling

Effect of Estimation Method on Incremental Fit Indexes for Covariance Structure Models

Factor Analysis & Structural Equation Models. CS185 Human Computer Interaction

1 The Robustness of LISREL Modeling Revisited

Statistical and psychometric methods for measurement: Scale development and validation

Streamlining Missing Data Analysis by Aggregating Multiple Imputations at the Data Level

An Introduction to Path Analysis

sempower Manual Morten Moshagen

Evaluation of structural equation models. Hans Baumgartner Penn State University

An Introduction to Mplus and Path Analysis

The comparison of estimation methods on the parameter estimates and fit indices in SEM model under 7-point Likert scale

STRUCTURAL EQUATION MODELS WITH LATENT VARIABLES

Supplemental material for Autoregressive Latent Trajectory 1

Nesting and Equivalence Testing

Structural Equation Modelling

The use of structural equation modeling to examine consumption of pornographic materials in Chinese adolescents in Hong Kong

A Cautionary Note on the Use of LISREL s Automatic Start Values in Confirmatory Factor Analysis Studies R. L. Brown University of Wisconsin

INTRODUCTION TO STRUCTURAL EQUATION MODELS

Mixture Modeling. Identifying the Correct Number of Classes in a Growth Mixture Model. Davood Tofighi Craig Enders Arizona State University

A simulation study to investigate the use of cutoff values for assessing model fit in covariance structure models

MODEL IMPLIED INSTRUMENTAL VARIABLE ESTIMATION FOR MULTILEVEL CONFIRMATORY FACTOR ANALYSIS. Michael L. Giordano

An Alternative to Cronbach s Alpha: A L-Moment Based Measure of Internal-consistency Reliability

Multiple Group CFA Invariance Example (data from Brown Chapter 7) using MLR Mplus 7.4: Major Depression Criteria across Men and Women (n = 345 each)

Measurement Invariance Testing with Many Groups: A Comparison of Five Approaches (Online Supplements)

Systematic error, of course, can produce either an upward or downward bias.

Recovery of weak factor loadings in confirmatory factor analysis under conditions of model misspecification

MICHAEL SCHREINER and KARL SCHWEIZER

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models

Introduction to Structural Equation Modeling: Issues and Practical Considerations

DATE: 9/ L I S R E L 8.80

Reliable and More Powerful Methods for Power Analysis in Structural Equation Modeling

The Impact of Model Misspecification in Clustered and Continuous Growth Modeling

Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood

The Impact of Varying the Number of Measurement Invariance Constraints on. the Assessment of Between-Group Differences of Latent Means.

Equivalent Models in the Context of Latent Curve Analysis

Related Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM

PRESENTATION TITLE. Is my survey biased? The importance of measurement invariance. Yusuke Kuroki Sunny Moon November 9 th, 2017

Confirmatory Factor Models (CFA: Confirmatory Factor Analysis)

Improper Solutions in Exploratory Factor Analysis: Causes and Treatments

Confirmatory factor analysis using cfa

Generating Non-normal Distributions

Introduction to Confirmatory Factor Analysis

Trends in Human Development Index of European Union

Uppsala University and Norwegian School of Management, b Uppsala University, Online publication date: 08 July 2010

Assessing Factorial Invariance in Ordered-Categorical Measures

Using SAS PROC TCALIS for multigroup structural equation modelling with mean structures

SEM for Categorical Outcomes

A note on structured means analysis for a single group. André Beauducel 1. October 3 rd, 2015

UNIVERSITY OF CALGARY. The Influence of Model Components and Misspecification Type on the Performance of the

ABSTRACT. types of misspecifications in the application of a linear-linear piecewise multilevel

Condition 9 and 10 Tests of Model Confirmation with SEM Techniques

Structural Equation Modeling

Structural equation modeling with small sample sizes using two-stage ridge least-squares estimation

arxiv: v1 [math.st] 7 Oct 2016

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior

Accounting for Population Uncertainty in Covariance Structure Analysis

RANDOM INTERCEPT ITEM FACTOR ANALYSIS. IE Working Paper MK8-102-I 02 / 04 / Alberto Maydeu Olivares

DEEP, University of Lausanne Lectures on Econometric Analysis of Count Data Pravin K. Trivedi May 2005

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

Estimation of Curvilinear Effects in SEM. Rex B. Kline, September 2009

miivfind: A command for identifying model-implied instrumental variables for structural equation models in Stata

Estimation in SEM: A Concrete Example

Confirmatory Factor Analysis

COSINE SIMILARITY APPROACHES TO RELIABILITY OF LIKERT SCALE AND ITEMS

Transcription:

How to use Stata s sem with small samples? New corrections for the L. R. χ 2 statistics and fit indices Meeting of the German Stata User Group at the Konstanz University, June 22nd, 218?All models are false, but some are useful. (George E. P. Box) Dr. Wolfgang Langer Martin-Luther-Universität Halle-Wittenberg Institut für Soziologie Assistant Professeur Associé Université du Luxembourg 1

Contents What is the problem? What are the solutions for it? What do we know from Monte-Carlo simulation studies? How to implement it in Stata? Empirical example of Islamophobia in Germany 216 Conclusions 2

What is the problem? In empirical research more and more people estimate their SEM using a small sample (n<1) in psychology, marketing or business research When working with small samples we are confronted with a severe problem < The traditional Likelihood-Ratio χ 2 goodness-of-fit test and all fit-indices basing on it tend to overreject acceptable models. They are too conservative! < This is caused by the pure approximation of the χ 2 test statistics to the noncentral χ 2 distribution 3

What are solutions for it? Several correction procedures have been developed to improve the approximation of the L.R.χ 2 -test statistics (T ML ) to the noncentral χ 2 distribution < The Bartlett correction < The Yuan correction < The Swain correction 4

The Bartlett correction Bartlett developed a small-sample correction to test the exact fit of exploratory factor models (1937, 195, 1954) estimated by ML 4k 2p 5 b 1 6n Bartlett corrected TMLb b T Legend : k : number of latent variables factors p : number of observed variables (indicators) n: samplesize N +1 ML 5

The Yuan correction Yuan (25) proposed an?ad hoc simplification of a Bartlett like correction formula developed by Wakaki, Eguchi & Fujikoshi (199) for covariance structure models y 1 2k 2p 7 6n Yuan corrected T y T MLy ML 6

The Swain correction Swain (1975) proposed the following correction of the test statistics T ML s 1 2 2 2 3 1 2 3 1 p p p q q q 12d 1 4p p 1 8d 1 with q 2 Swain corrected T s T MLs n ML Legend : p : number of observed variables (indicators) d : degrees of freedom of actual model n : samplesize N 1 7

What do we know from M. C. studies? A lot of Monte-Carlo simulation studies with small samples have been made to evaluate the shown corrections. They test systematically < Violations of the multivariate normal distribution assumption < Sample size < Number of indicators < Extend of model misspecification 8

What do we know...? Fouladi (2) and Newitt & Hancock (24) recommended the Bartlett correction of the T ML for normal data. For not normal distributed data they proposed the Bartlett correction of the Satorra-Bentler adjusted T ML Herzog, Boomsma & Reinecke (27) and Herzog & Boomsma (29/13) showed that < Both Bartlett and Yuan corrections overestimate the type-i-error rate when sample size decreases < The Swain correction is the winner for small sample sizes and large models with many indicators It reduces to a high extend the type-i-error rate It works even to a sample size to estimated parameter ratio of 2:1 9

What do we know...? Herzog, Boomsma & Reinecke (27) and Herzog & Boomsma (29/13) also developed and tested a modified version of Tucker-Lewis- Index (TLI or NNFI) using the Swain-rescaled T ML for the target model and usual T ML for the baseline model < It clearly outperforms the TLI calculated by standard programs like MPLUS, EQS, LISREL < It reports correctly the misspecification of the SEM < They recommended this correction also for the Comparativ Fit Index developed by Bentler (199) and Steiger s Root-Mean-Squared-Error of Approximation (RMSEA) 1

Swain corrected Tucker-Lewis Index Formulas For normal distributed data: TLI T ML df bs bs T ML df bs bs s T df 1 ML For not normaldistributed data: TLI bs ms ms sb TML s sb T bs dfbs dfms sb TMLbs 1 df ML ms 11

Swain corrected Comparative-Fit Index Formulas For normal distributed data: CFI T ML df bs bs s TML df ms ms T ML dfbs For not normaldistributed data: CFI bs sb T M L df bs bs s sb TML df ms ms sb T ML dfbs bs 12

Swain corrected RMSEA Formulas For normal distributed data: RMSEA st ML For not normaldistributed data: RMSEA ms ndf ms ML df ms ms ms s sb T df ndf ms 13

How to implement it in Stata? In 213 John Antonakis and Nicolas Bastardoz, both from University of Lausanne, Switzerland, published their?swain.ado calculating only the Swain-corrected T ML value for comparison of the actual vs. saturated model I have modified this ado-file calculating now Swain-corrected T ML, TLI, CFI and RMSEA Under the assumption of multivariate normality (Jöreskog 197, p. 239) Under violation of the multivariate normality assumption (not normal distributed data) using the Satorra-Bentler-corrected T ML All calculated scalars are displayed and returned in r-containers 14

Empirical example of Islamophobia SEM explaining Islamophobia in West Germany 216 5% sample of the German General Social Survey 216, subsample west: n=84 Presentation of used indicators Test of multivariate normal distribution of observed indicators (mvtest in Stata) Estimated results from sembuilder Results of estat gof, stats(all) Output of my swain_gof.ado 15

SEM to explain Islamophobia e11 e12 e13 mm1 e2 id2 educ2 incc mm2r e3 e1 SES Islamophob mm3 e4 e8 mm4 e5 pa1 Authoritu mm5r e 6 mm6 e7 lp1 lp2 e9 e1 16

Used indicators Factor SES: Socio-economic status < id2: Self rating of social class Underclass to upperclass [1;5] < educ2: educational degree Without degree to grammar school [1;5] < incc: income class (quintiles) [1;5] Factor Authoritu: authoritarian submission < lp1: Thank to the leading heads saying us what to do [1;7] < lp2: It is good for a child to learn to obey its parents [1;7] Single indicator pa1: left-right self-rating < 1) left.. 1) right 17

Used indicators Factor Islamophobia < Six items [1;7] mm1 The religious practice of Islam should be restricted in Germany mm2r The Islam does not belong to Germany mm3 The presence of Muslims leads to conflicts mm4 The Islamic communities should be supervised by the state mm5r I object to have an Islamic mayor in my town mm6 There are a lot of religious fanatics in the Islamic community 18

Test of multivariate normality (n = 84). mvtest normality mm1 mm2r mm3 mm4 mm5r mm6 lp1 lp2 pa1 id2 educ2 incc, uni stats(all) Test for univariate normality joint Variable Pr(Skewness) Pr(Kurtosis) adj chi2(2) Prob>chi2 mm1.4475. 44.17. mm2r.86.432 6.91.317 mm3.46.4 7.89.194 mm4.112.2 13.43.12 mm5r.4737... mm6.5839. 24.28. lp1.137.827 5.47.648 lp2..174 19.23.1 pa1.28.934 4.83.893 id2.9191.4762.53.7685 educ2.255.142 9.47.88 incc.878. 2.93. Test for multivariate normality Mardia mskewness = 31.4157 chi2(364) = 452.56 Prob>chi2 =.11 Mardia mkurtosis = 173.1796 chi2(1) = 1.677 Prob>chi2 =.1954 Henze-Zirkler = 1.34168 chi2(1) = 4.558 Prob>chi2 =. Doornik-Hansen chi2(24) = 118.558 Prob>chi2 =. Except id2 all indicators violate the assumption of univariate nomality! All together violate the assumption of multivariate normality! 19

Standardized solution of the SEM with Satorra-Bentler corrections: vce(sbentler) e 11.39 e 12.67 e 13.85 mm1 e 2.59 id2 educ2 incc.63 mm2r e 3.57.78.57 SES 1 -.4.39 -.66.25 e 8.61 e 1 Islamophob mm3 mm4 e 4.72 e 5.39 pa1 Sample size: N = 84 R 2 (Islamophobia) =.3716 R 2 (Authoritu) =.7463 TLI_SB =.897 CFI_SB =.921 1.53 lp1 Authoritu 2 mm5r mm6 e 9.86 e 1.85 RMSEA_SB =.59 N:t = 84 : 27. 3:1.38.39 lp2 e 6.65 e 7.59

Goodness of fit statistics: estat gof Fit statistic Value Description Likelihood ratio chi2_ms(51) 67.723 model vs. saturated p > chi2.58 chi2_bs(66) 264.488 baseline vs. saturated p > chi2. Satorra-Bentler chi2sb_ms(51) 65.876 p > chi2.79 chi2sb_bs(66) 253.89 p > chi2. Population error RMSEA.62 Root mean squared error of approximation 9% CI, lower bound. upper bound.99 pclose.292 Probability RMSEA <=.5 Satorra-Bentler RMSEA_SB.59 Root mean squared error of approximation Baseline comparison CFI.916 Comparative fit index TLI.891 Tucker-Lewis index Satorra-Bentler CFI_SB.921 Comparative fit index TLI_SB.897 Tucker-Lewis index Size of residuals SRMR.83 Standardized root mean squared residual CD.884 Coefficient of determination 21

Output of my swain_gof.ado. swain_gof Swain correction factor =.9391 Swain corrected chi-square = 63.597864 p-value of Swain corrected chi-square =.118 Satorra-Bentler-corrected statistics: Swain-Satorra-Bentler corrected chi-square = 61.863491 p-value of Swain-Satorra-Bentler corrected chi-square =.1417 Fit indices under assumption of multivariate normal distribution Swain-corrected Tucker-Lewis-Index =.9179 Swain-corrected Comparative-Fit-Index =.9365 Swain-correct RMSEA =.542 Fit indices under violation of multivariate normal distribution Swain-Satorra-Bentler-corrected Tucker-Lewis-Index =.9251 Swain-Satorra-Bentler-corrected Comparative-Fit-Index =.9422 Swain-Satorra-Bentler-correct RMSEA =.54 22

r-containers of the swain_gof.ado The swain_gof.ado returns the following r-containers. return list scalars: r(swain_rmsea_sb) =.5357145572617 r(swain_cfi_sb) =.942156582285176 r(swain_tli_sb) =.92514389774932 r(swain_rmsea) =.5422818421118 r(swain_cfi) =.9365379993264 r(swain_tli) =.917863388147377 r(swain_sb_p) =.1417421175619517 r(swain_chi_sb) = 61.8634917237523 r(swain_corr) =.93984549337976 r(swain_chi) = 63.59786447391117 r(swain_p) =.1182837558246 23

What do we see? Violation of the multivariate normality assumption. Therefore we look at the Satorra & Bentler (SB) corrected statistics of Stata < SB T ML (Stata) = 65.876 df=51 p=.79 < Swain SB T ML = 61.863 df=51 p=.142 ( < SB TLI (Stata) =.891 < Swain SB TLI =.925 ( < SB CFI (Stata) =.921 < Swain SB CFI =.942 ( < SB RMSEA (Stata)=.59 < Swain SB RMSEA =.5 ( The SB T ML statistics is reduced by the Swain correction. Therefore all fit indices are improved! 24

Conclusions The Monte-Carlo studies presented have proofed the advantage of the Swain correction for the SEM with small samples and many indicators < It works just to a sample size-parameter ratio of 2:1 My swain_gof.ado calculates easily the Swaincorrected T ML statistics and the fit indices TLI, CFI and RMSEA basing on it < Under the assumption of multivariate normality < Under violation of multivariate normality Therefore I recommend my swain_gof.ado to assess the fit of SEMs using small samples 25

Closing words Thank you for your attention Do you have some questions? 26

Contact Affiliation < Dr. Wolfgang Langer University of Halle Institute of Sociology D 699 Halle (Saale) < Email: wolfgang.langer@soziologie.uni- halle.de 27

References Antonakis, J., & Bastardoz, N. (213) Swain: Stata module to correct the SEM chi-square overidentification test in small sample sizes or complex models. http://econpapers.repec.org/software/bocbocode/s457617.htm Bentler, P. M. & Chou, C.-P. (1987): Practical issues in structural equation modeling. Sociological Methods & Research, 16 (1), 78-117 Bentler, P. M. & Yuan, K.-H. (1999): Structural equation modeling with small samples: Test statistics. Multivariate Behavioral Research, 34 (2), 181-197 Bentler, P. M. (199): Comparative fit indexes in structural equation models. Psychological Bulletin, 17, 238-246 Boomsma, A. & Herzog, W.(213): R function swain. Correcting structural equation model fit statistics and indexes unter smallsample and or large-model conditions. University of Groningen, The Netherlands, WHU Germany (http://www.gmw.rug.nl/~boomsma/swain.r) 28

References 2 Curran, P. J., Bollen, K. A., Paxton, P., Kirby, J., & Chen, F. N. (22). The noncentral chi-square distribution in misspecified structural equation models: Finite sample results from a Monte Carlo simulation. Multivariate Behavioral Research, 37(1), 1-36 Fouladi, R. T. (2): Performance of modified test statistics in covariance and correlation structure analysis under condition of multivariate nonnormality. Structural Equation Modeling, 7 (3), 356-41 GESIS - Leibniz-Institut für Sozialwissenschaften (217): Allgemeine Bevölkerungsumfrage der Sozialwissenschaften ALLBUS 216. GESIS Datenarchiv, Köln. ZA525 Datenfile Version 2.1., doi:1.4232/1.12796 Herzog, W., Boomsma, A., & Reinecke, S. (27): The model-size effect on traditional and modified tests of covariance structures. Structural Equation Modeling, 14(3), 361-9 Herzog, W., & Boomsma, A. (29): Small-sample robust estimators of noncentrality-based and incremental model fit. Structural Equation Modeling, 16(1), 1-27 29

References 3 Jöreskog, K.G. (197): A general method for analysis of covariance structures. Biometrika, 57(2), 239-251 Nevitt, J. & Hancock, G. R. (24): Evaluating small sample approaches for model test statistics in Structural Equation Modeling. Multivariate Behavioral Research, 39 (3), 439-478 Satorra, A. & Bentler, P. M. (1994): Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C. C. Clogg (eds.), Latent variables analysis: Applications for developmental research (pp. 399-419). Newbury Park, CA: Sage Swain, A. J. (1975): Analysis of parametric structures for variance matrices (doctoral thesis). University of Adelaide, Adelaide Wakaki, H., Eguchi, S. & Fujikoshi, Y.(199): A class of tests for a general covariance structure. Journal of Multivariate Analysis, 32, 313-325 Yuan, K.-H. (25): Fit indices versus test statistics. Multivariate Behavioral Research, 4 (1), 115-148 3

Appendix 31

Assumption of multivariate normality 1 Karl G. Jöreskog (197) formulated this assumption in his article?a general method for analysis of covariance structures. Biometrika, 57 (2), p. 239-251?It is assumed that observations on a set of variables have a multivariate normal distribution with a general parametric form of the mean vector and the variancecovariance parameters. Any parameter of the model may be fixed, free or constrained to be equal to other parameters. The free and constrained parameters are estimated by maximum likelihood. (p. 239) 32

Assumption of multivariate normality 2 Jöreskog had reduced the whole data matrix X to the first and second moments of the observed variables ignoring the third and forth moments - their skewness and kurtosis. Therefore he needed a strict assumption of their distribution. That s why he introduced the multivariate normality assumption of the observed variables. 33

Items measuring Islamophobia +) mm1 -) mm2r +) mm3 +) mm4 -) mm5r (GESIS 217, Liste 54) +) mm6 34

Items measuring authoritarian submission lp1 35 lp2 (GESIS 217, Liste 34)

Left-right-self rating pa1 (GESIS 217, Liste 46) 36