SHOPPING FOR EFFICIENT CONFIDENCE INTERVALS IN STRUCTURAL EQUATION MODELS. Donna Mohr and Yong Xu. University of North Florida

Similar documents
3/10/03 Gregory Carey Cholesky Problems - 1. Cholesky Problems

STRUCTURAL EQUATION MODELING. Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013

An Introduction to Path Analysis

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Inferences about a Mean Vector

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Psychology 282 Lecture #4 Outline Inferences in SLR

Deciphering Math Notation. Billy Skorupski Associate Professor, School of Education

Factor analysis. George Balabanis

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

STAT 730 Chapter 9: Factor analysis

Model Estimation Example

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Group comparison test for independent samples

I used college textbooks because they were the only resource available to evaluate measurement uncertainty calculations.

Semester I BASIC STATISTICS AND PROBABILITY STS1C01

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

A Introduction to Matrix Algebra and the Multivariate Normal Distribution

Approximate and Fiducial Confidence Intervals for the Difference Between Two Binomial Proportions

Inference using structural equations with latent variables

You can compute the maximum likelihood estimate for the correlation

Multidimensional scaling (MDS)

Introduction to Structural Equation Modeling Dominique Zephyr Applied Statistics Lab

An Introduction to Mplus and Path Analysis

Markov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1

Estimation in SEM: A Concrete Example

Using SPSS for One Way Analysis of Variance

Ordinary Least Squares Regression Explained: Vartanian

WHAT IS STRUCTURAL EQUATION MODELING (SEM)?

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

Lawrence D. Brown* and Daniel McCarthy*

CHAPTER 10. Regression and Correlation

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

How well do Fit Indices Distinguish Between the Two?

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Introduction to Structural Equation Modeling

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Power Comparison of Exact Unconditional Tests for Comparing Two Binomial Proportions

Contrasts (in general)

Final Exam - Solutions

Introduction to Matrix Algebra and the Multivariate Normal Distribution

A SAS/AF Application For Sample Size And Power Determination

Harvard University. Rigorous Research in Engineering Education

Path Diagrams. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

BIOS 625 Fall 2015 Homework Set 3 Solutions

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models

Statistical Distribution Assumptions of General Linear Models

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Evaluating the Sensitivity of Goodness-of-Fit Indices to Data Perturbation: An Integrated MC-SGR Approach

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Correlation 1. December 4, HMS, 2017, v1.1

Lectures 5 & 6: Hypothesis Testing

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Practice Problems Section Problems

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

Measuring relationships among multiple responses

Binary Logistic Regression

An Introduction to Matrix Algebra

COMS 4771 Lecture Course overview 2. Maximum likelihood estimation (review of some statistics)

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Structural equation modeling

SC705: Advanced Statistics Instructor: Natasha Sarkisian Class notes: Introduction to Structural Equation Modeling (SEM)

SOLUTIONS Problem Set 2: Static Entry Games

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

Econ 583 Homework 7 Suggested Solutions: Wald, LM and LR based on GMM and MLE

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

review session gov 2000 gov 2000 () review session 1 / 38

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

The Nonparametric Bootstrap

Chapter 11. Regression with a Binary Dependent Variable

Introduction to Factor Analysis

Likelihood and Fairness in Multidimensional Item Response Theory

A Primer on Statistical Inference using Maximum Likelihood

Exploring Cultural Differences with Structural Equation Modelling

EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

Unobservable Parameter. Observed Random Sample. Calculate Posterior. Choosing Prior. Conjugate prior. population proportion, p prior:

Supplemental material to accompany Preacher and Hayes (2008)

Multiple Regression Analysis

MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:

APPLIED STRUCTURAL EQUATION MODELLING FOR RESEARCHERS AND PRACTITIONERS. Using R and Stata for Behavioural Research

KDF2C QUANTITATIVE TECHNIQUES FOR BUSINESSDECISION. Unit : I - V

Just Enough Likelihood

3 Multiple Linear Regression

Using R in Undergraduate and Graduate Probability and Mathematical Statistics Courses*

Loglikelihood and Confidence Intervals

On Selecting Tests for Equality of Two Normal Mean Vectors

STA 114: Statistics. Notes 21. Linear Regression

Experimental designs for multiple responses with different models

Scaled and adjusted restricted tests in. multi-sample analysis of moment structures. Albert Satorra. Universitat Pompeu Fabra.

INTRODUCTION TO ANALYSIS OF VARIANCE

Generalized Linear Models for Non-Normal Data

Factor Analysis. Qian-Li Xue

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models

STAT FINAL EXAM

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

Testing and Model Selection

Psychological Methods

Logistic Regression: Regression with a Binary Dependent Variable

Transcription:

SHOPPING FOR EFFICIENT CONFIDENCE INTERVALS IN STRUCTURAL EQUATION MODELS Donna Mohr and Yong Xu University of North Florida

Authors Note Parts of this work were incorporated in Yong Xu s Masters Thesis while at the University of North Florida. Correspondence concerning this article should be directed to or Donna Mohr Department of Mathematics and Statistics University of North Florida 4567 St. Johns Bluff Road, S Jacksonville, FL 32256 Email: dmohr@unf.edu phone: (904) 620-2884 FAX: (904) 620-288 Yong Xu Department of Statistics and Mathematics Shandong Economic University China Email: Xuyong007@hotmail.com

Abstract Most users of Structural Equation Models are aware that Wald-type standard errors for parameter estimates can vary remarkably depending on the arbitrary choice of how the scale is identified. When the focus is on Ho: coefficient is 0, tests based on a likelihood-ratio are invariant to the scale identification. However, a simple example shows that confidence intervals based on the likelihood ratio still display different relative precisions depending on the choice of scale identification. Is it legitimate for the researcher to shop for a scale identification that gives the best relative precision for a certain parameter? A series of examples suggests that shopping has very little negative impact on coverage probabilities, and can yield substantially tighter confidence intervals. Keywords: SEM, log-likelihood, latent variables 2

Introduction When fitting a Structural Equation Model (SEM), the user has numerous seemingly arbitrary choices for how to fix the scaling. Typically, we set certain parameters, such as a coefficient that relates a manifest variable to its underlying latent variable, equal to. Users who have explored several alternative ways of fixing the scale are familiar with seeing the t- tests for the underlying latent regression coefficients vary in what is sometimes a dramatic fashion. Gonzalez and Griffin (200) have given an excellent explanation of why these Wald-type tests are sensitive to the scale identification. As an alternative when testing hypotheses of the form Ho: coefficient = 0, they recommend Chi-squared tests based on the log-likelihood function because they are invariant with respect to the choice of scale identification. In a growing number of applications in psychology, economics, and biology, researchers are interested in the actual values of the latent regression coefficients. Examples of this are found in the application of SEM to longitudinal data, as in work by McArdle and Hamagami (200). In this paper, our focus is on the behavior of confidence intervals formed using the log-likelihood function, in the manner recommended by Neale and Miller (997) and used in the computer package Mx. We will first show by example that these confidence intervals have relative errors which are sensitive to the scale identification. Then we explore the effects of shopping, that is, of fitting many different scale identifications, and choosing the one which gives the best relative error. In the examples where we have tried this experiment, we have seen that there is little penalty (in the form of smaller than nominal coverage probabilities) and 3

potentially large savings in accuracy. This suggests that shopping is a legitimate strategy for finding scale specifications that carry information in an efficient manner. Confidence Intervals based on the Profile Log-Likelihood Neale and Miller (997) provide a description of how confidence intervals are computed from profile log-likelihood functions. In the final paragraphs of that article, they comment that though these intervals are preserved by transformations of the parameter, they are not invariant with respect to the scale identification. Since this method of constructing confidence intervals is central to this paper, we will briefly review it in the context of an example. Lattin, Carroll and Green (2003, Chapter 0) give several examples of SEM. We will focus on Problem 0.5, itself taken from Fredricks and Dossett (983), which we will refer to as the FD example. The path diagram is shown in Figure. For the moment, let us assume that our emphasis is on estimating the latent regression coefficient γ. There are at least 27 different ways to make the scale identification, even if we do not attempt to fix any of the specific variances for the manifest variables. For the moment, we will compare only two of these: Specification : Set Var(F) = Var(D2) = Var(D3) = Specification 2: Set λ = λ = λ 3 = X Y Y Here, the λ are the coefficients connecting a manifest variable to an underlying latent variable. A data set of 236 observations was simulated using this path model and parameters, shown in Table, like those estimated from the textbook s correlation matrix. The results 4

below are based on Maximum Likelihood Estimates (MLE) fit to the sample covariance matrix for this simulated data, shown in Table 2. The key quantity for inference is { ln ( ) ln } 2 { log } F = n Σ tr SΣ S p = likelihood constant where S and Σ are the sample and predicted covariance matrices (of dimension p p). Under Specification, the MLE was ˆ γ =.322 and F = 5.873. To form the 95% confidence interval, we search either side of ˆ γ for two new values, ˆ γ L and ˆ γ U such that the F computed by fixing γ at these values and re-maximizing the remaining parameters equal 9.0288 = 5.873+ 3.845. The value 3.845 is the 95 th percentile from a Chi-Squared distribution with df. Since we re-maximize the other parameters each time we try a new γ, the resulting log-likelihoods are called profile log-likelihoods. Figure 2a shows the values of F created by varying γ. The resulting confidence interval is γ (.34,.52) Figure 2b shows the construction of the confidence interval for γ using Specification 2. Note that the MLE has changed due to the new choice of scale, so it is hard to directly compare the two graphs. To compare the two confidence intervals, we will focus on their relative error = ( ) ˆ γ ˆ γ / ˆ γ, which amounts to rescaling each interval so it is centered U L at.0. The results are shown in Figure 3. Now we clearly see that the interval produced by Specification is far more accurate, in the sense of having much smaller relative error. Shopping for a good scale identification Given that the accuracy of the confidence interval is sensitive to the scale identification, this raises the possibility that the user can shop. That is, the user may try 5

many different scale identifications until finding one where the estimated confidence interval is as short (relative to the parameter estimate) as possible. Analogous to a multiple comparison problem, we might reasonably wonder whether such a strategy has an actual coverage probability less than the stated confidence level. We used a simulation based on the FD example to examine this possibility. In the simulation, 2500 data sets were generated from the FD-model, using the parameter values given in Table. Each data set had 236 observations, as in the original example. The MLE were obtained, and then 95% confidence intervals were generated separately for γ, γ, 2 and β using the profile log-likelihoods under 27 different scale specifications. All programs were in double-precision FORTRAN, using IMSL routines DUMING and DZBREN for the minimization tasks. For each of the scale specifications, we track the percentage of times the confidence intervals contained the true value of the corresponding parameter, and the mean relative error of the interval. The coverage probabilities and mean relative errors are summarized in Table 3. Under every specification, the actual coverage probability was very close to the nominal 95%. However, the mean relative errors for parameter γ varied tremendously, from.666 to.965. By contrast, the relative errors for γ 2 are far more stable. Note (from Table ) that γ 2 is much closer to 0 than γ, and it seems that in some neighborhood of 0 the relative errors are less sensitive to the scale identification. The surprise comes in the behavior of the confidence interval chosen from the best identification, that is, by scanning across the calculated confidence intervals for all 27 specifications and reporting the specification with the shortest interval. These are summarized in Table 3, in the row labeled. Of course, the mean relative error is slightly 6

smaller than the mean of any one method of specification. The surprise is that the actual coverage probability is still very close to the 95% level. Shopping for several parameters simultaneously Normally we have several parameters for which we need confidence intervals. We will suppose that we are interested in all three latent regression coefficients γ, γ 2, and β. Usually there will not be a single specification that happens to be best for all the parameters of interest. In that case, we will have to compromise. Denote the estimated relative error in the confidence interval for parameter j under specification s as RE( j, s ). Let BestRE( j ) be the smallest relative error found after scanning all the specifications. This will normally come from a different s for each of the parameters of interest. We will choose the single specification that minimizes ORE() s = RE(,)/ j s BestRE() j j where the summation is over all parameters j of interest to the researcher. The results of following this strategy in the FD simulations are shown at the bottom of Table 3 in the line labeled. The confidence intervals based on this interval are almost as short as the ones chosen by optimizing each parameter separately. Unlike the intervals obtained by following separate strategies for each parameter, these are all comparable, in the sense they are all based on the same scale identification. A true multiple comparison problem arises as one considers the probability that all three of the true parameter values lie within their corresponding confidence intervals. In the simulation, this probability was 85.4%, indicating that Bonferroni-style adjustments may be appropriate for those concerned with the family-wise confidence level. 7

Conclusion We have reproduced these results in several other examples. One (Example 0.6 from Lattin, Carroll, and Green) had two latent regression coefficients, and another (Example 0.2) had one coefficient. While examples do not yield universal truths, there is a consistent pattern in which the confidence intervals chosen post hoc for having the shortest relative length do a good job of maintaining the required confidence level. Unfortunately, there seems to be no way to guess in advance which specification will yield the best results. In some simulations, the best results were found by fixing the variances of the disturbances in the underlying latent variables (as in the FD example). In others, one particular set of manifest variable coefficients seemed to be most advantageous. The driving factor seems to be the size of the variance: those variables (whether manifest or latent) with the largest variation are the ones that need to be anchored in some fashion. Naturally, this is not something one would know in advance of fitting the model. Users of SEMs may have been unaware of the advantages of shopping, or they may have been ashamed to admit they shop, feeling that this was akin to fishing for the hypothesis that will (finally) yield a significant p-value. But the work of Gonzalez and Griffin (200) shows us that various identifications carry information in more or less efficient ways for a particular subset of the parameter vector. Given that confidence levels seem to be maintained, the potential advantages in precision vindicate shopping as legitimate way of screening for an efficient identification. 8

References Fredricks, A. J., & Dossett, D. L. (983). Attitude-behavior relations: a comparison of the Fishbein-Ajzen and the Bentler-Speckart models. Journal of Personality and Social Psychology, 42, 20-22. Gonzalez, R., & Griffin, D. (200). Testing parameters in structural equation modeling: every one matters. Psychological Methods, 6, 258-269. Lattin, J., Carroll, J. D., & Green, P. E. (2003). Analyzing multivariate data. Pacific Grove, CA: Brooks/Cole Thomson Learning. McArdle, J. J., & Hamagami, F. (200) Latent difference score structural models for linear dynamic analyses with incomplete longitudinal data. In L. M. Collins & A. G. Sayers (Eds.), New methods for the analysis of change (pp. 39-75). Washington, DC: American Psychological Association. Neale, M. C., & Miller, M. B. (997). The use of likelihood-based confidence intervals in genetic models. Behavior Genetics, 27, 3-20. 9

Table. True Parameter Values used in Simulations from FD model. The λ represent the coefficients between the manifest variables and the underlying latent variable. The e are the error terms in the manifest variables X,...,Y4.. γ =.7008 7. λx 3. Var(e)=.334 2. γ 2 =.390 8. λ X 2 =.089 4. Var(e2)=.34584 3. β =.49 9. λy 5. Var(e3)=.55685 4. Var(F)=.42346 0. λ Y 2 =.992 6. Var(e4)=.57887 5. Var(D)=.42346. λy 3 7. Var(e5)=.50643 6. Var(D2)=.2803 2. λ Y 4 =.9748 8. Var(e6)=.42226 0

Table 2. Sample covariance matrix for a simulated data set from FD model X X2 Y Y2 Y3 Y4 X.897.488.2520.98.462.672 X2.837.47.584.053.03 Y.02.7029.2649.2856 Y2.9994.856.250 Y3.9558.394 Y4.969

Table 3. Coverage probabilities and mean relative errors obtained from simulation. Parameter identifications correspond to numbering in Table. γ γ 2 β Model, with ID of Coverage Mean Rel Coverage Mean Rel Coverage Mean Rel parameters= Prob. (%) Error Prob. (%) Error Prob. (%) Error. (7,9,) 94.5.806 94.3.8 94.9.07 2. (7,0,) 94.9.809 94.3.8 94.7.07 3. (8,9,2) 95.0.837 94.7 2.02 94.5.09 4. (8,9,) 95.0.837 94.2.96 94.9.07 5. (4,9,) 94.5.666 94.2.53 94.9.07 6. (4,5,6) 95.0.795 94.7.86 94.3.2 7. (4,5,) 95.0.795 94.2.53 94.7.05 8. (4,5,2) 95.0.795 94.9.59 94.6.06 9. (4,6,9) 94.5.666 94.7.85 94.8.23 0. (4,9,2) 94.5.666 94.9.59 94.5.09. (4,6,0) 95.0.669 94.7.85 94.7.23 2. (4,0,) 95.0.669 94.2.53 94.7.07 3. (4,0,2) 95.0.669 94.9.59 94.5.08 4. (5,6,7) 94.8.99 94.7 2.2 94.3.20 5. (5,7,) 94.8.99 94.3.8 94.7.05 6. (5,7,2) 94.8.99 95.0.87 94.6.06 7. (6,7,9) 94.5.806 94.7 2.2 94.8.23 8. (7,9,2) 94.5.806 95.0.87 94.5.09 9. (6,7,0) 94.9.809 94.7 2.2 94.7.23 20. (7,0,2) 94.9.809 95.0.87 94.5.08 2. (5,6,8) 95..965 94.7 2.28 94.3.2 22. (5,8,) 95..965 94.2.96 94.7.05 23. (5,8,2) 95..965 94.7 2.02 94.6.06 24. (6,8,9) 95.0.837 94.7 2.28 94.8.23 25. (6,8,0) 95.0.840 94.7 2.28 94.7.23 26. (8,0,) 95.0.840 94.2.96 94.7.07 27. (8,0,2) 95.0.840 94.7 2.02 94.5.08 shortest interval 94.7.646 94.6.44 94..00 compromise specification 94.5.648 94.5.48 94.5.03 2

Figure. Path Diagram for FD data. Latent variables are F, F2, and F3. D2 and D3 are disturbance terms in the endogenous latent variables. Y2 Y γ F2 D2 β D3 F F3 γ 2 X X2 Y3 Y4 3

Figure 2. (A) Confidence interval for γ constructed from profile log-likelihood using Specification, and (B) using Specification 2. 4

Figure 3. Confidence intervals for γ rescaled relative to ˆ γ. 5