Module 3. Latent Variable Statistical Models. y 1 y2

Size: px

Start display at page:

Download "Module 3. Latent Variable Statistical Models. y 1 y2"

Rosemary Barrett
5 years ago
Views:

1 Module 3 Latent Variable Statistical Models As explained in Module 2, measurement error in a predictor variable will result in misleading slope coefficients, and measurement error in the response variable will result in inflated standard errors. These problems can be reduced by using latent variable statistical models in which the measurement models described in Module 2 are integrated into any of the statistical models described in Module 1. Statistical models can be specified in terms of true scores (from a strictly parallel, parallel, or tau-equivalent model) or factor scores (from a congeneric or factor analysis model). True scores and factor scores will be referred to as latent variable scores. There are several types of analyses that benefit from an analysis of latent variable scores. In a GLM where x 1 is the predictor variable of primary interest and one or more confounding variables have been included in the model, if the confounding variables are measured with error, their confounding effects will only be partially removed from the relation between x 1 and y. If the confounding variables are represented by latent variables, then the effects of the confounding variables can be more effectively removed from the relation between x 1 and y. If two or more predictor variables measure highly similar attributes, multicollinearity problems can be avoided by using those predictor variables as indicators of a single latent variable. If two or more response variables measure highly similar attributes, the model will contain fewer path coefficients if those response variables are used as indicators of a single latent variable. An analysis of indirect effects is another type of analysis where analyzing latent variables is preferred to analyzing variables that are measured with error. Consider the path model illustrated below. e 1 e 2 x1 y 1 y2 β 11 γ 12 1

2 Measurement error in x 1 attenuates β 11, and measurement error in y 1 attenuates γ 12. If ρ x1 and ρ y1 are the reliabilities of x 1 and y 1, then the indirect effect β 11 γ 12 is attenuated by a factor of ρ x1 ρ y1. For instance, if both reliabilities equal.5, then the indirect effect would be attenuated by a factor of. 5(. 5) =.5. Furthermore, measurement error in y 1 and y 2 will inflate the standard errors of both path coefficients, which in turn will inflate the standard error for the indirect effect. Latent variable statistical models are also attractive in applications where the factor scores in a congeneric or factor analysis model represent a better approximation to the psychological construct under investigation than what could be measured using a single measurement of the construct. For instance, if spatial ability is an important variable in a statistical model, it could be assessed using a single test such as the y 1 = Card Rotation Test, y 2 = Hidden Figures Test, y 3 = Gestalt Picture Completion Test, or y 4 = Surface Development Test. However, each of these tests assesses only a particular aspect of spatial ability, and it could be argued that the factor scores in a congeneric model for y 1, y 2, y 3, and y 4 represent a more meaningful and complete representation of spatial ability. A more general notational scheme is needed for latent variable statistical models in which some latent variables are predictor variables and some latent variables are response variables. Latent predictor variables are represented by ξ, and latent response variables are represented by η. The indicators of ξ are represented by x, and the indicators of η are represented by y. The unique factors (or measurement errors) are represented by δ for ξ and by ε for η. The factor loadings for η are represented by λ y and the factor loadings for ξ are represented by λ x. Several basic types of latent variable statistical models are described below. A path diagram and the lavaan code is given for each example. Latent Variable Regression Model As already stated, response variable measurement error inflates the standard errors of the slope estimates and predictor variable measurement can attenuate or exaggerate the slope estimates depending on the pattern of correlations among the predictor variables. Furthermore, when the measurements of a response variable and a predictor variable are obtained using a common method (e.g., both are self-report measures or both are 5-point Likert scale measures), the strength of the relation between the response variable and predictor variable can be exaggerated due to common-method variance (CMV). Suppose a 2

3 sample of employees are asked to self report their level of commitment to the organization and also self report their level of job performance. Some employees will overstate their true level of commitment and job performance while other employees will understate their true level of commitment and job performance and this will exaggerate the estimated correlation between organizational commitment and job performance. If CMV is a potential concern, each attribute can be measured using two or more measurement methods. A path diagram for a simple linear regression model where the predictor variable and the response variable have been measured using the same three measurement methods is illustrated below. In this example, assume that x 1 and y 1 have been measured using the same method, x 2 and y 2 have been measured using the same method, and x 3 and y 3 have been measured using the same method. This model includes covariances among the three pairs of unique factors that have a common measurement method. The estimate of β 1 could be substantially exaggerated if these covariances are not included in the model. ξ 1 β 1 η 1 e 1 λ x2 λ x3 1 λ y2 λ y3 x 1 x 2 x 3 y 1 y 2 y 3 δ 1 δ 2 δ 3 ε 1 ε 2 ε 3 (Model 1) σ δ1 ε 1 σ δ2 ε 2 σ δ3 ε 3 The lavaan model specification for Model 1 is given below. reg.model <- ' ksi =~ 1*x1 + x2 + x3 eta =~ 1*y1 + y2 + y3 eta ~ ksi x1 ~~ y1 x2 ~~ y2 x3 ~~ y3 ' 3

4 The ksi =~ 1*x1 + x2 + x3 command defines ξ 1 and constrains the factor loading for x 1 to equal 1. The eta =~ 1*y1 + y2 + y3 command defines η 1 and constrains the factor loadings for y 1 to equal 1. The eta ~ ksi command defines the simple linear regression model with η 1 as the response variable and ξ 1 as the predictor variable. The x1 ~~ y1, x2 ~~ y2, and x3 ~~ y3 commands specify the covariances among the pairs of measurements that used a common method of measurement. ANCOVA Model with Latent Covariates An ANCOVA model in a nonexperimental design that includes one or more confounding variables as covariates, can remove the linear confounding effects of the covariates and provide an estimate of the treatment effect that more closely approximates the causal effect of treatment. However, if any of the covariates are measured with error, then the confounding effects are only partially removed and the estimated effect of treatment can be misleading. The path diagram of a 2-group ANCOVA model with two latent covariates is shown below where x 5 is a dummy variable that codes group membership (group1 = Treatment 1 and group 2 = Treatment 2). The β 3 coefficient describes the difference in the two population treatment means after controlling for differences in the latent covariates (ξ 1 and ξ 2 ). The variance of e represents the within-group error variance. x 5 β 3 y e x 4 δ 4 1 ξ 2 β 2 x 3 δ 3 1 β 1 δ 2 1 x 2 δ 1 1 x 1 ξ 1 (Model 2) 4

5 The lavaan model specification for Model 1 is given below. ancova.model <- ' ksi1 =~ 1*x1 + 1*x2 ksi2 =~ 1*x3 + 1*x4 y ~ b3*x5 + b2*ksi2 + b1*ksi1 ksi1 ~~ ksi2 ksi1 ~~ x5 ksi2 ~~ x5 ' The ksi1 =~ 1*x1 + 1*x2 command defines ξ 1 and constrains the two factor loadings for x 1 and x 2 to equal 1. The ksi2 =~ 1*x3 + 1*x4 command defines ξ 2 and constrains the two factor loadings for x 3 and x 4 to equal 1. The ksi1 ~~ ksi2, ksi ~~ x5, and ksi2 ~~ x5 commands specify the covariances among x 5, ξ 1, and ξ 2. The variances of δ 1,δ 2, δ 3, and δ 4 are unconstrained in this example to define a tau-equivalent measurement model for x 1 and x 2 and a tau-equivalent measurement model for x 3 and x 4. Parallel measurement models could be defined by imposing an equality constraint on the variances of δ 1 and δ 2 and an equality constraint on the variances of δ 3 and δ 4 by adding the commands x1 ~~ var1*x1, x2 ~~ var1*x2, x3 ~~ var2*x2, and x4 ~~ var2*x4. MANOVA with Latent Response Variables As explained in Module 1, a one-way MANOVA can be used to test the null hypothesis that the population means of all r response variables are equal across all levels of the independent variable. This test does not provide useful scientific information because the null hypothesis is known to be false in virtually every application. Useful scientific or practical information can be obtained by computing Bonferroni confidence intervals for all pairwise group differences of means and for all r response variables. There are r[k(k 1)/2] pairwise comparisons in a k-group design, but analyzing and reporting all these results could be intractable for unless k and r are both small. Furthermore, a Bonferroni adjustment for so many comparisons could produce uselessly wide confidence intervals. If the r response variables represent congeneric indicators of q factors, then only q[k(k 1)/2] pairwise comparisons need to be examined. More importantly, the q factors might have greater psychological meaning than any of the r individual response variables. 5

6 A path diagram of a 3-group MANOVA with q = 2 sets of congeneric measures is shown below where x 1 and x 2 are dummy variables that code group membership. y 1 ε 1 η 1 x 1 β 11 λ y2 y ε λ y3 β 12 e 1 ε 3 y 3 β 21 e 2 ε 4 y 4 x 2 η 2 1 β 22 λ y5 ε 5 y 5 λ y6 y 6 ε 6 λ y7 y 7 ε 7 (Model 3) In the above model with dummy coding (where x j = 1 if group = j, 0 otherwise), β 11 describes the difference in the means of η 1 for levels 1 and 3 of the independent variable, and β 21 describes the difference in the means of η 1 for levels 2 and 3 of the independent variable. Likewise, β 12 describes the difference in the means of η 2 for levels 1 and 3 of the independent variable, and β 22 describes the difference in the means of η 2 for levels 2 and 3 of the independent variable. The mean differences for levels 1 and 2 are equal to β 11 β 21 and β 12 β 22. The lavaan model definition for Model 2 is given below. 6

7 manova.model <- ' eta1 =~ 1*y1 + lamy2*y2 + lamy3*y3 eta2 =~ 1*y4 + lamy5*y5 + lamy6*y6 + lamy7*y7 eta1 ~ b11*x1 + b21*x2 eta2 ~ b21*x1 + b22*x2 eta1 ~~ eta2 b31 := b11 b12 b32 := b21 b22 ' The eta1 =~ 1*y1 + lamy2*y2 + lamy3*y3 and eta2 =~ 1*y4 + lamy5*y5 + lamy6*y6 + lamy7*y7 commands define the two factors where the first loading in each set of congeneric measurement is set equal to 1 to identify each measurement model. The eta1 ~ b11*x1 + b21*x2 and eta2 ~ b21*x1 + b22*x2 commands define the 1-way MANOVA model. The eta1 ~~ eta2 command define the covariance among the latent variable prediction errors. The b31 := b11 b12 and b32 := b21 b22 commands define new parameters that described the mean differences for levels 1 and 2 of the independent variable for the two latent response variables. Latent Variable Path Model An example of a latent variable path model is shown below. In this model x 4 and x 5 are assumed to be tau-equivalent measures, y 4 and y 5 are assumed to be tau-equivalent measures, x 1, x 2, and x 3 are assumed to be congeneric measures, and y 1, y 2, and y 3 are assumed to be congeneric measures. In this model, β 11, β 22, and γ 12 are assumed to be meaningfully large with β 12 and β 21 assumed to be small and constrained to equal 0. The two latent predictor variables (ξ 1 and ξ 2 ) are assumed to be correlated. The correlation between e 1 and e 2 is assumed to be small in this example and has been constrained to equal 0. Note that the zero-constrained parameters do not appear the in the path diagram. In this model, the direct effects (β 11, β 22, γ 12 ) and the indirect effect (β 11 γ 12 ) are not attenuated due to measurement error in x 1, x 2, x 3, x 4, x 5 and y 1, y 2, y 3. In addition, the measurement error in y 4 and y 5 will not inflate the standard errors of the direct and indirect effects. 7

8 ε 1 ε 2 ε 3 y 1 y 2 y 3 δ 1 x λ y2 λ y3 δ 2 λ x2 β 11 e 1 x 2 ξ 1 η 1 δ 3 x λ 3 x3 x 4 1 σ 12 γ 12 δ 4 β 22 e ξ η δ x 5 y 4 y 5 (Model 4) ε 4 ε 5 The lavaan model specification for Model 4 is given below. path.model <- ' ksi1 =~ 1*x1 + lamx2*x2 + lamx3*x3 ksi2 =~ 1*x4 + 1*x5 eta1 =~ 1*y1 + lamy2*y2 + lamy3*y3 eta2 =~ 1*y4 + 1*y5 eta1 ~ b11*ksi1 eta2 ~ b22*ksi2 + g12*eta1 ind := b11*g12 ksi1 ~~ ksi2 ' fit <- sem(path.model, data = mydata) The ksi2 =~ 1*x4 + 1*x5 command defines ξ 2 and constrains the two factor loadings for x 4 and x 5 to equal 1. The eta2 =~ 1*y4 + 1*y5 command defines η 2 and constrains the two factor loadings for y 4 and y 5 to equal 1. The ksi1 =~ 1*x1 + lamx2*x2 + lamx3*x3 and eta1 =~ 1*y1 + lamy2*y2 + lamy3*y3 commands each define a congeneric measurement model. The ind := b11*g12 command defines the indirect effect of ξ 1 on η 2. 8

9 The ksi1 ~~ ksi2 command specifies the covariance between ξ 1 and ξ 2. The variances of δ 4 and δ 5 and the variances of ε 4 and ε 5 have not been constrained in the above model specification and define a tau-equivalent measurement model for x 4 and x 5 and a tau-equivalent measurement model for y 4 and y 5. Parallel measurement models could be defined by imposing one equality constraint on the variances of δ 1 and δ 2 and another equality constraint on the variances of ε 4 and ε 5. These equality constraints can be specified by adding the commands x4 ~~ var1*x4, x5 ~~ var1*x5, y4 ~~ var2*y4, and y5 ~~ var2*y5. The covariance between e 1 and e 2 has been constrained to equal 0 in Model 2, but this constraint could be removed by adding the command eta1 ~~ eta2 to the model specification. The β 12 = 0 constraint could be removed by changing eta2 ~ b22*ksi2 + g12*eta1 to eta2 ~ b22*ksi2 + g12*eta1 + b12ksi1. The β 21 = 0 constraint could be removed by changing eta1 ~ b11*ksi1 to eta1 ~ b11*ksi1 + b21*ksi1. However, only two of these three constraints can be removed because otherwise the model will not be identified. Latent Growth Curve Model In a longitudinal study, suppose each participant (i = 1 to n) is measured on the same set of k time points (e.g., Jan, Feb, March, Apr). In the simplest case, the purpose of the study is to assess the linear change in the response variable over time. In this simple case, the statistical model for one randomly selected participant can be expressed as y it = b 0i + b 1i t + e it (3.1) where b 0i is the y-intercept for participant i, b 1i is the slope of the line relating time to y for participant i, and t is the time point value (e.g., t = 1, 2, 3, 4). Given that the n participants are assumed to be a random sample from some population, it follows that the b 0i and b 1i values are a random sample from a population of person-level y-intercept and slope values. Equation 3.1 is called a level-1 model. In the same way that a statistical model describes a random sample of y scores, statistical models can be used to describe a random sample of b 0i and b 1i values. The statistical models for b 0i and b 1i are called level-2 models. The following level-2 models for b 0i and b 1i are the simplest type because they have no predictor variables. 9

10 b 0i = β 00 + u 0i b 1i = β 10 + u 1i (3.2a) (3.2b) where u 0i and u 1i are the parameter prediction errors for the random value of b 0i and b 1i, respectively. These parameter prediction errors are usually assumed to correlated with each other but are assumed to be uncorrelated with the level-1 prediction errors (e it ). The variance of u 0i describes the variability of the person-level y-intercepts and the variance of u 1i describes the variability of the person-level slopes in the population. A path diagram of a latent growth curve model is illustrated below (Model 5) for the case of four equally-spaced time points. Note that the factor loadings for the intercept factor (η 0 ) are all set equal to 1 and the factor loadings for the slope factor (η 1 ) are all set equal to 0, 1, 2, and 3. Setting the slope factor loadings to 0, 1,, k 1 is called baseline centering. It is necessary to constrain the y-intercepts for y 1, y 2, y k to zero in order to estimate β 00 and β 10. With baseline centering β 00 describes the population mean y score at baseline. The population mean of the person-level slopes relating time to y is described by β 10. With unequally-spaced time points, such as 1, 2, 5, and 10, the slope factor loadings could be set to 0, 1, 4, and 9. 1 β 00 β 10 u 0 u η 1 0 η y 1 y 2 y 3 y 4 (Model 5) ε 1 ε 2 ε 3 ε 4 10

11 The lavaan model specification for Model 5 is given below. The growth function works like the sem function but the growth function is more convenient for latent growth curve models because it automatically specifies the intercepts (β 00 and β 10 ) for the intercept factor and the slope factor, and the y-intercepts for y 1, y 2, y k are automatically constrained to equal to 0. growth.model <- ' inter =~ 1*y1 + 1*y2 + 1*y3 + 1*y4 slope =~ 0*y1 + l*y2 + 2*y3 + 3*y4' fit <- growth(growth.model, data = mydata) parameterestimates(fit, ci = T, level =.95) Some of the variability in b 0i and b 1i could be explained by one or more predictor variables. Suppose that b 0i and b 1i are believed to be related to just one predictor variable x 2. We can now specify the following level-2 models for b 0i and b 1i. b 0i = β 00 + β 01 x 2i + u 0i b 1i = β 10 + β 11 x 2i + u 1i (3.3a) (3.3b) A predictor variable in a level-2 model is referred to as a time-invariant covariate because it will be measured at a single point in time, usually at or before the first time period. For instance, suppose y in Model 5 represents self-esteem measured from a sample of students at four points in time (e.g., grades 3, 4, 5, and 6). A measure of extroversion at grade 3 could be used as a time-invariant predictor of self-esteem. Demographic variables such as gender, mother's education, or number of siblings are a few other examples of time-invariant covariates. The level-2 models can have zero, one, or more time-invariant covariates. The covariates for b 0i are usually, but not necessarily, the same as the covariates for b 1i. The lavaan model specification for a latent growth model with one time-invariant covariate (gender) is given below. growth.model <- ' inter =~ 1*y1 + 1*y2 + 1*y3 + 1*y4 slope =~ 0*y1 + l*y2 + 2*y3 + 3*y4 inter ~ gender slope ~ gender' 11

12 In some applications, the level-1 model will include one or more predictor variables that are measured at each time period. This type of predictor variable is referred to as a timevarying covariate. Consider again the example where self-esteem is measured in grades 3, 4, 5, and 6. If academic performance is also measured each year, and we believe that selfesteem in year t is related to academic performance in year t, then the level-1 model could be expressed as y it = b 0i + b 1i t + b 2i x 1it + e it (3.4) where x 1it is an academic performance score for student i in year t. A level-1 model can have zero, one, or more time-varying covariates. The lavaan model specification for one time-invariant covariate (gender) and one time-varying covariate (self esteem) is given below. growth.model <- ' inter =~ 1*y1 + 1*y2 + 1*y3 + 1*y4 slope =~ 0*y1 + l*y2 + 2*y3 + 3*y4 y1 ~ selfesteem1 y2 ~ selfesteem2 y3 ~ selfesteem3 y4 = selfesteem4 inter ~ gender slope ~ gender' The following approximate 100(1 α)% confidence interval for σ 2 β0 and σ 2 β1 provides important information about the person-level variability in the intercept and slope factors exp[ln (σ βj 2 ) ± z α/2 var{ln (σ βj 2 )} ] (3.5) where var{ln (σ βj 2 )} is the standard error of ln (σ βj 2 ). Square-roots of the endpoints of Equation 3.5 give a confidence interval for the standard deviation of the intercept or slope random variable. The computation of Equation 3.5 can be simplified by letting lavaan compute the confidence interval for ln(σ 2 βj ) and then the endpoints can be exponentiated by hand to get a confidence interval for σ 2 βj. 12

13 growth.model <- ' inter =~ 1*y1 + 1*y2 + 1*y3 + 1*y4 slope =~ 0*y1 + l*y2 + 2*y3 + 3*y4 inter ~~ varinter*inter slope ~~ varslope*slope logvarinter := log(varinter) logvarslope := log(varslope) ' fit <- growth(growth.model, data = mydata) parameterestimates(fit, ci = T, level =.95) The level-1 and level-2 models can be analyzed using mixed model statistical programs that do not require the same set of time periods for each participant. For example, mixed model programs allow one participant to be measured on occasions 1, 2, 4, 6, a second participant to be measured on occasions 3, 5, 9, and 10, a third participant to be measured on occasions 1 and 7, and so on. However, suppose one or more of the predictor variables in the level-1 or level-2 models are latent variables. Then the mixed model programs are of no use and a latent grown curve model is required. A latent growth curve model (e.g., Model 5) can be part of a more complex model where the intercept and slope factors are predictors of other observed or latent variables this type of analysis is not possible using mixed model programs. The confidence intervals for σ 2 β0 and σ 2 β1 computed in mixed model programs assume the person-level intercept and slope coefficients are normally distributed in the population. This normality assumption can be relaxed in a latent growth curve analysis using optional robust standard errors. Multiple-Group Latent Variable Models A k-group design can be represented in a GLM by including k 1 indicator variables as predictor variables in the model along with any quantitative predictor variables of y. Consider the most simple case of k = 2 groups with one quantitative predictor of y. Using dummy coding, the following model includes one quantitative predictor variable (x 1 ), one dummy coded variable (x 2 ) to code the 2-group design, and the product of x 1 and x 2 to code the interaction between x 1 and x 2 y i = β 0 + β 1 x 1i + β 2 x 2i + β 3 (x 1i x 2i ) + e i. (3.6) Alternatively, the above model can be represented by specifying two models, one for each of the two groups as shown below 13

14 y 1i = β 10 + β 11 x 11i + e 1i y 2i = β 20 + β 21 x 21i + e 2i (3.7a) (3.7b) where the first subscript indicates group membership (1 or 2). It can be shown that that the parameters of Equation 3.6 can be expressed in terms of the parameters of Equations 3.7a and 3.7b. Specially, it can be shown that β 2 = β 10 β 20 and β 3 = β 11 β 21. Equations 3.7a and 3.7b are sometimes preferred to Equation 3.6 when the interaction effect is expected to be non-trivial and the researcher anticipates an examination of simple slopes, which are the β 11 and β 21 coefficients in Equations 3.7a and 3.7b. More importantly, if any of the quantitative predictor variables are latent variables, Equation 3.6 is of no use because it is not possible to compute the product of a dummy variable with a latent (unmeasured) variable. Programs like lavaan can be used to analyze latent variable models in which participants have been classified (e.g., male vs. female) or randomly assigned (e.g., treatment 1 vs. treatment 2 vs. treatment 3) into k 2 groups. Thus, the k-groups can represent the levels of a classification factor or a treatment factor. The k-groups could also represent the combinations of two or more classification or treatment factors. In multiple group studies, a model can be specified within each group. The model within each group could be any of the statistical models described in Module 1, any of the measurement or confirmatory factor analysis models described in Module 2, or any of the latent variable statistical models that have been described up to this point in Module 3. Interesting research questions can be addressed by comparing or combining unstandardized or standardized parameters (e.g., slopes, factor loadings, indirect effects, total effects, reliability coefficients, factor correlations, unique factor variances, means) from the multiple groups. Multiple-group measurement models can be used to assess measurement invariance. Strick measurement invariance across two or more groups assumes equal factor loadings, equal intercepts, and equal unique error variances across groups although the loadings, intercepts, and error variances may differ within groups. A path diagram for a 2-group study with a strictly parallel measurement model for each group is shown below. 14

15 ε 11 y 1 μ 11 λ y11 λ y11 η y 2 ε 21 λ y11 Group 1 μ 11 y 3 ε 31 μ 11 1 ε 12 y 1 λ y12 μ 12 λ y12 η y 2 ε 22 λ y12 Group 2 y 3 μ 12 ε 32 μ 12 1 (Model 6) The following approximate 100(1 α)% confidence interval for λ yj1 λ yj2 can be used to assess the similarity of population factor loading λ yj1 in two study populations λ yj1 λ yj2 ± z α/2 var(λ yj1 ) + var(λ yj2 ) (3.8) where var(λ yjk ) is the squared standard error of λ yjk and var(λ yj1 ) + var(λ yj2 ) is the estimated standard error of λ yj1 λ yj2. Equation 3.8 can be used for unstandardized or standardized factor loadings, although a confidence interval for a difference in standardized loadings is usually easier to interpret. The following approximate 100(1 α)% confidence interval for μ j1 μ j2 can be used to assess the similarity of population y-intercepts (means) in two study populations μ j1 μ j2 ± z α/2 var(μ j1) + var(μ j2) (3.9) 15

16 where var(μ jk)is the squared standard error of μ jk and var (μ j1 ) + var (μ j2 ) is the estimated standard error of μ j1 μ j2. An approximate 100(1 α)% confidence interval for σ 2 2 ε1 /σ ε2 can be used to assess the similarity of unique error variances in two study populations exp[ln(σ ε1 2 /σ ε2 2 ) ± z α/2 var{ln(σ ε1 2 )} + var{ln(σ ε2 2 )} ] (3.10) where var{ln(σ εk 2 )} is the squared standard error of ln(σ εk 2 ). The square-roots of the endpoints of Equation 3.10 gives a confidence interval for the ratio of unique factor standard deviations that is easier to interpret than a ratio of variances. The lavaan model specification and multiple-group sem function for Model 6 is given below. parallel.model <- ' eta =~ c(lam1, lam2)*y1 + c(lam1, lam2)*y2 + c(lam1,lam2)*y3 y1 ~~ c(var1, var2)*y1 y2 ~~ c(var1, var2)*y2 y3 ~~ c(var1, var2)*y3 lamdiff := lam1 lam2 logratio := log(var1/var2) ' fit <- sem(parallel.model, data = mydata, std.lv = T, group = "group") parameterestimates(fit, ci = T, level =.95) The eta =~ c(lam1, lam2)*y1 + c(lam1, lam2)*y2 + c(lam1,lam2)*y3 command defines η with equality constrained factor loadings within each group but not across groups. The y1 ~~ c(var1, var2)*y1, y2 ~~ c(var1, var2)*y2, and y3 ~~ c(var1, var2)*y3 commands constrain the error variances to be equal within each group but not across groups. The data file contains four variables named y1, y2, y3, and group. The lamdiff := lam1 lam2 command creates a new parameter called lamdiff that is the difference in the common factor loading in the two groups. The logerror := log(var1/var2) command creates a new parameter called logratio which is the natural logarithm of the ratio of the common error variances in each group (var1 and var2). The parameterestimates command will compute a confidence interval for ln(σ 2 ε1 /σ 2 ε2 ) and the endpoints of this interval can be exponentiated by hand to give a confidence interval for σ 2 ε1 /σ 2 ε2. 16

17 The above code allows the loadings, intercepts, and error variances to differ across the two groups. To constrain only the loadings to be equal across groups, the group.equal option in the sem function could be used as shown below. fit <- sem(parallel.model, data = mydata, std.lv = T, group = "group", group.equal = "loadings") The following code will equality constrain the loadings, intercepts, and error variances across groups. fit <- sem(parallel.model, data = mydata, std.lv = T, group = "group", group.equal = c("loadings", "intercepts", "residuals")) δ 11 x 1 λ x11 ξ 1 x 2 δ 21 λ x11 β 11 y e 1 Group 1 ρ 121 δ 31 λ x21 β 21 x 3 ξ 2 δ 41 x 4 λ x21 δ 12 x 1 λ x12 ξ 1 x 2 δ 22 λ x12 β 12 y e 2 Group 2 ρ 122 δ 32 λ x22 β 22 x 3 δ 42 λ x22 (Model 7) x 4 ξ 2 The path diagram for a 2-group GLM with latent predictor variables is shown above (Model 7). In this example, the two latent predictor variables are assumed to each have two tau-equivalent indicator variables. The two slope coefficients within each group (β 1j and β 2j ) are the simple slopes and the differences in simple slopes (β 11 β 12 and β 21 β 22 ) describe the Group x ξ 1 and Group x ξ 2 interactions, respectively. 17

18 The following approximate 100(1 α)% confidence interval for a difference in slope parameters (e.g., β j1 β j2 ) can be used to assess the similarity of population slope parameters in two study populations β j1 β j2 ± z α/2 var(β j1 ) + var(β j2 ) (3.11) where var(β jk ) is the squared standard error of β jk and var(β j1 ) + var(β j2 ) is the estimated standard error of β j1 β j2. The slope estimates and their standard errors in Equation 3.9 can be replaced with standardized slope estimates and their standard errors. Let ρ 1 represent the population correlation between two factors, two prediction errors, or two measurement errors that is estimated in group 1, and let ρ 2 represent the corresponding population correlation that is estimated from group 2. Let ρ 1 and ρ 2 denote the estimates of ρ 1 and ρ 2, respectively. Let L 1 and U 1 denote the lower and upper 100(1 α)% interval estimates computed from group 1 using Equation 2.17 (Module 2), and let L 2 and U 2 denote the lower and upper 100(1 α)% interval estimates computed from group 2 using Equation 2.17 (Module 2). Approximate lower and upper 100(1 α)% interval estimates for ρ 1 ρ 2 are L = ρ 1 ρ 2 (ρ 1 L 1 ) 2 + (ρ 2 U 2 ) 2 (3.12a) U = ρ 1 ρ 2 + (ρ 1 U 1 ) 2 + (ρ 2 L 2 ) 2. (3.12b) The lavaan model specification and multiple group sem function for Model 7 is shown below. A group difference in the slope parameters is defined for the two predictor variable and then lavaan will compute Equation A group.equal = "regressions" option could be added to the sem function to equality constrain the slope coefficients across groups. twogroupgml.model <- ' ksi1 =~ c(lamx11,lamx12)*x1 + c(lamx11,lamx12)*x2 ksi2 =~ c(lamx21,lamx22)*x3 + c(lamx21,lamx22)*x4 y ~ c(b11,b12)*ksi1 + c(b21,b22)*ksi2 b1diff := b11 b12 b2diff := b21 b22 ' fit <- sem(twogroupglm.model, data = mydata, std.lv = T, group = "group") parameterestimates(fit) 18

19 Model Assessment All theoretically important included paths (slopes, factor loadings, correlations) in a latent variable model should describe meaningfully large relations. Furthermore, all excluded paths should represent small or unimportant relations. As a general recommendation for parameters that have been included in the model, standardized slope coefficients should be greater than.25 in absolute value (larger is better) and standardized factor loadings should be greater than.4 in absolute value (larger is better). Theoretically important correlations among factors or observed variables should also be greater than.25 in absolute value. Standardized factor loadings are equal to Pearson correlations in factor models with a single factor or multiple uncorrelated factors. A standardized slope for a particular response variable is equal to a Pearson correlation if there is only one predictor of that response variable or if the multiple predictor variables are uncorrelated. Confidence intervals for standardized slopes, standardized factor loadings, and correlations can be used to assess the magnitude of the parameters. Specifically, a 95% confidence interval should be completely outside the -.25 to.25 range for a standardized slope or correlation and completely outside the -.4 to.4 range for a standardized factor loading. Ideally, 95% Bonferroni confidence intervals for the included parameters will indicate that all included parameters are meaningfully large. Model modification indices are useful in assessing model misspecification. Each index is a one degree of freedom chi-square test statistic for of the null hypothesis that the parameter constraint is correct. Model modification indices for factor loadings should be examined first. The omitted factor loading with the largest modification index can be added to the model, and if the 95% confidence interval for this factor is completely contained within the -.4 to.4 range (smaller is better), then the researcher could argue that constraining this loading to zero is justifiable. If the 95% confidence interval for the standardized factor loading with the largest modification index is completely contained with the -.4 to.4 range, it is likely that all other factors loadings that were constrained to zero are also small. If the 95% confidence interval for this loading is completely outside the -.4 to.4 range, then this loading should be retained in the model and all model parameters need to be re-estimated. The factor loading with the largest modification index in the revised model should also be assessed. If the confidence interval for a standardized factor loading includes -.4 or.4, the statistical results are inconclusive and 19

20 the researcher must decide to include or exclude that factor loading based on nonstatistical criteria. After the excluded factor loadings have been assessed, the excluded slope parameters should examined. The slope parameter with the largest modification index should be added to the model. If the 95% confidence interval for this standardized slope is completely contained within the -.25 to.25 range, then the researcher could argue that constraining this slope to zero is justifiable and it is likely that all other excluded slope parameters are also small. If the 95% confidence interval for this standardized slope parameter is completely outside the -.25 to.25 range, then this slope parameter should be included in the model and the excluded slope with the largest modification index in the revised model should be examined. If the confidence interval for a standardized slope includes -.25 or.25, the statistical results are inconclusive and the researcher must decide to include or exclude that slope parameter based on non-statistical criteria. When assessing parameter similarity across groups in a multi-group design, the confidence intervals for differences of factor loadings, slope parameters, or means will ideally be acceptably narrow and include 0. Confidence intervals for ratios of standard deviations should be acceptably narrow and include 1. If the confidence interval for the difference in parameters is completely contained within a h to h interval, where h represents an acceptably small difference in parameter values, this would provide convincing evidence of parameter similarity. The value of h will depend on the parameter and the application. It is usually easier to specify h for a standardized factor loading or a standardized slope. For instance, if the difference in standardized slopes or standardized factor loadings is completely contained with a -.1 to.1 interval, this would indicate that the standardized parameter values are very similar in the two study populations. However, a small value of h, such as.1 for a standardized slope or factor loading, will require a large sample because large samples are needed to obtain narrow confidence intervals. If several constrained parameters are unconstrained after an exploratory examination of the modification indices, the p-value and confidence interval for a particular parameter in the final model can be misleading. At a minimum, all exploratory modifications should be described in the research report. Ideally, the final model will be reanalyzed in a new 20

21 random sample. If the researcher has access to a large random sample, the sample can be randomly divided into two samples with the exploratory analysis performed on the first sample and a confirmatory analysis performed in the second (validation) sample. Only the results in the validation sample should be reported. Chi-square GOF tests can be used to assess the path models in Module 1 and all of the models in Modules 2 and 3. The GOF test is a test of the null hypothesis that all constraints (e.g., all excluded parameters equal 0 and all equality constrained parameters are equal) on the model are correct. The GOF test is routinely misinterpreted. Researchers incorrectly interpret a p-value greater than.05 as evidence that the model is correct. In fact, the null hypothesis is almost never correct in any real application and the p-value can exceed.05 in small samples even if the model is badly misspecified. In large samples, the p-value for a GOF test can be much less than.05 in models that are only trivially misspecified. Chi-square model comparison tests are also very popular. In multiple group designs, one model might allow corresponding parameter values to differ across groups and another model constrains these parameters to be equal across groups. The chi-squared model comparison test in this example is a test of the null hypothesis that all corresponding parameters are equal across groups. This test is routinely misinterpreted. Researchers will interpret a p-value greater than.05 as evidence that all corresponding parameters are equal across groups, and if the p-value is greater than.05, researchers often conclude that the parameters differ significantly across groups. In fact, a p-value less than.05 does not imply the parameters are equal, and a p-value greater than.05 does not imply that the parameter values are meaningfully different. Confidence intervals for differences or ratios of parameters rather than model comparison tests are needed to determine if the corresponding parameter values are similar or dissimilar across groups. Model comparison tests are also used to compare a model that includes all of the theoretically specified path parameters with a second model that omits all of these parameters. If the p-value for the chi-square model comparison test is less than.05, researchers incorrectly interpret this result as evidence that the model with the omitted paths is incorrect or unacceptable. Despite the serious limitations of the GOF and model comparison tests, most social science journals expect authors to report the results of a GOF test and possibly the results of a model comparison test. The recommendation 21

22 here is to supplement a GOF or model modification test with appropriate confidence interval results. Equivalent Models Equivalent models are models that have the identical GOF test statistic and fit index values with identical degrees of freedom. For instance, the six models shown below (with error terms omitted) for three variables (a, b, c) are all equivalent models with df = 1. Additional equivalent models can be specified by replacing a one-headed arrow in any of these models with a two-headed arrow. a b c (Model 9a) a c b (Model 9b) b a c (Model 9c) b c a (Model 9d) c a b (Model 9e) c b a (Model 9f) When presenting the results for a proposed model, it is important to acknowledge the existence of equivalent models because different equivalent models can have substantially different interpretations and causal implications. Some equivalent models can be ruled out based on theory or logic. In applications where two or more plausible theories are represented by equivalent models, alternative models should be acknowledged when presenting the results of the proposed model. 22

23 Assumptions The GOF tests, model comparison tests, and all confidence intervals assume: 1) random sampling, 2) independence among the n participants, and 3) the observed random variables have an approximate multivariate normal distribution in the study population. The standard errors for path parameters, factor loadings, and correlations are sensitive primarily to the kurtosis of the observed variables. The standard errors will be too small with leptokurtic distributions and too large with platykurtic distributions. Since confidence interval results provide the best way to assess a model, leptokurtosis is more serious than platykurtosis because the confidence intervals will be misleadingly narrow with leptokurtic distributions. As noted in Module 2, if the normality assumption for any particular observed variable has been violated, it might be possible to reduce skewness and kurtosis by transforming that variable. Data transformations might also help reduce nonlinearity and heteroscedasticity. If remedial measures cannot remove excess kurtosis, confidence intervals should be computed using robust or bootstrap standard errors. The recommendation to have a sample size of at least 100 when using ULS estimation and robust standard errors in measurement models and confirmatory factory analysis models also applies to general latent variable statistical models. For indirect and total effects, which can have highly nonnormal sampling distributions, bootstrap confidence intervals based on ULS estimates are recommended. For GOF tests and fit indices, the mean adjusted (Satorra-Bentler) test statistic based on ML estimates is recommended. Sample Size Recommendations There are two completely separate issues regarding sample size requirements for the tests and confidence intervals presented in Modules 1, 2, and 3. One issue is the sample size required for a test or confidence interval to perform properly. A 95% confidence interval for some parameter is said to perform properly in a sample of size n if about 95% of the confidence intervals computed from all possible samples of size n would contain the parameter value. A hypothesis test with α =.05 in a sample of size n is said to perform properly if the null hypothesis would be rejected in about 5% of all possible samples of size n assuming the null hypothesis is true. All of the tests and confidence intervals for the GLM and MGLM are small-sample methods and will perform properly in small samples if their assumptions (e.g., random sampling, independence among participants, 23

24 prediction error normality) have been satisfied. Tests and confidence intervals for latent variable models are referred to as large-sample methods that cannot be expected to perform properly in small samples. If the observed variables are platykurtic or at most moderately leptokurtic, confidence intervals based on USL estimates with robust standard errors should be acceptable with sample sizes of at least 100. Confidence intervals based on ML estimates with robust standard errors can require larger sample sizes (e.g. 200 or more) especially if the model contains many parameters to be estimated. The sample size needed to obtain acceptably narrow confidence intervals is a completely different issue. If the confidence intervals are too wide, the researcher will not be able to provide convincing evidence that the population factor loadings or slope parameters that have been included in the model are meaningfully large. Narrow confidence intervals are also needed to show that factor loadings and slope parameters that have been excluded from the model are small or unimportant. Large sample sizes are usually needed to obtain acceptably narrow confidence intervals possibly much larger than the minimum sample size needed for a robust test or confidence interval to perform properly. Sample size formulas to achieve desired confidence interval width for latent variable model parameters are not useful because they require accurate planning values of unknown population variances and covariances. A more practical approach is to use a sample size that would produce an acceptably narrow confidence interval for a Pearson correlation (ρ yx ) between any two observed variables because the estimated slopes and factor loadings are functions of the sample correlations. The required sample size to estimate ρ yx with 100(1 α)% confidence and a desired confidence interval width equal to w is approximately n = 4(1 ρ yx 2 ) 2 (z α/2 /w) (3.13) where ρ yx is a planning value of the Pearson correlation between observed variables y and x. Equation 3.13 could be used to obtain a rough approximation to the sample size needed to show that certain factor loadings or slope parameters are small. Small factor 2 loadings or slope parameters imply that certain correlations are small and ρ yx could then be set to 0. For instance, to obtain a 95% confidence interval for ρ yx that has a width of.2, 24

25 Equation 3.13 gives a sample size requirement of 388, which is substantially greater than the recommended minimum sample size requirement of 200 (assuming at most moderate leptokurtosis) for ULS estimates with robust standard errors and the mean adjusted (Satorra-Bentler) GOF test with ML estimates. If the sample can be obtained in two stages, the number of participants to sample in the second stage and added to the first-stage sample to achieve the desired confidence interval width (w) of a confidence interval for a specific parameter is approximately equal to n 2 = n 1 [(w 1 /w) 2 1] (3.14) where n 1 is the size of the first-stage sample and w 1 is the width (upper limit minus lower limit) of a confidence interval for a particular parameter obtained in the first-stage sample. A second-stage sample of size n 2 is taken from the same study population and combined with the first-stage sample. The parameters of the latent variable model are then estimated from the combined sample of size n 1 + n 2. The precision of a confidence interval for a standard deviation or a ratio of standard deviations is best described by the upper limit to lower limit ratio rather than the difference. Let r 1 denote the upper limit to lower limit ratio for a standard deviation or a ratio of standard deviations in a first-stage sample of size n 1. The number of participants that should be sampled in the second stage and added to the first-stage sample to achieve the desired upper limit to lower limit ratio (r) is approximately equal to n 2 = n 1 [{ln(r 1 )/ln(r)} 2 1]. (3.15) 25

Prerequisite Material

Prerequisite Material Study Populations and Random Samples A study population is a clearly defined collection of people, animals, plants, or objects. In social and behavioral research, a study population