SEM 2: Structural Equation Modeling

SEM 2: Structural Equation Modeling Week 2 - Causality and equivalent models Sacha Epskamp 15-05-2018

Covariance Algebra Let Var(x) indicate the variance of x and Cov(x, y) indicate the covariance between x and y. The following rules can be derived: Var(x) = Cov(x, x) Cov(x, α) = 0 Cov(x, y) = Cov(y, x) Cov(αx, βy) = αβcov(x, y) Cov(x + y, z) = Cov(x, z) + Cov(y, z) Where α and β are constants (parameter) and x, y, and z are random variables.

Matrix Covariance Algebra Let Var(x) indicate the variance covariance matrix of vector x and Cov(x, y) indicate the covariance matrix between x and y. Then the following rules can be derived: Var(x) = Cov(x, x) Cov(Ax, By) = ACov(x, y)b Var(Bx) = BVar(x)B Cov(x + y, z) = Cov(x, z) + Cov(y, z) Where A and B are constant (parameter) matrices.

Path analysis θ 1 θ 2 β 1 β 2 x y 1 y 2 x is exogenous, and both y 1 and y 2 are endogenous. θ 1 is the variance of ε 1. Causal model for y 2 : y i2 = β 2 y i1 + ε i2 y i2 = β 2 (β 1 x i + ε i1 ) + ε i2

Tracing rules Compound paths between F and G:

Tracing rules Compound paths between F and G: F D B B E G, F C A B E G, and F D A B E G.

Tracing rules Compound paths between F and G: F D B B E G, F C A B E G, and F D A B E G. Cov (C, D) = fc(var(b))dg + fbhdg + eahgd

SEM model: Σ = Λ(I B) 1 Ψ(I B) 1 Λ + Θ Simply the CFA model with one extra matrix: B encoding regression parameters. Element β ij encodes the effect from variable j to variable i (note, this is opposite of how normally a directed network is encoded). The same identification rules as in CFA apply: Latent variables must be scaled by setting one factor loading or (residual) variance to 1 Model must have at least 0 degrees of freedom

ψ 11 ψ 22 ψ 33 β 21 β 32 η 1 η 2 η 3 1 λ 21 1 λ 42 1 λ 63 y 1 y 2 y 3 y 4 y 5 y 6 θ 11 θ 22 θ 33 θ 44 θ 55 θ 66 1 0 0 λ 21 0 0 0 1 0 Λ =, Ψ = ψ 11 0 0 0 ψ 0 λ 42 0 22 0, B = 0 0 0 β 21 0 0 0 0 1 0 0 ψ 33 0 β 32 0 0 0 λ 63 Θ diagonal as usual.

ψ 11 ψ 22 ψ 33 β 21 β 32 η 1 η 2 η 3 1 λ 21 1 λ 42 1 λ 63 y 1 y 2 y 3 y 4 y 5 y 6 θ 11 θ 22 θ 33 θ 44 θ 55 θ 66 Lavaan model (using sem()): eta1 =~ y1 + y2 eta2 =~ y3 + y4 eta3 =~ y5 + y6 eta2 ~ eta1 eta3 ~ eta2

Causality Given the following causal statement: Rain causes the grass to become wet Which of the below statements are plausible? If it does not rain, the grass does not become wet

Causality Given the following causal statement: Rain causes the grass to become wet Which of the below statements are plausible? If it does not rain, the grass does not become wet If it rains, grass always becomes wet If it rains, grass is more likely to be wet than if it doesn t rain If the grass is wet, it must be / have been raining

Causal relationships should be probabilistic Rain increases the probability a random field of grass will be wet Causal relations should be framed in terms of interventions on a model Making the grass wet does not change the probability that it rains Pearl noted that statistics has no words to express that A causes B Structural equations Graphical models to portray causal hypotheses / structural equations

Rain Grass wet Implies: Observing that it rains makes it more likely that the grass is wet P(grass is wet See(raining)) > P(grass is wet) Observing that the grass is wet makes it more likely that it rains P(raining See(grass is wet)) > P(raining) But making the grass wet does not make it more likely that it rains (we know after all why the grass is wet) P(raining Do(grass is wet)) = P(raining) Unfortunately, in observational data (especially without temporal ordering), we can only investigate what happens if we see one variable (conditioning)...

The causal hypothesis: Rain Grass wet Allows for the testable hypothesis: P(grass is wet See(raining)) > P(grass is wet) That is, rain and grass being wet should be associated (correlated). However... Correlation does not imply causation One association between two variables is saturated and always fits the data Solution: More variables and more advanced causal models imply more testable hypotheses Conditional independence relations These more advanced models can be drawn as directed acyclic graphs (DAGs) In multivariate normal data, DAGs can be parameterized as SEMs (bivariate relationships can be replaced with a latent common cause)

Directed Acyclic Graphs

Building blocks of a DAG Common Cause Chain Collider B A C A B C A C B Example: Disease (B) causes two symptoms (A and C). A C A C B Example: Insomnia (A) causes fatigue (B), which in turn causes concentration problems (C) A C A C B Example: Difficulty of class (A) and motivation of student (C) cause grade on a test (B) A C A C B

To identify two variables (e.g., B and F ) are conditionally independent given a third (e.g., C) or set of multiple variables: List all paths between the variables (ignore direction of edge) For each path, check if the variable to condition on is: The middle node in a chain or common cause structure Not the middle node (common effect) in a collider structure or an effect of such a common effect If so, then the path is blocked If all such paths are blocked, the two variables are d-separated and thus conditionally independent

A B A D C B G C, E... Testing this causal model involves testing if all these conditional independence relations hold

y 3 ε 3 θ 33 λ 31 ψ 11 η 1 λ 21 y 2 ε 2 θ 22 1 y1 ε1 θ 11 Local independence y 1 y 2 η 1

If multivariate normality holds, then the Schur complement shows that any partial covariance can be expressed solely in terms of variances and covariances: Cov (Y i, Y j X = x) = Cov (Y i, Y j ) Cov (Y i, X ) Var (X ) 1 Cov (X, Y j ) Thus, a specific structure of the correlation matrix also implies a model for all possible partial correlations. If we know Σ, we know everything we can about the relationships between variables. As a result, fitting a SEM model equals simultaneously testing all conditional independence relationships implied by the model!

However, if this model fits: A B C Then so do these: A B C A B C Because these models imply the same conditional independence relationships and are therefore equivalent

Equivalent Models Two models, with the same (observed/latent) variables are equivalent if: The models imply exactly the same conditional independence relationships The models fit exactly equally well on all datasets The models have the same number of degrees of freedom Equivalent models can not be distinguished in statistical ways All identified saturated models are equivalent! Adding more latent variables can lead to an infinite number of equivalent models

Which two of these models are equivalent?

library("lavaan") ModA <- ' F =~ y1 + y2 + y3 + y4 y3 ~~ y4 ' fita <- sem(moda, Data, sample.nobs = 500) ModB <- 'F1 =~ y1 + y2 F2 =~ y3 + y4 F1 ~~ F2' fitb <- sem(modb, Data, sample.nobs = 200) ModC <- 'F =~ y1 + y2 + y3 + y4 y2 ~~ y3' fitc <- sem(modc, Data, sample.nobs = 200)

lavinspect(fita, "sigma") ## y1 y2 y3 y4 ## y1 1.904 ## y2 0.993 2.063 ## y3 0.915 0.927 2.078 ## y4 0.933 0.946 0.876 1.802 lavinspect(fitb, "sigma") ## y1 y2 y3 y4 ## y1 1.904 ## y2 0.993 2.063 ## y3 0.915 0.927 2.078 ## y4 0.933 0.946 0.876 1.802 Models A and B produce the same model-implied covariance matrix

lavinspect(fita, "sigma") ## y1 y2 y3 y4 ## y1 1.904 ## y2 0.993 2.063 ## y3 0.915 0.927 2.078 ## y4 0.933 0.946 0.876 1.802 lavinspect(fitc, "sigma") ## y1 y2 y3 y4 ## y1 1.904 ## y2 0.998 2.063 ## y3 0.924 0.908 2.078 ## y4 0.921 0.954 0.883 1.802 the model-implied covariance matrix from Model C is different

A C η B A B C Equivalent models or not?

A C η B A B C Equivalent models or not? Both saturated models and thus equivalent. However, under certain specifications the collider can imply a negative partial correlation, which can only be obtained in the one factor model using impossible negative variances!

collider <- ' B ~ 0.5*A + 0.5*C A ~~ -0.1*C ' Data <- simulatedata(collider) factor <- 'f =~ A + B + C' fit <- sem(factor, Data) ## Warning in lav object post check(object): lavaan WARNING: some estimated lv variances are negative

parameterestimates(fit) ## lhs op rhs est se z pvalue ci.lower ci.upper ## 1 f =~ A 1.000 0.000 NA NA 1.000 1.000 ## 2 f =~ B -4.390 2.273-1.932 0.053-8.844 0.065 ## 3 f =~ C 0.837 0.140 5.961 0.000 0.562 1.112 ## 4 A ~~ A 1.263 0.105 11.978 0.000 1.056 1.469 ## 5 B ~~ B 3.770 1.347 2.798 0.005 1.129 6.410 ## 6 C ~~ C 1.018 0.080 12.660 0.000 0.860 1.175 ## 7 f ~~ f -0.120 0.060-2.014 0.044-0.238-0.003

Replacement rule: Let X and Y be two variables with residuals ε X and ε Y The effect X Y may be replaced with ε X ε Y (and vise versa) if: If the predictors (causes) of Y are the same as or include those of X Both X and Y do not cause any predictors of X and Y In a fully connected (saturated) exogenous block (no incoming effects), any relation may be changed in direction or changed to a residual covariance

X 3 and X 4 have the same predictors

Saturated exogenous block

All equivalent models: A B C A B C A B C A B C A B C

What if we have no theory? Could we exploratively find this SEM model (a DAG)?

Equivalent Models

Causal models imply a set of conditional independence relationships that can be tested SEM is a powerful technique to test such a causal model in one step However, many equivalent models can fit the data equally well Be careful in explorative model modification! Be skeptic when interpreting SEM models (e.g., near saturated models do not prove a causal theory) The poor identification of directed graphical models led recent researchers (e.g., me) to use undirected graphical models instead A B C indicates A C B without troublesome causal interpretation and equivalent models More on this next week!