Evaluating sensitivity of parameters of interest to measurement invariance using the EPC-interest Department of methodology and statistics, Tilburg University WorkingGroupStructuralEquationModeling26-27.02.2015, FU Berlin
Conclusion This talk discusses the ``EPC-interest'' EPC-interest is like SEM's expected parameter change (``EPC-self'') but instead of measuring the change in the restricted parameter, it measures the change in the parameter of interest; Introduced for measurement invariance evaluation in continuous data SEM by Oberski (2014) and for categorical data by Oberski et al. (frth) Implemented in Latent Gold 5.0, experimental version of, working on main branch.
The problem of measurement invariance
p(y) = ξ J p(ξ x) p(y j ξ) j=1
p(y) = ξ J p(ξ x) p(y j ξ, x) j=1
Measurement invariance: the problem Problem of measurement invariance: we want to know γ, but δ 0 might bias this parameterofinterest.
Preceding solutions to the problem 1. Selecting one indicator as a reference indicator (``anchor item'') apriori; 2. Imposing a strong prior on differential functioning (Muthén&Asparouhov2012) 3. Test null hypothesis of full or partial invariance.
Preceding solutions to the problem 1. Selecting one indicator as a reference indicator (``anchor item'') apriori; + Don't need further restrictions; - No way of testing reference indicator. E.g.: Setting loadings to 1 in each group, ``alignment method''. 2. Imposing a strong prior on differential functioning (Muthén&Asparouhov2012) 3. Test null hypothesis of full or partial invariance.
Preceding solutions to the problem 1. Selecting one indicator as a reference indicator (``anchor item'') apriori; 2. Imposing a strong prior on differential functioning (Muthén&Asparouhov2012) + Attractive data-driven solution when prior is neither too informative nor too weak; - More research needed to figure out when prior is neither too informative nor too weak. 3. Test null hypothesis of full or partial invariance.
Preceding solutions to the problem 1. Selecting one indicator as a reference indicator (``anchor item'') apriori; 2. Imposing a strong prior on differential functioning (Muthén&Asparouhov2012) 3. Test null hypothesis of full or partial invariance. + When high-powered test of full measurement invariance is not rejected, may be safe to simply continue without the need for further modeling; - Rarely happens in practice. + Partial invariance looks at size and significance of δ, but not all big δ's are important nor are small ones necessarily unimportant. So does not guarantee that he parameter of interest in free of measurement differences (Oberski,2014) 4. Sensitivity analysis: allow partial violations when they matter.
Preceding solutions to the problem 1. Selecting one indicator as a reference indicator (``anchor item'') apriori; 2. Imposing a strong prior on differential functioning (Muthén&Asparouhov2012) 3. Test null hypothesis of full or partial invariance. 4. Sensitivity analysis: allow partial violations when they matter.
Measurement invariance: the problem and a solution Problem of measurement invariance: we want to know γ, but δ 0 might bias this parameterofinterest. Solution: Use EPC-interest: the expected change in γ when freeing δ. If EPC-interest is big (e.g. can change sign), incorporate δ; if EPC-interest is small, ignore it.
EPC-interest
EPC-interest, also for categorical data ( ) θ ( ˆδ) EPC-interest = P δ δ = PĤ 1 θθ Ĥθδ EPC-self = ˆγ a ˆγ + O(ψ ψ), where P selects the parameters of interest γ from the parameter vector θ, H is a Hessian, and O(ψ ψ) is an approximation term depending on the overall amount of misspecification (parameter differences).
Simulation: how good is the approximation? Setup: P(Y j = 1 x) = [1 + exp( x)] 1, with j {2, 3, 4}, and structural model x = γz + ϵ with γ = 1 and ϵ N(0, 1). We then introduced a violation of measurement invariance for the first indicator, P(Y 1 = 1 x) = [1 + exp( x δz)] 1. Nine conditions varied sample size, n {250, 500, 1000}, and the size of the invariance violation: δ = 0 (no violation), 0.5 (moderate), or 1 (extreme). Data were generated using R 3.1.2 and analyzed using Latent GOLD 5.0.0.14161.
n 250 500 True δ 0 0.5 1 0 0.5 1 0 Est. ˆγ 1.010 1.151 1.353 0.980 1.152 1.330 1.013 Bias ˆγ -0.010-0.151-0.353 0.020-0.152-0.330-0.013 EPC-int. 0.003-0.166-0.494-0.001-0.180-0.486 0.004 Table : Simulation study of EPC-interest. Shown is the average point estimate for the γ parameter of interest under full measurement invariance (``Est''), its difference from the true value γ = 1 (``Bias''), and the average EPC-interest.
Example with categorical data See
Figure : Graphical representation of the multilevel latent class regression model for (post)materialism measured by three partial ranking tasks. Observed variables are shown in rectangles while unobserved (``latent'') variables are shown in ellipses.
Multilevel latent class model w/ covariates for rankings L(θ) = P(A 1, A 2, B 1, B 2, C 1, C 2 Z 1, Z 2 ) = C n c P(G c ) P(X ic Z 1ic, Z 2ic, G c ) c=1 G i=1 X P(A 1ic, A 2ic X ic )P(B 1ic, B 2ic X ic )P(C 1ic, C 2ic X ic ), Goal: estimate γ (especially its sign). Possibleproblem: Violations of scalar and metric measurement invariance (DIF), parameterized respectively as τ and λ. Solution: See if these matter for the sign of γ.
Table : Full invariance multilevel latent class model: EPC-interest values. λ jkxg Estimates EPC-interest Est. s.e. Task 1 Task 2 Task 3 Class 1 GDP -0.035 (0.007) 0.073 0.252 0.005 Class 2 GDP -0.198 (0.012) -0.163-0.058 0.002 Class 1 Women 0.013 (0.001) -0.003 0.029 0.002 Class 2 Women -0.037 (0.001) -0.006-0.013 0.002 Free ``loadings'' for task 1 and task 2.
Table : Partially invariant multilevel latent class model: EPC-interest values. λ jkxg Estimates EPC-interest Est. s.e. Task 1 Task 2 Task 3 Class 1 GDP -0.127 (0.008) 0.097 Class 2 GDP 0.057 (0.011) 0.161 Class 1 Women 0.008 (0.001) 0.001 Class 2 Women 0.020 (0.001) 0.007
WhathasbeengainedbyusingEPC-interest: I am fairly confident here that there truly is "approximate measurement invariance", in the sense that any violations of measurement invariance do not bias the primary conclusions. I think attaining this goal is the main purpose of model fit evaluation.
Conclusion
Whatistheproblem? We do latent variable modeling with a goal in mind. But the latent variable model might be misspecified. The appropriate question: "will that affect my goal?" The actual question: "do the data fit the model in the population" (LR) or "are the model and the data far apart relative to model complexity" (RMSEA etc.) Whatisthesolution? Evaluatedirectlywhateffectpossiblemisspecifications haveonthegoaloftheanalysis.
References Oberski, D. (2014). Evaluating sensitivity of parameters of interest to measurement invariance in latent variable models. Political Analysis, 22(1):45--60. Oberski, D. and Vermunt, J. (2013). A model-based approach to goodness-of-fit evaluation in item response theory. Measurement: InterdisciplinaryResearch&Perspectives, 11:117--122. Oberski, D., Vermunt, J., and Moors, G. (frth). Evaluating measurement invariance in categorical data latent variable models with the EPC-interest. Vermunt, J. K. and Magidson, J. (2013). TechnicalguideforLatent GOLD 5.0: Basicandadvanced. Statistical Innovations Inc., Belmont, MA.
SEM regression coefficient example Two problems with invariance testing Conclusions Unaffected by misspecification Affected by misspecification Misspecified invariance model fit ``Good'' fit ``Bad'' fit (1) (2) Overparameterization or unnecessarily d i s c a r d e d i t e m, group, or scale. (3) Non-invariance invalidates conclusions. (4)
SEM regression coefficient example European Sociological Review 2008, 24(5), 583--599
SEM regression coefficient example Poland Czech Republic Greece United Kingdom Slovenia Belgium France Portugal Finland Hungary Norway Spain Ireland Germany Netherlands Switzerland Austria Danmark Sweden Poland Czech Republic Greece United Kingdom Slovenia Belgium France Portugal Finland Hungary Norway Spain Ireland Germany Netherlands Switzerland Austria Danmark Sweden Conservation Self transcendence 1.0 0.5 0.0 0.5 1.0 1.0 0.5 0.0 0.5 1.0 Regression coefficient ALLOW NOCOND
SEM regression coefficient example EPC-interest statistics of at least 0.1 in absolute value with respect to the latent variable regression coefficients. Metric invariance (loading) restriction ``Conditions Work skills'' in... Slovenia France Hungary Ireland EPC-interest w.r.t.: Conditions Self-transcendence -0.073-0.092-0.067 0.073 Conservation 0.144 0.139 0.123-0.113 SEPC-self 0.610 0.692 0.759-0.514
SEM regression coefficient example What has been gained by using EPC-interest Full metric invariance model: "close fit"; EPC-interest still detects threats to cross-country comparisons of regression coefficients;
SEM regression coefficient example What has been gained by using EPC-interest Full metric invariance model: "close fit"; EPC-interest still detects threats to cross-country comparisons of regression coefficients; MI and EPC-self do not detect these particular misspecifications;
SEM regression coefficient example What has been gained by using EPC-interest Full metric invariance model: "close fit"; EPC-interest still detects threats to cross-country comparisons of regression coefficients; MI and EPC-self do not detect these particular misspecifications; MI and EPC-self detect other misspecifications;
SEM regression coefficient example What has been gained by using EPC-interest Full metric invariance model: "close fit"; EPC-interest still detects threats to cross-country comparisons of regression coefficients; MI and EPC-self do not detect these particular misspecifications; MI and EPC-self detect other misspecifications; Looking at EPC-interest reveals that these do not affect the cross-country comparisons of regression coefficients.
SEM regression coefficient example What has been gained by using EPC-interest Full metric invariance model: "close fit"; EPC-interest still detects threats to cross-country comparisons of regression coefficients; MI and EPC-self do not detect these particular misspecifications; MI and EPC-self detect other misspecifications; Looking at EPC-interest reveals that these do not affect the cross-country comparisons of regression coefficients.
Example Goal: Estimate gender differences in "valuing Stimulation": (1) Very much like me; (2) Like me; (3) Somewhat like me; (4) A little like me; (5) Not like me; (6) Not like me at all. (S)he looks for adventures and likes to take risks. (S)he wants to have an exciting life. (S)he likes surprises and is always looking for new things to do. He thinks it is important to do lots of different things in life. Tool: Structural Equation Model for European Social Survey data (n = 18519 men and 16740 women). (OriginalstudybySchwarzetal. 2005)
Assume: Butreally(?): What difference does it make for the goal: true gender differences in values? (re-analysisofdatabyoberski2014) Latent mean difference estimate ± 2 s.e. 0.2 0.0 0.2 Men value more Women value more Model Scalar invariance Free intercept 'Adventure' ACPO ST SD HE COTR SE UN BE "Human value" factor
PROBLEM The original authors found that the conditional independence model fit the data "approximately" (p. 1013)... "Chi-squaredeterioratedsignificantly, χ 2 (19) = 3313, p <.001, butcfi didnotchange. Changeinchi-squareis highlysensitivewithlargesamplesizesandcomplex models. Theotherindicessuggestedthatscalarinvariance mightbeaccepted(cfi =.88, RMSEA =.04, CI =.039.040, PCLOSE =1.0).''
PROBLEM The original authors found that the conditional independence model fit the data "approximately" (p. 1013)... "Chi-squaredeterioratedsignificantly, χ 2 (19) = 3313, p <.001, butcfi didnotchange. Changeinchi-squareis highlysensitivewithlargesamplesizesandcomplex models. Theotherindicessuggestedthatscalarinvariance mightbeaccepted(cfi =.88, RMSEA =.04, CI =.039.040, PCLOSE =1.0).''... but unfortunately this "acceptable" misspecification could reversetheirconclusions!
EPC-interest applied to Stimulation example After fitting the full scalar invariance model, Effect size estimate of sex difference in Stimulation is +0.214 (s.e. 0.0139).
EPC-interest applied to Stimulation example After fitting the full scalar invariance model, Effect size estimate of sex difference in Stimulation is +0.214 (s.e. 0.0139). But EPC-interest of equal "Adventure" item intercept is -0.243.
EPC-interest applied to Stimulation example After fitting the full scalar invariance model, Effect size estimate of sex difference in Stimulation is +0.214 (s.e. 0.0139). But EPC-interest of equal "Adventure" item intercept is -0.243. So EPC-interest suggests conclusioncanbereversed by freeing a misspecified scalar invariance restriction
EPC-interest applied to Stimulation example After fitting the full scalar invariance model, Effect size estimate of sex difference in Stimulation is +0.214 (s.e. 0.0139). But EPC-interest of equal "Adventure" item intercept is -0.243. So EPC-interest suggests conclusioncanbereversed by freeing a misspecified scalar invariance restriction Actualchange when freeing this intercept is very close to EPC-interest: -0.235.
EPC-interest applied to Stimulation example After fitting the full scalar invariance model, Effect size estimate of sex difference in Stimulation is +0.214 (s.e. 0.0139). But EPC-interest of equal "Adventure" item intercept is -0.243. So EPC-interest suggests conclusioncanbereversed by freeing a misspecified scalar invariance restriction Actualchange when freeing this intercept is very close to EPC-interest: -0.235.