Unbelievably fast estimation of multilevel structural equation models

Size: px

Start display at page:

Download "Unbelievably fast estimation of multilevel structural equation models"

Felix Ezra Patrick
6 years ago
Views:

1 Intro Relational Simulation Appendix References Unbelievably fast estimation of multilevel structural equation models Joshua N. Pritikin Department of Psychology University of Virginia Spring 2016 Joshua N. Pritikin University of Virginia 1/58

2 Intro Relational Simulation Appendix References Abstract The challenge of quick optimization of multilevel structural equation models (SEM) will be introduced. To provide context, multilevel SEM will be compared with the mixed model., a novel method to simplify the multilevel SEM likelihood will be introduced, inspired by the fact that the multivariate normal density is transparent to orthogonal rotation. Assumptions and limitation of will be discussed. Joshua N. Pritikin University of Virginia 2/58

3 Intro Relational Simulation Appendix References Acknowledgment This research was aided by Timo von Oertzen & Steve Boker (University of Virginia) Tim Brick (Pennsylvania State University) OpenMx development team Joshua N. Pritikin University of Virginia 3/58

4 Intro Relational Simulation Appendix References Structural equation models Joshua N. Pritikin University of Virginia 4/58

5 Intro Relational Simulation Appendix References Multilevel structural equation models Joshua N. Pritikin University of Virginia 5/58

6 Intro Relational Simulation Appendix References A hypothetical example Joshua N. Pritikin University of Virginia 6/58

7 Intro Relational Simulation Appendix References Student Joshua N. Pritikin University of Virginia 7/58

8 Intro Relational Simulation Appendix References Teacher Joshua N. Pritikin University of Virginia 8/58

9 Intro Relational Simulation Appendix References School Joshua N. Pritikin University of Virginia 9/58

10 Intro Relational Simulation Appendix References Estimation time? How long will this model take to estimate? Joshua N. Pritikin University of Virginia 10/58

11 Intro Relational Simulation Appendix References Roadmap What is hard about multilevel? mixed model + SEM = Relational SEM (a novel method that favorably transforms the problem) Joshua N. Pritikin University of Virginia 11/58

12 Intro Relational Simulation Appendix References Direct sum B 1 B2 = ( ) B1 0 0 B 2 B k B i = 0 B B k i=1 stacks matrices in a block-diagonal arrangement Here we assume nested multilevel structure Joshua N. Pritikin University of Virginia 12/58

13 Intro Relational Simulation Appendix References Covariance Suppose we build a covariance model S for a particular student. A classroom of s students will have covariance matrix ( ) T1,1 T 1,2 T =. s i=1 S i T 2,1 Joshua N. Pritikin University of Virginia 13/58

14 Intro Relational Simulation Appendix References Sparseness pattern Joshua N. Pritikin University of Virginia 14/58

15 Intro Relational Simulation Appendix References Bottleneck in model evaluation Covariance matrix can become very large Inverting the covariance is expensive, O(N 3 ) Sparse matrix operations help, but not enough Conclusion: impractical without cleverness Joshua N. Pritikin University of Virginia 15/58

16 Intro Relational Simulation Appendix References Structural equation model What is a SEM? It s similar to regression. But how? Joshua N. Pritikin University of Virginia 16/58

17 Intro Relational Simulation Appendix References Structural equation model, 1st moment Regression is y = Xβ + e SEM is y = Xβ + e For y observations, X covariates/predictors, β constant coefficients, e residuals Joshua N. Pritikin University of Virginia 17/58

18 Intro Relational Simulation Appendix References Structural equation model, 2nd moment Regression SEM y = Xβ + e e N (., σ 2 I) y = Xβ + e (X, e) N (., Σ(θ)) For y observations, X covariates/predictors, β constant coefficients, e residuals, variance σ 2, parameters θ, covariance Σ Joshua N. Pritikin University of Virginia 18/58

19 Intro Relational Simulation Appendix References Mixed model lme4 (Bates, Mächler, Bolker, & Walker, 2015) Multilevel or crossed regression Fast How does it work? Joshua N. Pritikin University of Virginia 19/58

20 Intro Relational Simulation Appendix References Mixed model, 1st moment Y = column vector of observations Y Xβ }{{} + Zu + e. }{{} constant varying covariates X associated with constant coefficients β covariates Z associated with varying coefficients u column vector of residuals e E ( ) u = e ( ) 0 0 Joshua N. Pritikin University of Virginia 20/58

21 Intro Relational Simulation Appendix References Mixed model, 2nd moment Y = Cov Xβ }{{} + Zu + e. }{{} constant varying ( ) u = e ( ) G 0 0 R Joshua N. Pritikin University of Virginia 21/58

22 Intro Relational Simulation Appendix References Mixed model, unconditional distribution Y = Xβ + e where e N (0, ZGZ T + R) The formulation I showed earlier is actually conditional on a particular realization of the varying coefficients u. Joshua N. Pritikin University of Virginia 22/58

23 Intro Relational Simulation Appendix References Mixed model, example lmer(gpa ~ 1 + (1 school) + (1 school:teacher),...) Intercept-only model with 3 levels The expression after the vertical bar indicates the partitioning design (like conditional probability) 4 free parameters are estimated: the grand intercept; 2 variances, one for each varying coefficient; and the residual variance Efficient Joshua N. Pritikin University of Virginia 23/58

24 Intro Relational Simulation Appendix References How does this look in SEM? Joshua N. Pritikin University of Virginia 24/58

25 Intro Relational Simulation Appendix References What is? Let R and S be tables (or data frames) that contain rows. R (F ) S {r s r R s S F (r s)} where F is a boolean valued function. Without loss of generality, here F is whether primary and foreign keys match. We will omit F and write (k) where k is the name of the key. Joshua N. Pritikin University of Virginia 25/58

26 Intro Relational Simulation Appendix References What is? Example Employee Harry Sally George Harriet Dept Sales Finance Finance Sales Dept Sales Finance Production Manager George Harriet Charles Employee (Dept) Manager Employee Dept Manager Harry Sales George Sally Finance Harriet George Sales George Harriet Finance Harriet Joshua N. Pritikin University of Virginia 26/58

27 Intro Relational Simulation Appendix References Which is easier to understand? Joshua N. Pritikin University of Virginia 27/58

28 Intro Relational Simulation Appendix References A relational schema Joshua N. Pritikin University of Virginia 28/58

29 Intro Relational Simulation Appendix References Autoregressive tree (pedigree) Joshua N. Pritikin University of Virginia 29/58

30 Intro Relational Simulation Appendix References Summary: vs conditional probability Data is joined ( ) Conditional probability is an ingredient in multilevel models, not data Join commutes, conditional probability doesn t Joshua N. Pritikin University of Virginia 30/58

31 Intro Relational Simulation Appendix References Mixed model Limitations: missing data row-wise deletion only the lowest level unit has observations multivariate (more than one outcome) is very awkward Joshua N. Pritikin University of Virginia 31/58

32 Intro Relational Simulation Appendix References Sufficient statistic approach Suppose we have data of N independent observations of K-variate units. Let µ and Σ be the model expected mean vector and covariance matrix, respectively. Let m and S be the mean vector and covariance matrix of the data, respectively. 2 log L(data θ) = N(K log(2π) + log( Σ ) + tr(σ 1 S) + µ T Σ 1 (µ 2m)) Maximum covariance dimension is K. Joshua N. Pritikin University of Virginia 32/58

33 Intro Relational Simulation Appendix References Sparseness pattern Joshua N. Pritikin University of Virginia 33/58

34 Intro Relational Simulation Appendix References Uncompressed likelihood Suppose we have N observations consisting of data vector x. Let µ and Σ be the model expected mean vector and covariance matrix, respectively. 2 log L(data θ) = N log(2π) + log( Σ ) + (µ x) T Σ 1 (µ x) Maximum covariance dimension is N. Joshua N. Pritikin University of Virginia 34/58

35 Intro Relational Simulation Appendix References Covariance becomes very large. What to do? SEMs are specified using the RAM parameterization: µ = F (I A) 1 M Σ = F (I A) 1 S(I A) T F T A, S, F, and M are used for what? Joshua N. Pritikin University of Virginia 35/58

36 Intro Relational Simulation Appendix References RAM s A matrix Joshua N. Pritikin University of Virginia 36/58

37 Intro Relational Simulation Appendix References RAM s A matrix T S1 S2 S3 S4 T S S S S Joshua N. Pritikin University of Virginia 37/58

38 Intro Relational Simulation Appendix References Orthogonal or axis rotation 2 y x Joshua N. Pritikin University of Virginia 38/58

39 Intro Relational Simulation Appendix References Call forth the sublime orthogonal rotation Joshua N. Pritikin University of Virginia 39/58

40 Intro Relational Simulation Appendix References transformed Joshua N. Pritikin University of Virginia 40/58

41 Intro Relational Simulation Appendix References transformed T S1 S2 S3 S4 T S S S S Joshua N. Pritikin University of Virginia 41/58

42 Intro Relational Simulation Appendix References transformed Joshua N. Pritikin University of Virginia 42/58

43 Intro Relational Simulation Appendix References algebratically Upper level has K outgoing regressions Lower level has M incoming regressions with data D Find an orthogonal matrix Q R M M such that the lower M K rows of QA are zero. Define new model A as the first K rows of Q T A Define new lower level dataset D = Q T D Proceed with optimization as usual Joshua N. Pritikin University of Virginia 43/58

44 Intro Relational Simulation Appendix References geometry Use QR decomposition to scale to an orthogonal rotation Joshua N. Pritikin University of Virginia 44/58

45 Intro Relational Simulation Appendix References, apply recursively Joshua N. Pritikin University of Virginia 45/58

46 Intro Relational Simulation Appendix References historical lineage Who done it? summer of 2012: idea conceived by Timo von Oertzen, Steven M. Boker, and Timothy R. Brick, inspired by von Oertzen and Hackett (submitted) spring 2013: prototyped in OpenMx (by me) completed predissertation on a different topic spring 2016: optimized implementation in OpenMx (again by me) Joshua N. Pritikin University of Virginia 46/58

47 Intro Relational Simulation Appendix References Does it work? Joshua N. Pritikin University of Virginia 47/58

48 Intro Relational Simulation Appendix References Conditions 11 parameters per level, 2 between level regressions (total 35) 1st student indicator was set to missing with 20% probability 2 sets of true parameters (θ 1 and θ 2 ) Parameter θ 1 : 7 schools, 38 teachers, and 293 students Parameter θ 2 : 7 schools, 37 teachers, and 296 students 200 Monte Carlo replications for each condition (Algorithm θ) Joshua N. Pritikin University of Virginia 48/58

49 Intro Relational Simulation Appendix References Monte Carlo bias and variance θ replications method bias σ rampart regular rampart regular Joshua N. Pritikin University of Virginia 49/58

50 Intro Relational Simulation Appendix References Scatterplot of deviance at the maximum likelihood 1 2 regular regular rampart rampart Joshua N. Pritikin University of Virginia 50/58

51 Intro Relational Simulation Appendix References Seconds required per replication 30 rampart count regular seconds Joshua N. Pritikin University of Virginia 51/58

52 Intro Relational Simulation Appendix References OpenMx is a free and open source extension to the R statistical environment. Software and support available at Questions? Joshua N. Pritikin University of Virginia 52/58

53 Intro Relational Simulation Appendix References Appendix Some extra slides follow Joshua N. Pritikin University of Virginia 53/58

54 Intro Relational Simulation Appendix References Terminology Historically, coefficients that help predict all observations are called fixed effects whereas the other type of coefficient has been called a random effect. These are unfortunate terminology. In the statistical literature, there are at least five definitions of these phrases, all of which differ from each other (Gelman, 2005). Moreover, in computer science, the term random is usually associated with draws from a uniform random number generator, not synonymous with stochastic that does not suppose a particular distribution. Here we follow Gelman (2005) and use the terms constant and varying. Joshua N. Pritikin University of Virginia 54/58

55 Intro Relational Simulation Appendix References Example, lme4 lmer(reaction ~ Days + (Days Subject), sleepstudy) Joshua N. Pritikin University of Virginia 55/58

56 Intro Relational Simulation Appendix References Example, OpenMx (part 1) bysubj <- mxmodel( model="bysubj", type="ram", latentvars=c("slope", "intercept"), mxdata(data.frame(subject=unique(sleepstudy$subject)), type="raw", primarykey = "Subject"), mxpath(c("intercept", "slope"), arrows=2, values=1), mxpath("intercept", "slope", arrows=2, values=.25, labels="cov1")) Joshua N. Pritikin University of Virginia 56/58

57 Intro Relational Simulation Appendix References Example, OpenMx (part 2) ss <- mxmodel( model="sleep", type="ram", bysubj, manifestvars="reaction", latentvars = "Days", mxdata(sleepstudy, type="raw", sort=false), mxpath("one", "Reaction", arrows=1, free=true), mxpath("one", "Days", arrows=1, free=false, labels="data.days"), mxpath("days", "Reaction", arrows=1, free=true), mxpath("reaction", arrows=2, values=1), mxpath(paste0('bysubj.', c('intercept','slope')), 'Reaction', arrows=1, free=false, values=c(1,na), labels=c(na, "data.days"), joinkey="subject")) Joshua N. Pritikin University of Virginia 57/58

58 Intro Relational Simulation Appendix References Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), doi: /jss.v067.i01 Boker, S. M., Brick, T. R., Pritikin, J. N., Wang, Y., von Oertzen, T., Brown, D.,... Neale, M. C. (2015). Maintained individual data distributed likelihood estimation. Multivariate Behavioral Research, 50(6), Gelman, A. (2005). Analysis of variance why it is more important than ever. The Annals of Statistics, 33(1), Neale, M. C., Hunter, M. D., Pritikin, J. N., Zahery, M., Brick, T. R., Kirkpatrick, R.,... Boker, S. M. (in press). OpenMx 2.0: Extended structural equation and statistical modeling. Psychometrika. doi: /s Pritikin, J. N. (2016). A computational note on the application of the Supplemented EM algorithm to item response models. Pritikin, J. N., & Schmidt, K. M. (in press). Model builder for Item Factor Analysis with OpenMx. R Journal. von Oertzen, T., & Hackett, D. C. (submitted). Pre-processing for efficient maximum likelihood estimation in structural equation models with fixed loadings. submitted. Joshua N. Pritikin University of Virginia 57/58

R-squared for Bayesian regression models

R-squared for Bayesian regression models Andrew Gelman Ben Goodrich Jonah Gabry Imad Ali 8 Nov 2017 Abstract The usual definition of R 2 (variance of the predicted values divided by the variance of the