An Efficient State Space Approach to Estimate Univariate and Multivariate Multilevel Regression Models

Similar documents
Estimating Multilevel Linear Models as Structural Equation Models

Multilevel Modeling: A Second Course

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Multilevel Structural Equation Modeling

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

MODEL IMPLIED INSTRUMENTAL VARIABLE ESTIMATION FOR MULTILEVEL CONFIRMATORY FACTOR ANALYSIS. Michael L. Giordano

Modeling Heterogeneity in Indirect Effects: Multilevel Structural Equation Modeling Strategies. Emily Fall

Testing and Interpreting Interaction Effects in Multilevel Models

Introduction to Within-Person Analysis and RM ANOVA

Introduction to Random Effects of Time and Model Estimation

Review of CLDP 944: Multilevel Models for Longitudinal Data

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

General structural model Part 1: Covariance structure and identification. Psychology 588: Covariance structure and factor models

Model Assumptions; Predicting Heterogeneity of Variance

From Micro to Macro: Multilevel modelling with group-level outcomes

Consequences of measurement error. Psychology 588: Covariance structure and factor models

Goals for the Morning

Generalized Linear Models for Non-Normal Data

MULTILEVEL IMPUTATION 1

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011

Time Metric in Latent Difference Score Models. Holly P. O Rourke

Nesting and Equivalence Testing

Latent variable interactions

Bayesian Analysis of Latent Variable Models using Mplus

Elements of Multivariate Time Series Analysis

SEM for those who think they don t care

WHAT IS STRUCTURAL EQUATION MODELING (SEM)?

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Non-linear mixed models in the analysis of mediated longitudinal data with binary outcomes

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49

Exploring Cultural Differences with Structural Equation Modelling

Testing Main Effects and Interactions in Latent Curve Analysis

How to run the RI CLPM with Mplus By Ellen Hamaker March 21, 2018

Supplemental material to accompany Preacher and Hayes (2008)

CHAPTER 3. SPECIALIZED EXTENSIONS

Structural Equation Modeling An Econometrician s Introduction

A Introduction to Matrix Algebra and the Multivariate Normal Distribution

Applied Econometrics (QEM)

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures

Introduction and Background to Multilevel Analysis

7 Day 3: Time Varying Parameter Models

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

X t = a t + r t, (7.1)

Introduction to Structural Equation Modeling

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Review of Unconditional Multilevel Models for Longitudinal Data

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

Ruth E. Mathiowetz. Chapel Hill 2010

Describing Change over Time: Adding Linear Trends

Estimation: Problems & Solutions

Describing Within-Person Change over Time

Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Online Appendices for: Modeling Latent Growth With Multiple Indicators: A Comparison of Three Approaches

Multivariate ARMA Processes

Richard N. Jones, Sc.D. HSPH Kresge G2 October 5, 2011

Dynamic Structural Equation Modeling of Intensive Longitudinal Data Using Mplus Version 8

Review of Multilevel Models for Longitudinal Data

Modeling growth in dyads at the group level

Timevarying VARs. Wouter J. Den Haan London School of Economics. c Wouter J. Den Haan

What is in the Book: Outline

Estimation: Problems & Solutions

Introduction to Matrix Algebra and the Multivariate Normal Distribution

FIT INDEX SENSITIVITY IN MULTILEVEL STRUCTURAL EQUATION MODELING. Aaron Boulton

Mixed-effects Polynomial Regression Models chapter 5

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

Multilevel modelling of fish abundance in streams

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research

1 Teaching notes on structural VARs.

The Impact of Model Misspecification in Clustered and Continuous Growth Modeling

A note on structured means analysis for a single group. André Beauducel 1. October 3 rd, 2015

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

Analysis of Incomplete Non-Normal Longitudinal Lipid Data

Part I State space models

A Non-parametric bootstrap for multilevel models

Outline. Overview of Issues. Spatial Regression. Luc Anselin

An Introduction to Path Analysis

Weighted Least Squares

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Models for Clustered Data

Models for Clustered Data

IMPACT OF NOT FULLY ADDRESSING CROSS-CLASSIFIED MULTILEVEL STRUCTURE IN TESTING MEASUREMENT INVARIANCE AND CONDUCTING MULTILEVEL MIXTURE MODELING

Moderation 調節 = 交互作用

Factor Analysis & Structural Equation Models. CS185 Human Computer Interaction

The fixed- and random-effects parts of the mixed-model are specified in the MODEL and RANDOM

Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects

Dyadic Data Analysis. Richard Gonzalez University of Michigan. September 9, 2010

Multi-level Models: Idea

Time-Invariant Predictors in Longitudinal Models

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

Comparing Change Scores with Lagged Dependent Variables in Models of the Effects of Parents Actions to Modify Children's Problem Behavior

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes

An Introduction to Mplus and Path Analysis

Transcription:

An Efficient State Space Approach to Estimate Univariate and Multivariate Multilevel Regression Models Fei Gu Kristopher J. Preacher Wei Wu 05/21/2013

Overview Introduction: estimate MLM as SEM (Bauer, 2003; Curran, 2003; Mehta & Neale, 2005) Multiple-subject state space model (SSM) Examples of the SSM approach to 1. Univariate MLM: random intercept and random slope model 2. Univariate MLM: random intercept longitudinal model 3. Multivariate MLM: random intercepts model Summary, Discussions, and Conclusions

Introduction The SEM approach to MLMs: transforming data set structure (long format to wide format; e.g.,bauer, 2003; Curran, 2003) Equivalence between the SEM and the SSM can give rise to the state space approach conforming the SEM approach Limitations: computationally inefficient; data management nightmare (Curran, 2003, p. 565)

Introduction The state space approach in the time series literature: Icaza & Jones, 1999; Jones, 1993 Mentioned in educational and psychological literature briefly by Ho, Shumway, & Ombao, 2006, pp. 153-154

State space formulation y x, ~ MVN (0, ), i i i i i x, ~ MVN (0, ). i i1 i i i y i : p-variate data vector η i : q-variate latent state vector x i : g-variate vector containing exogenous variables ε i : p 1 vector containing measurement errors ζ i : q 1 vector containing latent state noises subscript, i, (i = 1, 2,, T) denotes the time point

Parameter estimation Kalman filter (Kalman, 1960) x P i i1 i1 i1 i P i i1 i1 i1 e y y y ( x ) D i i i i1 i i i1 i i P i i1 K P D 1 i i i1 i K e P D 1 i i i i1 i i i i1 i i1 i i P P K D K P P D P. 1 i i i i1 i i i i i1 i i1 i i i1 Prediction Error Decomposition (PED; Schweppe, 1965) function: 1 T 1 PED plog(2 ) log Di e id i ei. 2 i1 e

Multiple-subject SSM A second subscript, j, (j = 1, 2,, J) indexing different units is added to both the measurement and transition equations: y x, ~ MVN (0, ), j j j j x, ~ MVN (0, ). j j i1, j j j Total PED function: 1 J T 1 total PED pj log(2 ) log D e D e. 2 j1 i1

Example 1: univariate MLM (random intercept and random slope model) Level-1: Level-2: Combined equation: y 2 ~ 0, N 0j 1 jx, w u 0 j 00 01 1 j 0 j, 1 j 10 11w1 j u1 j u0 j 0 00 ~ MVN, u 0 1 j 10 11 y w x w x u u x 00 01 1 j 10 11 1 j 0 j 1 j.

Matrix form of the MLM Write the combined equation in Laird & Ware (1982) form: 00 01 u0 j y 1 w1j x w1j x 1 x. u 10 1 j 11 Rearrange the terms to match the measurement equation of the SSM: 1 u 0 j w 1 1 j y x 00 01 10 11. u 1 j x w1 jx

Transition equation and the state space form for the MLM Within a cluster/group, we have the transition equation: u0j 1 0u0j. u 0 1 u 1j 1j The state space form is obtained by define 1 u 0 j w 1j y y,, x,, 0, u 1 j x w1 jx x 2 j 0, j 1, j 00 01 10 11, j, 0 1 0 0 0 0 0 0 j, j, j 0 0 1 0 0 0 0, and j. 0 0

Initializing the KF To capture the random effects from the level-2 units: 0 00 0 0, j and P0 0, j 0 10 11 Ready to estimate the MLM using a simulated data set: 50 level-2 units; unbalanced design with 12 30 level-1 units in each cluster (Poisson(20))

Example 1 results True MLwiN PROC MIXED PROC IML Parameter value Point SE Point SE Point SE γ 00 10 10.144.704 10.146.704 10.146.704 γ 01 8 7.679.866 7.679.866 7.679.866 γ 10 5 5.224.611 5.226.611 5.226.611 γ 11 4 3.675.752 3.674.752 3.674.752 2 1 1.063.049 1.063.049 1.063.049 τ 00 10 8.182 1.689 8.187 1.678 8.188 1.678 τ 10 4 2.796 1.106 2.800 1.104 2.800 1.104 τ 11 8 6.341 1.266 6.339 1.269 6.339 1.270-2*loglik 3636.527 3636.530 3636.530

Computational inefficiency Estimators specifically designed for MLMs involve matrix operations at the cluster level. The dimension of the covariance matrix to be inverted is determined by the number of level- 1 units within each level-2 unit.

Computational efficiency In the KF, D i (or D ) is the only matrix that needs to be inverted in the (multiple-subject) Kalman recursion. The dimension of D i (or D ) corresponds to that of the data vector, p. In this example, D i (or D ) reduces to a scalar, rendering the computations very efficient.

Empirical comparison 50 clusters, balanced designs

Example 2: univariate MLM (random intercept longitudinal model) The longitudinal model (Icaza & Jones, 1999): 2 i1, j + u, where u ~ N 0, u, The state space form specification: 2 y j +, where j ~ N 0,, ~ N 0,, y y,, x 1,, 0, j 2 j 0, j 1 1, j, j, 2 0 0 0 0 u j, j, j, and j, 0 0 1 0 0 0 0

Initializing the KF Initialized with for all j to capture the random effect and the innovation variance of the AR(1) process. Note 2 u 0 2 and P 1 0 0 0 0, j 0 0, j 2 2 u Var( ) Var( i 1, j )+Var( u ) Var( ) Var( i 1, j). 2 1

Example 2 results 100 subjects, 20 observations, balanced design True PROC MIXED PROC IML Parameter value Point SE Point SE ϕ 0.5.451.066.459.066 μ 1.969.112.971.112 2 u 2 0.75.873.118.684.135 0.25.257.116.270.113 τ 1 1.139.178 1.143.179-2*loglik 5904.340 5904.718

Empirical comparison 50 subjects, balanced design

Example 3: multivariate MLM (random intercepts model) The combined matrix form: y1 100 u1 0 j 1 1 0 11 y2 2 00 u20 j 2, where 2 ~ MVN 0, 21 22. y3 3 00 u3 0 j 3 3 0 31 32 33 The random effect vector: u1 0 j 0 u2 0 j ~ MVN 0,. u3 0 j 0 Kamata, Bauer, & Miyazaki (2008, Equations 10.8a 10.8g)

State space form Re-write the combined matrix form: Then we have y1 1 0 0 u1 0 j 100 1 y2 0 1 0 u2 0 j 2 00 1 2. y3 0 0 1 u3 3 3 0 j 00 y1 u1 0 j 1 0 y y2, u2 0 j, x 1, 2, 0, y3 u3 3 0 0 j 0 1 0 0 100 11 0, 0 1 0, 2, 21 22, 0 0 0 1 3 00 31 32 33 0 1 0 0 0 0 0, 0 1 0, 0, and 0 0. 0 0 0 1 0 0 0 0 j j j 00 j j j j j

Initializing the KF Initialized with for all j. 0 0 0, j 0 and P 0 0, j 0

Example 3 results MLwiN Mplus PROC MIXED PROC IML Parameter Point SE Point SE Point SE Point SE γ1 00 3.677.072 3.677.072 3.677.072 3.677.072 γ2 00 3.717.072 3.717.072 3.717.072 3.717.072 γ3 00 3.721.072 3.721.072 3.721.072 3.721.072 τ.130.038.129.037.130.037.130.037 δ 11 3.036.070 3.036.071 3.036.071 3.036.071 δ 12 1.206.054 1.206.054 1.206.054 1.206.054 δ 22 3.096.072 3.095.072 3.096.072 3.096.072 δ 31 1.328.054 1.328.054 1.328.054 1.328.054 δ 32 1.290.054 1.290.054 1.290.054 1.290.054 δ 33 2.966.069 2.966.069 2.966.069 2.966.069-2*loglik 42658.679 42657.388 42658.678 42658.678

Summary The results from the state space approach are identical to those from the MLM and SEM software, except that The estimated variance of the innovation in the AR(1) process is biased from both PROC MIXED and PROC IML. The state space approach is more efficient because matrix operation is implemented at each observation, instead of at the cluster level.

Discussions Software: SAS/IML, R, MATLAB, C++, C, FORTRAN Further extension to multilevel CFA and multilevel SEM

Conclusions In general, the solution to representing the multilevel models using state space forms is to include the random effects in the latent state vector; and then, initialize the Kalman recursion by properly specifying the latent state vector (η 0 0,j ) and its associated covariance matrix (P 0 0,j ) to subsume the parameters at the cluster level.