Estimating a Piecewise Growth Model with Longitudinal Data that Contains Individual Mobility across Clusters

Similar documents
Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Introducing Generalized Linear Models: Logistic Regression

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

An Introduction to Mplus and Path Analysis

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research

An Introduction to Path Analysis

Part 8: GLMs and Hierarchical LMs and GLMs

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Describing Change over Time: Adding Linear Trends

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Random Intercept Models

Joint Modeling of Longitudinal Item Response Data and Survival

Sociology 593 Exam 2 Answer Key March 28, 2002

A multivariate multilevel model for the analysis of TIMMS & PIRLS data

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Bayesian non-parametric model to longitudinally predict churn

The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Models for Clustered Data

Models for Clustered Data

Summary of Talk Background to Multilevel modelling project. What is complex level 1 variation? Tutorial dataset. Method 1 : Inverse Wishart proposals.

Exploiting TIMSS and PIRLS combined data: multivariate multilevel modelling of student achievement

Two-Level Models for Clustered* Data

Hierarchical Linear Modeling. Lesson Two

Machine Learning Linear Regression. Prof. Matteo Matteucci

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Time-Invariant Predictors in Longitudinal Models

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3

Time-Invariant Predictors in Longitudinal Models

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Review of Multiple Regression

Review of CLDP 944: Multilevel Models for Longitudinal Data

Additional Notes: Investigating a Random Slope. When we have fixed level-1 predictors at level 2 we show them like this:

Time-Invariant Predictors in Longitudinal Models

Multilevel/Mixed Models and Longitudinal Analysis Using Stata

Model Estimation Example

Part 7: Hierarchical Modeling

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

Mixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Generalized Linear Models for Non-Normal Data

Time-Invariant Predictors in Longitudinal Models

Review of the General Linear Model

Stat 209 Lab: Linear Mixed Models in R This lab covers the Linear Mixed Models tutorial by John Fox. Lab prepared by Karen Kapur. ɛ i Normal(0, σ 2 )

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Sparse Matrix Methods and Mixed-effects Models

REVIEW 8/2/2017 陈芳华东师大英语系

Sociology 593 Exam 2 March 28, 2002

Longitudinal Data Analysis of Health Outcomes

Using Decision Trees to Evaluate the Impact of Title 1 Programs in the Windham School District

Package effectfusion

MULTILEVEL IMPUTATION 1

Introduction to Random Effects of Time and Model Estimation

Addressing Analysis Issues REGRESSION-DISCONTINUITY (RD) DESIGN

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Time Invariant Predictors in Longitudinal Models

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars)

Duration of Unemployment - Analysis of Deviance Table for Nested Models

Final Exam - Solutions

appstats8.notebook October 11, 2016

Advanced Quantitative Data Analysis

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Logistic Regression: Regression with a Binary Dependent Variable

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Describing Nonlinear Change Over Time

Stat 135 Fall 2013 FINAL EXAM December 18, 2013

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11

Linear Models 1. Isfahan University of Technology Fall Semester, 2014

Bayesian Causal Mediation Analysis for Group Randomized Designs with Homogeneous and Heterogeneous Effects: Simulation and Case Study

CIVL 7012/8012. Collection and Analysis of Information

SUPPLEMENTARY APPENDIX MATERIALS [not intended for publication]

Chapter 8: Sampling Distributions. A survey conducted by the U.S. Census Bureau on a continual basis. Sample

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Chapter 4 Regression with Categorical Predictor Variables Page 1. Overview of regression with categorical predictors

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Published online: 19 Jun 2015.

Utilizing Hierarchical Linear Modeling in Evaluation: Concepts and Applications

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

A Bayesian multi-dimensional couple-based latent risk model for infertility

Technical and Practical Considerations in applying Value Added Models to estimate teacher effects

Bayesian Networks in Educational Assessment

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Simple Linear Regression: One Qualitative IV

Introduction to Matrix Algebra and the Multivariate Normal Distribution

Introduction to Within-Person Analysis and RM ANOVA

Multilevel Analysis, with Extensions

Multilevel Modeling Day 2 Intermediate and Advanced Issues: Multilevel Models as Mixed Models. Jian Wang September 18, 2012

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Spatio-Temporal Threshold Models for Relating UV Exposures and Skin Cancer in the Central United States

Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements:

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

For more information about how to cite these materials visit

1 Introduction. 2 Example

Transcription:

Estimating a Piecewise Growth Model with Longitudinal Data that Contains Individual Mobility across Clusters Audrey J. Leroux Georgia State University

Piecewise Growth Model (PGM) PGMs are beneficial for potentially nonlinear data, because they break up curvilinear growth trajectories into separate linear components This modeling approach is useful when wanting to compare growth rates during two or more different time periods. For example: Compare growth rates during two or more different time periods Longitudinal data before treatment as well as during treatment Longitudinal data during treatment as well as follow-up data after treatment Etc.

Three-Level PGM Three-level PGMs are used to model the clustering of individuals, such as when students are nested within schools, classrooms, districts, etc. in educational research. 3

Baseline Three-Level PGM Level 1: Y tij = π 0ij + π 1ij Time 1tij + π ij Time tij + e tij, e tij ~ N 0, σ e where Time 1tij and Time tij are coded to represent piecewise growth Of course, can include predictors Level : Level 3: π 0ij = β 00j + r 0ij π 1ij = β 10j + r 1ij π ij = β 0j + r ij, β 00j = γ 000 + u 00j β 10j = γ 100 + u 10j, β 0j = γ 00 + u 0j u 00j u 10j u 0j r 0ij r 1ij r ij ~ MVN ~ MVN 0 0 0, 0 0 0, τ u00 τ u00,u10 τ r0 τ r0,r1 τ r1 τ r0,r τ r1,r τ r τ u10 τ u00,u0 τ u10,u0 τ u0 4

Three-Level PGM To estimate this particular model presented with initial status and two slopes varying across both individuals and clusters, a minimum of four time-points must be included for identification of the model. One more time-point than the number of growth parameters (three). Otherwise, modeling the slopes as fixed or constraints on the level-1 residual variance can be placed if utilizing fewer time-points (see McCoach, O'Connell, Reis, & Levitt, 006; Palardy, 010). 5

Three-Level PGM Three-level PGMs utilized in previous research have assumed pure clustering of individuals across time or removed individuals from the analysis who changed clusters. BUT in reality, individual mobility across clusters is frequently encountered in longitudinal studies. Incorrect model specification in the presence of cluster mobility negatively impacts parameter estimates (Chung & Beretvas, 01; Grady, 010; Grady & Beretvas, 010; Leroux, 014; Leroux & Beretvas, 017a, Leroux & Beretvas, 017b; Luo & Kwok, 009; Luo & Kwok, 01; Meyers & Beretvas, 006). Generally, leads to inaccurate estimates of between-clusters variance components and standard errors of the fixed effects. 6

Longitudinal Data with Mobile Students Student Fall K Spring K Spring 1 st Spring 3 rd Spring 5 th Sch. 1 Sch. 1 Sch. Sch. 1 Sch. Sch. 3 Sch. 1 Sch. Sch. 3 Sch. 4 Sch. 1 Sch. Sch. 3 Sch. 4 Sch. 5 A B C D E 7

Multiple Membership Data Elementary ES1 ES ES3 ES4 ES5 ES6 ES7 ES8 School Student A B C D E F G H I J K L M N O P Q R S T U V Some units of a lower-level classification are members of more than one higherlevel classification. 8

Multiple Membership Random Effects Model (MMREM) Models the contribution to the outcome, Y, of each level- unit of which the level- 1 unit is a member. E.g., For student i who is a member of a set of ESs {j}, the unconditional model s L1 equation is: Y i j = β 0 j + r i j At L: β 0 j = γ 00 + h j w ih u 0h 9

MMREM Single equation: Y i j = γ 00 + h j w ih u 0h + r i j r i j ~ N 0, σ and u 0h ~ N 0, τ u00 where the user specifies the weights to represent the hypothesized contribution of each L unit (here, elementary school) For each student i: h j w ih = 1 10

MMREM For non-mobile student B (attending ES1): Y B ES1 = γ 00 + u 0 ES1 + r B ES1 For mobile student A, attending ES1 and ES: Y A ES1,ES = γ 00 + 0.5u 0 ES1 + 0.5u 0 ES + r A ES1,ES For mobile student Q, attending ES6, ES7, and ES8: Y Q ES6,ES7,ES8 = γ 00 + 1 3 u 0 ES6 + 1 3 u 0 ES7 + 1 3 u 0 ES8 + r Q ES6,ES7,ES8 11

Conditional MMREM Can include L1 and L predictors. At L1: And at L: Y i j = β 0 j + β 1 j X i j + r i j β 0 j = γ 00 + β 1 j = γ 10 + h j h j w ih γ 01 Z h + u 0h w ih γ 11 Z h + u 1h 1

Conditional MMREM The following multivariate normal distribution is assumed for the level- residuals: u 0 j u 1 j ~ N 0 0, τ u00 τ u10 τ u11 Note: Contribution of each ES s Z (for mobile students) is modeled and weighted in the same way as are schools effects (the u s). 13

Conditional MMREM Private (coded 1) Public (coded 0) For mobile student A, attending private ES1 and public ES: Y A ES1,ES = γ 00 + γ 01 0.5 1 + 0.5 0 + 0.5u 0 ES1 + 0.5u 0 ES + r A ES1,ES 14

Cross-Classified Multiple Membership Longitudinal Data School in 1ES1 1ES 1ES3 1ES4 Time 1 Student A B C D E F G H I J K L M N O P Q R S T U V School after +ES1 +ES +ES3 +ES4 Time 1 Grady & Beretvas (010) 15

Cross-Classified Multiple Membership Longitudinal Data School in 1ES1 1ES 1ES3 1ES4 Time 1 Student A B C D E F G H I J K L M N O P Q R S T U V School after +ES1 +ES +ES3 +ES4 Time 1 Grady & Beretvas (010) 16

Purpose of Current Study The current study proposes a three-level PGM to handle mobile students who change schools (clusters) during the period of data collection. The proposed cross-classified multiple membership PGM (CCMM-PGM) will be derived, justified, and explained using a real dataset. 17

Purpose of Current Study This extension is of particular importance when repeated measures over time are captured for students within schools because there is a high probability that at least some substantial proportion of students change schools during a study s time period. 38.5% of people aged 5-17 years moved within 005 to 010 (Ihrke & Faber, 01) 5% of those between the ages 5-17 relocated within the same county From 01 to 013, 1% of people between the ages 5-17 years old moved 69% of those moves occurred within the same county (U.S. Census Bureau, 013) 13% of students changed schools 4 or more times between kindergarten and 8th grade (U.S. Government accounting office, 010) 18

Baseline CCMM-PGM Level 1: Y ti j1, j = π 0i j1, j + π 1i j1, j TIME 1ti j 1, j + π i j1, j TIME ti j 1, j + e ti j1, j Level : Intercept (initial status) π 0i j1, j = β 00 j1, j + r 0i j1, j π 1i j1, j = β 10 j1, j + r 1i j1, j π i j1, j = β 0 j1, j + r i j1, j β 00 j1, j = γ 0000 + u 00j1 0 Subscripts j 1 and {j } index the first and set of subsequent schools attended by a student. No subsequent school residual for initial status Level 3: 1 st Slope β 10 j1, j = γ 1000 + u 10j1 0 + w 1tih u 100h h j β 0 j1, j = γ 000 + u 0j1 0 + w tih u 00h h j Note two different weights because each slope is associated with different time-points nd Slope Cross-classification of first and subsequent schools 19

Data ECLS-K data were used with time nested within students nested within schools. Multiple membership structure due to some students switching elementary schools across the course of data collection Time-points: Fall of kindergarten and springs of kindergarten, 1 st, 3 rd, and 5 th grade Outcome: Math IRT-scaled scores Growth rates from K 1 st grade appeared faster than those from 1 st 5 th grade Gender (1 = female; 0 = male) and school type (1 = private; 0 = public) 10,906 students (9.8% mobile) from 970 schools 0

Descriptive Statistics Variable Name M SD N Outcome Math achievement in Fall Kindergarten Y 1ij 6.69 9.0 9,74 Math achievement in Spring Kindergarten Y ij 37.17 11.95 10,664 Math achievement in Spring 1 st Grade Y 3ij 6.6 17.96 10,803 Math achievement in Spring 3 rd Grade Y 4ij 99.73 4.47 10,764 Math achievement in Spring 5 th Grade Y 5ij 14.05 4.66 10,801 Variable Name Percentage N Level- variable Female student 49.78% 5,49 FEMALE Male student ij 50.% 5,477 Level-3 variable Private school 3.51% 8 PRIVATE Public school j 76.49% 74 1

Coding of Time Variables Grades Fall K Spring K Spring 1 st Spring 3 rd Spring 5 th Time 1tij 0 0.5 1.5 1.5 1.5 Interpretation of πs π 0ij status in Fall K π 1ij growth rate period 1 Time tij 0 0 0 4 π ij growth rate period Exploratory analyses suggested a two-piece growth model because growth rates from kindergarten through 1 st grade appeared faster than those from 1 st through 5 th grade

Analyses Baseline and conditional versions of the following models were estimated: CCMM-PGM: appropriately took into account student mobility First school-pgm: ignored mobility by only modeling effect of the first school attended Delete-PGM: ignored mobility by deleting students who changed schools Weights are based on the proportion of time-points a student was associated with a school. 3

Coding Schemes for Weights Student Fall K Spring K Schools Weights (K 1 st ) Weights (1 st 5 th ) Spring Spring Spring 1 st 1 st 3 rd 5 th School nd School 1 st School nd School 3 rd School A S1 S1 S1 S1 S1 1 0 1 0 0 0 B S1 S1 S1 S1 S 1 0 3/4 1/4 0 0 C S1 S1 S1 S S 1 0 1/ 1/ 0 0 D S1 S1 S S S 1/ 1/ 1/4 3/4 0 0 E S1 S S S S 1 0 1 0 0 0 F S1 S S S S3 1 0 3/4 1/4 0 0 G S1 S S S3 S3 1 0 1/ 1/ 0 0 H S1 S S3 S3 S3 1/ 1/ 1/4 3/4 0 0 I S1 S S3 S3 S4 1/ 1/ 1/4 1/ 1/4 0 J S1 S S3 S4 S4 1/ 1/ 1/4 1/4 1/ 0 4 th School K S1 S S3 S4 S5 1/ 1/ 1/4 1/4 1/4 1/4 4

Estimation Models were fit using R with MCMC estimation using Rjags to interface with Just Another Gibbs Sampler (JAGS). Non-informative normal priors were used for fixed effects parameters and inverse-wishart distributions for the covariance matrices. Burn-in period of 5,000 iterations and an additional 50,000 iterations Parameter and SE estimates were compared, as well as model fit using the deviance information criterion value (DIC; Spiegelhalter, Best, Carlin, & van der Linde, 00). 5

Baseline Fixed Effects Estimating Model CCMM-PGM 1 First School-PGM 1 Delete-PGM Parameter Coeff. Est. (SE) Coeff. Est. (SE) Coeff. Est. (SE) Model for initial status Intercept γ 0000 5.313 (0.171) γ 000 5.307 (0.171) γ 000 5.441 (0.194) Model for 1 st slope Intercept γ 1000 5.480 (0.150) γ 100 5.488 (0.146) γ 100 5.44 (0.171) Model for nd slope Intercept γ 000 15.499 (0.066) γ 00 15.500 (0.066) γ 00 15.544 (0.078) DIC 451,380.3 453,361.6 305,458.7 6

Baseline Random Effects Estimating Model CCMM-PGM First School-PGM Delete-PGM Parameter Coeff. Est. (SE) Coeff. Est. (SE) Coeff. Est. (SE) Level-1 variance between Measures σ 6.453 (0.631) σ 6.479 (0.66) σ 63.345 (0.706) Initial status variance between Students 1 st schools τ r0 τ u0j1 3.443 19.386 (1.057) (1.69) τ r0 τ u0 3.376 19.356 (1.00) (1.88) τ r0 τ u0 3.911 19.844 (1.73) (1.495) 1 st slope variance between Students 1 st schools Subsequent schools τ r1 τ u1j1 τ u1 j 39.81 8.663 3.567 (1.13) (1.139) (0.990) τ r1 τ u1 39.339 11.66 (1.113) (0.97) τ r1 τ u1 37.989 1.457 (1.79) (1.13) nd slope variance between Students 1 st schools Subsequent schools τ r τ uj1 τ u j 7.761 1.861 0.935 (0.14) (0.349) (0.414) τ r τ u 7.86.585 (0.4) (0.184) τ r τ u 7.043.870 (0.54) (0.49) 7

Conditional Fixed Effects Estimating Model CCMM-PGM 1 First School-PGM 1 Delete-PGM Parameter Coeff. Est. (SE) Coeff. Est. (SE) Coeff. Est. (SE) Model for initial status Intercept FEMALE Sch1_PRIVATE Model for 1 st slope Intercept FEMALE Sch1_PRIVATE SubSch_PRIVATE Model for nd slope Intercept FEMALE Sch1_PRIVATE SubSch_PRIVATE γ 0000 γ 0100 γ 0010 γ 1000 γ 1100 γ 1010 γ 1001 γ 000 γ 100 γ 010 γ 001 4.55 0.07 5.057 5.37 1.556 1.743 0.800 15.446 0.699 1.601 1.466 (0.18) (0.176) (0.385) (0.163) (0.186) (1.59) (1.569) (0.07) (0.078) (0.407) (0.45) γ 000 γ 010 γ 001 γ 100 γ 110 γ 101 γ 00 γ 10 γ 01 4.59 0.08 5.08 5.46 1.56 0.955 15.447 0.707 0.84 (0.168) (0.187) (0.374) (0.156) (0.185) (0.340) (0.073) (0.077) (0.168) γ 000 γ 010 γ 001 γ 100 γ 110 γ 101 γ 00 γ 10 γ 01 4.81 0.340 5.459 5.140 1.651 1.85 15.511 0.635 0.116 DIC 454,180.8 456,478.6 306,754.3 (0.04) (0.16) (0.43) (0.197) (0.7) (0.403) (0.086) (0.096) (0.180) 8

Conditional Random Effects Estimating Model CCMM-PGM First School-PGM Delete-PGM Parameter Coeff. Est. (SE) Coeff. Est. (SE) Coeff. Est. (SE) Level-1 variance between Measures σ 6.407 (0.657) σ 6.413 (0.657) σ 63.87 (0.711) Initial status variance between Students 1 st schools τ r0 τ u0j1 3.443 15. (0.97) (1.041) τ r0 τ u0 3.48 15.165 (1.043) (1.08) τ r0 τ u0 33.049 14.809 (1.95) (1.15) 1 st slope variance between Students 1 st schools Subsequent schools τ r1 τ u1j1 τ u1 j 38.769 8.958.974 (1.107) (1.089) (0.93) τ r1 τ u1 38.861 11.388 (1.08) (0.908) τ r1 τ u1 37.3 1.05 (1.337) (1.111) nd slope variance between Students 1 st schools Subsequent schools τ r τ uj1 τ u j 7.638 1.86 0.97 (0.17) (0.353) (0.395) τ r τ u 7.699.56 (0.0) (0.190) τ r τ u 6.960.851 (0.55) (0.40) 9

Implications Ignoring mobility could lead to inaccurate conclusions about: The intercept and slope estimates in a three-level PGM if one were to delete mobile cases The impact of cluster-level predictors (regardless if you delete or ignore mobile individuals) The impact of both level- and level-3 predictors if one were to delete mobile cases. 30

Implications Researchers using a PGM ignoring multiple membership data should be careful when making inferences about the nature of variability in growth rates. For the delete-pgm, the SE estimates of the other variances were larger, which could then lead to erroneous conclusions about random effects if mobile individuals were removed from analysis. The CCMM-PGM fit to the data better than the first school-pgm. Because of these findings, a simulation study will be conducted this summer, so stay tuned 31

Thank you! aleroux@gsu.edu 3