Estimating a Piecewise Growth Model with Longitudinal Data that Contains Individual Mobility across Clusters

Estimating a Piecewise Growth Model with Longitudinal Data that Contains Individual Mobility across Clusters Audrey J. Leroux Georgia State University

Piecewise Growth Model (PGM) PGMs are beneficial for potentially nonlinear data, because they break up curvilinear growth trajectories into separate linear components This modeling approach is useful when wanting to compare growth rates during two or more different time periods. For example: Compare growth rates during two or more different time periods Longitudinal data before treatment as well as during treatment Longitudinal data during treatment as well as follow-up data after treatment Etc.

Three-Level PGM Three-level PGMs are used to model the clustering of individuals, such as when students are nested within schools, classrooms, districts, etc. in educational research. 3

Baseline Three-Level PGM Level 1: Y tij = π 0ij + π 1ij Time 1tij + π ij Time tij + e tij, e tij ~ N 0, σ e where Time 1tij and Time tij are coded to represent piecewise growth Of course, can include predictors Level : Level 3: π 0ij = β 00j + r 0ij π 1ij = β 10j + r 1ij π ij = β 0j + r ij, β 00j = γ 000 + u 00j β 10j = γ 100 + u 10j, β 0j = γ 00 + u 0j u 00j u 10j u 0j r 0ij r 1ij r ij ~ MVN ~ MVN 0 0 0, 0 0 0, τ u00 τ u00,u10 τ r0 τ r0,r1 τ r1 τ r0,r τ r1,r τ r τ u10 τ u00,u0 τ u10,u0 τ u0 4

Three-Level PGM To estimate this particular model presented with initial status and two slopes varying across both individuals and clusters, a minimum of four time-points must be included for identification of the model. One more time-point than the number of growth parameters (three). Otherwise, modeling the slopes as fixed or constraints on the level-1 residual variance can be placed if utilizing fewer time-points (see McCoach, O'Connell, Reis, & Levitt, 006; Palardy, 010). 5

Three-Level PGM Three-level PGMs utilized in previous research have assumed pure clustering of individuals across time or removed individuals from the analysis who changed clusters. BUT in reality, individual mobility across clusters is frequently encountered in longitudinal studies. Incorrect model specification in the presence of cluster mobility negatively impacts parameter estimates (Chung & Beretvas, 01; Grady, 010; Grady & Beretvas, 010; Leroux, 014; Leroux & Beretvas, 017a, Leroux & Beretvas, 017b; Luo & Kwok, 009; Luo & Kwok, 01; Meyers & Beretvas, 006). Generally, leads to inaccurate estimates of between-clusters variance components and standard errors of the fixed effects. 6

Longitudinal Data with Mobile Students Student Fall K Spring K Spring 1 st Spring 3 rd Spring 5 th Sch. 1 Sch. 1 Sch. Sch. 1 Sch. Sch. 3 Sch. 1 Sch. Sch. 3 Sch. 4 Sch. 1 Sch. Sch. 3 Sch. 4 Sch. 5 A B C D E 7

Multiple Membership Data Elementary ES1 ES ES3 ES4 ES5 ES6 ES7 ES8 School Student A B C D E F G H I J K L M N O P Q R S T U V Some units of a lower-level classification are members of more than one higherlevel classification. 8

Multiple Membership Random Effects Model (MMREM) Models the contribution to the outcome, Y, of each level- unit of which the level- 1 unit is a member. E.g., For student i who is a member of a set of ESs {j}, the unconditional model s L1 equation is: Y i j = β 0 j + r i j At L: β 0 j = γ 00 + h j w ih u 0h 9

MMREM Single equation: Y i j = γ 00 + h j w ih u 0h + r i j r i j ~ N 0, σ and u 0h ~ N 0, τ u00 where the user specifies the weights to represent the hypothesized contribution of each L unit (here, elementary school) For each student i: h j w ih = 1 10

MMREM For non-mobile student B (attending ES1): Y B ES1 = γ 00 + u 0 ES1 + r B ES1 For mobile student A, attending ES1 and ES: Y A ES1,ES = γ 00 + 0.5u 0 ES1 + 0.5u 0 ES + r A ES1,ES For mobile student Q, attending ES6, ES7, and ES8: Y Q ES6,ES7,ES8 = γ 00 + 1 3 u 0 ES6 + 1 3 u 0 ES7 + 1 3 u 0 ES8 + r Q ES6,ES7,ES8 11

Conditional MMREM Can include L1 and L predictors. At L1: And at L: Y i j = β 0 j + β 1 j X i j + r i j β 0 j = γ 00 + β 1 j = γ 10 + h j h j w ih γ 01 Z h + u 0h w ih γ 11 Z h + u 1h 1

Conditional MMREM The following multivariate normal distribution is assumed for the level- residuals: u 0 j u 1 j ~ N 0 0, τ u00 τ u10 τ u11 Note: Contribution of each ES s Z (for mobile students) is modeled and weighted in the same way as are schools effects (the u s). 13

Conditional MMREM Private (coded 1) Public (coded 0) For mobile student A, attending private ES1 and public ES: Y A ES1,ES = γ 00 + γ 01 0.5 1 + 0.5 0 + 0.5u 0 ES1 + 0.5u 0 ES + r A ES1,ES 14

Cross-Classified Multiple Membership Longitudinal Data School in 1ES1 1ES 1ES3 1ES4 Time 1 Student A B C D E F G H I J K L M N O P Q R S T U V School after +ES1 +ES +ES3 +ES4 Time 1 Grady & Beretvas (010) 15

Purpose of Current Study The current study proposes a three-level PGM to handle mobile students who change schools (clusters) during the period of data collection. The proposed cross-classified multiple membership PGM (CCMM-PGM) will be derived, justified, and explained using a real dataset. 17

Purpose of Current Study This extension is of particular importance when repeated measures over time are captured for students within schools because there is a high probability that at least some substantial proportion of students change schools during a study s time period. 38.5% of people aged 5-17 years moved within 005 to 010 (Ihrke & Faber, 01) 5% of those between the ages 5-17 relocated within the same county From 01 to 013, 1% of people between the ages 5-17 years old moved 69% of those moves occurred within the same county (U.S. Census Bureau, 013) 13% of students changed schools 4 or more times between kindergarten and 8th grade (U.S. Government accounting office, 010) 18

Baseline CCMM-PGM Level 1: Y ti j1, j = π 0i j1, j + π 1i j1, j TIME 1ti j 1, j + π i j1, j TIME ti j 1, j + e ti j1, j Level : Intercept (initial status) π 0i j1, j = β 00 j1, j + r 0i j1, j π 1i j1, j = β 10 j1, j + r 1i j1, j π i j1, j = β 0 j1, j + r i j1, j β 00 j1, j = γ 0000 + u 00j1 0 Subscripts j 1 and {j } index the first and set of subsequent schools attended by a student. No subsequent school residual for initial status Level 3: 1 st Slope β 10 j1, j = γ 1000 + u 10j1 0 + w 1tih u 100h h j β 0 j1, j = γ 000 + u 0j1 0 + w tih u 00h h j Note two different weights because each slope is associated with different time-points nd Slope Cross-classification of first and subsequent schools 19

Data ECLS-K data were used with time nested within students nested within schools. Multiple membership structure due to some students switching elementary schools across the course of data collection Time-points: Fall of kindergarten and springs of kindergarten, 1 st, 3 rd, and 5 th grade Outcome: Math IRT-scaled scores Growth rates from K 1 st grade appeared faster than those from 1 st 5 th grade Gender (1 = female; 0 = male) and school type (1 = private; 0 = public) 10,906 students (9.8% mobile) from 970 schools 0

Descriptive Statistics Variable Name M SD N Outcome Math achievement in Fall Kindergarten Y 1ij 6.69 9.0 9,74 Math achievement in Spring Kindergarten Y ij 37.17 11.95 10,664 Math achievement in Spring 1 st Grade Y 3ij 6.6 17.96 10,803 Math achievement in Spring 3 rd Grade Y 4ij 99.73 4.47 10,764 Math achievement in Spring 5 th Grade Y 5ij 14.05 4.66 10,801 Variable Name Percentage N Level- variable Female student 49.78% 5,49 FEMALE Male student ij 50.% 5,477 Level-3 variable Private school 3.51% 8 PRIVATE Public school j 76.49% 74 1

Coding of Time Variables Grades Fall K Spring K Spring 1 st Spring 3 rd Spring 5 th Time 1tij 0 0.5 1.5 1.5 1.5 Interpretation of πs π 0ij status in Fall K π 1ij growth rate period 1 Time tij 0 0 0 4 π ij growth rate period Exploratory analyses suggested a two-piece growth model because growth rates from kindergarten through 1 st grade appeared faster than those from 1 st through 5 th grade

Analyses Baseline and conditional versions of the following models were estimated: CCMM-PGM: appropriately took into account student mobility First school-pgm: ignored mobility by only modeling effect of the first school attended Delete-PGM: ignored mobility by deleting students who changed schools Weights are based on the proportion of time-points a student was associated with a school. 3

Coding Schemes for Weights Student Fall K Spring K Schools Weights (K 1 st ) Weights (1 st 5 th ) Spring Spring Spring 1 st 1 st 3 rd 5 th School nd School 1 st School nd School 3 rd School A S1 S1 S1 S1 S1 1 0 1 0 0 0 B S1 S1 S1 S1 S 1 0 3/4 1/4 0 0 C S1 S1 S1 S S 1 0 1/ 1/ 0 0 D S1 S1 S S S 1/ 1/ 1/4 3/4 0 0 E S1 S S S S 1 0 1 0 0 0 F S1 S S S S3 1 0 3/4 1/4 0 0 G S1 S S S3 S3 1 0 1/ 1/ 0 0 H S1 S S3 S3 S3 1/ 1/ 1/4 3/4 0 0 I S1 S S3 S3 S4 1/ 1/ 1/4 1/ 1/4 0 J S1 S S3 S4 S4 1/ 1/ 1/4 1/4 1/ 0 4 th School K S1 S S3 S4 S5 1/ 1/ 1/4 1/4 1/4 1/4 4

Estimation Models were fit using R with MCMC estimation using Rjags to interface with Just Another Gibbs Sampler (JAGS). Non-informative normal priors were used for fixed effects parameters and inverse-wishart distributions for the covariance matrices. Burn-in period of 5,000 iterations and an additional 50,000 iterations Parameter and SE estimates were compared, as well as model fit using the deviance information criterion value (DIC; Spiegelhalter, Best, Carlin, & van der Linde, 00). 5

Baseline Fixed Effects Estimating Model CCMM-PGM 1 First School-PGM 1 Delete-PGM Parameter Coeff. Est. (SE) Coeff. Est. (SE) Coeff. Est. (SE) Model for initial status Intercept γ 0000 5.313 (0.171) γ 000 5.307 (0.171) γ 000 5.441 (0.194) Model for 1 st slope Intercept γ 1000 5.480 (0.150) γ 100 5.488 (0.146) γ 100 5.44 (0.171) Model for nd slope Intercept γ 000 15.499 (0.066) γ 00 15.500 (0.066) γ 00 15.544 (0.078) DIC 451,380.3 453,361.6 305,458.7 6

Baseline Random Effects Estimating Model CCMM-PGM First School-PGM Delete-PGM Parameter Coeff. Est. (SE) Coeff. Est. (SE) Coeff. Est. (SE) Level-1 variance between Measures σ 6.453 (0.631) σ 6.479 (0.66) σ 63.345 (0.706) Initial status variance between Students 1 st schools τ r0 τ u0j1 3.443 19.386 (1.057) (1.69) τ r0 τ u0 3.376 19.356 (1.00) (1.88) τ r0 τ u0 3.911 19.844 (1.73) (1.495) 1 st slope variance between Students 1 st schools Subsequent schools τ r1 τ u1j1 τ u1 j 39.81 8.663 3.567 (1.13) (1.139) (0.990) τ r1 τ u1 39.339 11.66 (1.113) (0.97) τ r1 τ u1 37.989 1.457 (1.79) (1.13) nd slope variance between Students 1 st schools Subsequent schools τ r τ uj1 τ u j 7.761 1.861 0.935 (0.14) (0.349) (0.414) τ r τ u 7.86.585 (0.4) (0.184) τ r τ u 7.043.870 (0.54) (0.49) 7

Conditional Fixed Effects Estimating Model CCMM-PGM 1 First School-PGM 1 Delete-PGM Parameter Coeff. Est. (SE) Coeff. Est. (SE) Coeff. Est. (SE) Model for initial status Intercept FEMALE Sch1_PRIVATE Model for 1 st slope Intercept FEMALE Sch1_PRIVATE SubSch_PRIVATE Model for nd slope Intercept FEMALE Sch1_PRIVATE SubSch_PRIVATE γ 0000 γ 0100 γ 0010 γ 1000 γ 1100 γ 1010 γ 1001 γ 000 γ 100 γ 010 γ 001 4.55 0.07 5.057 5.37 1.556 1.743 0.800 15.446 0.699 1.601 1.466 (0.18) (0.176) (0.385) (0.163) (0.186) (1.59) (1.569) (0.07) (0.078) (0.407) (0.45) γ 000 γ 010 γ 001 γ 100 γ 110 γ 101 γ 00 γ 10 γ 01 4.59 0.08 5.08 5.46 1.56 0.955 15.447 0.707 0.84 (0.168) (0.187) (0.374) (0.156) (0.185) (0.340) (0.073) (0.077) (0.168) γ 000 γ 010 γ 001 γ 100 γ 110 γ 101 γ 00 γ 10 γ 01 4.81 0.340 5.459 5.140 1.651 1.85 15.511 0.635 0.116 DIC 454,180.8 456,478.6 306,754.3 (0.04) (0.16) (0.43) (0.197) (0.7) (0.403) (0.086) (0.096) (0.180) 8

Conditional Random Effects Estimating Model CCMM-PGM First School-PGM Delete-PGM Parameter Coeff. Est. (SE) Coeff. Est. (SE) Coeff. Est. (SE) Level-1 variance between Measures σ 6.407 (0.657) σ 6.413 (0.657) σ 63.87 (0.711) Initial status variance between Students 1 st schools τ r0 τ u0j1 3.443 15. (0.97) (1.041) τ r0 τ u0 3.48 15.165 (1.043) (1.08) τ r0 τ u0 33.049 14.809 (1.95) (1.15) 1 st slope variance between Students 1 st schools Subsequent schools τ r1 τ u1j1 τ u1 j 38.769 8.958.974 (1.107) (1.089) (0.93) τ r1 τ u1 38.861 11.388 (1.08) (0.908) τ r1 τ u1 37.3 1.05 (1.337) (1.111) nd slope variance between Students 1 st schools Subsequent schools τ r τ uj1 τ u j 7.638 1.86 0.97 (0.17) (0.353) (0.395) τ r τ u 7.699.56 (0.0) (0.190) τ r τ u 6.960.851 (0.55) (0.40) 9

Implications Ignoring mobility could lead to inaccurate conclusions about: The intercept and slope estimates in a three-level PGM if one were to delete mobile cases The impact of cluster-level predictors (regardless if you delete or ignore mobile individuals) The impact of both level- and level-3 predictors if one were to delete mobile cases. 30

Implications Researchers using a PGM ignoring multiple membership data should be careful when making inferences about the nature of variability in growth rates. For the delete-pgm, the SE estimates of the other variances were larger, which could then lead to erroneous conclusions about random effects if mobile individuals were removed from analysis. The CCMM-PGM fit to the data better than the first school-pgm. Because of these findings, a simulation study will be conducted this summer, so stay tuned 31

Thank you! aleroux@gsu.edu 3