Models for Clustered Data

Size: px

Start display at page:

Download "Models for Clustered Data"

Theresa Phelps
5 years ago
Views:

1 Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019

2 Outline Notation NELS88 data Fixed Effects ANOVA Random Effects ANOVA Multiple Regression HLM and the general linear mixed model Reading: Snijders & Bosker, chapter 3 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

3 Notation Snijders & Bosker s notation j index for groups (macro-level units) N = number of groups j = 1,,N i index for individuals (micro-level units) n j = number of individuals in group j i = 1,,n j CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

4 Notation (continued) Y ij response or dependent variable y ij is a value on the response/dependent variable x ij explanatory variable n + = total number of level 1 observations, n + = N j=1 n j This is different from Snijder & Bosker (they use M) CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

5 NELS88 National Education Longitudinal Study: Conducted by National Center for Education Statistics of the US department of Education These data constitute the first in a series of longitudinal measurements of students starting in 8th grade These data were collected Spring 1988 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

6 NELS88: The Data for 10 Schools CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

7 Descriptive Statistics Math scores Total sample: N j n j = n + = 260, Ȳ = 5130, s = 1114 By School: sch id n j Mean Std Dev Min Max CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

8 Another Look at the Data Mean math scores ±1 standard deviation for the N = 10 schools School Mean Math Score +/ one Standard Deviation Math School CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

9 Review of Fixed Effects ANOVA The model: where Y ij = µ+α j +R ij µ is the overall mean (constant) α j is group j effect (constant) R ij is the residual or error (random) R ij N(0,σ 2 ) and independent Accounts for all group differences Only interested in the N groups CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

10 Example: NELS88 using 10 schools Math scores as response variable ANOVA Summary Table From SAS/GLM and MIXED (method=reml) Sum of Mean Source df Squares Square F p Model (school) < 0001 Error (residual) Corrected total CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

11 Example (continued) Only from SAS/MIXED: Maximum Likelihood Estimation Effect Num df Den df F value p school < 0001 Expected mean squares Sum of Mean Expected Error Source df squares square mean square term sch id Var(Res) + Q(sch id ) MS(Res Residual (Res) Var(Res) CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

12 Linear Regression Formulation of fixed effects ANOVA Y ij = µ j +R ij = µ+α j +R ij = β 0 +β 1 x 1ij +β 2 x 2ij ++β (N 1) x (N 1),ij +R ij where x kij = 1 if j = k, and 0 otherwise (ie, dummy codes) CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

13 Linear Regression in Matrix Notation Y 11 Y 21 Y 23,1 Y 1,2 Y 2,2 Y 20,2 Y 1,10 Y 20,10 = β o β 1 β 2 β 3 β 4 β 5 β 6 β 7 β 8 β 9 + R 11 R 21 R 23,1 R 1,2 R 2,2 R 20,2 R 1,10 R 20,10 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

14 Linear Regression in Matrix Notation Y 1 Y 2 = X 1 X 2 β + R 1 R 2 Y 10 X 10 R 10 Y = Xβ + R CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

15 Fixed Effects Model (continued) Y ij = β 0 +β 1 x 1ij +β 2 x 2ij ++β (N 1) x (N 1)ij +R ij The parameters β 0,β 1,,β N 1 are considered fixed The R ij s are random: R ij N(0,σ 2 ) and independent Therefore, Y ij N(β 0 +β 1 x 1ij ++β (N 1) x (N 1)ij,σ 2 ) CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

16 In Matrix Terms The model : Y = Xβ +R where β is an (N 1) vector of fixed parameters R is an (n + 1) vectors of random variables: R N n+ (0,σ 2 I) Y N n+ (Xβ,σ 2 I) Covariance matrix, σ σ 2 0 σ 2 0 I = 0 0 σ 2 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

17 Estimated model parameters, σ and β From the ANOVA table from SAS: ˆσ 2 = 7234 Solution for Fixed Effects β Standard Effect sch id Estimate Error df t Pr > t Intercept < 0001 sch id sch id sch id sch id sch id sch id sch id < 0001 sch id sch id sch id CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

18 Estimated model parameters, σ and β From the table from R: ˆσ 2 = 7234 Are these different from SAS? Solution for Fixed Effects Coefficients: Estimate Std Error t value Pr(> t ) (Intercept) < 2e 16 id id id id id id e 15 id id id CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

19 Random Effects ANOVA Situation: When groups are a random sample from some larger population of groups and you want to make inferences about this population Data are hierarchically structured data with no explanatory variables y CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

20 Random Effects ANOVA: The Model where Y ij = µ + U }{{} j +R ij }{{} fixed stochastic µ is the overall mean (constant) U j is group j effect (random) U j N(0,τ 2 ) and independent R ij is the w/in groups residual or error (random) R ij N(0,σ 2 ) and independent U j and R ij are independent, ie, cov(u j,r ij ) = 0 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

21 Notes Regarding Random Effects ANOVA If τ 2 > 0, group differences exist Instead of estimating N parameters for groups, we estimate a single parameter, τ 2 The variance of Y ij = (τ 2 +σ 2 ) Random effects ANOVA can be represented as a linear regression model with a random intercept CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

22 Random Effects ANOVA As Linear Regression Model: No explanatory variables Y ij = β 0j +R ij The intercept: β 0j = γ +U j where γ is Constant (overall mean) and U j is Random, N(0,τ 2 ) β 0j is random with β 0j N(γ,τ 2 ) U j are Level 2 residuals, R ij are Level 1 residuals, and they re independent The empty or null HLM CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

23 Random Effects ANOVA: NELS88 Using SAS/MIXED with Maximum likelihood estimation: Parameter Estimate std error Fixed effect(s) γ Variance parameters intercept τ residual σ R does not give std error, but it gives Std Dev = τ 2 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

24 A Closer Look (Intuitive) Y ij = β 0j +R ij = γ +U j +R ij Y ij is random due to U j and R ij Since U j and R ij are independent and normally distributed, (i) Y ij N(γ,(τ 2 +σ 2 )) (ii) cov(y ij,y kj ) = 0 (between groups) eg, Y ij = γ +U j +R ij Y i j = γ +U j +R i j CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

25 A Closer Look (continued) Y ij = β 0j +R ij = γ +U j +R ij Within groups, observations are dependent Y ij = γ +U j +R ij Y i j = γ +U j +R i j γ is fixed U j is random but the same value for both i & i R ij and R i j are random and different values for i & i CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

26 Within Group Dependency (formal) cov(y ij,y i j) E[(Y ij E(Y ij ))(Y i j E(Y i j))] = E[((γ +U j +R ij ) γ)((γ +U j +R i j) γ)] = E[(U j +R ij )(U j +R i j)] = E[U 2 j +U jr i j +R ij U j +R ij R i j] = E[U 2 j]+e[u j R i j]+e[r ij U j ]+E[R ij R i j] = E[U 2 j] = E[(U j 0) 2 ] var(u j ) = τ 2 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

27 Intra-class Correlation Measure of within group dependency ρ I corr(y ij,y i j) = = = cov(y ij,y i j) var(yij ) var(y i j) τ 2 τ 2 +σ 2 τ 2 +σ 2 τ 2 τ 2 +σ 2 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

28 Intra-class Correlation: ρ I ρ I = τ2 τ 2 +σ 2 Assuming within group dependency is the same in all groups If τ 2 = 0, then don t need HLM Could be used as a measure of how much variance accounted for by the model, eg, Our NELS88 example, ˆρ I = = 30 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

29 Matrix representation of Y ij = γ +U j +R ij Y 11 Y 21 Y 23,1 Y 1,2 Y 2,2 Y 20,2 Y 1,10 Y 20,10 = ( γ ) + U 1 U 1 U 1 U 2 U 2 U 2 U 10 U 10 + R 11 R 21 R 23,1 R 1,2 R 2,2 R 20,2 R 1,10 R 20,10 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

30 Matrix representation (continued) Y 1 Y 2 = X 1 X 2 β + U 1 U 2 + R 1 R 2 Y 10 X 10 U 10 R 10 Y = Xβ + U + R CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

31 Matrix Representation (continued) For group j, the distribution of Y j is Y j N nj (X j β,σ j ) where X j = 1 j is the an (n j 1) column vector with elements equal to one β = γ The covariance matrix for the j th group is an (n j n j ) symmetric matrix with elements Σ j = σ 2 I j +τ 2 1 j 1 j CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

32 Covariance Matrix Σ j (n j n j ) = σ 2 I j + τ 2 1 j 1 j = σ τ (τ 2 +σ 2 ) τ 2 τ 2 τ 2 (τ 2 +σ 2 ) τ 2 = τ 2 τ 2 (τ 2 +σ 2 ) CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

33 Covariance Matrix (continued) Notes: Σ j = σ 2 I j +τ 2 1 j 1 j Compound symmetry Only difference between groups is the size of this matrix (ie, (n j n j )) For simplicity, drop the sub-script from Σ CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

34 Covariance Matrix: Y For the whole data vector Y, the covariance matrix is of size (n + n + ) and equals Σ 0 0 (n 1 n 1 ) 0 Σ 0 (n 2 n 2 ) Σ (n N n N ) CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

35 Summary: Random Effects ANOVA Model isn t too interesting No explanatory variables If we add variables to the regression, we only have a shift intercept in the intercept CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

36 Multiple Regression Intercepts and Slopes as outcomes Two stages: Stage 1: Fit multiple regression model to each group (level 1 analysis) ie, Summarize groups by their regression parameters Stage 2: Use the estimated intercept and slope coefficients as outcomes (level 2 analysis) ie, Analysis of the summary statistics CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

37 NELS88: Overall Regression Before we do the two stage analysis, Linear Regression of Math on Time Spent Doing Homework Fitted Math Scores from Linear Regression Time Spent Doing Homework CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

38 Individual Schools Data CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

39 Individual Schools Regressions Separate Regression for Each School Math Scores Time Spent Doing Homework CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

40 Two Stage Example: NELS88 Stage 1: Within groups regressions Y ij = student s math score x ij = time spent doing homework Proposition: More time spent doing homework, the higher the math score Appropriate model: (math) ij = β 0j +β 1j (homew) ij +R ij where R ij N(0,σ 2 ) Y ij = β 0j +β 1j x ij +R ij CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

41 Stage One Model fit to each of the 10 NELS88 schools: School ID ˆβ0j ˆβ1j R CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

42 Stage One (continued) Parameter n mean std dev se variance ˆβ 0j ˆβ 1j CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

43 Stage Two: NELS88 We use the estimated regression parameters as response variables and try to model the parameters We ll start with the intercepts Example simple model for the ˆβ 0j s: ˆβ 0j = γ 0 +U 0j CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

44 Stage Two: Intercepts The REG Procedure Model: MODEL1 Dependent Variable: alpha Analysis of Variance Sum of Mean Source DF Squares Square F Pr > F Model 0 0 Error Corrected Total CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

45 Stage Two: Intercepts Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Standard Variable DF Estimate Error t Pr > t Intercept < 0001 Descriptives statistics CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

46 A little more complex model Model for the level 2 analysis: ˆβ 0j = γ 00 +γ 01 (mean SES j )+U 0j = γ 00 +γ 01 Z j +U 0j where U 0j N(0,τ 2 0 ) CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

47 SAS PROC REG CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

48 SAS PROC REG (intercepts) The REG Procedure Model: MODEL2 Dependent Variable: alpha Analysis of Variance Sum of Mean Source DF Squares Square F Pr > F Model Error Corrected Total CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

49 SAS PROC REG (intercepts) Root MSE R-Square Dependent Mean Adj R Coeff Var Parameter Estimates Parameter Standard Variable DF Estimate Error t Pr > t Intercept < 0001 meanses Conclusions? Note regarding p-value decimal places CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

50 Stage Two: Slopes If the estimated slopes differ over groups, the effect of time spent doing homework on math scores changes depending on what school a student is in Simplest case: no explanatory variable The model: ˆβ 1j = γ 10 +U 1j where U 1j N(0,τ 2 1 ) CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

51 Slopes: No Explanatory Variables The REG Procedure Model: MODEL1 Dependent Variable: beta Analysis of Variance Sum of Mean Source DF Squares Square F Pr > F Model 0 0 Error Corrected Total CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

52 Slopes: No Explanatory Variables Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var Parameter Estimates Parameter Standard Variable F Estimate Error t Pr > t Intercept Descriptives statistics: see page 41 CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

53 Slopes: More Complex Model Proposition: The higher the SES, the less of an effect time spent doing homework has on math achievement Reasoning for this: Schools with higher mean SES, more wealthy area, more money for schools, better school Diagram for this: Z x y CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

54 SAS PROC REG CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

55 Slopes: SAS PROC REG Output The model: ˆβ 1j = γ 10 +γ 11 (mean SES) j +U 1j where U 1j N(0,τ1 2) Dependent Variable: beta Analysis of Variance Sum of Mean Source DF Squares Square F Pr>F Model Error Corrected Total CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

56 Slopes: SAS PROC REG Output Root MSE Dependent Mean R-Square Adj R Parameter Standard Variable DF Estimate Error t Pr> t Intercept meanses Conclusions? Problems with this analysis? CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

57 Possible Improvements To the two stage analysis of NELS88 Center SES Use multivariate (multiple) regression so that the intercepts and slopes are modeled simultaneously Would get an estimate of τ 01, covariance between U 0j and U 1j CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

58 Serious problems remain Lost information because regression coefficients are used to summarize groups The number of observations per group is not taken into account when analyzing the estimated regression coefficients (in the 2nd stage) Random variability is introduced by using estimated regression coefficients How can model fit to data be assessed? Incorrect estimation of standard errors (level 2) CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

59 A Simple Hierarchical Linear Models Level 1: Y ij = β 0j +β 1j x ij +R ij where R ij N(0,σ 2 ) and independent and more compactly, where Y j N(X j β j,σ 2 I)iid Y j = X j β j +R j CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

60 HLM: Level 2 Level 2: β 0j = γ 00 +U 0j β 1j = γ 10 +U 1j where ( U0j U 1j ) = N (( 0 0 ) ( τ 2, 0 τ 01 τ 01 τ1 2 )) iid or more compactly, where U j N(0,T)iid β j = Z j Γ+U j CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

61 Linear Mixed Model Put models for β 0j and β 1j into the level 1 model: Y ij = β 0j +β 1j x ij +R ij = (γ 0o +U 0j )+(γ 1o +U 1j )x ij +R ij = γ 00 +γ 10 x ij }{{} +U 0j +U 1j x ij +R ij }{{} fixed part random part CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

62 Linear Mixed Model (continued) Given all the assumptions of the hierarchical model, Y ij N((γ 00 +γ 10 x ij ),var(y ij )) iid where var(y ij ) = (τ τ 01x ij +τ 2 1 x2 ij )+σ2 This is the Marginal Model for Y ij CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

63 In Matrix Form The Model: where Y j = X j Γ+Z j U j +R j Y j N nj (X j Γ,V j )iid and the covariance matrix equals, V j = Z j TZ j +σ 2 I and later for longitudinal data V j = Z j TZ j +Σ j CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

64 The Marginal Model The distinction between the hierarchical formulation (and linear mixed model) and the marginal model are important with respect to Interpretation Estimation Inference but more on this later CJ Anderson (Illinois) Hierarchical Linear Models/Multilevel Analysis Spring / 64

Models for Clustered Data

Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA