Multilevel Modeling: A Second Course

Similar documents
Estimation and Centering

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

An Efficient State Space Approach to Estimate Univariate and Multivariate Multilevel Regression Models

Describing Change over Time: Adding Linear Trends

Regression in R. Seth Margolis GradQuant May 31,

Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research

Outline

The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs

Testing and Interpreting Interaction Effects in Multilevel Models

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models

How to run the RI CLPM with Mplus By Ellen Hamaker March 21, 2018

Dyadic Data Analysis. Richard Gonzalez University of Michigan. September 9, 2010

Introduction to Random Effects of Time and Model Estimation

Hierarchical Linear Modeling. Lesson Two

Review of CLDP 944: Multilevel Models for Longitudinal Data

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

An Introduction to Path Analysis

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Introduction to Within-Person Analysis and RM ANOVA

Introduction and Background to Multilevel Analysis

Linear Regression with Multiple Regressors

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

multilevel modeling: concepts, applications and interpretations

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

THE EFFECTS OF MULTICOLLINEARITY IN MULTILEVEL MODELS

Time Invariant Predictors in Longitudinal Models

Categorical Predictor Variables

An Introduction to Mplus and Path Analysis

WU Weiterbildung. Linear Mixed Models

Review of Multiple Regression

A multivariate multilevel model for the analysis of TIMMS & PIRLS data

Linear Regression with Multiple Regressors

Given a sample of n observations measured on k IVs and one DV, we obtain the equation

Introduction to Structural Equation Modeling

ANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula

Latent Variable Centering of Predictors and Mediators in Multilevel and Time-Series Models

Introducing Generalized Linear Models: Logistic Regression

Time-Invariant Predictors in Longitudinal Models

Introduction to LMER. Andrew Zieffler

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Correlation and regression

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Research Design - - Topic 19 Multiple regression: Applications 2009 R.C. Gardner, Ph.D.

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

BIOSTATISTICAL METHODS

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Review of Multilevel Models for Longitudinal Data

Two-Level Models for Clustered* Data

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

The Basic Two-Level Regression Model

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.

Multilevel Analysis, with Extensions

Moderation 調節 = 交互作用

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

For more information about how to cite these materials visit

Using Power Tables to Compute Statistical Power in Multilevel Experimental Designs

Psychology 282 Lecture #3 Outline

ANCOVA. Lecture 9 Andrew Ainsworth

Multilevel Structural Equation Modeling

Ronald Heck Week 14 1 EDEP 768E: Seminar in Categorical Data Modeling (F2012) Nov. 17, 2012

Longitudinal Data Analysis of Health Outcomes

Formula for the t-test

Mixed- Model Analysis of Variance. Sohad Murrar & Markus Brauer. University of Wisconsin- Madison. Target Word Count: Actual Word Count: 2755

8/04/2011. last lecture: correlation and regression next lecture: standard MR & hierarchical MR (MR = multiple regression)

Three-Level Modeling for Factorial Experiments With Experimentally Induced Clustering

FAQ: Linear and Multiple Regression Analysis: Coefficients

Additional Notes: Investigating a Random Slope. When we have fixed level-1 predictors at level 2 we show them like this:

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

ANCOVA. Psy 420 Andrew Ainsworth

Describing Within-Person Change over Time

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Introduction to. Multilevel Analysis

Basic IRT Concepts, Models, and Assumptions

Correlation and Regression Bangkok, 14-18, Sept. 2015

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017

British Journal of Mathematical and Statistical Psychology (2016), 69, The British Psychological Society

Time-Invariant Predictors in Longitudinal Models

sphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement

Generalized Linear Models for Non-Normal Data

Spring RMC Professional Development Series January 14, Generalized Linear Mixed Models (GLMMs): Concepts and some Demonstrations

This document contains 3 sets of practice problems.

From Micro to Macro: Multilevel modelling with group-level outcomes

1 Motivation for Instrumental Variable (IV) Regression

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Application of Plausible Values of Latent Variables to Analyzing BSI-18 Factors. Jichuan Wang, Ph.D

Advancing the Formulation and Testing of Multilevel Mediation and Moderated Mediation Models

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3

Interactions between Binary & Quantitative Predictors

Time-Invariant Predictors in Longitudinal Models

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011

CHAPTER 9 EXAMPLES: MULTILEVEL MODELING WITH COMPLEX SURVEY DATA

Erasmus Teaching staff mobility INTRODUCTION TO MULTILEVEL MODELLING

Hypothesis Testing for Var-Cov Components

Biol 206/306 Advanced Biostatistics Lab 5 Multiple Regression and Analysis of Covariance Fall 2016

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

Chapter 6: Linear Regression With Multiple Regressors

Exploiting TIMSS and PIRLS combined data: multivariate multilevel modelling of student achievement

Transcription:

Multilevel Modeling: A Second Course Kristopher Preacher, Ph.D. Upcoming Seminar: February 2-3, 2017, Ft. Myers, Florida

What this workshop will accomplish I will review the basics of multilevel modeling (MLM) to bring everyone up to speed. I will discuss some more advanced ideas and techniques, and show how to implement them in Mplus. MSEM will be then explored in depth, with an emphasis on practical application using Mplus. By the end, you will be able to develop and run your own models, by adapting and combining some of the example syntax provided in the online workshop materials. 5 What this workshop isn t This is not a basic Mplus workshop We will cover some Mplus basics, but not everything only those features that Ihave found the most useful for implementing MSEM. This is not an introductory SEM or MLM course I will cover basic topics as they pertain directly to multilevel SEM. But if you are a complete newcomer to either or both of SEM or MLM, I recommend that you supplement this workshop with books or workshops dedicated expressly to these topics. 6 3

Review of Multilevel Modeling (MLM): Background 7 Key assumptions of (single level) multiple linear regression (MLR) Correct specification of relationship between IVs and DV Inclusion of the important IVs in the model Perfect reliability of IVs Constant variance of errors* (homoscedasticity) Independence of errors* Normality of errors* *Errors are unobserved population quantities. Residuals are the corresponding observable sample quantities. If the relationship between the IVs and DV is correctly specified, then e represents random error. iid i e ~ N 0, 2 e 8 4

Clustered data commonly occur when cases are nested in ways that enhance the probability of similar responses. Examples: Level 1 Level 2 students classrooms corporations nations spouses nested within couples voters districts repeated measures individuals Clustered data violate the assumption of independence, often leading to biased standard errors. 9 Ignoring violations of assumptions caused by nesting means we operate under the assumption that we have more information than we really do. In clustered data, sampling units no longer yield unique information. Nesting can be a nuisance from a statistical point of view, but the dependence can also be a source of great substantive interest. 10 5

Multilevel modeling (MLM) is an elaboration of multiple regression that is designed for use with clustered data. Also known as hierarchical linear modeling (HLM), random coefficient modeling, contextual analysis, mixed linear modeling, and mixed effects modeling. Very popular in psychology, education, organizational research, and public health. 11 MLM treats clusters as if they are sampled from a larger population of clusters, enhancing the generalizability of results. In MLM, regression intercepts and/or slopes are assumed to have a particular distribution across clusters, summarized by a limited set of parameters (e.g., mean and variance). MLM extends single-level regression by treating these intercepts and slopes as dependent variables in their own right. In MLM, the dependent variable (y ij ) is always measured at the lowest level, Level 1. 12 6

For example, whereas the single level regression model might assume that all cases come from a population with the same intercept 0 : 2 ei ~ N0, e y x e i 0 1 i i...mlm permits individual level 2 units to have their own distribution of 0. Rather than estimate each one individually, we assume a distributional form for 0 as we do for e ij : 2 eij ~ N0, e y x e ij 0 j ij ij ~ N, 0 j 00 00 13 Advantages of MLM Parsimony. Compared to a fixed effects approach (including J 1 separate level 2 effects of cluster identifiers using dummy codes), we simply estimate a mean and variance for an effect across the population of level 2 units. Generalizability. Because the effect is treated as random rather than fixed, we can generalize results to the population of level 2 units rather than only to those sampled. Appropriateness. Multilevel models often conform more closely to theoretical predictions than do other approaches. Flexibility. Rather than treating clusters as nuisances that violate assumptions, by modeling nestedness we can examine both level 1 and level 2 effects. 14 7

Review of Multilevel Modeling (MLM): Equations 15 There are two popular ways to represent MLM in equation form: Matrix expression: The model is expressed for a typical level 2 unit (cluster), where level 1 units are assembled into a vector. Scalar expression: The model is expressed for a typical level 1 unit and a typical level 2 unit. We will use the scalar expression here. 16 8

Scalar expression Level 1 expression... y x x e ij 0j 1ij 2j 2ij ij Combined (reduced) form... Level 2 expression... w u 0j 00 01 1 j 0j w u 10 11 1 j w u 2j 20 21 1 j 2j y ij w x x w x x 00 01 1 j 10 1ij 11 1ij 20 2ij 21 2ij u u x u x e 0 j 1ij 2j 2ij ij w 17 Scalar expression Using scalar expressions, the relationship of MLM to multiple linear regression (MLR) becomes clear. MLM can be thought of as a form of MLR in which the intercept and slopes are potentially random. We assume these coefficients are normally distributed, and estimate the means, variances, and covariances of their joint distribution as parameters. y x x e ij 0j 1ij 2j 2ij ij w u w u w u 0j 00 01 0j 10 11 2j 20 21 2j u 0j 0 00 u ~ MVN 0, 10 11 u2j 0 20 21 22 eij ~ N 0, 2 18 9

The simplest multilevel model: Random effects ANOVA y e ij 0 j ij u 0 j 00 0 j "Level-1 model" "Level-2 model" The reduced form equation for the random effects ANOVA: e ~ N 0, ij 2 e u ~ N 0, 0j 00 fixed random component component yij 00 u0 j eij random coefficient 0j The between and within cluster variances sum to the observed variance: ˆ ˆ ˆ 00 2 2 e y 0j is treated as a random coefficient because (a) we are interested not in individual intercepts but rather in the mean ( 00 ) and variance ( 00 ) and (b) we want to generalize results beyond the particular groups in our study. 19 The intraclass correlation (ICC) is the proportion of variance in y that is across level 2 units: ˆ 00 ICC ˆ 2 00 ˆe ICC is sometimes used to decide if MLM would be worthwhile. 20 10

We can expand the random effects ANOVA model to include level 1 predictors. The result is a random effects ANOVA model with covariates (i.e., RANCOVA): y x e 0 j 00 u0 j ij 0 j ij ij 10 The effect of x is constrained to be the same across all level 2 units. In other words, 11 = 0 (where 11 is the slope variance). The intercept variance ( 00 ), on the other hand, is allowed to be nonzero. The reduced form equation is: y x u e ij 00 10 ij 0 j ij 2 The level 1 variance e is now the residual variance after adjusting for the predictor x. The level 2 variance t 00 is the group level residual variance. 21 We could expand the RANCOVA model to include more level 1 predictors or level 2 predictors. For example: y x x x e ij 0 j 1ij 2 j 2ij 3j 3ij ij w w u 0 j 00 01 02 2 j 0 j 10 2 j 20 3j 30 22 11

In principle, we can have any number of level 1 predictors, and any mix of fixed and random coefficients. For example: y x x x x x x x e 0 j 00 u0 j 2 eij ~ N 10 u 0, e 2 j 20 u2 j u0 j 0 00 3j 30 u u 3j 1 0 j 10 11 4 j 40 u u 2 j 0 20 21 22 4 j u3 j ~ MVN 0, 30 31 32 33 5 j 50 u 5 j u 4 j 0 40 41 42 43 44 6 j 60 u 6 j u5 j 0 50 51 52 53 54 55 u 6 j 0 60 61 62 63 64 65 66 ij 0 j 1ij 2 j 2ij 3j 3ij 4 j 4ij 5 j 5ij 6 j 6ij 7 j 7ij ij 7 j 70 All these slopes have the same interpretations as in single level regression. 23 Random effects can exist at any level except the highest level. For example, level 1 slopes can vary randomly across level 2 units, but in a two level model there are no level 3 units for level 2 slopes to vary across. y x e ij 0 j 1ij ij w u 0 j 00 01 0 j w u 10 11 24 12

Model building strategies There are three common strategies for model fitting in MLM: A priori specification. Decide on the model based exclusively on theory. Build up strategy. Add predictors and random effects one or a few at a time, and retain if significant and/or meaningfully large. Tear down strategy. Start with maximally complex model, and remove predictors and random effects that are not significant or large. A priori specification is the most scientifically defensible. The build up strategy is perhaps the most common in practice. 25 For example, using the build up strategy, we might progress through these stages, each time examining the parameter estimates, statistical significance, differences in model fit, and changes in explained variance. y e ij 0 j ij u 0 j 00 0 j y x x e ij 0 j 1ij 2 j 2ij ij 0 j 00 0 j 10 2 j 20 u y x x e ij 0 j 1ij 2 j 2ij ij u 0 j 00 0 j u 10 u 2 j 20 2 j y x x e ij 0 j 1ij 2 j 2ij ij w w u 0 j 00 01 02 2 j 0 j w u 10 11 2 j w w u 2 j 20 21 22 2 j 2 j 26 13

Expanding the number of random coefficients expands size of the tau (T) matrix rapidly, increasing the risk of estimation errors and lowering parsimony: Random intercept only: Random intercept and slope: Random intercept, 2 slopes: Random intercept, 4 slopes: u ~ N 0, 0 j 00 u0 j 0 00 ~ N, u 0 1 j 10 11 u 0 j 0 00 u ~ N 0, 10 11 u 2 j 0 20 21 22 u0 j 0 00 u 1 0 j 10 11 u 2 j ~ N 0, 20 21 22 u3 j 0 30 31 32 33 u 0 4 j 40 41 42 43 44 27 Centering refers to subtracting a value from raw data. This value is usually the grand mean, the group mean, or a reference value of particular interest. Reasons why centering is used: to facilitate interpretation to avoid multicollinearity to separate effects into within and between cluster components Centering is widely recommended in a variety of situations. 28 14

Centering: Consequences for parameter estimation Intercepts are interpreted as the predicted value of y where all predictors = 0. In models with no higher order terms (squared terms, products, etc.), centering changes only the intercept, leaving slopes unaffected. In models with higher order terms, centering affects all terms except the highest order terms. 29 Grand mean centering: Subtracting the grand mean from all x scores. xij x.. xij x.. 0 represents the average level 1 unit s x ij score. 00 now represents the mean of the predicted ys at the grand mean of x ij. 00 is now the variance of the predicted ys at the grand mean of x ij. 30 15

An example model: Random intercept, random slope y x x e ij 0 j 1 j ij.. ij u 0 j 00 0 j u 10 y x x u x x u e ij 00 10 ij.. 1 j ij.. 0 j ij 31 Group mean centering: Subtracting the group mean from all xs in each group. x ij x. j x represents each level 2 unit s average x ij score. Thus, ij x. j 0 xij x. no level 2 variance. j has 00 still represents the mean of the predicted ys at the grand mean of x ij. 00 is the variance of the predicted ys at the group means of x ij. 32 16

An example model: Random intercept, random slope y x x e ij 0j 1 j ij. j ij u 0j 00 0 j u 10 y x x u x x u e ij 00 10 ij. j 1 j ij. j 0 j ij 33 Adding the mean back Group mean centering and using group means as level 2 predictors lets us separate x ij, and thus x ij s effect, into within and between components. y ( x x ) e x u ij 0 j 1 j ij. j ij 0 j 00 01. j 0 j 10 y u x x x e 01 ij 00 0 j 10 ij. j. j ij within between effect effct e However, it has been shown that the effect of is biased toward that of ( x x ), especially with small clusters and low ICC (Lüdtke et al., 2008). ij. j x. j 34 17

Centering y In principle, we could split y into between and within components as well. Between and within decomposition: A level 1 variable can be thought of as the sum of between cluster and within cluster components: y y y y ij ij. j. j...which seems a bit silly at first, but it is a meaningful split: y y y y.. ij ij j j varies only varies only within clusters between clusters 35 Centering y y y y y.. ij ij j j varies only within clusters varies only between clusters Consider individual differences in student achievement within vs. between classrooms. ( yij y. j ) is a student s achievement relative to his or her peers, whereas corresponds to the classroom average achievement. y. j We will see that it can be interesting to apply different models to ( yij y. j ) and. y. j 36 18