An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Similar documents
Introduction to Within-Person Analysis and RM ANOVA

Review of CLDP 944: Multilevel Models for Longitudinal Data

Describing Change over Time: Adding Linear Trends

Review of Multilevel Models for Longitudinal Data

Review of Unconditional Multilevel Models for Longitudinal Data

Introduction to Random Effects of Time and Model Estimation

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Time Invariant Predictors in Longitudinal Models

Describing Within-Person Change over Time

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

Time-Invariant Predictors in Longitudinal Models

Time-Invariant Predictors in Longitudinal Models

Longitudinal Data Analysis of Health Outcomes

A Re-Introduction to General Linear Models (GLM)

Describing Within-Person Fluctuation over Time using Alternative Covariance Structures

Describing Nonlinear Change Over Time

Time-Invariant Predictors in Longitudinal Models

A Re-Introduction to General Linear Models

Time-Invariant Predictors in Longitudinal Models

Generalized Linear Models for Non-Normal Data

Model Assumptions; Predicting Heterogeneity of Variance

Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

Simple, Marginal, and Interaction Effects in General Linear Models: Part 1

Two-Level Models for Clustered* Data

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

Statistical Distribution Assumptions of General Linear Models

Simple, Marginal, and Interaction Effects in General Linear Models

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Generalized Models: Part 1

Introduction to Generalized Models

SAS Syntax and Output for Data Manipulation:

Specifying Latent Curve and Other Growth Models Using Mplus. (Revised )

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

INTRODUCTION TO MULTILEVEL MODELLING FOR REPEATED MEASURES DATA. Belfast 9 th June to 10 th June, 2011

An Introduction to Mplus and Path Analysis

Review of the General Linear Model

An Introduction to Path Analysis

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

Multilevel Modeling: A Second Course

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

Centering Predictor and Mediator Variables in Multilevel and Time-Series Models

Models for Clustered Data

Models for Clustered Data

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

Interactions among Continuous Predictors

Basic IRT Concepts, Models, and Assumptions

Exploratory Factor Analysis and Principal Component Analysis

WU Weiterbildung. Linear Mixed Models

multilevel modeling: concepts, applications and interpretations

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Introducing Generalized Linear Models: Logistic Regression

Introduction to Regression

Thursday Morning. Growth Modelling in Mplus. Using a set of repeated continuous measures of bodyweight

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Generalized Linear Models for Count, Skewed, and If and How Much Outcomes

Exploratory Factor Analysis and Principal Component Analysis

ANOVA approaches to Repeated Measures. repeated measures MANOVA (chapter 3)

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Dyadic Data Analysis. Richard Gonzalez University of Michigan. September 9, 2010

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

Goals for the Morning

Review of Multiple Regression

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

An Efficient State Space Approach to Estimate Univariate and Multivariate Multilevel Regression Models

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Introduction to Confirmatory Factor Analysis

Longitudinal Data Analysis

y response variable x 1, x 2,, x k -- a set of explanatory variables

Random Intercept Models

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Analysis of variance and regression. December 4, 2007

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

Longitudinal Modeling with Logistic Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Variance component models part I

HUDM4122 Probability and Statistical Inference. February 2, 2015

MULTILEVEL MODELS. Multilevel-analysis in SPSS - step by step

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics

Sociology 593 Exam 2 Answer Key March 28, 2002

Correlated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data

Models for longitudinal data

Deciphering Math Notation. Billy Skorupski Associate Professor, School of Education

Estimating a Piecewise Growth Model with Longitudinal Data that Contains Individual Mobility across Clusters

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

General Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur

CAMPBELL COLLABORATION

Section 3: Simple Linear Regression

df=degrees of freedom = n - 1

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Maximum Likelihood Estimation; Robust Maximum Likelihood; Missing Data with Maximum Likelihood

Advanced Quantitative Data Analysis

Transcription:

An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person Models Repeated Measures ANOVA as MLM Introduction to Multilevel Models Fixed and Random Effects of Time and Persons Families of Models for Change

Dimensions for Organizing Models What kind of outcome variable? Normal/continuous? general linear models (GLM) Non-normal/categorical? generalized linear models What kind of predictors? (names relevant only within GLM) Continuous predictors? Regression Categorical predictors? ANOVA How many dimensions of sampling are in your data? How many ways do your observations differ from each other? What kind of dependency or correlation is in your data? Multilevel models are used to quantify and predict dependency in an outcome related to different dimensions of sampling.

What is a Multilevel Model (MLM)? Same as other terms you have heard of: General Linear Mixed Model (if you are from statistics) Mixed = Fixed and Random effects Random Coefficients Model (also if you are from statistics) Random coefficients = Random effects Hierarchical Linear Model (if you are from education) Not the same as hierarchical regression Special cases of MLM: Random Effects ANOVA or Repeated Measures ANOVA (Latent) Growth Curve Model (where latent SEM) Within-Person Variation Model (e.g., for daily diary data) Clustered/Nested Observations Model (e.g., for kids in schools) Cross-Classified Models (e.g., value-added models)

The Two Sides of a Model Model for the Means: Aka Fixed Effects, Structural Part of Model What you are used to caring about for testing hypotheses How the expected outcome for a given observation varies as a function of values on predictor variables Fixed effects are always specified as a function of known things Model for the Variances ( piles of variance ): Aka Random Effects and Residuals, Stochastic Part of Model What you are used to making assumptions about instead Terms that allow model residuals to be related across observations (persons, groups, time, etc) operate as a function of sampling Random effects and residuals are unknown things ( error ) This is the primary way that MLM differs from the GLM

Data Requirements for Longitudinal Modeling Multiple measures from the same person! 2 is minimum, but just 2 can lead to problems: Only 1 kind of change is observable (1 difference) Can t directly model interindividual differences in change, because no differentiation between real change and measurement error is possible More data is better (with diminishing returns) More occasions better description of the form of change More people better estimates of individual differences in change; better prediction of those individual differences

2 Types of Within-Person Variation Within-Person Change: Systematic change Magnitude or direction of change can be different across people Growth curve models Time is meaningfully sampled Within-Person Fluctuation: No systematic change Outcome just varies/fluctuates over time (e.g., emotion, stress) Time is just a way to get lots of data per person WP Change WP Change WP Fluctuation Time Time Time

Levels of Inference in Longitudinal Multilevel Modeling Between-Person (BP) Relationships: Level-2 INTER-individual Differences Time-Invariant All longitudinal studies begin as cross-sectional studies Within-Person (WP) Relationships: Level-1 INTRA-individual Variation Time-Varying Only longitudinal studies can provide this extra information Longitudinal studies allow examination of both types of relationships simultaneously (and their interactions) Any variable measured over time usually has both of these sources of variation: BP and WP

An Empty Between-Person Model 140 Y i = β 0 + e i 120 100 80 60 40 Mean = 89.55 Std. Dev. = 15.114 N = 1,334 Filling in values: 32 = 90 + -58 Y pred Y Error Variance: Σ (y y pred ) 2 N - 1 Model for the Means 20

Between-Person Model: One Continuous Predictor 140 120 100 80 y i = β 0 + β 1 X i + e i Empty Model: 32 = 90 + -58 Ability (X) Model: 32 = 29 + 2*9 + -15 60 40 20 10 20 30 40 Predictor Predictor X: Ability X1 Y pred Y Error Variance: Σ (y y pred ) 2 N - 2 Model for the Means

A More General Linear Model for Between-Person Analysis y i = β 0 + β 1 X i + β 2 Z i + β 3 X i Z i + e i Model for the Means (Fixed Effects): Each person s expected (predicted) outcome is a function of his/her values on x and z (and here, of their interaction, too) Even though the grand mean is no longer the best guess (best model) for each person s outcome, we still call it the model for the means because each person gets a conditional mean as his or her best guess (same predicted Y given same predictor values)

A More General Linear Model for Between-Person Analysis y i = β 0 + β 1 X i + β 2 Z i + β 3 X i Z i + e i Model for the Variance (Residuals and Random Effects): e i ~ NID (0, σ e2 ) In English: e i is a random variable with a mean of 0 and some estimated variance, and is normally distributed ONE pile of variance in leftover Y after accounting for predictors We don t care about the individual e s we care about their variance What makes this model for the variance Between-Person? e i is measured only once per person (as indicated by the i subscript)

Adding Within-Person Variance to the Model for the Variances Full Sample Distribution: 3 People, 5 Occasions each 140 120 100 80 60 40 Mean = 89.55 Std. Dev. = 15.114 N = 1,334 20

Empty +Within-Person Model 140 120 100 80 60 40 20 Start off with Mean of Y as best guess for any value: = Grand Mean = Fixed Intercept Can make better guess by taking advantage of repeated observations: = Person Mean Random Intercept

Division of Error in Between Person Model for Variances

Division of Error in Between Person Model for Variances e ti represents all Y variance e 1i e 2i e 3i e 4i e 5i

Division of Error in +Within Person Model for Variances U 0i represents Between-Person Y Variance e ti represents Within-Person Y Variance U 0i e 1i e 2i e 3i e 4i e 5i U 0i also represents dependency (covariance) due to mean differences in Y across persons

Division of Error in +Within 12 Person Model for Variances 10 8 6 4 U 0i e2i e3i e4i e 1i e 5i 2 0 1 2 3 4 5 Time

Empty +Within-Person Model 140 120 100 80 60 40 20 Variance of Y 2 sources: Between-Person Variance: Differences from GRAND mean INTER-Individual Differences Within-Person Variance: Differences from OWN mean INTRA-Individual Differences Now we have 2 piles of variance in Y to predict.

Between-Person vs. +Within-Person Empty Models Empty Between-Person Model (1 time point): y i = β 0 + e i β 0 = fixed intercept = grand mean e i = residual deviation from GRAND mean Empty +Within-Person Model (>1 time points): y ti = β 0 + U 0i + e ti β 0 = fixed intercept = grand mean U 0i = random intercept = individual deviation from GRAND mean e ti = time-specific residual deviation from OWN mean

The Model for the Variances All models have at least an e 1 pile of variance 2 main reasons to care about what else should go into the model for the variances: Validity of the tests of the predictors depends on having the right model for the variances (where right means least wrong ) Estimates will usually be ok come from model for the means Standard errors (and thus p-values) can be compromised The sources of variation that exist in your outcome will dictate what kinds of predictors will be useful For example, in longitudinal data: Between-Person variation needs Between-Person predictors Within-Person variation needs Within-Person predictors

Categorizing Familiar Models by Their Models for the Variances Multiple Regression, Between-Person ANOVA: 1 PILE y i = (β 0 + β 1 X i + β 2 Z i ) + e i e i ONE residual, assumed uncorrelated with equal variance across observations (here, just persons) BP variation Repeated Measures, Within-Person ANOVA: 2 PILES y ti = (β 0 + β 1 X i + β 2 Z i ) + U 0i + e ti U 0i A random intercept for differences in person means, assumed uncorrelated with equal variance across persons 2 BP variation = U 0 e ti A residual that represents remaining time-to-time variation, usually assumed uncorrelated with equal variance across observations (now, persons and time) WP variation = 2 e

The Model for the Variances All models have at least an e 1 pile of variance 2 main reasons to care about what else should go into the model for the variances: Validity of the tests of the predictors depends on having the right model for the variances (where right means least wrong ) Estimates will usually be ok come from model for the means Standard errors (and thus p-values) can be compromised The sources of variation that exist in your outcome will dictate what kinds of predictors will be useful For example, in longitudinal data: Between-Person variation needs Between-Person predictors Within-Person variation needs Within-Person predictors

ANOVA for longitudinal data? There are 3 possible kinds of ANOVAs we could use: Between-Persons/Groups, Univariate RM, and Multivariate RM NONE OF THEM ALLOW: Missing occasions (do listwise deletion instead) Time-varying predictors Each includes the same model for the means with respect to time: all possible mean differences (so 4 parameters to get to 4 means) Saturated means model : β 0 + β 1 (T 1 ) + β 2 (T 2 ) + β 3 (T 3 ) The Time variable must be balanced and discrete in ANOVA! Each kind of ANOVA differs by what it says about the correlation in the data from the same person in the model for the variances i.e., how it handles dependency due to persons, or what it says the variance and covariance of the y ti residuals should look like

Division of Error in Between Person Model for Variances e ti represents all Y variance e 1i e 2i e 3i e 4i e 5i

ANOVA for longitudinal data? 1. Between-Groups Regression or ANOVA BP variance only (1 pile of e ti only) 2 σ e 0 0 0 2 0 σ e 0 0 2 0 0 σ e 0 0 0 0 σ Assumes NO RELATIONSHIP WHATSOEVER among observations from the same person (or across persons) Dependency? What dependency? e.g., 4 occasions * 100 persons would be 400 independent observations Will usually be VERY WRONG for longitudinal data BP effects tested against wrong df, WP effects tested against wrong df and wrong variance messed up SEs messed up p-values Will also be wrong for clustered data, although perhaps less so (because the correlation among persons from the same group in clustered data is not as strong as the correlation among occasions from the same person in longitudinal data) 2 e

Division of Error in +Within Person Model for Variances U 0i represents Between-Person Y Variance e ti represents Within-Person Y Variance U 0i e 1i e 2i e 3i e 4i e 5i U 0i also represents dependency (covariance) due to mean differences in Y across persons

ANOVA for longitudinal data? 2. (a) Univariate Repeated Measures ANOVA: Var(e ti ) + Var(U 0i ) Assumes a CONSTANT RELATIONSHIP OVER TIME among observations from the same person: Compound Symmetry Observations from the same person are correlated because of constant person mean differences (via U 0i ) only 1 kind of person dependency 2 2 2 2 2 e u0 u0 u0 u0 2 2 2 2 2 u 0 eu 0 u 0 u0 2 2 2 2 2 u 0 u 0 eu 0 u0 2 2 2 2 2 u u u eu 0 0 0 0 Will usually be SOMEWHAT WRONG for longitudinal data If people change at different rates, the variances and covariances of the outcome over time have to change, too Time Time

ANOVA for longitudinal data? 2. (b) Univariate RM ANOVA with sphericity corrections Based on ε how far off sphericity, (ranges 0-1, 1=spherical) Applies an overall correction for model df based on estimated ε Corrections for sphericity do not really solve the problem 3. Multivariate Repeated Measures ANOVA Assumes nothing: all variances and covariances are estimated separately Unstructured 2 It s not a model, it IS the data! σ σ σ σ σ σ σ σ σ σ σ σ σ σ σ σ 11 12 13 14 2 21 22 23 24 2 31 32 33 43 2 41 42 43 44 Because it can never it be wrong, an unstructured model can be useful for complete longitudinal data with few occasions Becomes hard to estimate very quickly with many occasions Parameters needed = (#occasions * [#occasions+1]) / 2

Example Data Individual Observed Trajectories (N = 101, n = 6) Number Match Size 3 RT by Session 4500 4000 3500 3000 Mean RT 2500 2000 1500 1000 500 1 2 3 4 5 6 Session

Summary: ANOVA approaches for longitudinal data are one size fits most : Saturated Model for the Means (balanced time required) All possible mean differences Unparsimonious, but best-fitting (is a description, not a model) 3 kinds of Models for the Variances (complete data required) e ti only = Between-Person/Group ANOVA assumes independent data U 0i and e ti = Compound Symmetry (CS) = Univ. RM ANOVA Requires sphericity, which rarely holds in longitudinal data All possible var. and covar. = Unstructured (UN) = Multiv. RM ANOVA Unparsimonious; is a description, not a model MLM will give us more flexibility in both parts of the model: Fixed effects that predict the pattern of means Random intercepts and slopes and/or alternative covariance structures that predict the pattern of variation and covariation over time

Empty Longitudinal Multilevel Model: Review of Terminology 140 120 100 80 Variance of Y 2 sources: Level 2 Random Intercept Variance (of U 0i ): Between-Person Variance (τ U02 ) Difference from GRAND mean INTER-Individual Differences 60 40 20 Level 1 Residual Variance (of e ti ): Within-Person Variance (σ e2 ) Difference from OWN mean INTRA-Individual Differences

Empty* Multilevel Model Model for the Means; Model for the Variance General Linear Model y i = β 0 + e i Multilevel Model Level 1: y ti = β 0i + e ti Level 2: β 0i = γ 00 + U 0i Sample Grand Mean Intercept Individual Intercept Deviation 3 Model Parameters 1 Fixed Effect: γ 00 fixed intercept 1 Random Effect (intercept): U 0i person-specific deviation mean=0, variance = τ U0 2 1 Residual Error: e ti time-specific deviation mean=0, variance = σ e 2 Composite equation: y ti = γ 00 + U 0i + e ti * To be more clear, I call this an empty means, random intercept model

Empty Multilevel Model: Useful Descriptive Statistic ICC Intraclass Correlation (ICC): ICC ICC Intercept Variance Intercept Variance + Residual Variance 2 U 2 2 U e 0 = τ + σ Between-Person Variance Between-Person Variance + Within-Person Variance ICC = Proportion of total variance that is between persons ICC = Average correlation among occasions ICC is a standardized way of expressing how much we need to worry about dependency due to person mean differences (i.e., ICC is an effect size for constant person dependency) 0 τ

ICC Between-Person Variance Between-Person Variance + Within-Person Variance Counter-Intuitive: Between-Person Variance is in the numerator, but the ICC is the correlation over time! ICC = BTW / BTW + within Large ICC Large correlation over time ICC = btw / btw + WITHIN Small ICC Small correlation over time

More New Vocabulary: Labels for 3 Types of Models Empty Model Empty means, random intercept model Just fixed intercept, random intercept variance, residual variance First baseline model for everything to follow Used to compute an ICC Unconditional (Growth) Model is up next Effects related to time, but no other predictors yet Second baseline model for everything to follow Conditional (Growth) Model is coming next semester With predictors besides time May end up with more than 1 depending on research questions

Extending the Multilevel Model: 2 Questions for Effects of Time 1. Is there an effect of time on average? Is the average line not flat? Significant Fixed Effect of Time 2. Does the average effect of time vary across individuals? Does each person need their own line? Significant Random Effect of Time

Fixed and Random Effects of Time No Fixed, No Random Yes Fixed, No Random No Fixed, Yes Random Yes Fixed, Yes Random

Random Linear Time Model Multilevel Model 6 Model Parameters 2 Fixed Effects: γ 00 and γ 10 2 Random Effects (+1 covariance): Variances of U 0i and U 1i (τ U02, τ U12 ) Covariance of U 0i and U 1i (τ U01 ) 1 Residual Variance of e ti (σ e2 ) Level 1: y ti = β 0i + β 1i Time ti + e ti Sample (Grand Mean) Intercept Individual Intercept Deviation Sample (Grand Mean) Slope Individual Slope Deviation Level 2: β 0i = γ 00 + U 0i β 1i = γ 10 + U 1i Composite Model y ti = (γ 00 + U 0i ) + (γ 10 + U 1i )Time ti + e ti β 0i β 1i

Fixed & Random Effects of Time y ti = (γ 00 + U 0i ) + (γ 10 + U 1i )Time ti + e ti Fixed Intercept Random Intercept Deviation Fixed Slope Random Slope Deviation error for person i at time t Outcome 32 28 24 20 16 12 8 4 u 1i = +2 γ 10 = 6 γ 00 =10 U 0i = -4 Mean P2 Linear (Mean) Linear (P2) 0 1 2 3 e ti = -1 6 Model Parameters: 2 Fixed Effects: γ 00 Intercept, γ 10 Slope 2(+1) Random Effects: U 0i Intercept Variance U 1i Slope Variance Int-Slope Covariance 1 e ti Residual Variance Time

Unbalanced Time Different time measurements across persons? OK Rounding time can lead to incorrect estimates! Code time as exactly as possible. This red predicted slope will probably be made steeper because it is based on less data, though the term for this is shrinkage. MLM uses each complete time point for each person. 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Longitudinal Data: Modeling Means and Variances We have two tasks in describing the effects of time : 1. Choose a Model for the Means What kind of change in the outcome do we have on average? What kind of and how many parameters do we need to represent that change as parsimoniously but accurately as possible? 2. Choose a Model for the Variances What kind of pattern do the variances and covariances of the outcome show over time due to individual differences in change? What kind of and how many parameters do we need to represent that pattern as parsimoniously but accurately as possible?

Big Picture Modeling Framework: Choices for Modeling Means What kind of change on average (in the means)? Empty refers to model for the means with no predictors (just fixed intercept for grand mean outcome over time) Saturated refers to model for means with all possible means estimated (#parameters = #occasions) THIS IS ANOVA Is a DESCRIPTION of the means, not a predictive MODEL Parsimony Empty Means: Predicts NO change over time 1 Fixed Effect In-between options: polynomial slopes, piecewise slopes, nonlinear models Even if time is unbalanced! Saturated Means: Reproduces mean at each occasion # Fixed Effects = # Occasions Good fit

Choices for Modeling Variance The partitioning of variance into piles Level 2 = BP G matrix of random effects variances/covariances Level 1 = WP R matrix of residual variances/covariances G and R combine to create V matrix of total variances/covariances Many flexible options that allows the variances and covariances to vary in a time-dependent way that better matches the actual data Compound Symmetry (CS) Unstructured (UN) Parsimony 2 2 2 2 2 e u0 u0 u0 u0 2 2 2 2 2 u 0 eu 0 u 0 u0 2 2 2 2 2 u 0 u 0 eu 0 u0 2 2 2 2 2 u u u eu 0 0 0 0 Univariate RM ANOVA MLM uses G and R to predict something in-between! Even if time is unbalanced! σ σ σ σ σ σ σ σ σ σ σ σ σ σ σ σ 2 11 12 13 14 2 21 22 23 24 2 31 32 33 43 2 41 42 43 44 Multivariate RM ANOVA Good fit

The Point of MLM: Dependency Common description of the purpose of MLM is that it addresses or handles correlated (dependent) data But where does this correlation come from? 3 places (here, an example with health as an outcome): 1. Mean differences across persons Some people are just healthier than others (at every time point) 2. Differences in effects of predictors across persons Does time affect health more in some persons than others? Does daily stress affect health more in some persons than others? 3. Non-constant within-person correlation for unknown reasons Occasions closer together may just be more related More likely for outcomes that fluctuate (than change) over time

2-Level Models for the Variances Where does the correlation or dependency go? Into a new random effects variance component ( pile of variance ) Original Residual Variance 2 e Level-1 Residual Variance 2 e Level-2 Random Intercept Variance 2 U 0 Level-1 Residual Variance 2 e Level-2 Random Linear Slope Variance 2 U Level-2 Random Intercept Variance 2 U 1 0 Within-Person Differences Between-Person Differences U 01 covariance

Summary: The Point of MLM All we ve done so far is carve up our total variance into up to 3 piles: BP (error) variance around intercept BP (error) variance around slope These two are one pile of WP (error) residual variance error variance in RM Anova But making piles does not make error variance go away Level 1 (one source of) Within-Person Variation: Gets accounted for by time-level predictors Level 2 (two sources of) Between-Person Variation: Gets accounted for by person-level predictors Residual Variance (σ e2 ) BP Int Variance (τ U02 ) FIXED effects make variance go away (explain variance). RANDOM effects just make a new pile of variance. BP Slope Variance (τ U12 ) τ U01 covariance

What can MLM do for you? 1. Model dependency across observations Longitudinal, clustered, and/or cross-classified data? No problem! Tailor your model of correlation over time and person to your data 2. Include categorical or continuous predictors at any level Time-varying, person-level, group-level predictors for each variance Explore reasons for dependency, don t just control for dependency 3. Does not require same data structure for each person Unbalanced or missing data? No problem! 4. You already know how (even if you don t know it yet)! SPSS Mixed, SAS Mixed, Stata, Mplus, R, HLM What s an intercept? What s a slope? What s a pile of variance?