A brief introduction to mixed models

Similar documents
Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Mixed effects models

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

Mixed models in R using the lme4 package Part 3: Linear mixed models with simple, scalar random effects

Correlated Data: Linear Mixed Models with Random Intercepts

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

Value Added Modeling

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

lme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept

36-463/663: Hierarchical Linear Models

Workshop 9.3a: Randomized block designs

Outline. Mixed models in R using the lme4 package Part 2: Linear mixed models with simple, scalar random effects. R packages. Accessing documentation

Random and Mixed Effects Models - Part II

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Fitting Linear Mixed-Effects Models Using the lme4 Package in R

Introduction and Background to Multilevel Analysis

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 1: Linear mixed models with simple, scalar random

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Model comparison and selection

Mixed models in R using the lme4 package Part 1: Linear mixed models with simple, scalar random effects

Stat 579: Generalized Linear Models and Extensions

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Fitting Mixed-Effects Models Using the lme4 Package in R

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data

Introduction to Mixed Models in R

Homework 3 - Solution

Using R formulae to test for main effects in the presence of higher-order interactions

Package HGLMMM for Hierarchical Generalized Linear Models

Non-Gaussian Response Variables

13. October p. 1

Exploring Hierarchical Linear Mixed Models

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Outline. Example and Model ANOVA table F tests Pairwise treatment comparisons with LSD Sample and subsample size determination

Lecture 9 STK3100/4100

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Mixed Model Theory, Part I

Introduction to Linear Mixed Models: Modeling continuous longitudinal outcomes

Random and Mixed Effects Models - Part III

Outline for today. Two-way analysis of variance with random effects

Longitudinal Data Analysis

Simple Linear Regression

Coping with Additional Sources of Variation: ANCOVA and Random Effects

Introduction to Within-Person Analysis and RM ANOVA

Statistical Inference: The Marginal Model

Exercise 5.4 Solution

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij

Regression, Ridge Regression, Lasso

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

MS&E 226: Small Data

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Math 423/533: The Main Theoretical Topics

Hierarchical Random Effects

MASM22/FMSN30: Linear and Logistic Regression, 7.5 hp FMSN40:... with Data Gathering, 9 hp

Stat 5303 (Oehlert): Randomized Complete Blocks 1

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58

Generalized linear models

Package blme. August 29, 2016

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Time-Invariant Predictors in Longitudinal Models

Generalized Linear Models

PAPER 206 APPLIED STATISTICS

MLDS: Maximum Likelihood Difference Scaling in R

Time-Invariant Predictors in Longitudinal Models

STAT 526 Advanced Statistical Methodology

Exam Applied Statistical Regression. Good Luck!

Multivariate Statistics in Ecology and Quantitative Genetics Summary

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Aedes egg laying behavior Erika Mudrak, CSCU November 7, 2018

Bayes: All uncertainty is described using probability.

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution

Power analysis examples using R

Answer to exercise: Blood pressure lowering drugs

1 Mixed effect models and longitudinal data analysis

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Ch 2: Simple Linear Regression

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Maximum Likelihood (ML) Estimation

arxiv: v2 [stat.co] 17 Mar 2018

A Re-Introduction to General Linear Models (GLM)

Generalized Linear and Nonlinear Mixed-Effects Models

Information. Hierarchical Models - Statistical Methods. References. Outline

Subject CS1 Actuarial Statistics 1 Core Principles

Statistical Distribution Assumptions of General Linear Models

Linear Regression Models P8111

Likelihood Ratio Tests. that Certain Variance Components Are Zero. Ciprian M. Crainiceanu. Department of Statistical Science

University of Minnesota Duluth

Model checking overview. Checking & Selecting GAMs. Residual checking. Distribution checking

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

Mixed models in R using the lme4 package Part 4: Theory of linear mixed models

A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics

Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1

STAT 740: Testing & Model Selection

Transcription:

A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017

Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation. Confidence intervals for parameters. Models with nested factors. Model comparison. Model diagnostics. Show how mixed models can be estimated using lme4. The lecture is mainly based on 1 Bates et al, Fitting Linear Mixed-Effects Models Using lme4, Journal of Statistical Software, 67:1, 2015. 2 Douglas M. Bates lme4: Mixed-effects modeling with R Outline

An illustrative example Study taken from Belenky et al. (2003). Based on 18 subjects. On day 0 the subjects had their normal amount of sleep. They were then restricted to 3 hours of sleep per night. Each day the average reaction times in milliseconds (ms) on a series of tests is measured for each subject. An example of a mixed model

Linear regression It seems like a linear model would fit the data: Y ij = β 1 + t ij β 2 + ε ij, ε ij N(0, σ 2 ) Collecting the measurements from subject i into a vector Y i : Y i = X i β + ε i, ε i N(0, σ 2 I) An example of a mixed model

Linear regression in R library("lme4") data("sleepstudy") str(sleepstudy) data.frame : 180 obs. of 3 variables: $ Reaction: num 250 259 251 321 357... $ Days : num 0 1 2 3 4 5 6 7 8 9... mod = lm(reaction ~ Days, sleepstudy) plot(sleepstudy$days,sleepstudy$reaction, xlab = "Days", ylab = "Response time (ms)") lines(sleepstudy$days,mod$fitted.values) An example of a mixed model

People need different amount of sleep The data shown separately for each subject. An individual linear regression for each subject. An example of a mixed model

A random effect model for the data For the linear regression (fixed effect model) we had Y i = X i β + ε i For a random effect model we instead have where b i N(0, Σ). Y i = X i β + X i b i + ε i β is now the common average regression coefficients and b i are the individual deviations from the mean. The matrix Σ determines the variances of the deviations from the mean, as well as the correlations between the different random effects. An example of a mixed model

Mixed models in R We formulate the random effect model using the formula specification as Reaction ~ Days + (Days Subject) We can fit the model using lme4 as fm1 <- lmer(reaction ~ Days + (Days Subject), sleepstudy) Some results Random effects: Groups Name Std.Dev. Corr Subject (Intercept) 24.740 Days 5.922 0.07 Residual 25.592 Fixed Effects: (Intercept) Days 251.41 10.47 An example of a mixed model

Mixed models For more complicated problems, we may need to use both fixed effects and random effects. A general mixed model can be written as Y i = X i β + Z i b i + ε i where the model matrix X i and the random effects matrix Z i may be different. We could for example have a model with a random intercept but a common slope. In this case, Z i is only the first column of X i, and b i N(0, σ 2 b ). In the formula specification: Reaction ~ Days + (1 Subject) The general definition

A second example The Penicillin data from Davies and Gold-smith (1972), used to assess the variability between samples of penicillin by the B. subtilis method. In this test, plates (petri dishes) with bulk-innoculated nutrient agar medium has six small pots at equal distance. A few drops of the penicillin solutions is placed in each pot, and the whole plate is placed in an incubator for a given time. Penicillin diffuses from the pots into the agar. The diameter of the affected zone is related to the concentration of penicillin. We have 24 plates, each with 6 pots, giving in total 144 observations. The variation in the diameter is associated with the plates and with the samples (pots). The general definition

The general definition

The model for the Penicillin data We use two random effects: One for plate that accounts for the between-plate variability, and one for pot that accounts for between-sample variability. We can write the model as Y ij = α + b i + b j + ε ij where i denotes plate, j denotes pot, and b i N(0, σ 2 plate ), b i N(0, σ 2 pot), ε ij N(0, σ 2 ). We assume that the two random effects are independent, and can therefore specify the model as f = diameter ~ 1 + (1 plate) + (1 sample) To assume independence between slope and intercept in a linear model, one can use : Reaction ~ Days + (Days Subject) The general definition

Estimating the model in R f = diameter ~ 1 + (1 plate) + (1 sample) fm2 <- lmer(f, Penicillin) Random effects: Groups Name Variance Std.Dev. plate (Intercept) 0.71691 0.84671 sample (Intercept) 3.73097 1.93157 Residual 0.30241 0.54992 Number of obs: 144, groups: plate, 24; sample, 6 Fixed effects: Estimate Std. Error t value (Intercept) 22.9722 0.8086 28.41 The sample-to-sample variability has the greatest contribution, then plate-to-plate variability and finally the residual variability that cannot be attributed to either the sample or the plate. The general definition

Maximum likelihood estimation Standard mixed models, such as those in lme4, assumes Gaussian distributions for all components. The model parameters we need to estimate are fixed effects, β parameters in the covariance matrix of the random effects, θ measurement noise variance σ 2. The parameters are estimated by numerical maximization of the likelihood for the data observed y, L(θ, β, σ y). Be default, lme4 does not do ML estimation but Restricted (or residual) ML (REML) estimation. Parameter estimation

Restricted ML estimation REML estimation is done as follows: For a fixed value of θ, we can maximize the profiled likelihood L(θ, β, σ y) analytically with respect to β and σ to obtain profile likelihood estimates ˆβ θ and ˆσ θ. L P (θ y) = L(θ, ˆβ θ, ˆσ θ y) is called a profiled likelihood. L P (θ y) is maximized numerically to obtain the parameter estimates (e.g. using the L-BFGS method). There are good reasons for doing this: It decreases computation time. ML estimates of variances are known to be biased. REML estimates tend to have less bias. Turn off this by using REML = FALSE in the lmer function. Parameter estimation

Confidence intervals for parameters lme4 uses confidence intervals based on the profile likelihood as follows: Let L = arg max θ L(θ y). Let L p denote the best fit with the parameter of interest fixed at value p. Define the likelihood ratio test statistic on deviance scale: D(p) = 2(log(L p ) log(l)). Define ξ(p) as the signed square root transformation of D(p). ξ(p) is compared to quantiles of a standard normal, and for example a 95% profile deviance confidence interval of p is given by the values for which 1.960 < ξ < 1.960. Confidence intervals

Confidence intervals in R fm1 <- lmer(reaction ~ Days + (Days Subject), sleepstudy) pr1 <- profile(fm1) confint(pr1) 2.5 % 97.5 %.sig01 14.3815048 37.715996.sig02-0.4815007 0.684986.sig03 3.8011641 8.753383.sigma 22.8982669 28.857997 (Intercept) 237.6806955 265.129515 Days 7.3586533 13.575919 Confidence intervals

Profile zeta plots: xyplot(pr1) Confidence intervals

Interpreting the profile zeta plot One could think of these curves as showing the distribution of the estimator. If the curve is close to a straight line, the likelihood is quadratic close to the maxima and a Gaussian approximation works fine for inference. The curve for the fixed effects are sigmoidal, which means that the distribution is over-dispersed compared to a Gaussian (for regression it is a t-distribution). The curves for the variances are skewed (for regression it is a χ 2 distribution). Confidence intervals

Assessing the random effects It is often interesting to look at the posterior distribution π(b y) of the random effects. For example, what is the effect of the sleep deprivation for a specific subject? The posterior is Gaussian, and we can use the posterior mean as a point estimate. The posterior mean and variance for each subject is computed using the ranef function. These can be plotted using the dotplot function: rr1 <- ranef(fm1, condvar = TRUE) dotplot(rr1,scales = list(x = list(relation = free ))) Prediction intervals

Prediction intervals

Nested factors In many situations, one have so called nested factors that need to be dealt with in a slightly different way. To illustrate this, we use data from Davies and Goldsmith (1972) of deliveries of a chemical paste product contained in casks. As a routine, three casks selected at random from each delivery were sampled. Ten of the delivery batches were sampled at random and two analytical tests carried out on each of the 30 samples. The factors are now batch (1-10) and cask (1-3). Thus, it would be tempting to fit a model strength ~ 1 + (1 cask) + (1 batch) to account for the cask effect and batch effect. This is wrong! Models with nested factors

Nested factors The problem is that the casks are sampled at random, so cask 1 in batch 1 has nothing to do with cask 1 in batch 2. Thus, the effect of cask only makes sense within each batch: The factors are nested. The model we need to specify is Y ijk = α + b i + b ij + ε ijk where b i is batch effect, b ij is cask effect, and ε ijk is measurement error. Note that this model requires repeated measurements for each cask (we have 2). Models with nested factors

Speecification of Nested factors To specify a model with nested factors, we can use strength ~ 1 + (1 batch) + (1 batch:cask) which also can be written as strength ~ 1 + (1 batch/cask) Alternatively, we can pre-compute the nested factor b ij and append it to the dataset, as Pastes <- within(pastes, sample <- factor(batch:cask)) and then specify the model as strength ~ 1 + (1 batch) + (1 sample) Models with nested factors

f = strength ~ 1 + (1 batch) + (1 batch:cask) fm3 <- lmer(f, Pastes,REML=FALSE) fm3 Linear mixed model fit by maximum likelihood [ lmermod ] Formula: strength ~ 1 + (1 batch) + (1 batch:cask) Data: Pastes AIC BIC loglik deviance df.resid 255.9945 264.3718-123.9972 247.9945 56 Random effects: Groups Name Std.Dev. batch:cask (Intercept) 2.9041 batch (Intercept) 1.0951 Residual 0.8234 Number of obs: 60, groups: batch:cask, 30; batch, 10 Fixed Effects: (Intercept) 60.05 Models with nested factors

A reduced model The batch effect is much lower than the sample effect. The profile confidence interval for σ 2 goes to zero: pr3 <- profile(fm3) confint(pr3) 2.5 % 97.5 %.sig01 2.1579337 4.053589.sig02 0.0000000 2.946591.sigma 0.6520234 1.085448 (Intercept) 58.6636504 61.443016 Can we get away with a simpler model without the batch effect? f = strength ~ 1 + (1 batch:cask) fm4 <- lmer(f, Pastes, REML=FALSE) Model selection

Model selection We want to test H 0 : σ 2 = 0 against H 1 : σ 2 > 0. Alternatively, we can say that we want to test the full model against the reduced model. We can compare nested models using a likelihood ratio test, using that the LRT statistics is approximately χ 2 distributed. This test is a classical ANOVA analysis, implemented in the anova function of lme4. anova(fm3,fm4) Data: Pastes Models: fm4: strength ~ 1 + (1 batch:cask) fm3: strength ~ 1 + (1 batch) + (1 batch:cask) Df AIC BIC loglik deviance Chisq Chi Df Pr(>Chisq) fm4 3 254.40 260.69-124.2 248.40 fm3 4 255.99 264.37-124.0 247.99 0.4072 1 0.5234 Model selection

More precise model checks One should be a bit careful with LR tests, since they are based on asymptotics. The absence of analytical results for null distributions of parameter estimates in complex situations is a long-standing problem in mixed-model inference. If a greatly increased computation time is acceptable, one can estimate confidence intervals by parametric bootstrapping. This is done by simulation from the model and refitting of the model parameters. The function bootmer inplements this. Model selection

Model diagnostics As always, we should check that the assumptions of the model hold before using it to draw conclusions. Some common diagnostic plots are: plot(fm1, type = c("p", "smooth")) qqmath(fm1, id = 0.05) Model diagnostics #residual plot #qq-plot

Further topis Longitudinal data analysis. The regression example was the simplest example of a longitudinal model. We often need to deal with temporal correlation using stochastic processes. Generalized mixed models. Need if we, for example, want to look at count data. Non-Gaussian mixed models. What do we do if we have skewed random effect distributions? Bayesian mixed models. In Bayesian models, there is no need to separate random effects from fixed effects, they just have different priors! A popular package for fitting Bayesian mixed models is R-INLA. +++ Further topics