Introduction to Mixed Models in R

Size: px
Start display at page:

Download "Introduction to Mixed Models in R"

Transcription

1 Introduction to Mixed Models in R Galin Jones School of Statistics University of Minnesota galin March 2011

2 Second in a Series Sponsored by Quantitative Methods Collaborative. Previous workshop An Introduction to R was given on 3/7 by Professor Sanford Weisberg. Upcoming talk on 4/11 by Chris Winship of Harvard University on causal inference.

3 Goals Brief review of first workshop. Definition of mixed models and why they may be useful. Using mixed models in R through two simple case studies.

4 Review R is free software and can be downloaded at A package is a collection of functions designed for specific tasks. For example the package lme4 fits many mixed models. When you install R on your computer you get the base distribution. This includes many useful packages but others, such as lme4, you need to install separately. The R search engine is extremely useful Getting data into R (or any other software package) can be challenging. Many R packages include built-in data sets and we will use two of these today.

5 Mixed Models Mixed models are a large and complex topic, we will only just barely get started with them today. There are many varieties of mixed models: Linear mixed models (LMM) Nonlinear mixed models (NLM) Generalized linear mixed models (GLMM) Our focus will be on linear mixed models. Much more discussion of this material can be found in the following books. Extending the Linear Model with R by Julian Faraway Mixed-Effects Models in S and S-PLUS by José Pinheiro and Douglas Bates

6 Factors, Effects and Treatments Suppose apple slices are treated with five preservative compounds (A, B, C, D, E) with the goal of extending shelf life. Response: Shelf life Factor: Preservative Treatments = Levels of a factor: A, B, C, D, E Effect: Impact of compound on shelf life µ=population mean shelf life µ A = population mean shelf life with treatment A effect of A = µ A µ

7 Fixed or Random? Factor effects are either fixed or random. Fixed: The levels in the study represent all levels of interest Random: The levels in the study represent only a sample of the levels of interest. Mixed models have both fixed and random effects. In our example, preservative is fixed since A, B, C, D, E are the only levels of interest.

8 Block: Blocking A group of units formed so that units within the group are as homogeneous as possible. Reduces the effects of variation among experimental units. Treatmets are randomly assigned to blocks. Lets add to the example: Suppose 10 individual fruit are randomly chosen from a population of fruit and the 5 preservatives are randomly assigned to 5 portions of fruit. Preservative is a fixed effect. Fruit is a (random) block effect.

9 Mixed models Mixed models contain both fixed and random effects This has several ramifications: Using random effects broadens the scope of inference. That is, inferences can be made on a statistical basis to the population from which the levels of the random factor have been drawn. Naturally incorporates dependence in the model. Observations that share the same level of the random effects are being modeled as correlated. Using random factors often gives more accurate estimates. Sophisticated estimation and fitting methods must be used.

10 Rail Data The Rail data is a built-in R data set. It can be found in the MEMSS and the NLME packages. The commands > library(nlme) > data(rail) >?Rail will make the data available to us and produce a description.

11 Rail Data Evaluation of Stress in Railway Rails Description The Rail data frame has 18 rows and 2 columns. Format This data frame contains the following columns: Rail an ordered factor identifying the rail on which the measurement was made. travel a numeric vector giving the travel time for ultrasonic head-waves in the rail (nanoseconds). The value given is the original travel time minus 36,100 nanoseconds.

12 Rail Data Summary Six rails were chosen from a group of rails. Each rail was tested 3 times. Measured the time it takes an ultrasonic wave to travel the length of a rail. Rail is a factor (fixed or random?) and travel is the response.

13 > Rail Grouped Data: travel ~ 1 Rail Rail travel Rail Data

14 Rail Data > summary(rail) Rail travel 2:3 Min. : :3 1st Qu.: :3 Median : :3 Mean : :3 3rd Qu.: :3 Max. : > with(rail, tapply(travel, Rail, mean)) > pdf(file="railplot1.pdf") > with(rail, plot(travel, Rail, xlab="travel time")) > dev.off()

15 Rail Data Rail Travel time

16 Rail Data Between-rail variability is greater than within-rail variability. Within-rail variability is not constant. Mean travel time appears different for some rails We clearly need to account for the classification factor (Rail) in the analysis.

17 Rail as a fixed effect Rail Data where y ij = β i + e ij i = 1,..., 6 j = 1, 2, 3 y ij is the observed travel time for observation j on rail i. β i is the population mean travel time of rail i e ij are independent and identically normally distributed with mean 0 and variance σ 2 or e ij iid N(0, σ 2 ). This asumption means y ij ind N(β i, σ 2 ) > #Rail as a fixed effect > r1.lm<-lm(travel ~ Rail - 1, data=rail)

18 Rail Data > r1.lm Call: lm(formula = travel ~ Rail - 1, data = Rail) Coefficients: Rail2 Rail5 Rail1 Rail6 Rail3 Rail This means that the oredered estimates of the β i are β 2 = 31.67,..., β 4 = 96.00

19 Rail Data > summary(r1.lm) Coefficients: Estimate Std. Error t value Pr(> t ) Rail e-08 ***.. Rail e-14 *** Residual standard error: on 12 degrees of freedom Multiple R-squared: ,Adjusted R-squared: F-statistic: on 6 and 12 DF, p-value: 2.971e-15 Note that the F-statistic and p-value are testing for any differences between the rail effects. Also, the estimate of σ 2 is on 12 degrees of freedom.

20 Rail Data > with(rail, bwplot(rail ~ residuals(r1.lm))) residuals(r1.lm)

21 Rail Data The fixed effects model gives a good summary of the data but the main interest is in the population of rails. Also, we don t believe the model is accurate since the 3 observations on each rail are clearly not independent. Rail as a random effect y ij = β + b i + e ij where y ij is the observed travel time for observation j on rail i. β is the population mean travel time b i is the deviation from β for the ith rail

22 Rail Data e ij is the deviation for observation j on rail i from the mean travel time for rail i b i iid N(0, σ 2 b ) e ij iid N(0, σ 2 ). Our assumptions imply that observations on the same rail are correlated, in fact, corr = σ2 b σ 2 b + σ2.

23 Rail Data > #Rail as random effect > r2.lme<-lmer(travel ~ 1 + (1 Rail), REML=FALSE, data=rail) This notation takes some getting used to. Specifically, (1 Rail) means that there is a single random factor which is constant within each level and its levels are given by the grouping variable Rail.

24 Rail Data > summary(r2.lme) Linear mixed model fit by maximum likelihood Formula: travel ~ 1 + (1 Rail) Random effects: Groups Name Variance Std.Dev. Rail (Intercept) Residual Number of obs: 18, groups: Rail, 6 Our best guess at σ 2 and σ 2 b are ˆσ 2 b = ˆσ2 = and the estimated correlation between observations on a rail is ĉorr = =

25 Fixed effects: Estimate Std. Error t value (Intercept) Rail Data The estimate of β is ˆβ = For a new rail drawn for the population this is our best guess at travel time while the predicted values for a new observation on each of the same 6 rails are > fixef(r2.lme)+ranef(r2.lme)$rail (Intercept)

26 Rail Data > qqnorm(resid(r2.lme), main="") Sample Quantiles Theoretical Quantiles

27 Rail Data > plot(fitted(r2.lme), resid(r2.lme), xlab="fitted", ylab="residuals") Residuals Fitted

28 Multilevel Models Multilevel models are special cases of mixed models, are useful for data with a hierarchical structure, and can be implemented in a variety of R packages.

29 Joint Schools Project > library(faraway) > data(jsp) >?jsp Description Example Dataset from "Practical Regression and Anova" Format See for yourself Source See Reference References Reference details may be found in "Practical Regression and Anova" by Julian Faraway

30 Joint Schools Project > str(jsp) data.frame : 3236 obs. of 9 variables: $ school : Factor w/ 49 levels "1","2","3","4",..: $ class : Factor w/ 4 levels "1","2","3","4": $ gender : Factor w/ 2 levels "boy","girl": $ social : Factor w/ 9 levels "1","2","3","4",..: $ raven : num $ id : Factor w/ 1192 levels "1","2","3","4",..: $ english: num $ math : num $ year : num

31 Joint Schools Project > summary(jsp) school class gender social raven 48 : 206 1:1949 boy : :1225 Min. : : 131 2: 987 girl: : 484 1st Qu.: : 131 3: : 424 Median : : 107 4: : 288 Mean : : : 270 3rd Qu.: : : 221 Max. :36.00 (Other):2458 (Other): 324 id english math year 1 : 3 Min. : 0.00 Min. : 1.00 Min. : : 3 1st Qu.: st Qu.: st Qu.: : 3 Median :54.00 Median :28.00 Median : : 3 Mean :52.49 Mean :26.66 Mean : : 3 3rd Qu.: rd Qu.: rd Qu.: : 3 Max. :98.00 Max. :40.00 Max. : (Other):3218

32 Joint Schools Project #Subset data to focus on Year=2 > jsp.year2<-jsp[jsp$year==2,] > plot(jitter(math) ~ jitter(raven), xlab="raven Score", ylab="math Score", data=jsp.year2) > boxplot(math ~ social, xlab="social Class", ylab="math Score", data=jsp.year2) > boxplot(math ~ gender, xlab="gender", ylab="math Score", data=jsp.year2)

33 Joint Schools Project Raven Score Math Score There is clearly correlation between math score and raven score.

34 Joint Schools Project Math Score Social Class There are differences in math scores between the levels of social class.

35 Joint Schools Project boy girl Gender Math Score Math scores are not different between the levels of gender.

36 Joint Schools Project Center raven since we would otherwise be comparing to zero. > jsp.y2$ctrraven<-jsp.y2$raven-mean(jsp.y2$raven) > jsp1.lme<-lmer(math ~ ctrraven*social*gender + (1 school) + (1 school:class), data=jsp.y2) > qqnorm(resid(jsp1.lme),main="") > plot(fitted(jsp1.lme) ~ resid(jsp1.lme), xlab="fitted", ylab="residuals")

37 Joint Schools Project Theoretical Quantiles Sample Quantiles There isn t much of concern here.

38 Joint Schools Project Fitted Residuals There is some evidence of non-constant variance. Transformation?

39 Joint Schools Project > anova(jsp1.lme) Analysis of Variance Table Df Sum Sq Mean Sq F value ctrraven social gender ctrraven:social ctrraven:gender social:gender ctrraven:social:gender

40 Joint Schools Project The p-values associated with the F -statistics are approximate and can be too small when the number of cases is small. However, this data set is probably large enough to overcome this limitation. The parametric bootstrap can be used if this is a concern. > nrow(jsp.y2) - sum(anova(jsp1.lme)[,1]) - 1 [1] 917 > round(pf(anova(jsp1.lme)[,4], anova(jsp1.lme)[,1], 917, lower.tail=false),4) [1]

41 Joint Schools Project > #Remove Gender > jsp2.lme<-lmer(math ~ ctrraven*social + (1 school) + (1 school:class), data=jsp.y2) > qqnorm(resid(jsp2.lme),main="") > plot(fitted(jsp2.lme) ~ resid(jsp2.lme), xlab="fitted", ylab="residuals") > qqnorm(ranef(jsp2.lme)$"school:class"[[1]], main="school Effects") > qqnorm(ranef(jsp2.lme)$"school:class"[[1]], main="class Effects")

42 Joint Schools Project Theoretical Quantiles Sample Quantiles Still not much of concern.

43 Joint Schools Project Fitted Residuals Constant variance is still a problem.

44 Joint Schools Project School Effects Theoretical Quantiles Sample Quantiles

45 Joint Schools Project Class Effects Theoretical Quantiles Sample Quantiles

46 Joint Schools Project > anova(jsp2.lme) Analysis of Variance Table Df Sum Sq Mean Sq F value ctrraven social ctrraven:social > nrow(jsp.y2) - sum(anova(jsp2.lme)[,1]) - 1 [1] 935 > round(pf(anova(jsp2.lme)[,4], anova(jsp2.lme)[,1], 935, lower.tail=false),4) [1]

47 Joint Schools Project > sch.effects<-ranef(jsp2.lme)$school[[1]] > summary(sch.effects) Min. 1st Qu. Median Mean 3rd Qu. Max > raw.sch.effects<-coef(lm(math ~ school-1,jsp.y2)) > raw.sch.effects<-raw.sch.effects-mean(raw.sch.effects) > plot(raw.sch.effects,sch.effects) > sint<-c(9,14,29) > text(raw.sch.effects[sint],sch.effects[sint]+0.2, c("9","15","30"))

48 Joint Schools Project raw.sch.effects sch.effects

Correlated Data: Linear Mixed Models with Random Intercepts

Correlated Data: Linear Mixed Models with Random Intercepts 1 Correlated Data: Linear Mixed Models with Random Intercepts Mixed Effects Models This lecture introduces linear mixed effects models. Linear mixed models are a type of regression model, which generalise

More information

Introduction and Background to Multilevel Analysis

Introduction and Background to Multilevel Analysis Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and

More information

#Alternatively we could fit a model where the rail values are levels of a factor with fixed effects

#Alternatively we could fit a model where the rail values are levels of a factor with fixed effects examples-lme.r Tue Nov 25 12:32:20 2008 1 library(nlme) # The following data shows the results of tests carried over 6 rails. The response # indicated the time needed for a an ultrasonic wave to travel

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

The First Thing You Ever Do When Receive a Set of Data Is

The First Thing You Ever Do When Receive a Set of Data Is The First Thing You Ever Do When Receive a Set of Data Is Understand the goal of the study What are the objectives of the study? What would the person like to see from the data? Understand the methodology

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University

More information

Randomized Block Designs with Replicates

Randomized Block Designs with Replicates LMM 021 Randomized Block ANOVA with Replicates 1 ORIGIN := 0 Randomized Block Designs with Replicates prepared by Wm Stein Randomized Block Designs with Replicates extends the use of one or more random

More information

Pumpkin Example: Flaws in Diagnostics: Correcting Models

Pumpkin Example: Flaws in Diagnostics: Correcting Models Math 3080. Treibergs Pumpkin Example: Flaws in Diagnostics: Correcting Models Name: Example March, 204 From Levine Ramsey & Smidt, Applied Statistics for Engineers and Scientists, Prentice Hall, Upper

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA

More information

General Linear Statistical Models - Part III

General Linear Statistical Models - Part III General Linear Statistical Models - Part III Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Interaction Models Lets examine two models involving Weight and Domestic in the cars93 dataset.

More information

Models for Clustered Data

Models for Clustered Data Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

Workshop 7.4a: Single factor ANOVA

Workshop 7.4a: Single factor ANOVA -1- Workshop 7.4a: Single factor ANOVA Murray Logan November 23, 2016 Table of contents 1 Revision 1 2 Anova Parameterization 2 3 Partitioning of variance (ANOVA) 10 4 Worked Examples 13 1. Revision 1.1.

More information

Beam Example: Identifying Influential Observations using the Hat Matrix

Beam Example: Identifying Influential Observations using the Hat Matrix Math 3080. Treibergs Beam Example: Identifying Influential Observations using the Hat Matrix Name: Example March 22, 204 This R c program explores influential observations and their detection using the

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Value Added Modeling

Value Added Modeling Value Added Modeling Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background for VAMs Recall from previous lectures

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36 20. REML Estimation of Variance Components Copyright c 2018 (Iowa State University) 20. Statistics 510 1 / 36 Consider the General Linear Model y = Xβ + ɛ, where ɛ N(0, Σ) and Σ is an n n positive definite

More information

De-mystifying random effects models

De-mystifying random effects models De-mystifying random effects models Peter J Diggle Lecture 4, Leahurst, October 2012 Linear regression input variable x factor, covariate, explanatory variable,... output variable y response, end-point,

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 14 1 / 64 Data structure and Model t1 t2 tn i 1st subject y 11 y 12 y 1n1 2nd subject

More information

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation

More information

1 Use of indicator random variables. (Chapter 8)

1 Use of indicator random variables. (Chapter 8) 1 Use of indicator random variables. (Chapter 8) let I(A) = 1 if the event A occurs, and I(A) = 0 otherwise. I(A) is referred to as the indicator of the event A. The notation I A is often used. 1 2 Fitting

More information

A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics

A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics A Brief and Friendly Introduction to Mixed-Effects Models in Linguistics Cluster-specific parameters ( random effects ) Σb Parameters governing inter-cluster variability b1 b2 bm x11 x1n1 x21 x2n2 xm1

More information

Lecture 1: Linear Models and Applications

Lecture 1: Linear Models and Applications Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation

More information

Class: Taylor. January 12, Story time: Dan Willingham, the Cog Psyc. Willingham: Professor of cognitive psychology at Harvard

Class: Taylor. January 12, Story time: Dan Willingham, the Cog Psyc. Willingham: Professor of cognitive psychology at Harvard Class: Taylor January 12, 2011 (pdf version) Story time: Dan Willingham, the Cog Psyc Willingham: Professor of cognitive psychology at Harvard Why students don t like school We know lots about psychology

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Statistics - Lecture Three. Linear Models. Charlotte Wickham 1.

Statistics - Lecture Three. Linear Models. Charlotte Wickham   1. Statistics - Lecture Three Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Linear Models 1. The Theory 2. Practical Use 3. How to do it in R 4. An example 5. Extensions

More information

STAT 215 Confidence and Prediction Intervals in Regression

STAT 215 Confidence and Prediction Intervals in Regression STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:

More information

STAT 510 Final Exam Spring 2015

STAT 510 Final Exam Spring 2015 STAT 510 Final Exam Spring 2015 Instructions: The is a closed-notes, closed-book exam No calculator or electronic device of any kind may be used Use nothing but a pen or pencil Please write your name and

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Chapter 1 Linear Regression with One Predictor

Chapter 1 Linear Regression with One Predictor STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the

More information

Information. Hierarchical Models - Statistical Methods. References. Outline

Information. Hierarchical Models - Statistical Methods. References. Outline Information Hierarchical Models - Statistical Methods Sarah Filippi 1 University of Oxford Hilary Term 2015 Webpage: http://www.stats.ox.ac.uk/~filippi/msc_ hierarchicalmodels_2015.html Lectures: Week

More information

Regression on Faithful with Section 9.3 content

Regression on Faithful with Section 9.3 content Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,

More information

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as: 1 Joint hypotheses The null and alternative hypotheses can usually be interpreted as a restricted model ( ) and an model ( ). In our example: Note that if the model fits significantly better than the restricted

More information

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is

More information

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij Hierarchical Linear Models (HLM) Using R Package nlme Interpretation I. The Null Model Level 1 (student level) model is mathach ij = β 0j + e ij Level 2 (school level) model is β 0j = γ 00 + u 0j Combined

More information

1 Introduction. 2 Example

1 Introduction. 2 Example Statistics: Multilevel modelling Richard Buxton. 2008. Introduction Multilevel modelling is an approach that can be used to handle clustered or grouped data. Suppose we are trying to discover some of the

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

22s:152 Applied Linear Regression. 1-way ANOVA visual:

22s:152 Applied Linear Regression. 1-way ANOVA visual: 22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.

More information

Random Coefficients Model Examples

Random Coefficients Model Examples Random Coefficients Model Examples STAT:5201 Week 15 - Lecture 2 1 / 26 Each subject (or experimental unit) has multiple measurements (this could be over time, or it could be multiple measurements on a

More information

Stat 5303 (Oehlert): Randomized Complete Blocks 1

Stat 5303 (Oehlert): Randomized Complete Blocks 1 Stat 5303 (Oehlert): Randomized Complete Blocks 1 > library(stat5303libs);library(cfcdae);library(lme4) > immer Loc Var Y1 Y2 1 UF M 81.0 80.7 2 UF S 105.4 82.3 3 UF V 119.7 80.4 4 UF T 109.7 87.2 5 UF

More information

Random Intercept Models

Random Intercept Models Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept

More information

Lecture 10. Factorial experiments (2-way ANOVA etc)

Lecture 10. Factorial experiments (2-way ANOVA etc) Lecture 10. Factorial experiments (2-way ANOVA etc) Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regression and Analysis of Variance autumn 2014 A factorial experiment

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Natural language support but running in an English locale

Natural language support but running in an English locale R version 3.2.1 (2015-06-18) -- "World-Famous Astronaut" Copyright (C) 2015 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin13.4.0 (64-bit) R is free software and comes with ABSOLUTELY

More information

FACTORIAL DESIGNS and NESTED DESIGNS

FACTORIAL DESIGNS and NESTED DESIGNS Experimental Design and Statistical Methods Workshop FACTORIAL DESIGNS and NESTED DESIGNS Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments Items Factorial

More information

R in Linguistic Analysis. Wassink 2012 University of Washington Week 6

R in Linguistic Analysis. Wassink 2012 University of Washington Week 6 R in Linguistic Analysis Wassink 2012 University of Washington Week 6 Overview R for phoneticians and lab phonologists Johnson 3 Reading Qs Equivalence of means (t-tests) Multiple Regression Principal

More information

Multiple Regression Part I STAT315, 19-20/3/2014

Multiple Regression Part I STAT315, 19-20/3/2014 Multiple Regression Part I STAT315, 19-20/3/2014 Regression problem Predictors/independent variables/features Or: Error which can never be eliminated. Our task is to estimate the regression function f.

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Coping with Additional Sources of Variation: ANCOVA and Random Effects

Coping with Additional Sources of Variation: ANCOVA and Random Effects Coping with Additional Sources of Variation: ANCOVA and Random Effects 1/49 More Noise in Experiments & Observations Your fixed coefficients are not always so fixed Continuous variation between samples

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Data Analysis Using R ASC & OIR

Data Analysis Using R ASC & OIR Data Analysis Using R ASC & OIR Overview } What is Statistics and the process of study design } Correlation } Simple Linear Regression } Multiple Linear Regression 2 What is Statistics? Statistics is a

More information

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison. Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose

More information

14 Multiple Linear Regression

14 Multiple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in

More information

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple

More information

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb

Stat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb Stat 42/52 TWO WAY ANOVA Feb 6 25 Charlotte Wickham stat52.cwick.co.nz Roadmap DONE: Understand what a multiple regression model is. Know how to do inference on single and multiple parameters. Some extra

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

y i s 2 X 1 n i 1 1. Show that the least squares estimators can be written as n xx i x i 1 ns 2 X i 1 n ` px xqx i x i 1 pδ ij 1 n px i xq x j x

y i s 2 X 1 n i 1 1. Show that the least squares estimators can be written as n xx i x i 1 ns 2 X i 1 n ` px xqx i x i 1 pδ ij 1 n px i xq x j x Question 1 Suppose that we have data Let x 1 n x i px 1, y 1 q,..., px n, y n q. ȳ 1 n y i s 2 X 1 n px i xq 2 Throughout this question, we assume that the simple linear model is correct. We also assume

More information

Exercise 5.4 Solution

Exercise 5.4 Solution Exercise 5.4 Solution Niels Richard Hansen University of Copenhagen May 7, 2010 1 5.4(a) > leukemia

More information

Mixed effects models

Mixed effects models Mixed effects models The basic theory and application in R Mitchel van Loon Research Paper Business Analytics Mixed effects models The basic theory and application in R Author: Mitchel van Loon Research

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

Introduction to the Analysis of Hierarchical and Longitudinal Data

Introduction to the Analysis of Hierarchical and Longitudinal Data Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models

More information

Leverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response.

Leverage. the response is in line with the other values, or the high leverage has caused the fitted model to be pulled toward the observed response. Leverage Some cases have high leverage, the potential to greatly affect the fit. These cases are outliers in the space of predictors. Often the residuals for these cases are not large because the response

More information

MODELS WITHOUT AN INTERCEPT

MODELS WITHOUT AN INTERCEPT Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level

More information

Homework 3 - Solution

Homework 3 - Solution STAT 526 - Spring 2011 Homework 3 - Solution Olga Vitek Each part of the problems 5 points 1. KNNL 25.17 (Note: you can choose either the restricted or the unrestricted version of the model. Please state

More information

Linear Regression Model. Badr Missaoui

Linear Regression Model. Badr Missaoui Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus

More information

The Statistical Sleuth in R: Chapter 13

The Statistical Sleuth in R: Chapter 13 The Statistical Sleuth in R: Chapter 13 Linda Loi Kate Aloisio Ruobing Zhang Nicholas J. Horton June 15, 2016 Contents 1 Introduction 1 2 Intertidal seaweed grazers 2 2.1 Data coding, summary statistics

More information

Random and Mixed Effects Models - Part II

Random and Mixed Effects Models - Part II Random and Mixed Effects Models - Part II Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Two-Factor Random Effects Model Example: Miles per Gallon (Neter, Kutner, Nachtsheim, & Wasserman, problem

More information

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10

More information

Stat 209 Lab: Linear Mixed Models in R This lab covers the Linear Mixed Models tutorial by John Fox. Lab prepared by Karen Kapur. ɛ i Normal(0, σ 2 )

Stat 209 Lab: Linear Mixed Models in R This lab covers the Linear Mixed Models tutorial by John Fox. Lab prepared by Karen Kapur. ɛ i Normal(0, σ 2 ) Lab 2 STAT209 1/31/13 A complication in doing all this is that the package nlme (lme) is supplanted by the new and improved lme4 (lmer); both are widely used so I try to do both tracks in separate Rogosa

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Using R in 200D Luke Sonnet

Using R in 200D Luke Sonnet Using R in 200D Luke Sonnet Contents Working with data frames 1 Working with variables........................................... 1 Analyzing data............................................... 3 Random

More information

Lecture 2. Simple linear regression

Lecture 2. Simple linear regression Lecture 2. Simple linear regression Jesper Rydén Department of Mathematics, Uppsala University jesper@math.uu.se Regression and Analysis of Variance autumn 2014 Overview of lecture Introduction, short

More information

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim 0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#

More information

Chapter 1. Linear Regression with One Predictor Variable

Chapter 1. Linear Regression with One Predictor Variable Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

lm statistics Chris Parrish

lm statistics Chris Parrish lm statistics Chris Parrish 2017-04-01 Contents s e and R 2 1 experiment1................................................. 2 experiment2................................................. 3 experiment3.................................................

More information

R Output for Linear Models using functions lm(), gls() & glm()

R Output for Linear Models using functions lm(), gls() & glm() LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base

More information

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form Outline Statistical inference for linear mixed models Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark general form of linear mixed models examples of analyses using linear mixed

More information

Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression

Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression Scenario: 31 counts (over a 30-second period) were recorded from a Geiger counter at a nuclear

More information

The Statistical Sleuth in R: Chapter 13

The Statistical Sleuth in R: Chapter 13 The Statistical Sleuth in R: Chapter 13 Kate Aloisio Ruobing Zhang Nicholas J. Horton June 15, 2016 Contents 1 Introduction 1 2 Intertidal seaweed grazers 2 2.1 Data coding, summary statistics and graphical

More information

Multiple Linear Regression (solutions to exercises)

Multiple Linear Regression (solutions to exercises) Chapter 6 1 Chapter 6 Multiple Linear Regression (solutions to exercises) Chapter 6 CONTENTS 2 Contents 6 Multiple Linear Regression (solutions to exercises) 1 6.1 Nitrate concentration..........................

More information

Chaper 5: Matrix Approach to Simple Linear Regression. Matrix: A m by n matrix B is a grid of numbers with m rows and n columns. B = b 11 b m1 ...

Chaper 5: Matrix Approach to Simple Linear Regression. Matrix: A m by n matrix B is a grid of numbers with m rows and n columns. B = b 11 b m1 ... Chaper 5: Matrix Approach to Simple Linear Regression Matrix: A m by n matrix B is a grid of numbers with m rows and n columns B = b 11 b 1n b m1 b mn Element b ik is from the ith row and kth column A

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

Information. Hierarchical Models - Statistical Methods. References. Outline

Information. Hierarchical Models - Statistical Methods. References. Outline Information Hierarchical Models - Statistical Methods Sarah Filippi 1 University of Oxford Hilary Term 2015 Webpage: http://www.stats.ox.ac.uk/~filippi/msc_ hierarchicalmodels_2015.html Lectures: Week

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

University of Minnesota Duluth

University of Minnesota Duluth Trend analyses for species of concern: Analysis of CPUE data for walleye, cisco, and smallmouth bass 1970-2008 Dave Staples 1, Lucinda Johnson 2, Jennifer Olker 2, Dan Brenneman 2 1 Minnesota Department

More information

Generating OLS Results Manually via R

Generating OLS Results Manually via R Generating OLS Results Manually via R Sujan Bandyopadhyay Statistical softwares and packages have made it extremely easy for people to run regression analyses. Packages like lm in R or the reg command

More information

Workshop 9.3a: Randomized block designs

Workshop 9.3a: Randomized block designs -1- Workshop 93a: Randomized block designs Murray Logan November 23, 16 Table of contents 1 Randomized Block (RCB) designs 1 2 Worked Examples 12 1 Randomized Block (RCB) designs 11 RCB design Simple Randomized

More information