Analysis of Covariance

Similar documents
STAT 705 Chapters 22: Analysis of Covariance

STAT 506: Randomized complete block designs

Stat 705: Completely randomized and complete block designs

Analysis of Covariance

Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

STAT 705 Chapter 19: Two-way ANOVA

Multiple comparisons The problem with the one-pair-at-a-time approach is its error rate.

VIII. ANCOVA. A. Introduction

STAT 705 Chapter 16: One-way ANOVA

Nested Designs & Random Effects

Chapter 8: Regression Models with Qualitative Predictors

STAT 705 Chapters 23 and 24: Two factors, unequal sample sizes; multi-factor ANOVA

Regression Models for Quantitative and Qualitative Predictors: An Overview

STAT 525 Fall Final exam. Tuesday December 14, 2010

Chapter 12: Linear regression II

Sections 2.3, 2.4. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis 1 / 21

Blocks are formed by grouping EUs in what way? How are experimental units randomized to treatments?

Regression With a Categorical Independent Variable

Comparing Nested Models

Response Surface Methods

SAS Program Part 1: proc import datafile="y:\iowa_classes\stat_5201_design\examples\2-23_drillspeed_feed\mont_5-7.csv" out=ds dbms=csv replace; run;

y = µj n + β 1 b β b b b + α 1 t α a t a + e

Lec 5: Factorial Experiment

BIOL 933!! Lab 10!! Fall Topic 13: Covariance Analysis

Multivariate analysis of variance and covariance

Sections 7.1, 7.2, 7.4, & 7.6

Regression With a Categorical Independent Variable

STATS DOESN T SUCK! ~ CHAPTER 16

Multiple Predictor Variables: ANOVA

Categorical Predictor Variables

Lecture 9: Factorial Design Montgomery: chapter 5

STAT22200 Spring 2014 Chapter 8A

STAT 705 Chapter 19: Two-way ANOVA

WU Weiterbildung. Linear Mixed Models

Two-Way Factorial Designs

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Stat 579: Generalized Linear Models and Extensions

Formula for the t-test

BIOSTATISTICAL METHODS

Analysis of Variance

Topic 13. Analysis of Covariance (ANCOVA) [ST&D chapter 17] 13.1 Introduction Review of regression concepts

More than one treatment: factorials

Regression With a Categorical Independent Variable

Factorial designs. Experiments

Lecture 10. Factorial experiments (2-way ANOVA etc)

Stat 217 Final Exam. Name: May 1, 2002

Analysis of Covariance

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

Outline Topic 21 - Two Factor ANOVA

Business Statistics. Lecture 10: Correlation and Linear Regression

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Lec 1: An Introduction to ANOVA

Fitting a regression model

Overview. 4.1 Tables and Graphs for the Relationship Between Two Variables. 4.2 Introduction to Correlation. 4.3 Introduction to Regression 3.

STAT Final Practice Problems

Well-developed and understood properties

Stat 502 Design and Analysis of Experiments General Linear Model

Experimental Design and Data Analysis for Biologists

Topic 28: Unequal Replication in Two-Way ANOVA

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik

11. Linear Mixed-Effects Models. Copyright c 2018 Dan Nettleton (Iowa State University) 11. Statistics / 49

Notes on Maxwell & Delaney

General Linear Models. with General Linear Hypothesis Tests and Likelihood Ratio Tests

A Significance Test for the Lasso

Joint Probability Distributions

STAT 704 Sections IRLS and Bootstrap

Unbalanced Data in Factorials Types I, II, III SS Part 1

8/04/2011. last lecture: correlation and regression next lecture: standard MR & hierarchical MR (MR = multiple regression)

Regression With a Categorical Independent Variable: Mean Comparisons

Topic 13. Analysis of Covariance (ANCOVA) - Part II [ST&D Ch. 17]

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Two or more categorical predictors. 2.1 Two fixed effects

Confidence Intervals, Testing and ANOVA Summary

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

1. An article on peanut butter in Consumer reports reported the following scores for various brands

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.

L21: Chapter 12: Linear regression

BIOS 2083: Linear Models

Lecture 2. The Simple Linear Regression Model: Matrix Approach

PLS205!! Lab 9!! March 6, Topic 13: Covariance Analysis

BIOS 6649: Handout Exercise Solution

Multiple Linear Regression for the Supervisor Data

STAT 430 (Fall 2017): Tutorial 8

1. (Problem 3.4 in OLRT)

Conjugate Analysis for the Linear Model

Rule of Thumb Think beyond simple ANOVA when a factor is time or dose think ANCOVA.

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

REPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29

3. Factorial Experiments (Ch.5. Factorial Experiments)

Chapter 1. Modeling Basics

Statistics GIDP Ph.D. Qualifying Exam Methodology

Review of the General Linear Model

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Lecture 11: Nested and Split-Plot Designs

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014

Chapter 4: Randomized Blocks and Latin Squares

Unit 6: Orthogonal Designs Theory, Randomized Complete Block Designs, and Latin Squares

Least Squares Regression

Transcription:

Analysis of Covariance Timothy Hanson Department of Statistics, University of South Carolina Stat 506: Introduction to Experimental Design 1 / 11

ANalysis of COVAriance Add a continuous predictor to an ANOVA model = ANCOVA. Mix continuous and discrete predictors. Useful for testing treatment effects in presence of continuous predictor(s) that may explain much variability. Continuous predictor may be concomitant (supplemental, uncontrolled) or controlled (e.g. drug dose in mg). Concomitant variable should be unaffected by treatments; i.e. they should be independent. They are often measured before study takes place. Examples: prestudy attitude, age, SES, aptitude, baseline outcomes (e.g. seizure rate). Often the same types of variables one might block on in a RCBD. 2 / 11

Simplest ANCOVA model One treatment and one covariate that enters model linearly. Have i = 1,..., r treatment levels and j = 1,..., n i observations within level i. Model is y ij = µ + τ i + γx ij + ɛ ij. This gives r parallel regression lines, one for each treatment level (a picture helps). Fixing x, the mean difference between group i and group j is µ + τ i + γx (µ + τ j + γx) = τ i τ j. Can get from lsmeans, pairwise, etc. 3 / 11

ANOVA table For the simple ANCOVA model, the ANOVA table will have a row for the concomitant variable and another row for the treatment effects. The p-values test H 0 : γ = 0 (concomitant variable not important) and H 0 : τ 1 = = τ r = 0 (no treatment differences). 4 / 11

Cracker sales CRD where N = 15 stores were randomly assigned one of three promotion treatment levels: 1 i = 1 sampling of product by customers in store and regular shelf space, 2 i = 2 additional shelf space, 3 i = 3 special display shelves at ends of aisle in addition to regular shelf space. y ij is number of cases sold during the promotional period. x ij is number of cases sold during the previous (non-promotional) period. Model fit in R is y ij = µ + τ i + γx ij + ɛ ij. 5 / 11

Cracker sales in SAS library(cfcdae); library(lsmeans); library(car) treatment=factor(c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3)) cases =c(38,39,36,45,33,43,38,38,27,34,24,32,31,21,28) preceding=c(21,26,22,28,19,34,26,29,18,25,23,29,30,16,29) d=data.frame(cases,preceding,treatment) plot(cases~preceding,pch=19,col=c("green","blue","red")[treatment]) legend(17,45,legend=c("1","2","3"),col=c("green","blue","red"),pch=19) f=lm(cases~preceding+treatment) Anova(f,type=3) lsmeans(f,"treatment") pairs(lsmeans(f,"treatment")) pairwise(f,treatment) library(hh) # has a nice function ancova(cases~preceding+treatment,data=d) 6 / 11

Checking for non-constant slopes The assumption of parallel slopes should be checked, via plots and/or Type III tests. A model that allows for slopes to change with treatment is y ij = [µ + τ j ] + [γ + γ j ]x ij + ɛ ij. f2=lm(cases~preceding*treatment) Anova(f2,type=3) # p=0.4 so additive model okay Diagnostics? 7 / 11

Generalizations Basic model is y ij = µ + τ i + γx ij + ɛ ij. Response mean is linear function of x for each treatment group: parallel lines. i = 1,..., r levels of one treatment modeled. τ i τ j gives mean treatment differences for a given level of x. Similar, but simpler than a RCBD with x chopped up into categories like age group. Just treat age as continuous. Increased efficiency if age really is linear. Nonlinear mean, e.g. y ij = µ + τ i + γ 1 x ij + γ 2 x 2 ij + ɛ ij. Mean response is parallel curves in x, one for each treatment level. Might be necessary if e ij vs ŷ ij shows a parabolic (or otherwise nonlinear) shape. τ i τ j again gives mean treatment differences for a given level of x. 8 / 11

Generalizations More factors, e.g. y ijk = µ + α i + β j + (αβ) ij + γx ijk + ɛ ijk. Here i = 1,..., a levels of A, j = 1,..., b levels of B, and k = 1,..., n ij replicates in A = i and B = j. If this fits, should see approximately parallel curves in scatterplot stratified by (i, j). If H 0 : (αβ) ij = 0 then analysis simplifies; can look at differences in main effects. Pairwise difference, e.g. β 3 β 1 do not change with either i or x. More concomitant variables, e.g. y ijk = µ + τ i + γ 1 x i1k + γ 2 x i2k + ɛ ijk where x ijk is variable j on kth subject with treatment i. Mean response is parallel surfaces in (x 1, x 2 ). Here we are assuming parallel planes, one for each level of i. 9 / 11

Salable flowers Factor A is flower variety: i = 1 LP, i = 2 WB. Factor B is moisture level: j = 1 low, j = 2 high. N = 24 plots total; n ij = 6 replications of each pairing (i, j). y ijk is number of flowers horticulturist can sell. x ijk is plot size; expect γ > 0. Model is y ijk = µ + α i + β j + (αβ) ij + γx ijk + ɛ ijk. CRD with factorial treatment structure. 10 / 11

Salable flowers in SAS variety= factor(c(1,1,1,1,1,1,2,2,2,2,2,2,1,1,1,1,1,1,2,2,2,2,2,2)) moisture=factor(c(1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2)) yield= c(98,60,77,80,95,64,55,60,75,65,87,78,71,80,86,82,46,55,76,68,43,47,62,70) plotsize=c(15, 4, 7, 9,14, 5, 4, 5, 8, 7,13,11,10,12,14,13, 2, 3,11,10, 2, 3, 7, 9) d=data.frame(yield,plotsize,variety,moisture) plot(yield~plotsize,col=rep(1:4,each=6),main="yield by plotsize & variety:moisture",pch=19) legend(3,90,legend=c("1:1","2:1","1:2","2:2"),col=1:4,pch=19) f1=lm(yield~plotsize+variety*moisture,data=d) Anova(f,type=3) f2=lm(yield~plotsize+variety+moisture,data=d) pairs(lsmeans(f2,"variety")) pairs(lsmeans(f2,"moisture")) Let s look at diagnostics... 11 / 11