VIII. ANCOVA. A. Introduction

Similar documents
Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Categorical Predictor Variables

Data Set 8: Laysan Finch Beak Widths

Analysis of Covariance

Analysis of Variance

Regression Models for Quantitative and Qualitative Predictors: An Overview

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

STAT 705 Chapter 16: One-way ANOVA

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

PLS205!! Lab 9!! March 6, Topic 13: Covariance Analysis

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Chapter 14 Student Lecture Notes 14-1

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Confidence Intervals, Testing and ANOVA Summary

Ch 2: Simple Linear Regression

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Simple Linear Regression

Statistics for exp. medical researchers Regression and Correlation

ACOVA and Interactions

Analysis of Covariance

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th

RCB - Example. STA305 week 10 1

Analysing data: regression and correlation S6 and S7

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

Incomplete Block Designs

General Linear Model (Chapter 4)

Coping with Additional Sources of Variation: ANCOVA and Random Effects

Lecture 6 Multiple Linear Regression, cont.

BIOL 933!! Lab 10!! Fall Topic 13: Covariance Analysis

Multivariate analysis of variance and covariance

STAT 525 Fall Final exam. Tuesday December 14, 2010

Chapter 7 Student Lecture Notes 7-1

Simple Linear Regression

Analysis of Covariance

STAT 3A03 Applied Regression With SAS Fall 2017

Topic 13. Analysis of Covariance (ANCOVA) [ST&D chapter 17] 13.1 Introduction Review of regression concepts

Two-Way Factorial Designs

An Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012

Formula for the t-test

Math 3330: Solution to midterm Exam

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Stat 705: Completely randomized and complete block designs

BIOS 2083 Linear Models c Abdus S. Wahed

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs

Essential of Simple regression

Inferences for Regression

Correlation and regression

CAMPBELL COLLABORATION

STAT 705 Chapter 19: Two-way ANOVA

STAT5044: Regression and Anova. Inyoung Kim

Lecture 10 Multiple Linear Regression

Ch 3: Multiple Linear Regression

STA441: Spring Multiple Regression. More than one explanatory variable at the same time

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

y response variable x 1, x 2,, x k -- a set of explanatory variables

STAT 506: Randomized complete block designs

Scatter plot of data from the study. Linear Regression

Multiple Linear Regression

Statistical Techniques II EXST7015 Simple Linear Regression

[y i α βx i ] 2 (2) Q = i=1

Review of the General Linear Model

A Re-Introduction to General Linear Models (GLM)

Specification Errors, Measurement Errors, Confounding

One-way ANOVA (Single-Factor CRD)

The factors in higher-way ANOVAs can again be considered fixed or random, depending on the context of the study. For each factor:

General Linear Model: Statistical Inference

Well-developed and understood properties

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Scatter plot of data from the study. Linear Regression

De-mystifying random effects models

4 Multiple Linear Regression

Unbalanced Data in Factorials Types I, II, III SS Part 1

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Analysis of variance and regression. November 22, 2007

Lecture 9: Factorial Design Montgomery: chapter 5

STAT22200 Spring 2014 Chapter 8A

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

4.1 Example: Exercise and Glucose

Linear regression and correlation

One-Way ANOVA Model. group 1 y 11, y 12 y 13 y 1n1. group 2 y 21, y 22 y 2n2. group g y g1, y g2, y gng. g is # of groups,

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Introduction to Regression

Repeated Measures ANOVA Multivariate ANOVA and Their Relationship to Linear Mixed Models

Basic Business Statistics, 10/e

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Blocks are formed by grouping EUs in what way? How are experimental units randomized to treatments?

A discussion on multiple regression models

Overview Scatter Plot Example

Bayesian Linear Regression

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =

Least Squares Analyses of Variance and Covariance

More about Single Factor Experiments

Transcription:

VIII. ANCOVA A. Introduction In most experiments and observational studies, additional information on each experimental unit is available, information besides the factors under direct control or of interest. ANCOVA stands for Analysis of Covariance. 674

The additional information is covariates (continuous or categorical) thought to also influence or be associated with the outcome. Controlling for these covariates can often increase the precision of the group mean comparisons, primarily by decreasing the... In blocking designs, additional information is already being used. How? 675

However, blocking on covariates known or likely to be important is sometimes not feasible. There may be too few EUs with a particular covariate value (or range of values) to form a block. In most medical studies, age, weight, gender, health history, and other variables are known to be associated with most health outcomes, but we can t block on all those covariates unless the sample size is huge. 676

Sufficient information on the covariate may not be available in advance of the experiment to form blocks. Example: outcome = wrist pain each day for 7 days experimental factors = desk type, keyboard type covariate = 677

Covariates are sometimes called control variables or concomitant variables. We will begin with a model that assumes the association of the covariate with the outcome is the same for all treatment groups. 678

B. One-Way ANOVA with One Covariate EXAMPLE Exercise & Oxygen Ventilation Researchers in exercise physiology want to know if 12 weeks of an outdoor jogging regimen was better preparation for a cardiovascular health treadmill test than 12 weeks of a step aerobics regimen. 679

12 subjects with a sendentary lifestyle were recruited and had their cardiovascular health tested. 6 each were randomly assigned to the two training regimens. Age was also recorded. 12 weeks later, cardiovascular health was again tested, and change from baseline was computed. 680

681

Model: Y ij = µ i + β(x ij X.. ) + e ij i = 1,..., t, j = 1,..., r e ij iid N(0, σe 2 ) µ i = µ + α i = treatmentgroupmeans Note that β has no subscripts and that treatment and the covariate X ij do not interact. This is sometimes called the parallel-lines model. Why? This is also called the separable intercepts model. Why? 682

Tests: Because we are now including a continuous covariate, there are no easy formulas for sums of squares. SSTreatment and SSCovariate are not orthogonal! Tests are constructed using the regression approach. Treatment and covariate are each adjusted for the other. 683

Estimation: How do we now compute estimated treatment means? We might want to somehow adjust them for the covariate effect. Ȳ 1 is the estimated outcome in group 1, but group 1 has mean covariate X 1. Ȳ 2 is the estimated outcome in group 2, but group 2 has mean covariate X 2. 684

To make Ȳ 1 comparable to Ȳ 2, we must adjust them so that they equal what we would have observed if both groups had had the same mean covariate value. adjust to X. Unadjusted: ˆµ i = Ȳ i Adjusted: ˆµ adj i = Ȳ ia = Ȳ i ˆβ( X i X ) 685

Where does this come from? Start with E[Y ij ] = µ i + β(x ij X ) 686

This computation is equivalent to what LSMEANS does in SAS. What is the interpretation of these estimated means? 687

688

Note that the adjusted means could be adjusted to any covariate value, not just to the average observed covariate value. 689

What is an advantage of using covariate adjusted means? What is a disadvantage of using covariate adjusted means? 690

For contrasts or confidence intervals, V ar[ˆµ adj i ] = MSE 1 r + ( X i X ) 2 t r i=1 j=1 (X ij X i ) 2 Notice that it will change across i values!! 691

Efficiency: Was using a covariate worth it? RE = M SE(Reduced model) ( M SE(Full Model) 1 + r t i=1 ( X i X ) 2 ) ti=1 rj=1 (X ij X i ) 2 692

Diagnostics: As before, plus we need to check whether (a) treatments affected the covariate (b) Y vs X is really linear (c) β is the same for all treatments. 693

(a) Verify how the study was carried out. Is it physically possible that a treatment could affect a covariate value? Was the covariate collected before the treatments were applied? If the covariate value was affected by a treatment, then ANCOVA is not appropriate. 694

(b) Plot Y ij vs X ij and look for an approximately linear scatter of points. Also plot residuals vs. X ij and look for no systematic pattern in the scatter of points. Then the linear assumption is appropriate. If the linear trend does not seem appropriate, then consider transforming X ij and/or Y ij, or consider a more complex ANCOVA model with, e.g., linear and quadratic effects for the covariate. 695

(c) Plot Y ij vs X ij and superimpose a loess smooth separately for each treatment group. If they are approximately parallel, then the ANCOVA model is appropriate. If not, then we have an interaction between treatment and covariate. 696

C. ANCOVA with Interaction If the covariate effect is not the same for all treatments, then we have a separate-slopes model. This allows us to, in effect, carry out a separate regression for each treatment group. 697

Model: Y ij = µ i + β i (X ij X..) + e ij i = 1,..., t, j = 1,..., r e ij iid N(0, σe 2 ) This is identical to including a treatment by covariate interaction. Tests: What is the first test to carry out? 698

H 0 : β 1 = β 2 = = β t Full model: Reduced model: df Full model = df Reduced model = This is sometimes called a test of homogeneity of covariate effects or coefficients. 699

If this is significant, then we proceed directly to mean comparisons among treatment groups at specific values of the covariate.

Coding: LSMEANS trt/at variable list = value list; EX LSMEANS trt / AT age = 25; EX LSMEANS trt / AT MEANS; (SAS default) 700

D. Generalizations The analysis of any experimental design or observational study with factors can be turned into an ANCOVA by adding adjustment for one or more covariates. All testing should be done with Type III SS or General linear F-tests. 701

Before any analyses are carried out, careful consideration should be given to whether covariate by factor interactions should be included or not. LSMEANS by default adjusts treatment group means to the average observed value of each covariate, unless you specify otherwise with the AT option. 702

E. Caveats Is ANCOVA better than blocking on the covariate? If there are a large number of covariates which affect the outcome, then blocking on all of them is not feasible. Blocking requires dividing the covariate values into groups, and each group then forms one block. This grouping seems like throwing away information, and the boundaries between groups may be arbitrary. 703

Is blocking on a covariate better than ANCOVA? Randomization can result in unbalanced covariate values across the treatment groups, especially in small studies. Blocking imposes balance. Blocking is advantageous when the variability due to blocks is large. Blocking maintains the orthogonality of covariate and treatment effects. 704

In either analysis, it can be tempting to extrapolate beyond the range of observed covariate values or blocking groups. This is never a good idea!! 705