Analysis of Covariance

Similar documents
Categorical Predictor Variables

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Analysis of Variance (ANOVA) Part 2

Introduction to Factorial ANOVA

Stats fest Analysis of variance. Single factor ANOVA. Aims. Single factor ANOVA. Data

General linear models. One and Two-way ANOVA in SPSS Repeated measures ANOVA Multiple linear regression

Simple Linear Regression

Review of the General Linear Model

Six Sigma Black Belt Study Guides

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Statistical Techniques II EXST7015 Simple Linear Regression

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

Experimental Design and Data Analysis for Biologists

Analysis of Variance

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Extensions of One-Way ANOVA.

General Linear Model (Chapter 4)

STA441: Spring Multiple Regression. This slide show is a free open source document. See the last slide for copyright information.

Extensions of One-Way ANOVA.

8/04/2011. last lecture: correlation and regression next lecture: standard MR & hierarchical MR (MR = multiple regression)

Workshop 9.3a: Randomized block designs

SRBx14_9.xls Ch14.xls

Correlation and Simple Linear Regression

Multiple Regression. More Hypothesis Testing. More Hypothesis Testing The big question: What we really want to know: What we actually know: We know:

Workshop 7.4a: Single factor ANOVA

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Multiple Predictor Variables: ANOVA

Answer Keys to Homework#10

Assignment 9 Answer Keys

VIII. ANCOVA. A. Introduction

Data Skills 08: General Linear Model

ANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula

Formula for the t-test

Data Set 8: Laysan Finch Beak Widths

Multiple Predictor Variables: ANOVA

Data Analysis 1 LINEAR REGRESSION. Chapter 03

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

ANOVA CIVL 7012/8012

Regression ( Kemampuan Individu, Lingkungan kerja dan Motivasi)

Inferences for Regression

EE290H F05. Spanos. Lecture 5: Comparison of Treatments and ANOVA

Confidence Intervals, Testing and ANOVA Summary

Analysing qpcr outcomes. Lecture Analysis of Variance by Dr Maartje Klapwijk

Analysis of Variance (ANOVA)

UNIVERSITY OF CALIFORNIA College of Engineering Department of Electrical Engineering and Computer Sciences. PROBLEM SET No. 5 Official Solutions

Statistical Modelling in Stata 5: Linear Models

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

Lecture 9: Linear Regression

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

sociology 362 regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

14 Multiple Linear Regression

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Matrices and vectors A matrix is a rectangular array of numbers. Here s an example: A =

WELCOME! Lecture 13 Thommy Perlinger

CAMPBELL COLLABORATION

General Linear Models. with General Linear Hypothesis Tests and Likelihood Ratio Tests

This module focuses on the logic of ANOVA with special attention given to variance components and the relationship between ANOVA and regression.

df=degrees of freedom = n - 1

Sleep data, two drugs Ch13.xls

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

NC Births, ANOVA & F-tests

Weighted Least Squares

22s:152 Applied Linear Regression. 1-way ANOVA visual:

Design of Engineering Experiments Chapter 5 Introduction to Factorials

R Output for Linear Models using functions lm(), gls() & glm()

Chapter 16. Simple Linear Regression and Correlation

STATISTICS 141 Final Review

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Unbalanced Data in Factorials Types I, II, III SS Part 1

Suppose we needed four batches of formaldehyde, and coulddoonly4runsperbatch. Thisisthena2 4 factorial in 2 2 blocks.

using the beginning of all regression models

Simple Linear Regression

22s:152 Applied Linear Regression. Chapter 5: Ordinary Least Squares Regression. Part 1: Simple Linear Regression Introduction and Estimation

SIMPLE REGRESSION ANALYSIS. Business Statistics

BIOL 933!! Lab 10!! Fall Topic 13: Covariance Analysis

Statistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Area1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)

Comparing Nested Models

Lecture 11: Simple Linear Regression

G562 Geometric Morphometrics. Statistical Tests. Department of Geological Sciences Indiana University. (c) 2012, P. David Polly

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

The Multiple Regression Model

General Linear Statistical Models

sociology 362 regression

22s:152 Applied Linear Regression

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Topic 13. Analysis of Covariance (ANCOVA) - Part II [ST&D Ch. 17]

Class Notes Spring 2014

Chapter 16. Simple Linear Regression and dcorrelation

Transcription:

Analysis of Covariance Using categorical and continuous predictor variables Example An experiment is set up to look at the effects of watering on Oak Seedling establishment Three levels of watering: (no additional) (three times a week) ( times a week) 1

Mean() Results No significant effect of watering Chart Least Squares Means Table Level Least Sq Mean 64. 6.1486 8.1486 Std Error 5.668994 5.668994 5.668994 Mean 64. 6.14 8.14 Analysis of Variance Source Model Error C. Total DF 18 Sum of Squares 891.14 4.851 416.514 Mean Square 445.85 19.14 F Ratio.49 Prob > F.1118 1 Each error bar is constructed using 1 standard error from the mean. Addition of a covariate -proximity to bushes Perhaps a proxy for grazing pressure TREAT

There is a spread of distances for each 1 4 5 6 Distance from bush (meters) Compare effect of treatments with and without accounting for distance to bushes 1 4 5 6 1 4 5 6 TREAT 1 4 5 6

De-trend Data Account for effect of covariate 1 4 5 6 1 4 5 6 Adjusted Mean (SEM) 5 1 4 5 6 65 Compare models with and without covariate No covariate With covariate SEM=5 SEM= Seedlings (SEM) Adjusted Mean (SEM) 5 65 4

Formally - ANCOVA The objective of an analysis of covariance is to compare the treatment means after adjusting for differences among the treatments due to differences attributable to the covariate. The analysis is a joining of the regression model with the analysis of variance model. Combinations of Categorical and Continuous Factors Calculation of Adjusted means in ANCOVA y adjusted y 1adjusted Y y Group Group 1 y 1 x x x1 X Adjusted Y means are based on overall X mean, not x means for each group 5

Linear model where y ij = µ + i + (x ij -x) + ij m overall mean i effect of factor A (m i - m) ij ij combined regression coefficient representing pooling of regression slopes of Y on X within each group. unexplained variation Assumptions Linearity The relationship between Y and X must be linear or transformed to linear. If not (Slope) term is meaningless and will lead to errors in analysis Y y Group y adjusted y 1adjusted Group 1 y 1 x x x1 X 6

Assumptions Covariate values similar across groups Assumption is that distributions of covariates are similar across groups This ensures that the the covariate is independent of group (treatment) Also allows for logical assumption of linearity throughout range of covariate. Y Group Group 1 This could be possible X Assumptions Covariate is fixed (without error) Same assumption as for regression analysis Almost never true in Biological systems If assumptions of homogeneity of variances, similar ranges in covariate values and homogeneity of slopes are met then: No obvious increase in Type I or II error

Homogeneity of Slopes Assumptions Assumption is that all slopes are the same Allows for pooling of groups to generate common slope Allows for logical partitioning of variance Could Groups be compared without this assumption? Y Group 1 Group Are Groups simply comparable? NO! X Testing Homogeneity of Slopes assumption (for 1 categorical and 1 covariate) Interaction between covariate and categorical variable is test of homogeneity of slopes assumption First test HOS assumption as part of FULL model If slopes are homogeneous (no significant interaction effect) then: Run REDUCED model (leave out interaction between categorical and covariate) 8

General scheme for testing ANCOVA Full Model A=Categorical B=Covariate Source df df denomonator F A df A = p-1 N/A N/A B df B =1 N/A N/A AB df AB = (p-1) df Residual MS AB /MS Residual If interaction between A and B is not significant, indicating homogeneity of slopes then run the reduced model Source df df denomonator F A df A = p-1 df Residual MS A /MS Residual B df B =1 df Residual MS B /MS Residual General scheme for testing ANCOVA A=Categorical B=Categorical C=Covariate Drop Source df df denomonator F A df A = p-1 B df B = q-1 C df C = 1 AB df AB = (p-1)(q-1) AC df AC = (p-1) BC df BC = (q-1) ABC df ABC = (p-1)(q-1) df Residual MS ABC /MS Residual If ABC not significant then drop and run reduced model Source df df denomonator F A df A = p-1 B df B = q-1 C df C = 1 AB df AB = (p-1)(q-1) Drop or AC df AC = (p-1) df Residual MS BC /MS Residual BC df BC = (q-1) df Residual MS BC /MS Residual If either AC or BC not significant then drop and run reduced model 9

If other interaction involving covariate not significant then drop and run reduced model Source df df denomonator F A=Categorical B=Categorical C=Covariate Drop A df A = p-1 B df B = q-1 C df C = 1 AB df AB = (p-1)(q-1) AC df AC = (p-1) df Residual MS AC /MS Residual In the end if all interaction involving covariate can be dropped then you are left with the fully reduced model Source df df denomonator F A df A = p-1 df Residual MS A /MS Residual B df B = q-1 df Residual MS B /MS Residual C df C = 1 df Residual MS C /MS Residual AB df AB = (p-1)(q-1) df Residual MS AB /MS Residual Back to Example An experiment is set up to look at the effects of watering on Oak Seedling establishment Three levels of watering: (no additional) (three times a week) ( times a week) Ancova seedlings, water and distance from bushes 1

There is a spread of distances for each 1 4 5 6 Distance from bush (meters) Test Full model vs. Y () = 4.1 + 9.95*X Y () = 4.1 + 9.8*X Y () =.9 + 1.9*X Analysis of Variance Source Model Error C. Total Effect Tests DF 5 15 Source * Sum of Squares 1.6 41.8688 416.514 Nparm 1 Mean Square 4.541 DF 1.591 Sum of Squares.9864.644.9818 F Ratio 6.91 Prob > F <.1* F Ratio 4.95 98.969.18 Prob > F.55* <.1*.984 1 4 5 This are meaningless Homogeneity of Slopes Assumption met 11

Run Reduced Model Analysis of Variance Source Model Error C. Total Effect Tests Source DF 1 Nparm 1 Sum of Squares 11.8 414.86 416.514 DF 1 Mean Square 1.4 Sum of Squares 5.186 8.65 4. F Ratio 5.68 115.5599 F Ratio.4 Prob > F <.1* Prob > F.1* <.1* Both and Distance to Bushes are significant 91. Least Squares Means ANOVA 8. Least Squares Means ANCOVA 8.8 81. 4.6 66.4 5.4 69.6 58. 6.8. TREAT 58. TREAT Test for Linear Trend (do seedling numbers increase linearly with watering regime) Contrast Contrast Specification -1 1 + + + - - - Click on + or - to make contrast values. Test for Linear Trend 8. Least Squares Means Contrast 81. SS NumDF DenDF 1 1 F Ratio 11.188 Prob > F.8* 5.4 69.6 6.8 58. TREAT 1