Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3

Similar documents
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3

Comparison of a Population Means

Single Factor Experiments

Lecture 4. Checking Model Adequacy

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery Sections 3.1 through 3.5 and Section

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Lecture 7: Latin Square and Related Design

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5

EXST7015: Estimating tree weights from other morphometric variables Raw data print

Chapter 8 (More on Assumptions for the Simple Linear Regression)

Lecture 10: 2 k Factorial Design Montgomery: Chapter 6

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

Lecture 4. Random Effects in Completely Randomized Design

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Answer Keys to Homework#10

Assignment 9 Answer Keys

BE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club

Chapter 11. Analysis of Variance (One-Way)

5.3 Three-Stage Nested Design Example

Lecture 11: Simple Linear Regression

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013

Lecture 7: Latin Squares and Related Designs

Lecture 12: 2 k Factorial Design Montgomery: Chapter 6

Outline Topic 21 - Two Factor ANOVA

Topic 20: Single Factor Analysis of Variance

Lecture 9: Factorial Design Montgomery: chapter 5

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

One-way ANOVA (Single-Factor CRD)

Lecture 10: Experiments with Random Effects

Chapter 1 Linear Regression with One Predictor

Analysis of Variance

Handout 1: Predicting GPA from SAT

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups

Topic 2. Chapter 3: Diagnostics and Remedial Measures

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups

Odor attraction CRD Page 1

Biological Applications of ANOVA - Examples and Readings

Analysis of variance. April 16, Contents Comparison of several groups

Analysis of variance. April 16, 2009

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Statistics 512: Applied Linear Models. Topic 9

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

Analysis of Covariance

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X.

Topic 29: Three-Way ANOVA

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

Overview Scatter Plot Example

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Week 7.1--IES 612-STA STA doc

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

Ch 2: Simple Linear Regression

PLS205 Lab 2 January 15, Laboratory Topic 3

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Assessing Model Adequacy

Topic 32: Two-Way Mixed Effects Model

STOR 455 STATISTICAL METHODS I

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

Analysis of Variance Bios 662

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

PLS205 Lab 6 February 13, Laboratory Topic 9

STAT 401A - Statistical Methods for Research Workers

Descriptions of post-hoc tests

Lecture notes on Regression & SAS example demonstration

Outline. Topic 22 - Interaction in Two Factor ANOVA. Interaction Not Significant. General Plan

Course Information Text:

Chapter 2 Inferences in Simple Linear Regression

Topic 28: Unequal Replication in Two-Way ANOVA

Lecture 7 Randomized Complete Block Design (RCBD) [ST&D sections (except 9.6) and section 15.8]

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

Chapter 8 Quantitative and Qualitative Predictors

Stat 427/527: Advanced Data Analysis I

Unbalanced Designs Mechanics. Estimate of σ 2 becomes weighted average of treatment combination sample variances.

Assignment 6 Answer Keys

What is Experimental Design?

Statistics 512: Applied Linear Models. Topic 1

Topic 6. Two-way designs: Randomized Complete Block Design [ST&D Chapter 9 sections 9.1 to 9.7 (except 9.6) and section 15.8]

Ch 3: Multiple Linear Regression

Power Analysis for One-Way ANOVA

data proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg

Analysis of Variance

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

Orthogonal contrasts for a 2x2 factorial design Example p130

Asymptotic Statistics-VI. Changliang Zou

Chapter 12. Analysis of variance

One-way ANOVA Model Assumptions

Failure Time of System due to the Hot Electron Effect

A Little Stats Won t Hurt You

Lecture 11: Blocking and Confounding in 2 k design

3. Design Experiments and Variance Analysis

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

Transcription:

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3 Page 1

Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the weight percent of cotton used in the blend of the materials for the fiber and it has five levels. weight tensile strength of cotton 1 2 3 4 5 total average 15 7 7 11 15 9 49 9.8 20 12 17 12 18 18 77 15.4 25 14 18 18 19 19 88 17.6 30 19 25 22 19 23 108 21.6 35 7 10 11 15 11 54 10.8 Page 2

Data Layout for Single-Factor Experiments treatment observations totals averages 1 y 11 y 12 y 1n y 1. ȳ 1. 2 y 21 y 22 y 2n y 2. ȳ 2.... a y a1 y a2 y an y a. ȳ a.... Page 3

Analysis of Variance Interested in comparing several treatments. Could do numerous two-sample t-tests but this approach does not test equality of all treatments at once. ANOVA provides method of joint inference. Statistical Model: y ij = µ + τ i + ɛ ij i =1, 2...a j =1, 2,...n i µ - grand mean; τ i - ith treatment effect; ɛ ij N(0,σ 2 ) - error a Constraint: i=1 τ i =0. Hypotheses: H 0 : τ 1 = τ 2 =... = τ a =0 vs H 1 : τ i 0for at least one i Page 4

Partitioning y ij Notation y i. = n i j=1 y ij y i. = y i. /n i (treatment sample mean, or row mean) y.. = y ij y.. = y.. /N (grand sample mean) Decomposition of y ij : y ij = y.. +(y i. y.. )+(y ij y i. ) Estimates for parameters: So y ij as ˆµ +ˆτ i +ˆɛ ij. ˆµ = y.. It can be verified that a i=1 n iˆτ i =0 ni j=1 ˆɛ ij =0for all i ˆτ i =(y i. y.. ) ˆɛ ij = y ij y i. ( residual ) Page 5

Partitioning the Sum of Squares Recall y ij y.. =ˆτ i +ˆɛ ij =(y i. y.. )+(y ij y i. ) Can show i j (y ij y.. ) 2 = i n i(y i. y.. ) 2 + i j (y ij y i. ) 2 = i n iˆτ i 2 + i j ˆɛ2 ij. Total SS = Treatment SS + Error SS SS T =SS Treatments +SS E Derive test statistics for testing H 0 : look at SS Treatments = n iˆτ i 2 = n i (y i. y.. ) 2 Small if ˆτ i s are small If large then reject H 0, but how large is large? Standardize to account for inherent variability Page 6

Test Statistic F 0 = SS Treatments/(a 1) SS E /(N a) Mean squares: = ni (y i. y.. ) 2 /(a 1) (yij y i. ) 2 /(N a) MS Treatments = SS Treatments /(a 1); MS E = SS E /(N a) Can show that the expected values are E(MS E )=σ 2 E(MS Treatment )=σ 2 + n i τ 2 i /(a 1) Hence, MS E is always a good estimator for σ 2, that is, ˆσ 2 =MS E. Under H 0,MS Treatment is also an estimator for σ 2. F 0 is the ratio of the mean squares, under H 0, it should be around 1. Large F 0 implies deviation from H 0. What is the sampling distribution of F 0 under H 0? Page 7

Sampling Distribution for F 0 under H 0 Based on model: y ij N(µ + τ i,σ 2 ) and y i. N(µ + τ i,σ 2 /n i ) j (y ij y i. ) 2 /σ 2 χ 2 n i 1 Therefore: SS E / σ 2 = i j (y ij y i. ) 2 /σ 2 χ 2 N a Under H 0 : SS Treatment /σ 2 = i n i(y i. y.. ) 2 /σ 2 χ 2 a 1 SS E and SS Treatment are independent. So F 0 = SS Treatment/σ 2 (a 1) SS E /σ 2 (N a) = χ2 (a 1)/(a 1) χ 2 (N a)/(n a) F a 1,N a F -distribution with numerator df a 1 and denominator df N a. Page 8

Analysis of Variance (ANOVA) Table Source of Sum of Degrees of Mean F 0 Variation Squares Freedom Square Between SS Treatment a 1 MS Treatment F 0 Within SS E N a MS E Total SS T N 1 If balanced: SS T = y 2 ij y2../n ; SS Treatment = 1 n y 2 i. y 2../N SS E =SS T -SS Treatment If unbalanced: SS T = y 2 ij y2../n ; SS Treatment = y 2 i. n i y 2../N SS E =SS T -SS Treatment Decision Rule: IfF 0 >F α,a 1,N a then reject H 0 Page 9

Connection to Two-Sample t-test a=2 Consider the square of the t-test statistic t 2 0 = (ȳ 1. ȳ 2. ) 2 (s pool 1/n1 +1/n 2 ) 2 = n 1 n 2 n 1 +n 2 (ȳ 1. ȳ 2. ) 2 s 2 pool = n 1 n 2 n 1 +n 2 [(ȳ 1. ȳ.. ) (ȳ 2. ȳ.. )] 2 s 2 pool = n 1 n 2 n 1 +n 2 (ȳ 1. ȳ.. ) 2 + n 1n 2 n 1 +n 2 (ȳ 2. ȳ.. ) 2 2n 1n 2 n 1 +n 2 (ȳ 1. ȳ.. )(ȳ 2. ȳ.. ) s 2 pool Page 10

Consider ȳ.. = 1 n 1 + n 2 ( n 1 j=1 y 1j + n 2 j=1 y 2j )= n 1 n 1 + n 2 ȳ 1. + n 2 n 1 + n 2 ȳ 2. One has t 2 0 = [n 1 (ȳ 1. ȳ.. ) 2 + n 2 (ȳ 2. ȳ.. ) 2 ]/(2 1) [ n 1 j=1 (y 1j ȳ 1. ) 2 + n 2 j=1 (y 2j ȳ 2. ) 2 ]/(n 1 + n 2 2) = MS Treatment MS E When a =2, t 2 0 = F 0 When a =2, F -test and two-sample two-sided test are equivalent. Page 11

Example Twelve lambs are randomly assigned to three different diets. The weight gain (in two weeks) is recorded. Is there a difference between the diets? SS T = 2274 156 2 /12 = 246 Diet 1 8 16 9 Diet 2 9 16 21 11 18 Diet 3 15 10 17 6 yij = 156 and y 2 ij = 2274 y 1. =33, y 2. =75, and y 3. =48 n 1 =3, n 2 =5, n 3 =4and N =12 SS Treatment = (33 2 /3+75 2 /5+48 2 /4) 156 2 /12 = 36 SS E = 246 36 = 210 F 0 = (36/2)/(210/9) = 0.77 P-value > 0.25 Page 12

Using SAS (lambs.sas) option nocenter ps=65 ls=80; data lambs; input diet wtgain; cards; 1 8 1 16 1 9 2 9 2 16 2 21 2 11 2 18 3 15 3 10 3 17 3 6 ; Page 13

symbol1 bwidth=5 i=box; axis1 offset=(5); proc gplot; plot wtgain*diet / frame haxis=axis1; proc glm; class diet; model wtgain=diet; output out=diag r=res p=pred; proc gplot; plot res*diet /frame haxis=axis1; proc sort; by pred; symbol1 v=circle i=sm50; proc gplot; plot res*pred / haxis=axis1; run; Page 14

SAS Output The GLM Procedure Dependent Variable: wtgain Sum of Source DF Squares Mean Square F Value Pr > F Model 2 36.0000000 18.0000000 0.77 0.4907 Error 9 210.0000000 23.3333333 Corrected Total 11 246.0000000 R-Square Coeff Var Root MSE wtgain Mean 0.146341 37.15738 4.830459 13.00000 Source DF Type I SS Mean Square F Value Pr > F diet 2 36.00000000 18.00000000 0.77 0.4907 Source DF Type III SS Mean Square F Value Pr > F diet 2 36.00000000 18.00000000 0.77 0.4907 Page 15

Side-by-Side Boxplot Page 16

Residual Plot Page 17

Tensile Strength Experiment options ls=80 ps=60 nocenter; goptions device=win target=winprtm rotate=landscape ftext=swiss hsize=8.0in vsize=6.0in htext=1.5 htitle=1.5 hpos=60 vpos=60 horigin=0.5in vorigin=0.5in; data one; infile c:\saswork\data\tensile.dat ; input percent strength time; title1 Tensile Strength Example ; proc print data=one; run; symbol1 v=circle i=none; title1 Plot of Strength vs Percent Blend ; proc gplot data=one; plot strength*percent/frame; run; proc boxplot; Page 18

plot strength*percent/boxstyle=skeletal pctldef=4; proc glm; class percent; model strength=percent; output out=oneres p=pred r=res; run; proc sort; by pred; symbol1 v=circle i=sm50; title1 Residual Plot ; proc gplot; plot res*pred/frame; run; proc univariate data=oneres normal; var res; qqplot res / normal (L=1 mu=est sigma=est); histogram res / normal; run; symbol1 v=circle i=none; title1 Plot of residuals vs time ; proc gplot; plot res*time / vref=0 vaxis=-6 to 6 by 1; run; Page 19

Tensile Strength Example Obs percent strength time 1 15 7 15 2 15 7 19 3 15 15 25 4 15 11 12 5 15 9 6 6 20 12 8 7 20 17 14 8 20 12 1 9 20 18 11 : : : : 16 30 19 22 17 30 25 5 18 30 22 2 19 30 19 24 20 30 23 10 21 35 7 17 22 35 10 21 23 35 11 4 24 35 15 16 25 35 11 23 Page 20

Scatter Plot Page 21

Side-by-Side Plot Page 22

The GLM Procedure Dependent Variable: strength Sum of Source DF Squares Mean Square F Value Pr > F Model 4 475.7600000 118.9400000 14.76 <.0001 Error 20 161.2000000 8.0600000 Corrected Total 24 636.9600000 R-Square Coeff Var Root MSE strength Mean 0.746923 18.87642 2.839014 15.04000 Source DF Type I SS Mean Square F Value Pr > F percent 4 475.7600000 118.9400000 14.76 <.0001 Source DF Type III SS Mean Square F Value Pr > F percent 4 475.7600000 118.9400000 14.76 <.0001 Page 23

Residual Plot Page 24

The UNIVARIATE Procedure Variable: res Moments N 25 Sum Weights 25 Mean 0 Sum Observations 0 Std Deviation 2.59165327 Variance 6.71666667 Skewness 0.11239681 Kurtosis -0.8683604 Uncorrected SS 161.2 Corrected SS 161.2 Coeff Variation. Std Error Mean 0.51833065 Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0.943868 Pr < W 0.1818 Kolmogorov-Smirnov D 0.162123 Pr > D 0.0885 Cramer-von Mises W-Sq 0.080455 Pr > W-Sq 0.2026 Anderson-Darling A-Sq 0.518572 Pr > A-Sq 0.1775 Page 25

Histogram of Residuals Page 26

QQ Plot Page 27

Time Serial Plot Page 28