Comparison of a Population Means

Similar documents
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3

Single Factor Experiments

Lecture 4. Checking Model Adequacy

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery Sections 3.1 through 3.5 and Section

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

EXST7015: Estimating tree weights from other morphometric variables Raw data print

Chapter 8 (More on Assumptions for the Simple Linear Regression)

Lecture 7: Latin Square and Related Design

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

5.3 Three-Stage Nested Design Example

Lecture 4. Random Effects in Completely Randomized Design

Lecture 10: 2 k Factorial Design Montgomery: Chapter 6

Handout 1: Predicting GPA from SAT

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013

BE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Topic 20: Single Factor Analysis of Variance

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Lecture 11: Simple Linear Regression

Outline Topic 21 - Two Factor ANOVA

Chapter 1 Linear Regression with One Predictor

Odor attraction CRD Page 1

Assessing Model Adequacy

Chapter 11. Analysis of Variance (One-Way)

Lecture 7: Latin Squares and Related Designs

Topic 2. Chapter 3: Diagnostics and Remedial Measures

One-way ANOVA (Single-Factor CRD)

Biological Applications of ANOVA - Examples and Readings

Lecture 12: 2 k Factorial Design Montgomery: Chapter 6

Analysis of Covariance

Overview Scatter Plot Example

dm'log;clear;output;clear'; options ps=512 ls=99 nocenter nodate nonumber nolabel FORMCHAR=" = -/\<>*"; ODS LISTING;

Lecture 9: Factorial Design Montgomery: chapter 5

Answer Keys to Homework#10

Assignment 9 Answer Keys

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

Analysis of Variance

Analysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking

Week 7.1--IES 612-STA STA doc

Chapter 2 Inferences in Simple Linear Regression

Lecture 10: Experiments with Random Effects

Ch 2: Simple Linear Regression

Outline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X.

Analysis of variance. April 16, Contents Comparison of several groups

Analysis of variance. April 16, 2009

Outline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups

Failure Time of System due to the Hot Electron Effect

A Little Stats Won t Hurt You

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

Booklet of Code and Output for STAC32 Final Exam

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

Power Analysis for One-Way ANOVA

Topic 29: Three-Way ANOVA

Outline. Topic 22 - Interaction in Two Factor ANOVA. Interaction Not Significant. General Plan

Assignment 6 Answer Keys

STOR 455 STATISTICAL METHODS I

Statistics 512: Applied Linear Models. Topic 1

Course Information Text:

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Lecture notes on Regression & SAS example demonstration

Chapter 12. Analysis of variance

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping

Descriptions of post-hoc tests

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

Topic 32: Two-Way Mixed Effects Model

Lecture 7 Randomized Complete Block Design (RCBD) [ST&D sections (except 9.6) and section 15.8]

Topic 6. Two-way designs: Randomized Complete Block Design [ST&D Chapter 9 sections 9.1 to 9.7 (except 9.6) and section 15.8]

Analysis of Variance Bios 662

22s:152 Applied Linear Regression. Take random samples from each of m populations.

PLS205 Lab 2 January 15, Laboratory Topic 3

data proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg

The Random Effects Model Introduction

PLS205 Lab 6 February 13, Laboratory Topic 9

Topic 28: Unequal Replication in Two-Way ANOVA

Statistics 512: Applied Linear Models. Topic 9

Lecture 1 Linear Regression with One Predictor Variable.p2

STAT 115:Experimental Designs

Analysis of Variance

STAT 705 Chapter 16: One-way ANOVA

EXST 7015 Fall 2014 Lab 11: Randomized Block Design and Nested Design

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Chapter 6 Multiple Regression

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

Chapter 11 : State SAT scores for 1982 Data Listing

13. The Cochran-Satterthwaite Approximation for Linear Combinations of Mean Squares

STAT 401A - Statistical Methods for Research Workers

Chapter 16. Nonparametric Tests

Multicollinearity Exercise

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Asymptotic Statistics-VI. Changliang Zou

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

Transcription:

Analysis of Variance Interested in comparing Several treatments Several levels of one treatment Comparison of a Population Means Could do numerous two-sample t-tests but... ANOVA provides method of joint inference Design of Experiments - Montgomery Section 3-1 through 3-3 µ - grand mean y ij = µ + τ i + ɛ ij τ i - ith treatment effect { i =1, 2...a j =1, 2,...n i ɛ ij N(0,σ 2 ) - error component H 0 : τ 1 = τ 2 =... = τ a =0 H 1 : τ i 0 for at least one i 4 4-1 Notation Partitioning y ij ni y i. = yij yi. = yi./ni (treatment sample mean) j=1 y.. = y ij y.. = y../n (grand sample mean) Decomposition of y ij : Rewrite y ij as y.. +(y i. y.. )+(y ij y i. ) Canthenexpressy ij as ˆµ +ˆτ i +ˆɛ ij where ˆµ = y.. ˆτ i =(y i. y.. ) ˆɛ ij = y ij y i. Built in parameter estimate restrictions niˆτ i =0 Partitioning the Sum of Squares Recall y ij y.. =(y i. y.. )+(y ij y i. ) Can show (yij y.. ) 2 = n i (y i. y.. ) 2 + (y ij y i. ) 2 Total SS = Treatment SS + error SS SS T =SS Treatments +SS E To test H 0,lookat n iˆτ i 2 = n i (y i. y.. ) 2 Small if ˆτ i s 0 If large then reject H 0, but how large is large? Standardize to account for inherent variability ˆɛij =0foralli F 0 = ni (y i. y.. ) 2 /(a 1) ave square between trts = (yij y i. ) 2 /(N a) ave square within trts 4-2 4-3

Why use F 0? From general sum of squares result, can show 1. SS E /(N a) is unbiased estimate of σ 2 Test Statistic Distribution Assume y ij N(µ + τ i,σ 2 ) and independent y i. N(µ + τ i,σ 2 /n i ) (yij y i. ) 2 /σ 2 χ 2 ni 1 2. Under H 0,SS Treatment /(a 1) is also unbiased estimate (yij y i. ) 2 /σ 2 χ 2 N a Call these estimates Mean Squares Can show, in general, the expected values are E(MS E )=σ 2 E(MS Treatment )=σ 2 + n iτi 2 /(a 1) Under H 0 : y i. N(µ, σ 2 /n i ) ni (y i. y.. ) 2 /σ 2 χ 2 a 1 If we can show independence, can use F-distribution Use ratio of mean squares as test statistic If statistic deviates from one, then reject H 0 What is test statistic distribution? Cochran s Thm Since SS T with N 1 degrees of freedom is partitioned into SS Treatment and SS E and (a 1) + (N a) =N 1, then the two sum of squares are independent Use F with a 1andN a degrees of freedom 4-4 4-5 Similarity With t-test Consider the square of the t-statistic (n 1 = n 2 = n) Analysis of Variance Table Source of Sum of Degrees of Mean F 0 Variation Squares Freedom Square Between SS Treatment a 1 MS Treatment F 0 Within SS E N a MS E Total SS T N 1 If F 0 >F α,a 1,N a then reject H 0 If P(F a 1,N a >F 0 ) <αthen reject H 0 t 2 0 = = = = ( y 1. y 2. ) 2 S p 2/n ( ) 2 (y 1. y..) (y 2. y..) S p 2/n ( ) 2 2(y 1. y..) S p 2/n ( ) 4(y 1. y..) 2 S 2 p (2/n) = 2((y1. y..)2 +(y 2. y..) 2 ) Sp 2 (2/n) = n((y1. y..)2 +(y 2. y..) 2 ) Sp 2 = MS(Between) MS(Within) = MS Treatment MS E When a =2,t 2 0 = F 0 F-test gives identical results as t-test H A : 4-6 4-7

Example Twelve lambs are randomly assigned to three different diets. The weight gain (in two weeks) is recorded. Is there a difference among the diets? Diet 1 Diet 2 Diet 3 8 9 15 16 16 10 9 21 17 11 6 18 yij = 156 and yij 2 = 2274 y 1. = 33, y 2. = 75, and y 3. =48 n 1 =3,n 2 =5,n 3 = 4 and N =12 SS T = 2274 156 2 /12 = 246 SS Treatment = (33 2 /3+75 2 /5+48 2 /4) 156 2 /12 = 36 SS E = 246 36 = 210 F 0 = (36/2)/(210/9) = 0.77 P-value > 0.25 (DNR) option nocenter ps=65 ls=80; data lambs; input diet wtgain; cards; 1 8 1 16 1 9 2 9 2 16 2 21 2 11 2 18 3 15 3 10 3 17 3 6 ; Using SAS (lambs.sas) symbol1 bwidth=5 i=box; axis1 offset=(5); proc gplot; plot wtgain*diet / frame haxis=axis1; proc glm; class diet; model wtgain=diet; output out=diag r=res p=pred; proc gplot; plot res*diet /frame haxis=axis1; proc sort; by pred; symbol1 v=circle i=sm50; proc gplot; plot res*pred / haxis=axis1; run; 4-8 4-9 Log File (.log) 393 option nocenter ps=65 ls=80; 395 data lambs; 396 input diet wtgain; 397 cards; NOTE: The data set WORK.LAMBS has 12 observations and 2 variables. NOTE: DATA statement used: real time 0.10 seconds 412 symbol1 bwidth=5 i=box; 413 axis1 offset=(5); 414 proc gplot; 415 plot wtgain*diet / frame haxis=axis1; NOTE: There were 12 observations read from the data set WORK.LAMBS. NOTE: PROCEDURE GPLOT used: real time 2.57 seconds 418 proc glm; 419 class diet; 420 model wtgain=diet; 421 output out=diag r=res p=pred; 422 run; NOTE: There were 12 observations read from the data set WORK.LAMBS. NOTE: The data set WORK.DIAG has 12 observations and 4 variables. NOTE: PROCEDURE GLM used: real time 0.44 seconds The GLM Procedure Class Level Information Class Levels Values diet 3 1 2 3 Number of observations 12 The GLM Procedure Dependent Variable: wtgain Output File (.lst) Sum of Source DF Squares Mean Square F Value Pr > F Model 2 36.0000000 18.0000000 0.77 0.4907 Error 9 210.0000000 23.3333333 Corrected Total 11 246.0000000 R-Square Coeff Var Root MSE wtgain Mean 0.146341 37.15738 4.830459 13.00000 Source DF Type I SS Mean Square F Value Pr > F diet 2 36.00000000 18.00000000 0.77 0.4907 Source DF Type III SS Mean Square F Value Pr > F diet 2 36.00000000 18.00000000 0.77 0.4907 4-10 4-11

Plots Plots 4-12 4-13 Handout Example options ls=80 ps=60 nocenter; goptions device=win target=winprtm rotate=landscape ftext=swiss hsize=8.0in vsize=6.0in htext=1.5 htitle=1.5 hpos=60 vpos=60 horigin=0.5in vorigin=0.5in; data one; infile c:\saswork\data\tensile.dat ; input percent strength time; title1 Chapter 3 Example ; proc print data=one; run; symbol1 v=circle i=none; title1 Plot of Strength vs Percent Blend ; proc gplot data=one; plot strength*percent/frame; run; proc boxplot; plot strength*percent/boxstyle=skeletal; proc glm; class percent; model strength=percent; output out=oneres p=pred r=res; run; proc sort; by pred; symbol1 v=circle i=sm50; title1 Residual Plot ; proc gplot; plot res*pred/frame; run; proc univariate data=oneres; var res; qqplot res / normal (L=1 mu=est sigma=est); histogram res / normal; run; symbol1 v=circle i=none; title1 Plot of residuals vs time ; proc gplot; plot res*time / vref=0 vaxis=-6 to 6 by 1; run; 4-14 Chapter 3 Example Obs percent strength time 1 15 7 15 2 15 7 19 3 15 15 25 4 15 11 12 5 15 9 6 6 20 12 8 7 20 17 14 8 20 12 1 9 20 18 11 10 20 18 3 11 25 14 18 12 25 18 13 13 25 18 20 14 25 19 7 15 25 19 9 16 30 19 22 17 30 25 5 18 30 22 2 19 30 19 24 20 30 23 10 21 35 7 17 22 35 10 21 23 35 11 4 24 35 15 16 25 35 11 23 4-15

4-16 4-17 The GLM Procedure Dependent Variable: strength Sum of Source DF Squares Mean Square F Value Pr > F Model 4 475.7600000 118.9400000 14.76 <.0001 Error 20 161.2000000 8.0600000 Corrected Total 24 636.9600000 R-Square Coeff Var Root MSE strength Mean 0.746923 18.87642 2.839014 15.04000 Source DF Type I SS Mean Square F Value Pr > F percent 4 475.7600000 118.9400000 14.76 <.0001 Source DF Type III SS Mean Square F Value Pr > F percent 4 475.7600000 118.9400000 14.76 <.0001 4-18 4-19

The UNIVARIATE Procedure Variable: res Moments N 25 Sum Weights 25 Mean 0 Sum Observations 0 Std Deviation 2.59165327 Variance 6.71666667 Skewness 0.11239681 Kurtosis -0.8683604 Uncorrected SS 161.2 Corrected SS 161.2 Coeff Variation. Std Error Mean 0.51833065 Basic Statistical Measures Location Variability Mean 0.00000 Std Deviation 2.59165 Median 0.40000 Variance 6.71667 Mode -3.40000 Range 9.00000 Interquartile Range 4.00000 Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student s t t 0 Pr > t 1.0000 Sign M 2.5 Pr >= M 0.4244 Signed Rank S 0.5 Pr >= S 0.9896 Fitted Distribution for res Parameters for Normal Distribution Parameter Symbol Estimate Mean Mu 0 Std Dev Sigma 2.591653 Goodness-of-Fit Tests for Normal Distribution Test ---Statistic---- -----p Value----- Kolmogorov-Smirnov D 0.16212279 Pr > D 0.088 Cramer-von Mises W-Sq 0.08045523 Pr > W-Sq 0.203 Anderson-Darling A-Sq 0.51857191 Pr > A-Sq 0.177 4-20 4-21 4-22 4-23