Unit 12: Analysis of Single Factor Experiments

Similar documents
DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Lec 1: An Introduction to ANOVA

Design & Analysis of Experiments 7E 2009 Montgomery

Analysis of Variance

What If There Are More Than. Two Factor Levels?

Unit 10: Simple Linear Regression and Correlation

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

Multiple comparisons - subsequent inferences for two-way ANOVA

PLSC PRACTICE TEST ONE

Multiple comparisons The problem with the one-pair-at-a-time approach is its error rate.

20.0 Experimental Design

1 One-way Analysis of Variance

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

Analysis of variance

Unit 14: Nonparametric Statistical Methods

More about Single Factor Experiments

Analysis of Variance

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

9 One-Way Analysis of Variance

One-way ANOVA (Single-Factor CRD)

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

One-Way Analysis of Variance (ANOVA)

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

Factorial designs. Experiments

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Review of Statistics 101

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Chapter Seven: Multi-Sample Methods 1/52

Inferences for Regression

Comparing Several Means: ANOVA

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

STAT22200 Spring 2014 Chapter 5

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

DESAIN EKSPERIMEN BLOCKING FACTORS. Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

QUEEN MARY, UNIVERSITY OF LONDON

Central Limit Theorem ( 5.3)

The Random Effects Model Introduction

The Distribution of F

Sleep data, two drugs Ch13.xls

ANOVA: Analysis of Variation

3. Design Experiments and Variance Analysis

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46

Formal Statement of Simple Linear Regression Model

Lecture 7: Hypothesis Testing and ANOVA

Design of Experiments. Factorial experiments require a lot of resources

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

Introduction to the Analysis of Variance (ANOVA)

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

Chapter 4: Randomized Blocks and Latin Squares

Multiple Comparison Methods for Means

CHAPTER 10. Regression and Correlation

This gives us an upper and lower bound that capture our population mean.

A posteriori multiple comparison tests

Inference for Regression Simple Linear Regression

(1) The explanatory or predictor variables may be qualitative. (We ll focus on examples where this is the case.)

Stat 705: Completely randomized and complete block designs

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Assessing Model Adequacy

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

4.1. Introduction: Comparing Means

Week 14 Comparing k(> 2) Populations

Chap The McGraw-Hill Companies, Inc. All rights reserved.

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

Statistics For Economics & Business

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

COMPARING SEVERAL MEANS: ANOVA

Statistics: revision

COMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Inference for the Regression Coefficient

1 Introduction to One-way ANOVA

Confidence Intervals, Testing and ANOVA Summary

CS 5014: Research Methods in Computer Science. Experimental Design. Potential Pitfalls. One-Factor (Again) Clifford A. Shaffer.

Introduction. Chapter 8

Hypothesis T e T sting w ith with O ne O One-Way - ANOV ANO A V Statistics Arlo Clark Foos -

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

Introduction to Analysis of Variance (ANOVA) Part 2

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik

STA Module 10 Comparing Two Proportions

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design

Unit 11: Multiple Linear Regression

Chapter 12. Analysis of variance

Analysis of Variance (ANOVA)

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

Fractional Factorial Designs

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Chapter 10. Design of Experiments and Analysis of Variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

9 Correlation and Regression

Transcription:

Unit 12: Analysis of Single Factor Experiments Statistics 571: Statistical Methods Ramón V. León 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 1

Introduction Chapter 8: How to compare two treatments. Chapter 12: How to compare more than two treatments Limited to a single treatment factor Example of single factor experiment: Compare the flight distances of three types of golf balls differing in the shape of dimples on them: circular, fat elliptical, and thin elliptical Treatment factor: type of ball Factor levels: circular, fat elliptical, and thin elliptical Treatments: circular, fat elliptical, and thin elliptical How would an experiment with more than one treatment factor look? 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 2

Experimental Designs Independent Samples Dependent Samples Two Treatments Independent Samples Design Matched Pair Design More Than Two Treatments Completely Randomized Design Randomized Block Design 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 3

Completely Randomized Design Random sample drawn in each of six molding stations. Runs should be in random order to protect against time trend 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 4

Completely Randomized Design Notation If the sample sizes are equal the design is balanced; otherwise the design is unbalanced N a = n j= 1 i 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 5

Completely Randomized Design: Comments In a CRD the experimental units are randomly assigned to each treatment Similar data also arises in observational studies where the units are not assigned to the different groups by the investigator Stronger conclusions are possible with experimental data 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 6

Completely Randomized Design Data Inspection Nominal Variable 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 7

CRD Side-by-Side Box Plots Station 5 has two outliers Stations 4, 5, and 6 which are supplied by feeder 2 have a higher average as a group than stations 1, 2, and 3 that are supplied by feeder 1. Is this difference real or the result sampling variation? Weights 52.5 52 51.5 51 1 2 3 4 5 6 Station 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 8

CRD Model and Estimation Model assumption: the data on the i-th treatment are N µ σ 2 a random sample from an ( i, ) population Y = µ + ε ( i = 1,2,..., a; j = 1,2,..., n ) ij i ij i where ε N ij are independent and identically distributed (i.i.d.) 2 (0, σ ) random errors. 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 9

CRD Model and Estimation 2 The treatment means µ i and the error variance σ are unknown parameters. The primary interest is on comparing the means Frequently, we write µ = µ + τ where µ is the "grand mean" defined as the weighted average of the µ : a a n 1 iµ µ i= i i= 1 i µ = = if ni = n are egual a n a i= 1 i and τ = µ µ is the deviation of the i-th treatment mean i i from this grand mean. i i We refer to τ as the i-th treatment effect. i i 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 10

CRD Model and Estimation Alternative Formulation of the Model: Y = µ + τ + ε ( i= 1,2,..., a ; j = 1,2,..., n ) ij i ij i The τ are subject to the contraint: i a ( a τ ) i = = = 0 nτ if the n n are equal i= 1 i i = 1 i i So there are only a -1 linearly independent τ ' s. i 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 11

CRD Parameter Estimates ˆ σ = s 2 2 Measure of common experimental error 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 12

ANOVA in JMP s Fit Model Platform Note that the Station variable is nominal 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 13

ˆ µ ˆ τ ˆ τ ˆ τ ˆ τ ˆ τ 1 2 3 4 5 CRD Parameter Estimates How do we find the value of ˆ6 τ? 2 s 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 14

Relationship to Dummy Variable Regression z i 1 if station i = 1 if station 6 0 otherwise i = 1,2,...,5 y = 51.57 + 0.09z 0.23z 0.33z + 0.05z + 0.13z + ε 1 1 2 2 3 3 1 2 3 4 5 y = ˆ µ + ˆ τ z + ˆ τ z + ˆ τ z + ˆ τ 4 z 4 + ˆ τ 5 z 5 + ε 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 15

CRD Parameter Estimates 2 s 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 16

i CRD (1-α)-level Confidence Interval s y t µ y + t n i N a, α 2 i i N a, α 2 i However, usually we are more interested in comparing the µ with each other than estimating them separately. Fit Y by X: s n i 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 17

Mean Diamonds in JMP Why do all the diamonds have the same height? 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 18

H H Analysis of Variance Homogeneity Hypothesis : : µ = µ =... = µ vs. H : Not all the µ are equal. 0 1 2 a 1 i : τ = τ =... = τ = 0 vs. H : At least some τ 0. 0 1 2 a 1 i Note SSA = Treatment sums of squares 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 19

Wrong ANOVA table: ANOVA in JMP (Model: Y = β + β Station + ε) 0 1 Note that the SS has the wrong number of degrees of freedom Correct ANOVA table: (Model: Y = µ + τ z + τ z + τ z + τ z + τ z + ε) 1 1 2 2 3 3 4 4 5 5 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 20

Model Diagnostics: Residuals versus Fitted Value Part of Fit Model Output eij = yij yi This plot checks the assumption of constant error variance σ 2 A cone shape in this plot would suggest a log transformation of response 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 21

Model Diagnostic: Assumption of Equal Variances (More Formal Tests) 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 22

Model Diagnostics: Residual Versus Row (Time?) Order Fit Model Platform: A time pattern here would be confounded with a station effect. JMP table should be in the random order that the data is supposed to have been collected 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 23

Model Diagnostics: Normal Plot of Residuals Strong indication that errors are normally distributed. 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 24

0 1 Multiple Comparison of Means If H : µ =... = µ is rejected all that we can say is that a the treatment means are not equal. The F-test does not pinpoint which treatment means are significantly different from each other. We could test all pairwise equality hypotheses H : µ = µ y y Reject H if t = > t s 1 n + 1 n i j 0 ij ij N a, α 2 y y > t s 1 n + 1 n = i j N a, α 2 i j ( Least significant difference, LSD) i j 0ij i j 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 25

Pairwise Equality Hypotheses Since each of the 15 pairwise test have a level α, the type I error probability of declaring at least one pairwise difference falsely significant will exceed α. Family Wise Error rate (FWE): FWE = P{Reject at least one true null hypothesis when they are true} If all six means are actually equal in the plastic container example FWE = 0.350 when each LSD test is done at the 0.05 level. Fisher s protected LSD method: Use LSD method only after the F-test rejects (This method is not recommended today.) 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 26

LSD Method in JMP Overlap Marks If the overlap marks overlap the two means are not significantly different according to the LSD criterion 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 27

LSD Method in JMP Fit Y by X JMP platform: 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 28

Tukey Method Recommended Method: FWE = α if the sample sizes are equal and is slightly conservative (i.e., the actual FWE is < α ) when sample sizes are unequal 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 29

This report shows the ranked differences, from highest to lowest, with a confidence interval band overlaid on the plot. Confidence intervals that do not fully contain their corresponding bar are significantly different from each other. 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 30

Tukey Method Confidence Intervals This is a way of construction 100(1-α)% Simultaneous Confidence Intervals (SCIs) for all pairwise difference of means 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 31

Tukey Method Confidence Intervals Compare to the Minitab output at the bottom of Figure 12.6 of your textbook. How would you get the top output in that figure? 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 32

Dunnett Method for Comparisons with a Control 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 33

Dunnett Method in JMP 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 34

Hsu Method for Comparison with the Best 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 35

Box Plots for Teaching Method 40 35 Test Score 30 25 20 15 10 Case Equation Formula Unitary Analysis Method 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 36

Hsu Method in JMP Explanation Next Page 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 37

Hsu Method in JMP The Unitary Method is best Can t tell which is the worse method 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 38

Randomized Block Design Blocking helps to reduce experimental error variation caused by difference in the experimental units by grouping them into homogeneous sets (called blocks). Treatments are randomly assigned within each block 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 39

Randomized Block Design Model: Fixed Block Effects Y = µ + τ + β + ε ( i = 1,..., a; j = 1,..., b) ij i j ij 2 where εij are i.i.d. N(0, σ ) µ is called the grand mean τ is called the ith treatment effect i β is called the jth block effect j a i b τ = 0 and β = 0 so there are = 1 i j = 1 a 1 independent treatment effects b -1 independent block effects j 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 40

Mystery of Degrees of Freedom Explained Counting the grand mean there are 1 + ( a-1) + ( b-1) = a+ b 1 unknown parameters. (This many degrees of freedom are needed to estimate these parameters.) There are N = ab observations (total degrees of freedom). So there are ν = ab ( a + b 1) = ( a 1)( b 1) degrees of freedom for estimating the error variation (degrees of freedom for error). 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 41

No Interactions Between Treatments and Blocks The difference in mean responses between any two treatments is the same across all blocks µ µ = ( µ + τ + β ) ( µ + τ + β ) = τ τ ij i' j i j i' j i i' which is indepedent of the particular block j We say that there are no interactions between treatments and blocks Example: Consider the treatments to be fertilizer and the blocks to be different fields. Then no interaction implies that the difference in mean yields between any two fertilizers is the same for all fields. 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 42

RBD Example Notice that interest is on the differences among the positions. We assume that these differences are the same for all three batches except for random error, that is, we assume no interaction between batch and position. 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 43

JMP Analysis of Drip Loss Experiment Nominal 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 44

JMP Analysis of Drip Loss Experiment Position and batch explain 86% of the variation in drip loss SSModel = SSTreatment + SSBlocks True because we assume no interaction between treatment and block. (See next slide.) 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 45

JMP 4 Analysis of Drip Loss Experiment. III These two table were not the same in regression. They are equal here because the model is balanced. Also in regression the sum of the Type III sums of squares is not equal to the model sums of squares. This only true here because the model is balanced. The P-values show that there are significant position effects. We recommend ignoring the Block (Batch) test because it is not meaningful for the RBD. (Type III) Model SS = 56.654971 Recall: The sum of the Type I sums of squares is always equal to the model sums of squares 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 46

Drip Loss in Meat Loaves: Residual Plots The predicted versus residual plot is part of the standard output of the Fit Model platform. The normal plot was obtained by saving the residuals and then going to the Distribution platform. 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 47

Tukey Method for the RBD Using the Fit Model platform with batch and position in the model. That the two variables be included is important. Warning: Don t use the Fit Y by X platform to do Tukey s test as you will use the wrong number of degrees of freedom. 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 48

Tukey Method for the RBD 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 49

Tukey Method for the RBD 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 50

Mixed Effects Model for the RB Design Y = µ + τ + β + ε ( i= 1,..., a; j = 1,..., b) ij i j ij 2 where εij are i.i.d. N(0, σ ) and β are i.i.d. N(0, σ ) i j j 2 B µ is called the grand mean τ is called the ith treatment effect β 's are called the block effects Independent a τ i = 0 so there are a 1 independent treatment effects i = 1 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 51

7/16/2004 Unit 12 - Stat 571 - Ramón V. León 52

Compare with Results in Section 12.4.5, Example 12.16 of your textbook The variability due to batches accounts for about 58.4% of the total variability in drip loss. 7/16/2004 Unit 12 - Stat 571 - Ramón V. León 53