Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Similar documents
Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Fractional Factorial Designs

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

Analysis of Variance and Design of Experiments-I

STAT Final Practice Problems

2 k, 2 k r and 2 k-p Factorial Designs

Chapter 10: Analysis of variance (ANOVA)

CS 5014: Research Methods in Computer Science. Experimental Design. Potential Pitfalls. One-Factor (Again) Clifford A. Shaffer.

Factorial designs. Experiments

Field Work and Latin Square Design

Statistics For Economics & Business

19. Blocking & confounding

16.3 One-Way ANOVA: The Procedure

Chapter 11 - Lecture 1 Single Factor ANOVA

23. Fractional factorials - introduction

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Confounding and Fractional Replication in Factorial Design

Topic 7: Incomplete, double-blocked designs: Latin Squares [ST&D sections ]

The hypergeometric distribution - theoretical basic for the deviation between replicates in one germination test?

Fractional Factorial Designs

Suppose we needed four batches of formaldehyde, and coulddoonly4runsperbatch. Thisisthena2 4 factorial in 2 2 blocks.

Chapter 13 Experiments with Random Factors Solutions

DESAIN EKSPERIMEN BLOCKING FACTORS. Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Chapter 11: Factorial Designs

STATS Analysis of variance: ANOVA

CS 5014: Research Methods in Computer Science

Unit 9: Confounding and Fractional Factorial Designs

Written Exam (2 hours)

Multiple Comparisons. The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

BALANCED INCOMPLETE BLOCK DESIGNS

Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G

Reference: Chapter 6 of Montgomery(8e) Maghsoodloo

Fractional Replications

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

Analysis of Variance and Design of Experiments-II

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

Chapter 11 - Lecture 1 Single Factor ANOVA

Advanced Digital Design with the Verilog HDL, Second Edition Michael D. Ciletti Prentice Hall, Pearson Education, 2011

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Unit 6: Fractional Factorial Experiments at Three Levels

Unit 12: Analysis of Single Factor Experiments

2.830J / 6.780J / ESD.63J Control of Manufacturing Processes (SMA 6303) Spring 2008

Contents. TAMS38 - Lecture 8 2 k p fractional factorial design. Lecturer: Zhenxia Liu. Example 0 - continued 4. Example 0 - Glazing ceramic 3

Answer Keys to Homework#10

Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs

Analysis of variance

The 2 k Factorial Design. Dr. Mohammad Abuhaiba 1

Assignment 9 Answer Keys

Inferences for Regression

ST3232: Design and Analysis of Experiments

Blocks are formed by grouping EUs in what way? How are experimental units randomized to treatments?

IE 361 Exam 3 (Form A)

Analysis of Variance and Design of Experiments-II

Design theory for relational databases

df=degrees of freedom = n - 1

CS 147: Computer Systems Performance Analysis

CHAPTER EIGHT Linear Regression

SRI RAMAKRISHNA INSTITUTE OF TECHNOLOGY DEPARTMENT OF SCIENCE & HUMANITIES STATISTICS & NUMERICAL METHODS TWO MARKS

Chapter 4: Randomized Blocks and Latin Squares

Two-Factor Full Factorial Design with Replications

COM111 Introduction to Computer Engineering (Fall ) NOTES 6 -- page 1 of 12

Chap The McGraw-Hill Companies, Inc. All rights reserved.

One-Way Analysis of Variance (ANOVA)

MSc / PhD Course Advanced Biostatistics. dr. P. Nazarov

If we have many sets of populations, we may compare the means of populations in each set with one experiment.

Chapter 5 Introduction to Factorial Designs Solutions

STAT22200 Spring 2014 Chapter 13B

Analysis of Variance (and discussion of Bayesian and frequentist statistics)

Institutionen för matematik och matematisk statistik Umeå universitet November 7, Inlämningsuppgift 3. Mariam Shirdel

RCB - Example. STA305 week 10 1

The Random Effects Model Introduction

Chapter 30 Design and Analysis of

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Lecture 11: Blocking and Confounding in 2 k design

Construction of row column factorial designs

Exercise.13 Formation of ANOVA table for Latin square design (LSD) and comparison of means using critical difference values

What Is ANOVA? Comparing Groups. One-way ANOVA. One way ANOVA (the F ratio test)

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

. Example: For 3 factors, sse = (y ijkt. " y ijk

Research Methods II MICHAEL BERNSTEIN CS 376

Unit 6: Orthogonal Designs Theory, Randomized Complete Block Designs, and Latin Squares

Variance Estimates and the F Ratio. ERSH 8310 Lecture 3 September 2, 2009

ANOVA: Comparing More Than Two Means

KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICS & STATISTICS DHAHRAN, SAUDI ARABIA

Chapter 14 Simple Linear Regression (A)

COMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Topic 6. Two-way designs: Randomized Complete Block Design [ST&D Chapter 9 sections 9.1 to 9.7 (except 9.6) and section 15.8]

STATISTICS AND NUMERICAL METHODS QUESTION I APRIL / MAY 2010

Contents. TAMS38 - Lecture 6 Factorial design, Latin Square Design. Lecturer: Zhenxia Liu. Factorial design 3. Complete three factor design 4

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

Y it = µ + τ i + ε it ; ε it ~ N(0, σ 2 ) (1)

Power & Sample Size Calculation

Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs

3. Design Experiments and Variance Analysis

44.2. Two-Way Analysis of Variance. Introduction. Prerequisites. Learning Outcomes

Transcription:

Analysis of Variance and Design of Experiment-I MODULE IX LECTURE - 38 EXERCISES Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Example (Completely randomized design) Suppose there are four types of medicines which claim to control the body temperature of patient having fever. Let us denote them as M, M, M 3 and M. Suppose there are 0 patients who are suffering from fever and have agreed to use the medicine. Our objective is to know if there is any difference in the effects of the medicines. Note that the efficiency of the medicine also depends on the age of person but at present we are ignoring this aspect and we will consider it later in the designing of randomized block design. At present, we consider the most simple set up of completely randomized design. Suppose it is decided to give medicines M to patients, M to 5 patients, M 3 to 6 patients and M to remaining 5 patients. Note that the number of patients to be given a specific medicine are decided at random. Moreover, which patient has to be given which of the medicine is decided randomly so that other factors, e.g., age does not affect the final conclusions. The medicines are administrated and the observations on the number of hours of temperature control are recorded as follows: M 6, 8,, 9 Number of hours M 7, 9, 6, 5, M 3 8, 0, 9,, 3, 5 M 7, 5, 6, 6,

3 In the one way model τ i y = μ + τ + ε, ij i ij denotes the effect the medicine and there are types of medicine, so I =,, 3, ; y ij denotes the number of hours of temperature control on j th patient who is given i th medicine, so in our set up: M y y y3 y = 6, = 8, =, = 9 M y y y3 y y5 = 7, = 9, = 6, = 5, = M y y y y y y 3 3 3 33 3 35 36 = 8, = 0, = 9, =, = 3, = 5 M y y y3 y y5 = 7, = 5, = 6, = 6, = The null and alternative hypothesis under consideration are H H : τ = τ = τ = τ 0 3 : The effect of at least one pair of medicines is not the same.

We now test this null hypothesis using the set up of one way analysis of variance. First we explain how to compute all the related terms: n =, n = 5, n = 6, n = 5 3 n= n = 0 y y y y i= i n 6+ 8+ + 9 = y = = 6.75 o j n j= n 7+ 9+ 6+ 5+ = y = = 6. 5 o j n j= n3 8+ 0+ 9+ + 3+ 5 = y = = 9.5 6 3o 3j n3 j= n 7+ 5+ 6+ 6+ = y = = 5.6 o j n j= 5 G = sum of all the observations=3 y oo n i G = yij = = 7.5. n 0 i= j= The estimates of treatment effects are ˆ τ = y y = 0. o ˆ τ = y y = 0.95 o 3 3o o oo oo ˆ τ = y y =.35 oo ˆ τ = y y =.55 oo

5 The total sum of squares is n i G TSS = yij n i= j= sum of squares of G = all the observations n = 9.56 = 6.. The treatment totals are the total of the observations obtained by giving a specific treatment. The treatment totals due to treatment τ, τ, τ3 and τ are denoted as T, T, T3 and T respectively and are obtained as n T = y = 6+ 8+ + 9= 7 j j= T n = y = 7+ 9+ 6+ 5+ = 3 j j= n3 T = y = 8 + 0 + 9 + + 3+ 5 = 57 T 3 3j j= n = y = 7+ 5+ 6+ 6+ = 8. j j=

6 The sum of squares due to treatment is T G SSTr = n n i i= i T T T T G = + + + n n n n = 07.75.56 3 3 = 070.9. The sum of squares due to error is SSE = TSS SSTr = 6. 070.9 = 76.5. The mean square due to treatment is SStr MSTr = = 356.73. The mean square due to error is SSE MSE = =.77 0 The value of F - statistic is MStr F = = 7.85. MSE The tabulated value of F at 3 and 6 degrees of freedom at 5% level of significance ( ) F 3,6 = 3.. tab

7 The analysis of variance is now constructed t as Source of variation Degrees of freedom Sum of squares Mean squares F- value Medicines =3 070.9 356.73 (treatments) Error 0 = 6 76.5.77 7.85 0 = 9 6. Since F>F tab, so H 0 is rejected at 5% level of significance. This means that on the basis of given sample, it can be concluded that the treatment effects are not the same, i.e., the effect of the four medicines are not the same on the patients but they have different effects. This conclusion poses the next question. When the null hypothesis is rejected then which of the treatment effect are responsible for the rejection in the sense that whether all the treatment effects are different from each other or some of the treatmentst t have same effect. To know the answer, we go for multiple l comparison test. t Various available multiple l comparison tests can be used. It is not necessary that the conclusions based on different tests are always same.

8 Example (Randomized block design) We continue here with the sameset up of Example but now conduct the experiment in the set up of arandomized d block design. Suppose there are four medicines denoted as M, M, M 3 and M which are to be tested over 0 patients to know their effect in controlling the body temperature of patients having fever. The effect of medicine depends not only on the chemical composition but also on other factors also. We consider here another important factor as age of the patients. Based on age, we divide the patients into five groups comprising of patients with ages 5-0 years, 5-0 years, 5-30 years, 0-5 years and 50-65 years which provides the five blocks denoted as B, B, B 3, B and B 5 respectively. In each age group (or block) there are patients which is same as the number of medicines. Now the four medicines are given to four patients at random in each block and the readings on the number of hours of fever control are recorded. The observations are compiled in the following table which are under the set up of an RBD. Block Block Block 3 Block Block 5 5-0 years 5-0 years 5-30 years 0-55 years 50-65 years Medicine 5 7 8 6 7 Medicine 6 7 5 5 Medicine 3 6 8 8 5 Medicine 7 9 7 6 One can observe here that the observations depends not only on the type of medicine given but also how the patients are allocated to different blocks or in simple words, how the blocks are constructed. The way in which blocks are constructed introduces the block effect. Ideally, it is expected that the construction of block does not affects the efficiency of medicines. So it is also also important to check whether all the block effects of all the blocks are same or not besides the equality of treatment effects.

9 The computations involved are as follows: Block Block Block 3 Block Block 5 Block totals Means M 5 7 8 6 7 B = 33 y o = 6.6 M 6 7 5 5 M 6 8 8 5 3 M 7 9 7 6 B = 7 y o = 5. B 3 = 3 y o3 = 6. 33 y o Treatment totals T = 9 T = 8 T 3 = 3 T = T 5 = 3 B = y = 6.6 The model is yij = μ + βi + τ j + εij where * y is the number of hours of fever control when medicine is given to a patient in i th block. y ij μ * is the general mean effect. β i * is the i th block effect (effect of age), i =,, 3,, 5. τ j * is the j th treatment effect (effect of medicine), j =,, 3, as there are four medicines. * n = 0.

0. 5 yij i= j= ˆ μ = yoo = = = 6. 5 0 5 ˆ β = y y i ij oo 5 j= = y y io oo ˆ β =.75 6. =.5 ˆ β = 7 6.= 0.8 ˆ β = 8 6.=.8 3 ˆ β = 5.5 6. = 0.7 ˆ β = 5.75 6. = 0.5 5 ˆ τ = y y j ij oo i= ˆ τ = 6.6 6. = 0. = y y oj oo ˆ τ = 5 5. 6 6.= 0.8 08 ˆ τ = 6. 6. = 0 3 ˆ τ = 6.6 6. = 0. Correction factor = G 768.8 n = 0 =

The total sum of squares is 5 G TSS = yij n i= j= sum of squares = of all the observation = 80 768.8 =.. G n The sum of squares due to blocks is B j G SSBl = j= n = 795.5 768.8 = 6.7. The sum of squares due to treatments is T G SSTr = = 773.6 768.8 =.8. 5 i i= 5 n The sum of squares due to error is SSE = TSS SSBl SStr =. 6.7.8 = 9.7. The mean square due to block is SSbl 6.7 MSBl = = = 6.675. 5

The mean square due to treatment is SStr MSTr = =.6. The mean square due to error is SSE MSE = = 0.808. (5 )( ) The F-statistic for testing H : β = β = β = β = β is F bl = MSBl 8.6. MSE = ob 3 5 The critical value of F at 5% level of significance at and degrees of freedom is F (,) = 3.6. Based on the given set of data the null hypothesis H ob is rejected as F (,). bl > Ftab tab The F - statistic for testing H : τ = τ = τ = τ is F tr MSTr = =.98. MSE ot 3 The critical value of F at 5% level of significance with 3 and degrees of freedom is F tab(3,) = 3.9.

3 Based on the given set of data, the null hypothesis H ot is accepted as F < F (3,). The analysis of variance table is compiled as follows: tr tab Source of variation Degrees of freedom Sum of squares Mean squares F-value Blocks 6.7 6.675 8.6. Medicines 3.8.6.98 Error 9.7 0.808 Ttl Total 9. Looking the conclusions drawn from this arrangement of RBD, we observe that on the basis of given set of data. The null hypothesis corresponding to the medicines, i.e., the treatment effects is accepted. This means that all the medicines are equally effective and there is no difference in the medicines. The null hypothesis about the block effect is rejected, i.e., the effect due to age of patients is the not same in all the blocks. Now note that in the example, the age effect was not considered and the conclusion was that all the medicines have same effect. This conclusion is reversed in this example when the age effect is incorporated. So the blocking factor plays an important role.

Example 3 (Latin square design) The mileage of a car is the number of kilometers it runs with one liter of petrol. The mileage depends on the type of petrol, type of car as well as the driving habits of the driver. Suppose there are five varieties of petrol denoted as A,B,C,D,E; five different drivers denoted as D, D, D3, D, D5 and five cars denoted as car, car, car 3, car and car 5. An experiment is conducted to know the effect of these factors, viz., petrol, driver and car using the Latin square design. A Latin square of order 5 5 is chosen and its rows are columns are randomized. The resulting square is given as follows: ( ) A D B C E D A C E B E B A D C B C E A D C E D B A

5 Based on this Latin square, the experiment is conducted and following data on the number of kilometers run by one liter of petrol is obtained as Drivers D D D 3 D D 5 Car A D B C E 7 9 3 Car D A C E B 9 6 3 7 Car3 E B A D C Cars 7 5 Car B C E A D 6 8 30 8 0 Car5 C E D B A 8 9

6 The interpretation of this data is as follows. The first cell has value 6. This means that the car was driver by driven D with one liter of petrol type A and it runs 6 km. Similarly the second entry in the first row tells that when car was driven by driver car D using one liter of petrol type D, then it runs 9 kms. Now we conduct the analysis of data as follows. The corresponding model is where yijk = μ + αi + β j + τk + εijk, μ α i is the general mean effect, is the main effect of i th row, i.e., the main effect of i th car, i =,, 3,, 5, β j is the main effect of j th column, i.e., the main effect of j th driver, j =,, 3,, 5, and τ is the main effect of k th treatment, i.e., the main effect of k th variety of petrol. τ k The null hypothesis are H : α = α = α = α = α, H H 0R 3 5 : β = β = β = β = β, 0C 3 5 : τ = τ = τ = τ = τ. 0T 3 5

7 The observations are tabulated as follows Drivers Row totals D D D 3 D D 5 Car 7 9 3 R = 0 Car 9 6 3 7 R = 99 Cars Car3 7 5 R 3 = 99 Car 6 8 30 8 0 R = Car5 8 9 R 5 = 0 Column totals C = C = 09 C 3 = 96 C = 3 C 5 = 9 Grand Total G = 53 The row total and column totals are mentioned in this table. The treatment totals are obtained as follow. Treatment total due to A is the sum of all the observations obtained by the use of petrol A. It is given as TA = T = 7 + 6 + + 8 + =.

8 Similarly, the treatment totals due to the use of petrols B,C,D and E are T T T B C T D E = T = + 7 + + 6 + = 00 = T = 3+ 3+ 5 + 8 + 8 = 87 3 = T = 9 + 9 + 7 + 0 + 9 = 9 = T5 = + + + 30 + = 8 n = 5. Correction factor (CF)= G n 53 53 = = 09.6. 5 SSR = Sum of squares due to rows (cars) = R G n 5 i i= 5 = 08.-09.6 09 6 = 77.0 SSC = Sum of squares due to columns (driver) = C 5 j G j= 5 n = 00.6-09.6 = 63. SSTr = Sum of squares due to treatment (petrol) = 5 T G n k k = 5 = -09.6 = 99.8

9 5 5 5 TSS = Total sum of squares = y y ijk i= j= k= = 5-09.6 = 83.8 G n Sum of squares due to error SSE = TSS SSR SSC SSTr = 83.8 77.0 63. 99.8 =3.5 SSR Mean square due to rows = MSR = = 9.6 5 Mean square due to column = SSC MSC = = 5.86 5 SSTr Mean square due to treatments = MSTr = = 0.96 5 SSE Mean square due to error = MSE = =.96 (5 )(5 )

0 F - statistic for rows F r MSR = =.6 MSE F - statistic for columns F c MSC = =.33 MSE F - statistic for treatments F Tr MSTr = = 0. MSE

The tabulated value of F at and degrees of freedom at 5% level of significance is These values can be compiled in the following analysis of variance table: F tab (,) = 3.6. Source of variation Degrees of freedom Sum of squares Mean squares F- value Rows Columns Treatments 77.0 63. 99.8 9.6 5.86 0.96 Error 3.5.96.6.33 0. Total 83.8 We conclude now on the basis of given data that the values of F - statistic. ( ) H : α = α = α = α = α Since F F,, so is accepted and it means that the effect of all the cars is r < tab 0R 3 5 same on the mileage. ( ) Since F F,, so H : β = β = β = β = β is accepted. This means that the effect of all the drivers c< tab 0C 3 5 is same on mileage. ( ) H : τ = τ = τ = τ = τ Since F F,, so is rejected. This means that the effect of all varieties of tr > tab 0T 3 5 petrol on mileage is different. So different petrol affect the mileage differently.

Example (Factorial experiment) Suppose the rotation per minute (rpm) of an electric motor depends on four factors - Voltage (A), current (B), temperature (C) and length of blades (D). Each factor has two levels as follows: Voltage: level = 0 volts (a 0 ) level = 60 volts (a ) Current : level = 7 ampere (b 0 ) level = ampere (b ) Temperature: level = 0 degree centigrade (c 0 ) level = 35 degree centigrade (c ) Length of blades: level = 0 cm (d 0 ) level = 5 cm (d ).

3 The experiment is conducted using RBD is the set up of combinations is observed as follows. factorial and rpm with four replications for each treatment Treatment combinations () a b ab c ac bc abc d ad bd abd cd acd bcd abcd Observed rpm (in hundreds) Replications Replication Replication Replication 3 Replication 7 6 3 6 6 6 0 9 38 50 70 0 80 60 9 5 7 0 5 3 0 5 3 8 79 6 7 8 6 30 7 0 50 36 0 50 60 3 6 6 70 5 5 3 8 6 0 0 6 33 5 7 50 5 80 The total rpm are obtained for each treatment combination by summing up all the observations corresponding to each treatment combination from the four replications. For example, the total rpm is -- due to a is (a) = 7 + 9 + 8 + 5 -- due to b is (b) = 6 + 5 + 6 + 3

The total rpm are compiled in the following table Treatment combinations Observed rpm (in hundreds) Replications Replication Replication Replication 3 Replication Total rpm () a b ab c ac bc abc d ad bd abd cd acd bcd abcd 7 6 3 6 6 6 0 9 38 50 70 0 80 60 9 5 7 0 5 3 0 5 3 8 79 6 7 8 6 30 7 0 50 36 0 50 60 3 6 6 70 5 5 3 8 6 0 0 6 33 5 7 50 5 80 () = 5 (a) = 89 (b) = 0 (ab)= 73 (c) = (ac) = 87 (bc) = 9 (abc) = 73 (d) = 9 (ad) = 3 (bd) = 90 (abd) = 6 (cd) = 9 (acd) = 7 (bcd) = 7 (abcd) = 7

5 Now various sums of squares can be computed using the Yates procedure as follows: Yates procedure Treatment totals Yates procedure calculations () () (3) () (5) 5 3 37 697 37 = [M] 89 3 350 0 570 = [A] 0 8 69 37 635 = [B] 73 8 3 6 = [AB] 3 77 73 85 = [C] 87 06 70 6 86 = [AC] 9 66 65 67 = [BC] 73 55 58-5 - = [ABC] 9 79 3 73 = [D] 3 33 9 8 - = [AD] 90 6 83-7 89 = [BD] 6 79 93-38 = [ABD] 9 39 89 5 79 = [CD] 7 6 78 96 00 = [ACD] 7 6 3 65 83 = [BCD] 7-3 -6-77 - = [ABCD]

6 The sum of squares are obtained by using For example, SS( Effect) = [ A] SS( A) =,. [ B ] SS( B ) =,. [ Effect] r. and so on. These sum of squares are complied in the following analysis of variance table.

Analysis of variance table for factorial experiment 7

8 The tabulated value of F-statistics at 5% level of significance with and 5 degrees of freedom is F tab (, 5) =.06. The F values of the null hypothesis corresponding to A, B, D, BC, AC, AD, BD, CD, ABC, ABD and ABCD are greater than F tab (, 5) =.06, so the null hypothesis is rejected. So these effects are significant. The F values of the null hypothesis corresponding to C, AB, ACD and BCD are smaller than F tab (, 5) =.06, so the null hypothesis is accepted. Thus these effects arenot significant. ifi