The Distribution of F

Similar documents
Stat 502 Design and Analysis of Experiments One-Factor ANOVA

These are all actually contrasts (the coef sum to zero). What are these contrasts representing? What would make them large?

STAT22200 Spring 2014 Chapter 5

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5

More about Single Factor Experiments

Lec 1: An Introduction to ANOVA

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

Unit 12: Analysis of Single Factor Experiments

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

ANOVA Multiple Comparisons

Comparisons among means (or, the analysis of factor effects)

STK4900/ Lecture 3. Program

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Laboratory Topics 4 & 5

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA


2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

1 One-way Analysis of Variance

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Multiple Comparison Methods for Means

Orthogonal, Planned and Unplanned Comparisons

Group comparison test for independent samples

W&M CSCI 688: Design of Experiments Homework 2. Megan Rose Bryant

Factorial and Unbalanced Analysis of Variance

Multiple Testing. Tim Hanson. January, Modified from originals by Gary W. Oehlert. Department of Statistics University of South Carolina

Tukey Complete Pairwise Post-Hoc Comparison

H0: Tested by k-grp ANOVA

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

Stats fest Analysis of variance. Single factor ANOVA. Aims. Single factor ANOVA. Data

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

Analytical Comparisons Among Treatment Means (Chapter 4) Analysis of Trend (Chapter 5) ERSH 8310 Fall 2009

Diagnostics and Transformations Part 2

Multiple Comparisons

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

Chapter 1 Statistical Inference

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

Week 14 Comparing k(> 2) Populations

Stat 502 Design and Analysis of Experiments General Linear Model

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

Your schedule of coming weeks. One-way ANOVA, II. Review from last time. Review from last time /22/2004. Create ANOVA table

Lectures on Simple Linear Regression Stat 431, Summer 2012

OHSU OGI Class ECE-580-DOE :Design of Experiments Steve Brainerd

One-way between-subjects ANOVA. Comparing three or more independent means

3. Design Experiments and Variance Analysis

Orthogonal contrasts and multiple comparisons

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Lecture 10: Generalized likelihood ratio test

Chapter Seven: Multi-Sample Methods 1/52

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means

Chapter 10. Design of Experiments and Analysis of Variance

Math 423/533: The Main Theoretical Topics

What If There Are More Than. Two Factor Levels?

Solutions to Final STAT 421, Fall 2008

Analysis of Variance (ANOVA)

Central Limit Theorem ( 5.3)

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

22s:152 Applied Linear Regression. 1-way ANOVA visual:

Introduction to Analysis of Variance (ANOVA) Part 2

Analysis of Variance II Bios 662

Confidence Intervals, Testing and ANOVA Summary

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

ANOVA: Comparing More Than Two Means

A posteriori multiple comparison tests

COMPARISON OF MEANS OF SEVERAL RANDOM SAMPLES. ANOVA

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies.

ANOVA (Analysis of Variance) output RLS 11/20/2016

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Formal Statement of Simple Linear Regression Model

One-Way ANOVA Calculations: In-Class Exercise Psychology 311 Spring, 2013

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences

ANOVA: Comparing More Than Two Means

Introduction to the Analysis of Variance (ANOVA)

Multiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design

One-way ANOVA (Single-Factor CRD)

Multiple comparisons - subsequent inferences for two-way ANOVA

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

10. Issues on the determination of trial size

One-Way Analysis of Variance (ANOVA) There are two key differences regarding the explanatory variable X.

In ANOVA the response variable is numerical and the explanatory variables are categorical.

H0: Tested by k-grp ANOVA

The Random Effects Model Introduction

Inference for Regression

Statistics for EES Factorial analysis of variance

STAT22200 Spring 2014 Chapter 13B

Lecture 18: Simple Linear Regression

Outline. Example and Model ANOVA table F tests Pairwise treatment comparisons with LSD Sample and subsample size determination

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

y response variable x 1, x 2,, x k -- a set of explanatory variables

MAT3378 (Winter 2016)

Transcription:

The Distribution of F It can be shown that F = SS Treat/(t 1) SS E /(N t) F t 1,N t,λ a noncentral F-distribution with t 1 and N t degrees of freedom and noncentrality parameter λ = t i=1 n i(µ i µ) 2 /σ 2 = t i=1 n i j=1 (µ i µ) 2 /σ 2 Under H 0 : µ 1 =... = µ t this becomes the (central) F t 1,N t distribution. We reject H 0 whenever F F t 1,N t (1 α) = Fcrit = qf(1 α,t 1,N t) which denotes the (1 α)-quantile of the F t 1,N t distribution. Power function: β(λ) = P(F F t 1,N t (1 α)) = 1 pf(fcrit,t 1,N t,λ) 1

Discussion of Noncentrality Parameter λ The power of the ANOVA F-test is a monotone increasing function of λ = t i=1 n i(µ i µ) 2 /σ 2 = N t i=1 (n i/n)(µ i µ) 2 /σ 2 = N σ 2 µ/σ 2 = N between treatment variation/within treatment variation Thus we consider the drivers in λ. λ increases as σ decreases (provided the µ i are not all the same). The more difference there is between the treatment means µ i the higher λ Increasing the sample sizes will magnify n i (µ i µ) 2 (provided µ is stable). The sample sizes we can plan for. Later: we can reduce σ by blocking units into more homogeneous groups. 2

Optimal Allocation of Sample Sizes? We have N experimental units available for testing the effects of t treatments and suppose that N is a multiple of t, say N = r t (r and t integer). It would seem best to use samples of equal size r for each of the t treatments i.e., we would opt for a balanced design. That way we would not emphasize one treatment over any of the others. Is there some other optimality criterion that could be used as justification? We may plan for a balanced design upfront, but then something goes wrong with a few observations and they have to be discarded from analysis. Be careful that the deletion of observations does not bias any conclusions. 3

A Sample Size Allocation Rationale We may be concerned with alternatives where all means but one are the same. Since we won t know upfront which mean sticks out, we would want to maximize the minimum power against all such contingencies. Max-Min Strategy! If µ 1 = µ + and µ 2 =... = µ t = µ then µ = µ + n 1 /N and (algebra) t λ 1 = n i (µ i µ) 2 /σ 2 = N 2 n ( 1 i=1 σ 2 1 n ) 1 N N for the other cases. It is easy to see now that for fixed σ max min n 1,...,n t 1 i t [ N 2 σ 2 is achieved when n 1 =... = n t. That is because n ( i 1 n ) i N N increases for n i /N 1/2. n i N and similarly ( 1 n ) ] i N λ i = N 2 σ 2 n i N ( 1 n ) i N 4

Using sample.sizeanova (see web page) Suppose we have t = 3 treatments and want to determine the sample size n per treatment to achieve power β(λ) =.9 for level α =.05. It is desired to do this for a λ = λ i corresponding to the alternatives on the previous slide with /σ =.5, i.e., with N = t n λ i = N 2 σ 2 n ( 1 n ) ( = n 2 N N σ 2 1 1 ) = n 2 t σ 2 t 1 = n λ t 0. λ 0 = ( 2 /σ 2 ) (t 1)/t can be interpreted more generally as (µ i µ) 2 /σ 2. > sample.sizeanova() > sample.sizeanova(nrange=30:100) > sample.sizeanova(nrange=70:100,power0=.9) produced the next 3 slides = n = 77. 5

sample.sizeanova I function (delta.per.sigma=.5,t.treat=3, nrange=2:30,alpha=.05,power0=null) { # delta.per.sigma is the ratio of delta over sigma # for which one wants to detect a delta shift in one # mean while all other means stay the same. # t.treat is the number of treatments. alpha is the # desired significance level. nrange is a range of # sample sizes over which the power will be calculated # for that delta.per.sigma. power0 is on optional value # for the target power that will be highlighted on the plot. #------------------------------------------------------------------- lambda0=((t.treat-1)/t.treat)*delta.per.sigmaˆ2 power=null 6

sample.sizeanova II for(n in nrange){ N=n*t.treat Fcrit=qf(1-alpha,t.treat-1,N-t.treat) power=c(power,1-pf(fcrit,t.treat-1,n-t.treat,n*lambda0))} plot(nrange,power,type="l",xlab=paste("sample size n per each of t =", t.treat," treatments"), ylab="",ylim=c(0,1)) mtext(expression(beta(lambda) "=" beta(n %*% lambda[0])),2,2.7) abline(h=seq(0,1,.02),col="grey") abline(v=nrange,col="grey") lines(nrange,power,col="red") title(substitute(delta/sigma==delta.per.sigma "," lambda[0] "=" sum((mu[i]-bar(mu))ˆ2/sigmaˆ2) "=" lambda0 ", " alpha==alpha1, list(lambda0=format(signif(lambda0,4)),alpha1=alpha, delta.per.sigma=delta.per.sigma))) 7

sample.sizeanova III if(!is.null(power0)){ abline(h=power0,col="blue") par(las=2) mtext(power0,4,0.2,at=power0,col="blue") par(las=0)}} 8

Sample Size Determination σ = 0.5, λ 0 = (µ i µ) 2 σ 2 = 0.1667, α = 0.05 β(λ) = β(n λ 0 ) 0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 25 30 sample size n per each of t = 3 treatments 9

Sample Size Determination (increased n) σ = 0.5, λ 0 = (µ i µ) 2 σ 2 = 0.1667, α = 0.05 β(λ) = β(n λ 0 ) 0.0 0.2 0.4 0.6 0.8 1.0 30 40 50 60 70 80 90 100 sample size n per each of t = 3 treatments 10

Sample Size Determination (magnified) σ = 0.5, λ 0 = (µ i µ) 2 σ 2 = 0.1667, α = 0.05 β(λ) = β(n λ 0 ) 0.0 0.2 0.4 0.6 0.8 1.0 0.9 70 75 80 85 90 95 100 sample size n per each of t = 3 treatments 11

Coagulation Example In order to understand the blood coagulation behavior in relation to various diets, lab animals were given 4 different diets and their subsequent blood draws were then measured for their respective coagulation times in seconds. The lab animals were assigned randomly to the various diets. The results were as follows: > ctime [1] 59 60 62 63 63 64 65 66 67 71 66 67 68 68 68 71 56 59 [19] 60 61 62 63 63 64 > diet [1] "A" "A" "A" "A" "B" "B" "B" "B" "B" "B" "C" "C" "C" [14] "C" "C" "C" "D" "D" "D" "D" "D" "D" "D" "D" 12

Plot for Coagulation Example n A = 4 n B = 6 n C = 6 n D = 8 coagulation time (sec) 50 55 60 65 70 75 80 A A A A B B B B B B C C C C D D D D D D D diet 13

ANOVA for Coagulation Example Note that in the previous plot we used jitter(ctime) to plot ctime in the vertical direction and to plot its horizontal mean lines. This perturbs tied observations a small random amount to make tied observations more visible. For example, the mean lines for diet A and D would have been the same otherwise. > anova(lm(ctime as.factor(diet))) Analysis of Variance Table Response: ctime Df Sum Sq Mean Sq F value Pr(>F) as.factor(diet) 3 228.0 76.0 13.571 4.658e-05 *** Residuals 20 112.0 5.6 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 14

lm for Coagulation Example > out=lm(ctime as.factor(diet)) > names(out) [1] "coefficients" "residuals" "effects" [4] "rank" "fitted.values" "assign" [7] "qr" "df.residual" "contrasts" [10] "xlevels" "call" "terms" [13] "model" > out$coefficients (Intercept) as.factor(diet)b as.factor(diet)c 6.100000e+01 5.000000e+00 7.000000e+00 as.factor(diet)d -1.095919e-14 Note that these are the estimates ˆµ A, ˆµ B ˆµ A, ˆµ C ˆµ A, ˆµ D ˆµ A. 15

Residuals from lm for Coagulation Example > out$residuals 1 2 3 4-2.000000e+00-1.000000e+00 1.000000e+00 2.000000e+00 5 6 7 8-3.000000e+00-2.000000e+00-1.000000e+00 1.111849e-16 9 10 11 12 1.000000e+00 5.000000e+00-2.000000e+00-1.000000e+00 13 14 15 16-5.534852e-17-5.534852e-17-5.534852e-17 3.000000e+00 17 18 19 20-5.000000e+00-2.000000e+00-1.000000e+00-1.663708e-16 21 22 23 24 1.000000e+00 2.000000e+00 2.000000e+00 3.000000e+00 Numbers such as -5.534852e-17 should be treated as 0 (computing quirks). 16

Fitted Values from lm for Coagulation Example > out$fitted.values 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 61 61 61 61 66 66 66 66 66 66 68 68 68 68 68 68 61 61 61 20 21 22 23 24 61 61 61 61 61 17

Comparing Treatment Means Ȳ i. When the hypothesis H 0 : µ 1 =... = µ t is not rejected at level α then there is little purpose in looking closer at differences between the sample means Ȳ i. for the various treatments. Any such perceived differences could easily have come about by simple random variation, even when the hypothesis is true. Why then read something into randomness? It is like reading tea leaves! However, when the hypothesis is rejected it is quite natural to ask in which way the hypothesis was contradicted. The best indicators for any analysis as to how the means µ i may be different would be the sample or treatment means ˆµ i = Ȳ i., i = 1,...,t. 18

Confidence Intervals for µ i A first step in understanding differences in the µ i is to look at their estimates ˆµ i =Ȳ i.. We should do this in the context of the sampling variability of ˆµ i. In the past we addressed this via confidence intervals for µ i based on ˆµ i. In any such confidence interval we can now use the pooled variance s 2 from all t samples and not just the variance s 2 i from the ith sample, i.e. we get ˆµ i ±t 1 α/2,n t s ni as our 100(1 α)% confidence interval for µ i. This follows as before (exercise) from the independence of ˆµ i and s, the fact that (ˆµ i µ i )/(σ/ n i ) N (0,1), and from s 2 /σ 2 χ 2 N t /(N t). The validity of this improvement (N t n i 1 when using s 2 instead of s 2 i ) depends strongly on the assumption that the population variances σ 2 behind all t samples are the same, or at least approximately so. 19

Plots of Confidence Intervals for Coagulation Data coagulation time (sec) 55 60 65 70 75 80 using pooled s 2 = using individual s i 2 t i=1 s i2 (n i 1) (N t) diet A diet B diet C diet D 20

Simultaneous Confidence Intervals When constructing intervals of the type: ˆµ i ±t 1 α/2 s ni or ˆµ i ±t 1 α/2 s i ni for i = 1,...,t we should be aware that these intervals don t simultaneously cover their respective targets µ i with probability 1 α. They do so individually. For example ( ) s t ( ) P µ i ˆµ i ±t i s 1 α/2, i = 1,...,t = ni P µ i ˆµ i ±t i 1 α/2 i=1 ni = (1 α) t < 1 α. Thus we should choose α for individual intervals to get (1 α ) t = 1 α or α = 1 (1 α) 1/t α t = α t. Some problem remains when using common pooled estimate s. No independence! 21

α = 1 (1 α) 1/t α/t or α ~ t = α t α t * = 1 (1 α) (1 t) 0.00 0.01 0.02 0.03 0.04 α * (1 t) t = 1 (1 α) α ~ t = α t t = 3 t = 5 t = 6 t = 10 0.00 0.02 0.04 0.06 0.08 0.10 desired overall α 22

Dealing with Dependence from Using Pooled s When we use a common pooled estimate s for the standard deviation σ the previous confidence intervals are no longer independent. However, it can be shown that ( ) s P µ i ˆµ i ±t 1 α /2, i = 1,...,t ni t ( ) s P µ i ˆµ i ±t 1 α /2 i=1 ni = (1 α ) t = 1 α This comes from the positive dependence between confidence intervals through s, i.e., if one interval is more (less) likely to cover its target µ i due to s, so are the other intervals more (less) likely to cover their targets µ j. Using the same compensation as in the independence case would let us err on the conservative side, i.e., give us higher confidence than the targeted 1 α. 23

Boole s and Bonferroni s Inequality For any m events E 1,...,E m Boole s inequality states P(E 1... E m ) P(E 1 ) +... + P(E m ) For any m events E 1,...,E m Bonferroni s inequality states P(E 1... E m ) 1 m (1 P(E i )) i=1 The statement are equivalent, since P(E 1... E m ) = 1 P(E c 1... Ec m). If E i denotes the i th coverage event { µ i ˆµ i ±t 1 α/2 s ni } with P(E i ) = 1 α, then the simultaneous coverage probability is bounded from below as follows ( ) \ t t P E i 1 (1 P(E i )) = 1 t α = 1 α if α = α t = α/t, i=1 i=1 i.e., we can achieve at least 1 α probability coverage by choosing the individual coverage appropriately, namely 1 α = 1 α/t. Almost same adjustment. 24

Contrasts Any linear combination C = t i=1 c iµ i with t i=1 c i = 0 is called a contrast. Note that t i=1 c iµ = 0, i.e., contrasts are zero over the hypothesis. Suppose we have 4 treatments with respective means µ 1,...,µ 4. We may be interested in contrasts of the following form C 12 = µ 1 µ 2 with c = (c 1,...,c 4 ) = (1, 1,0,0). Similarly for the other differences C i j = µ i µ ( 4) j. There are 2 = 6 such contrasts. Sometimes one of the treatments, say the first, is singled out as the control. We may then be interested in just the 3 contrasts C 12,C 13 and C 14 or we may be interested in C 1.234 = µ 1 (µ 2 + µ 3 + µ 4 )/3 with c = (1, 1/3, 1/3, 1/3). Sometimes the first 2 treatment share something in common and so do the last 2. One might then try: C 12.34 = (µ 1 +µ 2 )/2 (µ 3 +µ 4 )/2 with c = (1/2,1/2, 1/2, 1/2) 25

Estimates and Confidence Intervals for Contrasts A natural estimate of C = t i=1 c iµ i is Ĉ = t i=1 c iˆµ i = t i=1 c iȳi.. ( ) t t t We have E(Ĉ) = E c i Ȳ i. = c i E (Ȳ i.) = c i µ i = C i=1 i=1 i=1 t and var(ĉ) = var( c i Ȳ i. i=1 Under the normality assumption for the Y i j we have ) = Ĉ C t N t where s 2 = s t i=1 c2 i /n i t c 2 i var(ȳ i.) = i=1 t c 2 i σ2 /n i. i=1 t n i s 2 i /(N t) = MS E. i=1 = Ĉ±t N t,1 α/2 s t c 2 i /n i is a 100(1 α)% confidence interval for C. i=1 26

Testing H 0 : C = 0 Based on the duality of testing and confidence intervals we can test the hypothesis H 0 : C = 0 by rejecting it whenever the previous confidence interval does not contain C = 0. Similarly, reject H 0 : C = C 0 by rejecting it whenever the previous confidence interval does not contain C = C 0 Another notation for this interval is Ĉ ±t N t,1 α/2 SE(Ĉ) where SE(Ĉ) = s t c 2 i /n i. i=1 SE(Ĉ) is the standard error of Ĉ, the estimate of the standard deviation of Ĉ. 27

Paired Comparisons: Fisher s Protected LSD Method After rejecting H 0 : µ 1 =... = µ t one is often interested in looking at all ( t 2 ) pairwise contrasts C i j = µ i µ j. The following procedure is referred to as Fisher s Protected Least Significant Difference (LSD) Method. It consists of possibly two stages: 1) Perform α level F-test for testing H 0. If H 0 is not rejected, stop. 2) If H ( t ) 0 is rejected, form all 2 (1 α)-level confidence intervals for Ci j = µ i µ j : 1 Î i j = ˆµ i ˆµ j ±t N t,1 α/2 s + 1 n i n j and declare all µ i µ j 0 for which Î i j does not contain zero. 28

Comments on Fisher s Protected LSD Method If H 0 is true, the chance of making any statements contradicting H 0 is at most α. This is the protected aspect of this procedure. However, when H 0 is not true there are many possible contingencies, some of which can give us a higher than desired chance of pronouncing a significant difference, when in fact there is none. E.g., if all but one mean (say µ 1 ) are equal and µ 1 is far away from µ 2 =... = µ t our chance of rejecting H 0 is almost 1. However, among the intervals for µ i µ j, 2 i < j we may find a significantly higher than α proportion of cases with wrongly declared differences. This is due to the multiple comparison issue. 29

Pairwise Comparisons: Tukey-Kramer Method The Tukey-Kramer method is based on the distribution of { } Zi Z j i.i.d. Q t, f = max where Z 1 i< j t s 1,...,Z t N (0,1) and f s 2 χ 2 f Its cdf and quantile function are given in R as ptukey(q,nmeans,df) and qtukey(p,nmeans,df), nmeans = t is the number of means, df = f = N t denotes the degrees of freedom in s. Applying this to Z i = (ˆµ i µ i )/(σ/ n) and assuming n 1 =... = n t = n we get max i< j { } n ˆµi ˆµ j (µ i µ j ) s = max i< j ˆµ i µ i σ/ n ˆµ j µ j σ/ n s/σ = Q t, f P ( µ i µ j ˆµ i ˆµ j ± q t, f,1 α s/ n i < j ) = 1 α simultaneous (1 α)-coverage confidence intervals. Here P(Q t, f q t, f,1 α ) = 1 α or q t, f,1 α = qtukey(1 α,t,f). 30

Tukey-Kramer Method: Unequal Sample Sizes The simultaneous intervals for all pairwise mean differences was due to Tukey, but it was hampered by the requirement of equal sample sizes. This was addressed by Kramer in the following way. In the above confidence intervals replace n in 1/ n = 1/n by n i j, where n i j is the harmonic mean of n i and n j, i.e., 1/n i j = (1/n i +1/n j )/2. Different adjustment for each pair (i, j)! It was possible to show ( ) P µ i µ j ˆµ i ˆµ j ± Q t, f,1 α s/ n i j i < j 1 α simultaneous confidence intervals with coverage 1 α. 31

Tukey-Kramer Method for Coagulation Data coag.tukey = function (alpha=.05) { diets=unique(diet) mu.vec=null nvec=null mean.vec=null for(i in 1:length(diets)){ mu.vec=c(mu.vec,mean(ctime[diet==diets[i]])) nvec=c(nvec,length(ctime[diet==diets[i]])) mean.vec=c(mean.vec,rep(mu.vec[i],nvec[i])) } tr=length(nvec) N=sum(nvec) MSE=sum((ctime-mean.vec)ˆ2/(N-tr)) 32

Tukey-Kramer Method for Coagulation Data s=sqrt(mse) intervals=null for(i in 1:3){ for(j in (i+1):4){ nijstar=1/(.5*(1/nvec[i]+1/nvec[j])) qtk=qtukey(1-alpha,tr,n-tr) Diff=mu.vec[i]-mu.vec[j] lower=diff - qtk*s/sqrt(nijstar) upper=diff + qtk*s/sqrt(nijstar) intervals=rbind(intervals,c(lower,upper)) } } intervals } 33

Tukey-Kramer Results for Coagulation Data > coag.tukey() [,1] [,2] [1,] -9.275446-0.7245544 [2,] -11.275446-2.7245544 [3,] -4.056044 4.0560438 [4,] -5.824075 1.8240748 [5,] 1.422906 8.5770944 [6,] 3.422906 10.5770944 Declare significant differences in µ 1 µ 2, µ 1 µ 3, µ 2 µ 4, and µ 3 µ 4. 34

Scheffé s Confidence Intervals for All Contrasts Scheffé took the F-test for testing H 0 : µ 1 =... = µ t and converted it into a simultaneous coverage statement about confidence intervals for all contrasts c = (c 1,...,c t ): P t c i µ i i=1 ( t t c iˆµ i ± (t 1) F t 1,N t,1 α s i=1 i=1 1/2 c 2 i i) /n c = 1 α This is a coverage statement about an infinite number of contrasts, but can be applied conservatively to all pairwise contrasts. The resulting intervals tend to be quite conservative. But it compares well with Bonferroni type intervals if applied to many contrasts. 35

Pairwise Comparison Intervals for Coagulation Data (simultaneous) 95%-Intervals mean Fisher s Bonferroni Scheffé s all difference Tukey-Kramer protected LSD inequality contrasts method µ 1 µ 2-9.28-0.72-8.19-1.81-9.47-0.53-9.66-0.34 µ 1 µ 3-11.28-2.72-10.19-3.81-11.47-2.53-11.66-2.34 µ 1 µ 4-4.06 4.06-3.02 3.02-4.24 4.24-4.42 4.42 µ 2 µ 3-5.82 1.82-4.85 0.85-6.00 2.00-6.17 2.17 µ 2 µ 4 1.42 8.58 2.33 7.67 1.26 8.74 1.10 8.90 µ 3 µ 4 3.42 10.58 4.33 9.67 3.26 10.74 3.10 10.90 Declare significant differences in µ 1 µ 2, µ 1 µ 3, µ 2 µ 4, and µ 3 µ 4, using any of the four methods. 36

Simultaneous Paired Comparisons (95%) Pairwise Comparisons of Means (Coagulation Data): 1 α = 0.95 15 10 5 0 5 10 15 Tukey Kramer pairwise comparisons Fisher's protected LSD Bonferroni intervals Scheffe's intervals for all contrasts µ 1 µ 2 µ 1 µ 3 µ 1 µ 4 µ 2 µ 3 µ 2 µ 4 µ 3 µ 4 37

Simultaneous Paired Comparisons (99%) Pairwise Comparisons of Means (Coagulation Data): 1 α = 0.99 15 10 5 0 5 10 15 Tukey Kramer pairwise comparisons Fisher's protected LSD Bonferroni intervals Scheffe's intervals for all contrasts µ 1 µ 2 µ 1 µ 3 µ 1 µ 4 µ 2 µ 3 µ 2 µ 4 µ 3 µ 4 38

Orthogonal Contrast All ( t 2 ) pairwise comparisons for µi µ j could by very many and simultaneous intervals would become quite conservative. Since all these contrasts span a (t 1)-dimensional space one should be able to capture all differences with just t 1 orthogonal contrasts. C 1 = t c 1i µ i C 2 = i=1 t c 2i µ i i=1 C 1 C 2 = cov(ĉ 1,Ĉ 2 ) = t i=1 t c 1i c 2i /n i = 0 i=1 t c 1i c 2 j cov(ˆµ i, ˆµ j ) = j=1 ( ) wrong def. in Montgomery p.91 t c 1i c 2i σ 2 /n i = 0, i=1 i.e., Ĉ 1 and Ĉ 2 are independent and simultaneous statements for C 1,C 2,... are easier to handle, just as before when making simultaneous intervals for µ 1,...,µ t based on independent ˆµ 1,..., ˆµ t. The trick is to have meaningful or interpretable orthogonal contrast. 39

Simultaneous Intervals for Orthogonal Contrast If Ĉ 1,...,Ĉ k are any k orthogonal contrasts, they are independent. Just as we constructed simultaneous intervals for means based on independent mean estimates and the pooled standard deviation s we can again construct contrast confidence intervals with simultaneous coverage probability 1 α by taking 1 α with α = 1 (1 α) 1/k α k as the coverage probability for the individual intervals. 40

An Orthogonal Contrast Example Suppose we have t = 3 treatments of which the third is a control, i.e., we are familiar with its performance. Assume further that we have a balanced design, i.e., n 1 = n 2 = n 3. We could try the following t 1 = 2 orthogonal contrasts: c 1 = (.5,.5, 1) and c 2 = (1, 1,0). Note that C 1 = (µ 1 + µ 2 )/2 µ 3 and C 2 = µ 1 µ 2, of which the first assesses how much the average mean of the two new treatments differs from the control mean and the second assesses the difference between the two new treatments. These are seemingly orthogonal issues. 41

Unbalanced Case of Previous Example We have an unbalanced design, i.e., n 1, n 2, n 3 may be different. Then the following t 1 = 2 vectors: c 1 = (n 1/(n 1 + n 2 ),n 2 /(n 1 + n 2 ), 1) and c 2 = (1, 1,0) are indeed contrast vectors: n 1 /(n 1 + n 2 ) + n 2 /(n 1 + n 2 ) 1 = 0 and 1 1 + 0 = 0 and they are orthogonal: n 1 /[(n 1 + n 2 )n 1 ] n 2 /[(n 1 + n 2 )n 2 ] 1 0/n 3 = 0. = C 1 = (n 1 µ 1 + n 2 µ 2 )/(n 1 + n 2 ) µ 3 = µ 12 µ 3 and C 2 = µ 1 µ 2, of which the first assesses how much the weighted average mean of the two new treatments differs from the control mean and the second assesses the difference between the two new treatments. These are seemingly orthogonal issues. 42

Service Center Data # of persons # of calls on call processed per hour 2 1.7 2.7 2.5 1.9 3 4.5 3.5 4.7 5.4 4 4.7 4.8 5.6 5.1 5 6.3 5.2 6.6 4.9 7 6.3 5.7 6.1 6.1 number of calls processed per hour 0 1 2 3 4 5 6 2 3 4 5 6 7 number of people on call 43

Service Center Data Here we have a new type of treatment (number of persons on call), where the different treatment levels are scalar and not just qualitative. In such situations the following orthogonal contrasts are of practical interest: c i1 c i2 c i3 c i4 c i5 C 1 = 5 j=1 c 1 j µ j -2-1 0 1 2 C 2 = 5 j=1 c 2 j µ j 2-1 -2-1 2 C 3 = 5 j=1 c 3 j µ j -1 2 0-2 1 C 4 = 5 j=1 c 4 j µ j 1-4 6-4 1 For what kind of mean patterns in µ 1,...,µ 5 would C i and consequently Ĉ i be large? 44

Orthogonal Contrast Plots 5 C i = c i, j j using µ j = j j=1 4 2 0 2 4 6 C 1 C 2 C 3 C 4 2 1 0 1 2 j 45

Interpretation of Orthogonal Contrast Plots The previous plot suggests that a pattern in the means µ j in relation to j = 1,...,5 that correlates most strongly with the corresponding pattern in the plot should yield a high value for the corresponding absolute contrast C i. Thus a large value C 1 indicates a strong linear component in the mean pattern. A large value C 2 indicates a strong quadratic component in the mean pattern. A large value C 3 indicates a strong cubic component in the mean pattern. A large value C 4 indicates a strong quartic component in the mean pattern. Typically, one hopes to rule out some (if not all) of the latter possibilities. 46

Simultaneous Contrast Intervals (for Service Center Data) 95% 99% C 1 [ 6.27, 11.58] [ 5.53, 12.32] C 2 [-7.02, -0.73] [ -7.89, 0.14] C 3 [-1.26, 4.06] [ -1.99, 4.79] C 4 [-9.58, 4.48] [-11.53, 6.43] From these intervals one sees that C 1 and C 2 are significantly different from zero. with 95% confidence, but C 2 not quite with 99% confidence. Hence there appears to be a sufficiently strong linear and mildly quadratic component. The original data plot suggested this and its strength is now assessed statistically. 47

Orthogonal Polynomial Contrast Vectors The previous orthogonal contrasts for linear, quadratic, cubic, quartic behavior were tailored to five treatments. How do we get similar contrast vectors when we have t treatments? R has a function contr.poly(t) that gives you orthogonal vectors representing the various polynomial components: linear, quadratic,... > round(contr.poly(7),3).l.q.c ˆ4 ˆ5 ˆ6 [1,] -0.567 0.546-0.408 0.242-0.109 0.033 [2,] -0.378 0.000 0.408-0.564 0.436-0.197 [3,] -0.189-0.327 0.408 0.081-0.546 0.493 [4,] 0.000-0.436 0.000 0.483 0.000-0.658 [5,] 0.189-0.327-0.408 0.081 0.546 0.493 [6,] 0.378 0.000-0.408-0.564-0.436-0.197 [7,] 0.567 0.546 0.408 0.242 0.109 0.033 48

Orthogonal Polynomial Contrasts from contr.poly(7) orthogonal polynomial 1.0 0.5 0.0 0.5 1.0 3 2 1 0 1 2 3 i 49