Solutions to Final STAT 421, Fall 2008

Similar documents
Multiple comparisons - subsequent inferences for two-way ANOVA

Comparison of Two Population Means

Research Methods II MICHAEL BERNSTEIN CS 376

Chapter 12. Analysis of variance

Design & Analysis of Experiments 7E 2009 Montgomery

Topic 22 Analysis of Variance


INTERVAL ESTIMATION AND HYPOTHESES TESTING

Sociology 6Z03 Review II

1 One-way Analysis of Variance

Confidence Intervals, Testing and ANOVA Summary

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

Inferences for Regression

Formal Statement of Simple Linear Regression Model

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Review of Statistics 101

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Analysis of Variance

Chapter Seven: Multi-Sample Methods 1/52

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

Multiple comparisons The problem with the one-pair-at-a-time approach is its error rate.

Group comparison test for independent samples

Chapter 12 Comparing Two or More Means

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Analysis of variance (ANOVA) Comparing the means of more than two groups

Lec 1: An Introduction to ANOVA

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

COMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

Statistics for EES Factorial analysis of variance

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

The Chi-Square Distributions

Written Exam (2 hours)

Chapter 1 Statistical Inference

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Group comparison test for independent samples

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Week 14 Comparing k(> 2) Populations

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

INTRODUCTION TO ANALYSIS OF VARIANCE

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapters 7-9

Statistics For Economics & Business

More about Single Factor Experiments

Analysis of Variance

Stat 427/527: Advanced Data Analysis I

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

STATISTICS 141 Final Review

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Ch 2: Simple Linear Regression

One-way Analysis of Variance. Major Points. T-test. Ψ320 Ainsworth

Chapter 12 - Lecture 2 Inferences about regression coefficient

Lec 3: Model Adequacy Checking

22s:152 Applied Linear Regression. 1-way ANOVA visual:

4.1. Introduction: Comparing Means

QUEEN MARY, UNIVERSITY OF LONDON

Stat 502 Design and Analysis of Experiments General Linear Model

One-Way ANOVA Calculations: In-Class Exercise Psychology 311 Spring, 2013

9 One-Way Analysis of Variance

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

These are all actually contrasts (the coef sum to zero). What are these contrasts representing? What would make them large?

Factorial designs. Experiments

Power & Sample Size Calculation

Analysis of Variance (ANOVA)

n i n T Note: You can use the fact that t(.975; 10) = 2.228, t(.95; 10) = 1.813, t(.975; 12) = 2.179, t(.95; 12) =

Unit 12: Analysis of Single Factor Experiments

The Chi-Square Distributions

COMPARING SEVERAL MEANS: ANOVA

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

ANOVA Multiple Comparisons

Lecture 10: Generalized likelihood ratio test

Simple Linear Regression

Lecture 15. Hypothesis testing in the linear model

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

STAT5044: Regression and Anova

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

21.0 Two-Factor Designs

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

ANOVA Analysis of Variance

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

One-way ANOVA (Single-Factor CRD)

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

Two-Way Analysis of Variance - no interaction

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer

Chapter 23: Inferences About Means

Transcription:

Solutions to Final STAT 421, Fall 2008 Fritz Scholz 1. (8) Two treatments A and B were randomly assigned to 8 subjects (4 subjects to each treatment) with the following responses: 0, 1, 3, 6 and 5, 7, 9, 10 for treatments A and B, respectively. Based on the randomization reference distribution find the p-value (corresponding to the above sets of observations) for the two-sided test of the hypothesis H 0 of no treatment effect when rejecting for large values of X B X A. Would you reject at level α =.05 = 1/20? (Hint: you don t need to compute the full randomization reference distribution! Find the most extreme value possible for X B X A and reason from there to the next most extreme.) The most extreme values for X B X A arise when the 8 data values are split as 0, 1, 3, 5 and 6, 7, 9, 10 (i.e., smallest four versus largest four, or in reverse order. The next largest value of X B X A arises when we interchange 5 and 6 in the above splits. This configuration was given as the observed situation. Thus there are 2 + 2 such splits with values of X B X A that are (0 + 1 + 3 + 6)/4 (5 + 7 + 9 + 10)/4. Since there are ( ) 8 4 = 70 possible splits of the 8 numbers into two samples of 4 it follows that the p-value of teh observed split is 4/70 = 2/35 > 1/20 =.05, i.e., we should not reject at level.05. 2. (6) In the context of the previous problem, why is the randomization reference distribution of X B X A symmetric around which value? The reference distribution is symmetric around zero, since for any split with positive (negative) X B X A there is a corresponding split (with X s under A exchanged against X s under B, possible because m = n) with the same negative (positive) value. 3. (6) Complete the following vectors c = (c 1,..., c t ) such that they become proper contrast vectors in a 1-way ANOVA with t = 6 treatment levels. a) c1 = c(0, 0, 1,.5, 0,?)? =.5 b) c2 = c( 1, 0,?,.5, 0,.5)? = 1 c) c3 = c( 1/6, 0, 1/6, 0,?, 1/2)? = 1/6 4. (10) The Fmin.test uses which test statistic T to test which hypothesis H 0, rejecting H 0 when T is too...? What are the crucial assumptions underlying the null distribution of T? How is the null distribution of T obtained? Which other test may be preferable when which one of these crucial assumptions is not satisfied? 1

T = min(s 2 1,..., s 2 t )/ max(s 2 1,..., s 2 t ) can be used to test the hypothesis H 0 : σ 2 1 =... = σ 2 t, i.e., of equal variances across t samples (still assuming equal variance within each sample). We reject H 0 when T is too small and the sampling distribution of T under H 0 can be obtained by generating random samples from the standard normal distribution and computing T each time, i.e, we simulate the null distribution. However, the assumption of normality is important for this null distribution, since it may change otherwise. When the assumption of normality is not reasonable it is advisable to use the modified Levene test which is more robust in that regard. 5. (5) If x = 1 : 5 what are the responses to the following two R commands? > x[x > 3] [1] 4 5 > mean(x[-c(1,3)]) [1] 3.666667 6. (5) What are the responses to the following two R commands? > c(8,6,4,4)/c(2,2,1,2) [1] 4 3 4 2 > c(8,6,4,4)/c(2,2) [1] 4 3 2 2 7. (6) What is the output of the following R function: mystery.function=function(n){ out=null for(k in 1:n){ out=c(out,1:k) } out } when you call [1] 1 1 2 1 2 3 mystery.function(3)? 8. (9) Assume that for three samples of respective sizes n 1 = 4, n 2 = 4 and n 3 = 8 you are given the pooled standard deviation estimate s = 4. What is the standard error (SE) of the contrast estimate Ĉ = 1 X 2 1. + 1 X 2 2. X 3.? Also state which contrast is being estimated. SE = s (1/2) 2 4 + (1/2)2 4 + ( 1)2 8 = s and the contrast being estimated is: C = 1 2 µ 1 + 1 2 µ 2 µ 3. 1 16 + 1 16 + 1 8 = s 2 = 2 9. (8) For the 2-sample Student-t based confidence interval for = µ Y µ X show how you would convert a 95% confidence interval [A, B] = [.5, 3.5] into a 99% confidence interval, if I give you the 2

value for the ratio t.995,m+n 2 /t.975,m+n 2 = qt(.995, m + n 2)/qt(.975, m + n 2) = 1.35. Would this change the conclusion when testing the hypothesis H 0 : = 0 at the two levels α =.05 and α =.01? Justify your answer. The 95% t-based interval is Ȳ X ± t.975,m+n 2 s 1 m + 1 n = A + B ± B A 2 2 The 99% t-based interval is Ȳ X 1 ± t.995,m+n 2 s m + 1 n = A + B ± t.995,m+n 2 B A = 2 ± 1.35 1.5 = 2 ± 2.025. 2 t.975,m+n 2 2 Since the first interval does not contain zero we would reject H 0 at level α =.05, but at level α =.01 we would not reject H 0 since the 99% interval contains zero. 10. (15) Give the definitions of the noncentral t, the noncentral χ 2, and the noncentral F distributions, respectively. State these definitions in terms of functions of more basic random variables (name the random variables together with the appropriate parameters and relationships to each other, and give the relevant function). Let Z N (δ, 1) and V χ 2 f (central chi-square with f degrees of freedom) be independent, then Z/ V/f is said to have a noncentral t-distribution with f degrees of freedom and noncentrality parameter δ. Let Z i N (µ i, 1), i = 1,..., f be independent, then Z 2 1 +... + Z 2 f is said to have a noncentral χ 2 distribution with f degrees of freedom and noncentrality parameter λ = µ 2 1 +... + µ 2 f. Let V 1 noncentral χ 2 distribution with f 1 degrees of freedom and noncentrality parameter λ and V 2 central χ 2 distribution with f 2 degrees of freedom, with V 1 and V 2 independent, then (V 1 /f 1 )/(V 2 /f 1 ) is said to have a noncentral F distribution with f 1 (numerator) and f 2 (denominator) degrees of freedom and with noncentrality parameter λ. 11. (8) In a 1-way ANOVA, comparing means for k = 3 groups, each with sample size 5, at significance level α =.05, what happens to the type II error probability when µ 1 = 1, µ 2 = 2 and µ 3 = 3 and a) the common σ for all k groups becomes very small? b) the common σ for all k groups becomes very large? (Think of the noncentrality parameter). The noncentrality parameter is λ = i µ 2 i /σ 2. Since the power β = β(λ) is strictly increasing to 1 as λ and since the type II error probability is 1 β(λ) we get 1 β(λ) 0 under a) and 1 β(λ) 1 α under b) since then λ 0, i.e., we approach the null distribution. 12. (9) In a 1-way ANOVA with t treatments and n observations per treatment indicate your answer for a)-c) by True, False or Undecidable (without additional information) using T, F, U. Assume Ȳ i. = n j=1 Y ij /n. 3

a) nj=1 (Y ij Ȳi.) = 0 (T) b) ti=1 (Y ij Ȳi.) = 0 (U) c) nj=1 ti=1 (Y ij Ȳi.) = 0 (T) 13. (6) Assume a 1-way ANOVA situation with t treatments. a) If we multiply a contrast vector c = (c 1,..., c t ) by 3, is the resulting vector again a contrast vector? (T) since i c i = 0 i 3c i = 0, i.e., it is still a contrast vector b) If instead of multiplying by 3 we add 3 to the contrast vector c, do you get again a contrast vector? (F) since i c i = 0 i(c i + 3) 0, i.e., it is no longer a contrast vector Answer True (T), False (F) or Undecidable (U) (without additional information). 14. (18) The following table gives the responses Y from a 2 factor experiment with 3 observations per factor level combination. Give the 2 3 table of Ȳij., i = 1, 2, j = 1, 2, 3 (leave answers in fractional form). Also give Ȳ1.., Ȳ2.., Ȳ. 1., Ȳ. 2., Ȳ. 3., and Ȳ... Data Table Table of Y ij. Factor 2 Factor 2 level 1 2 3 level 1 2 3 Factor 1 1 1, 4, 3 2, 2, 4 3, 1, 2 2 6, 1, 2 3, 2, 1 4, 5, 2 Factor 1 1 8/3 8/3 6/3 2 9/3 6/3 11/3 Ȳ 1.. = 22/9, Ȳ2.. = 26/9, Ȳ. 1. = 17/6, Ȳ. 2. = 14/6, Ȳ. 3. = 17/6, Ȳ... = 48/18 = 8/3. 15. (8) When comparing k = 5 independent normal samples with equal variances, we can compute confidence intervals for how many pairwise differences of means µ i µ j, i < j? Using Bonferroni s method, how would you adjust the confidence levels of these individual confidence intervals so that they simultaneously cover their respective targets µ i µ j with probability.90. ) = 10 possible paired differences. Thus the individual confidence levels need to be There are ( 5 2 adjusted to 1 α/10 = 1.10/10 =.99. 4

16. (15) In a 2-way (2 factor) ANOVA rewrite the Sum of Squares Decomposition SS T = SS A + SS B + SS AB + SS E in terms of the responses Y using the dot-bar notation, e.g., Ȳi.., etc. SS T = SS A + SS B + SS AB + SS E with SS T = (Y Ȳ.) 2 SS A = (Ȳi. Ȳ.) 2 = â 2 i, SS B = (Ȳ.j. Ȳ.) 2 = ˆb2 j SS AB = (Ȳij. Ȳi. Ȳ.j. + Ȳ.) 2 = ĉ 2 ij and SS E = (Y Ȳij.) 2 = ˆɛ 2 (Y Ȳ.) 2 = (Ȳi. Ȳ.) 2 + (Ȳ.j. Ȳ.) 2 + (Ȳij. Ȳi. Ȳ.j. + Ȳ.) 2 + (Y Ȳij.) 2 17. (6) What role (distinct in character) did SIR and FLUX play in the circuit board experiment? SIR was the response variable and FLUX was the treatment or factor. 18. (10) The testing of which hypothesis is addressed by the modified Levine test and how does it use a 1-way ANOVA analysis? The modified Levene test tests the hypothesis of equal variance across several samples by subtracting the respective sample medians from each sample, and performing a 1-way ANOVA on these absolute differences. 19. (12) In the context of an experiment with two factors A and B with t 1 and t 2 levels, respectively, and with n replications of the response Y per factor level combination, what is the difference in performed analyses in the following three commands? anova(lm(y A*B)) and anova(lm(y A+B)) and anova(lm(y A:B))? In the case of A*B a full model is fitted consisting of an additive model and interactions, with appropriate constraints. In the case of A+B an additive model is fitted and in the case of A:B a simple 1-way ANOVA model is fitted to all treatment level combinations for factors A and B, without any implicit additive and interaction structure. 20. (8) What type of constraints does R use when calculating and presenting the least squares estimates of the model coefficients in the 1-way and 2-way ANOVA? (answer in a short phrase) R uses set-to-zero constraints. 21. (12) In the full 2-factor ANOVA model µ ij = µ + a i + b j + c ij (i = 1,..., t 1, j = 1,..., t 2, and with equal number n > 1 of replications per factor level combination) what is the null distribution of the F -test for testing H 0 : b 1 =... = b t2? What is the null distribution of the F -test for testing H 0 : c ij = 0 i = 1,..., t 1, j = 1,..., t 2? The respective null distributions are F t2 1,t 1 t 2 (n 1) and F (t1 1)(t 2 1),t 1 t 2 (n 1), respectively. 22. (10) In the one-way or two-way ANOVA what is the underlying assumption about data variability? If that assumption is violated what technique can correct for this? Give two distinct examples for applying the technique and explain how the technique was tuned to each situation. 5

It is assumed that the variance of all observations is constant. If the variance changes with the mean for each treatment (combination) then a variance stabilizing transform is often useful. One plots log(s i ) against log(ˆµ i ) and hopes to see a strong linear pattern. Then one fits a straight line and its slope α, rounded to an integer or nearest multiple of 1/2, is used in the λ = 1 α power transformation of the data. Such transformed data will then have a more stable variance across all groups. When α = 1 the transformation should be the logarithm. We saw two examples, the insecticide example with a reciprocal transform and the hermit crab count data with a log-transform. 6