Randomized Complete Block Designs

Similar documents
Split-Plot Designs. David M. Allen University of Kentucky. January 30, 2014

A Likelihood Ratio Test

Chapter 10: Inferences based on two samples

Inferences for Regression

Math 423/533: The Main Theoretical Topics

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Multiple Linear Regression

EC2001 Econometrics 1 Dr. Jose Olmo Room D309

Lecture 19 Multiple (Linear) Regression

df=degrees of freedom = n - 1

STAT 501 EXAM I NAME Spring 1999

Ch 2: Simple Linear Regression

Topic 22 Analysis of Variance

Lecture 3: Multiple Regression

Tutorial 4: Power and Sample Size for the Two-sample t-test with Unequal Variances

Lecture 3. Inference about multivariate normal distribution

Chapter 27 Summary Inferences for Regression

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Interpreting Regression Results

Hypothesis Testing for Var-Cov Components

Factorial designs. Experiments

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Summary of Chapters 7-9

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

STA Module 11 Inferences for Two Population Means

STA Rev. F Learning Objectives. Two Population Means. Module 11 Inferences for Two Population Means

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Linear Regression Model. Badr Missaoui

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

ST505/S697R: Fall Homework 2 Solution.

NEW APPROXIMATE INFERENTIAL METHODS FOR THE RELIABILITY PARAMETER IN A STRESS-STRENGTH MODEL: THE NORMAL CASE

Mathematical statistics

Chapter 1 Statistical Inference

PLSC PRACTICE TEST ONE

Correlation and the Analysis of Variance Approach to Simple Linear Regression

The Random Effects Model Introduction

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

Basic Business Statistics 6 th Edition

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

One-Way Repeated Measures Contrasts

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

appstats27.notebook April 06, 2017

13. The Cochran-Satterthwaite Approximation for Linear Combinations of Mean Squares

Institute of Actuaries of India

Lecture 4: Testing Stuff

2.830 Homework #6. April 2, 2009

Statistics for Managers using Microsoft Excel 6 th Edition

Ch 3: Multiple Linear Regression

ANALYSIS OF VARIANCE AND QUADRATIC FORMS

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18

[y i α βx i ] 2 (2) Q = i=1

Mean Vector Inferences

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual

INTERVAL ESTIMATION AND HYPOTHESES TESTING

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Sociology 6Z03 Review II

Simple linear regression

Simple Linear Regression

Simple Linear Regression

Hypothesis Testing One Sample Tests

Random and mixed effects models

Statistics and econometrics

A SAS/AF Application For Sample Size And Power Determination

Chi square test of independence

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

MATH 644: Regression Analysis Methods

A discussion on multiple regression models

Economics 240A, Section 3: Short and Long Regression (Ch. 17) and the Multivariate Normal Distribution (Ch. 18)

Week 14 Comparing k(> 2) Populations

Linear Mixed Models: Methodology and Algorithms

Analysis of Variance and Co-variance. By Manza Ramesh

INFERENCE FOR REGRESSION

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Simple Linear Regression: One Quantitative IV

i=1 X i/n i=1 (X i X) 2 /(n 1). Find the constant c so that the statistic c(x X n+1 )/S has a t-distribution. If n = 8, determine k such that

STA 2201/442 Assignment 2

Bivariate Relationships Between Variables

40.2. Interval Estimation for the Variance. Introduction. Prerequisites. Learning Outcomes

Multiple comparisons - subsequent inferences for two-way ANOVA

Probability and Statistics Notes

1 Statistical inference for a population mean

3. The F Test for Comparing Reduced vs. Full Models. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

First Year Examination Department of Statistics, University of Florida

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

2.1 Linear regression with matrices

Section 4.6 Simple Linear Regression

Lecture.10 T-test definition assumptions test for equality of two means-independent and paired t test. Student s t test

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

CHAPTER 10 ONE-WAY ANALYSIS OF VARIANCE. It would be very unusual for all the research one might conduct to be restricted to

Chapter 14 Simple Linear Regression (A)

Can you tell the relationship between students SAT scores and their college grades?

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Transcription:

Randomized Complete Block Designs David Allen University of Kentucky February 23, 2016

1 Randomized Complete Block Design There are many situations where it is impossible to use a completely randomized design. Suppose a researcher at an agricultural experiment station wants to conduct trials to compare six fertilization programs for wheat with regard to nitrate content of the plants. It would be desirable to make recommendations apply across the state. Because of variability in climate, soil type, fertility, etc. trials would be conducted at multiple locations. The trial at each location should be balanced with respect to treatments. Hence a separate randomization is done at each location. 2

Kentucky 3

Data from an example in Kuehl [1] are shown in Table 1. Block 1 Treatment 2 5 4 1 6 3 Response 40.89 37.99 37.18 34.98 34.89 42.07 2 Treatment 1 3 4 6 5 2 Response 41.22 49.42 45.85 50.15 41.99 46.69 3 Treatment 6 3 5 1 2 4 Response 44.57 52.68 37.61 36.94 46.65 40.23 4 Treatment 2 4 6 5 3 1 Response 41.90 39.20 43.29 40.45 42.91 39.97 Table 1: Nitrate content of wheat plants 4

The Model The model in scalar notation is Y j = μ + b j + ε j (1) where μ is the mean of the th treatment group, = 1,, t, b j is a random component associated with the jth block, j = 1,, b and ε j is a random component specific to the th treatment and jth block. The b j are independently distributed N(0, σ 2 b ), and the ε j are independently distributed N(0, σ 2 ). The b j and ε j are jointly independent. 5

Study Objectives The objectives of the study are 1. to test hypotheses, or place confidence intervals, on linear combinations of treatment means, and 2. to test the null hypothesis σ 2 b = 0. The test of the null hypothesis that the treatment means are all equal is commonly applied, but is discussed in another presentation. 6

Learning Objectives I am sure you have encountered the randomized complete block design before. However, when the linear combinations of treatment means is not a contrast, there is not an exact test or confidence interval. The Satterthwaite procedure is used to obtain approximate tests or confidence intervals. 7

Analysis by SAS I m thinking that an analysis by SAS might help set the stage for the more theoretical development later. SAS code evoking proc glimmix is supplied that addresses the objectives above for some specific examples. Note the use of the model statement option ddfm = satterthwaite, and on the output, there is fractional degrees of freedom associated with the confidence interval for the first treatment mean. Exercise 1.1. Run SAS proc glimmix on the data in Table 1. State your conclusions relative to the study objectives stated on page 6. 8

2 Analytic Analysis For the randomized complete block design, estimates of the population means are just the sample means. Derivation of the formula for the standard errors of a linear combination of sample means is the subject of this Section. Let b (with no subscript) denote the number of blocks in the design and t the number of treatments. 9

Variances and Covariances The th and jth sample means in terms of model (1) are Ȳ = μ + b + ε Ȳ j = μ j + b + ε j The variance of each mean is σ2 b b + σ2. The covariance b between the th and jth means is E ( b + ε )( b + ε j ) = E b 2 + b ε j + b ε + ε ε j = E b 2 = σ2 b b 10

Variance of a Linear Combination The variance of a linear combination is V r c Ȳ = c 2 σ 2 b + c 2 σ 2 b b. If c = 0 the linear combination is called a contrast. For contrasts, σ 2 drops out of the variance and inference is b straight forward. 11

Inference on Non-Contrasts If a linear combination is not a contrast, it may not be possible to construct a Student s-t statistic. In such a case, it is common to use the Satterthwaite procedure described in the next section. 12

3 The Satterthwaite Procedure Suppose there is a situation where an immediate estimate of V r c t β does not exist. However, there are two independent sums of squares SS 1 and SS 2 with respective degrees of freedom ν 1 and ν 2. Furthermore, constants c 1 and c 2 are such that E (c 1 SS 1 /ν 1 + c 2 SS 2 /ν 2 ) = V r c t β. If we use c 1 SS 1 /ν 1 + c 2 SS 2 /ν 2 as an estimator of the variance, what value of the degrees of freedom should be used? 13

The Pivotal quantity The pivotal quantity for a confidence interval on c t β is t = c t β c t β c1 SS 1 /ν 1 + c 2 SS 2 /ν 2. Pivotal quantity above is in quotes because the distribution of t is unknown if c 1 = 0. Common practice is to use the Satterthwaite approximation [2]. This approximation may be thought of as synthesizing a mean square. 14

Decomposing t The approach is to approximate the distribution of t by a t-distribution. That reduces the problem to finding the degrees of freedom of the approximating t-distribution. Define c t β c t β Z = c 1 σ 2 1 + c 2σ 2 2 and U = then t = Z/ U. c 1 σ 2 SS 1 1 ν 1 (c 1 σ 2 1 + c 2σ 2 2 ) σ 2 + 1 c 2 σ 2 2 ν 2 (c 1 σ 2 1 + c 2σ 2 2 ) SS 2 σ 2 2 15

The distribution of Z is standard normal. It remains to approximate the distribution of U by a Chi-square divided by it degrees of freedom, i.e. there exist a ν such that is approximately satisfied. U χ 2 (ν)/ν 16

Degrees of freedom for approximating distribution By approximately satisfied we mean U and χ 2 (ν)/ν should have the same variance. Now V r(u) = and c 1 σ 2 1 ν 1 (c 1 σ 2 1 + c 2σ 2 2 ) 2 2ν 1 + c 2 σ 2 2 ν 2 (c 1 σ 2 1 + c 2σ 2 2 ) 2 2ν = 2 c2 1 σ4 1 /ν 1 + c 2 2 σ4 2 /ν 2 (c 1 σ 2 1 + c 2σ 2 2 )2 (2) V r χ 2 (ν)/ν = 2 ν. (3) 17

Equating variances (2) and (3) and solving for ν gives ν = (c 1σ 2 1 + c 2σ 2 2 )2 c 2 1 σ4 1 /ν 1 + c 2 2 σ4 2 /ν 2 Note that ν depends on unknown parameters inpractice the σ 2 are replaced by the corresponding mean squares. 18

4 Numerical Analysis In order to display the matrices involved, a situation with fewer blocks and fewer treatments is used. A tabular layout of the responses for four treatments and three blocks is Block 1 2 3 Mean 1 Y 11 Y 12 Y 13 Ȳ 1 Treatment 2 Y 21 Y 22 Y 23 Ȳ 2 3 Y 31 Y 32 Y 33 Ȳ 3 4 Y 41 Y 42 Y 43 Ȳ 4 19

Simulated Data Simulated data for this layout is treat block response 1 1 4.92 1 2 5.36 1 3 1.78 2 1 2.96 2 2 6.79 2 3 4.63 3 1 6.86 3 2 10.02 3 3 4.55 4 1 8.43 4 2 10.30 4 3 8.13 20

The Model in Matrix Notation The model in matrix notation is Y = Xβ + Zb + ε 4.92 1 0 0 0 1 0 0 5.36 1 0 0 0 0 1 0 1.78 1 0 0 0 0 0 1 2.96 0 1 0 0 6.79 0 1 0 0 1 0 0 μ 1 0 1 0 4.63 6.86 = 0 1 0 0 μ 2 0 0 1 0 μ + 0 0 1 b 1 3 1 0 0 b 2 + ε 10.02 0 0 1 0 μ 4 0 1 0 b 3 4.55 0 0 1 0 0 0 1 8.43 0 0 0 1 1 0 0 10.30 0 0 0 1 0 1 0 8.13 0 0 0 1 0 0 1 21

The Transformed Model Elements of the transformed model are -1.73 0 0 0-0.58-0.58-0.58-6.96 0 1.73 0 0 0.58 0.58 0.58 8.30 0 0 1.73 0 0.58 0.58 0.58 12.37 0 0 0 1.73 0.58 0.58 0.58 15.51 0 0 0 0 1.63-0.82-0.82-1.07 0 0 0 0 0 1.41-1.41 4.73 0 0 0 0 0 0 0-0.73 0 0 0 0 0 0 0 0.39 0 0 0 0 0 0 0-2.19 0 0 0 0 0 0 0-1.22 0 0 0 0 0 0 0-1.39 0 0 0 0 0 0 0-0.66 22

Variance Matrix The variance matrix of the transformed Y is 1-1 -1-1 0 0 0 0 0 0 0 0-1 1 1 1 0 0 0 0 0 0 0 0-1 1 1 1 0 0 0 0 0 0 0 0-1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 σ 2 b + σ2 (4) 23

An analysis of variance is Analysis of Variance Sum of Degrees of Mean Source Squares Freedom Square 510.97 4 127.74 Blocks 23.51 2 11.76 Residual 9.33 6 1.55 24

Exercises Each of the following exercises refer to the mockup data. Exercise 4.1. Estimate σ 2 b and σ2. Exercise 4.2. Estimate μ 1 and its standard error. Put a 95% confidence interval on μ 1. Exercise 4.3. Estimate μ 1 μ 2 and its standard error. Put a 95% confidence interval on μ 1 μ 2. Exercise 4.4. Test H 0 : σ 2 = 0 with α = 0.05. What is the b power of this test if σ 2 b /σ2 = 2? 25

References [1] Robert O. Kuehl. Design of Experiments: Statistical Principles of Research Design and Analysis. Duxbury Press, second edition, 2000. [2] F. E. Satterthwaite. An approximate distribution of estimates of variance components. Biometrics Bulletin, 2:110 114, 1946. 26