Lecture 6 More on Complete Randomized Block Design (RBD)

Similar documents
Topic- 11 The Analysis of Variance

Modeling and Simulation NETW 707

Chapter 11: I = 2 samples independent samples paired samples Chapter 12: I 3 samples of equal size J one-way layout two-way layout

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Lecture 4 Hypothesis Testing

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

4.1. Lecture 4: Fitting distributions: goodness of fit. Goodness of fit: the underlying principle

Statistics for Economics & Business

x = , so that calculated

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

BETWEEN-PARTICIPANTS EXPERIMENTAL DESIGNS

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

/ n ) are compared. The logic is: if the two

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Joint Statistical Meetings - Biopharmaceutical Section

Chapter 12 Analysis of Covariance

experimenteel en correlationeel onderzoek

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Two-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats

Statistics II Final Exam 26/6/18

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Topic 23 - Randomized Complete Block Designs (RCBD)

ANOVA. The Observations y ij

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

First Year Examination Department of Statistics, University of Florida

Chapter 11: Simple Linear Regression and Correlation

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

F statistic = s2 1 s 2 ( F for Fisher )

Basic Business Statistics, 10/e

1-FACTOR ANOVA (MOTIVATION) [DEVORE 10.1]

Chapter 3 Describing Data Using Numerical Measures

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Chapter 13 Analysis of Variance and Experimental Design

STATISTICS QUESTIONS. Step by Step Solutions.

Lecture 6: Introduction to Linear Regression

4.3 Poisson Regression

UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 11 Analysis of Variance - ANOVA. Instructor: Ivo Dinov,

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Chapter 13: Multiple Regression

# c i. INFERENCE FOR CONTRASTS (Chapter 4) It's unbiased: Recall: A contrast is a linear combination of effects with coefficients summing to zero:

Chapter 8 Indicator Variables

Comparison of Regression Lines

Chapter 14 Simple Linear Regression

Regression Analysis. Regression Analysis

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

A nonparametric two-sample wald test of equality of variances

STAT 511 FINAL EXAM NAME Spring 2001

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

MD. LUTFOR RAHMAN 1 AND KALIPADA SEN 2 Abstract

STAT 3008 Applied Regression Analysis

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

Introduction to Analysis of Variance (ANOVA) Part 1

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Statistics Chapter 4

Reduced slides. Introduction to Analysis of Variance (ANOVA) Part 1. Single factor

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Unit 8: Analysis of Variance (ANOVA) Chapter 5, Sec in the Text

x i1 =1 for all i (the constant ).

Hydrological statistics. Hydrological statistics and extremes

Negative Binomial Regression

Economics 130. Lecture 4 Simple Linear Regression Continued

Polynomial Regression Models

Statistics for Business and Economics

Limited Dependent Variables

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Basically, if you have a dummy dependent variable you will be estimating a probability.

Multiple Contrasts (Simulation)

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS

Professor Chris Murray. Midterm Exam

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

MULTIPLE COMPARISON PROCEDURES

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Lecture 3: Probability Distributions

CHAPTER 8. Exercise Solutions

General Linear Models

The Gaussian classifier. Nuno Vasconcelos ECE Department, UCSD

Correlation and Regression

REGRESSION ANALYSIS II- MULTICOLLINEARITY

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Composite Hypotheses testing

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Linear Regression Analysis: Terminology and Notation

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600

Transcription:

Lecture 6 More on Complete Randomzed Block Desgn (RBD)

Multple test

Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For a levels and ther means,, a, testng the followng a, hypotheses: H : 0,,,,,, a

Fsher Least Sgnfcant Dfferent (LSD) Method Ths method bulds on the equal varances t-test of the dfference between two means. The test statstc s mproved by usng MS ε rather than s p. It s concluded that y and y dffer at α sgnfcance level f y > LSD, where y LSD t, df MS ( r r )

LSD Where r and r are number of observatons under level and. a And df r a

Experment-wse Type I error rate (α E ) (the effectve Type I error) The Fsher s method may result n an ncreased probablty of commttng a type I error. The experment-wse Type I error rate s the probablty of commttng at least one Type I error at sgnfcance level of α. It s calculated by α E = -( α) C where C s the number of parwse comparsons (all: C = a(a-)/) The Bonferron adustment determnes the requred Type I error probablty per parwse comparson (α), to secure a pre-determned overall α E.

Bonferron adustment The procedure: Compute the number of parwse comparsons (C) [all: C=a(a-)/], where a s the number of populatons. Set α = α E /C, where α E s the true probablty of makng at least one Type I error (called expermentwse Type I error). It s concluded that y and y dffer at α/c sgnfcance level f y y t (C ), df MS ( r ) r

Duncan s multple range test The Duncan Multple Range test uses dfferent Sgnfcant Dfference values for means next to each other along the real number lne, and those wth,,, a means n between the two means beng compared. The Sgnfcant Dfference or the range value: R p r, p, MS where r α,p, ν s the Duncan s Sgnfcant Range Value wth parameters p (= range-value) and v (= MS ε degree-of-freedom), and expermentwse alpha level α (= α ont ). n

Duncan s Multple Range Test MS ε s the mean square error from the ANOVA table and n s the number of observatons used to calculate the means beng compared. The range-value s: f the two means beng compared are adacent 3 f one mean separates the two means beng compared 4 f two means separate the two means beng compared

Sgnfcant ranges for Duncan s Multple Range Test

Tukey Kramer method A procedure whch controls the expermentwse error rate s Tukey s honestly sgnfcant dfference test. It s used for levels wth the same replcatons. Assume r =r = =r a =r. Basc dea: f H 0 s true, the value y - y should not be large. W: the reecton regon of multple tests (.e. at least one s reected ) W= y - y c H 0

Tukey Kramer method We need to determne c s.t. when all the are true, the probablty of type I error s α,.e. P(W)=α r MS c r MS y r MS y P r MS c r MS y y P r MS c r MS y y P c y y P c y y P c y y c y y P hypothesses are true all - mn - - max ) - ) - ( - ( max - max - max - max - - P - - P (W) H 0

Tukey Kramer method MSε s mean square of errors n ANOVA, and s unbased estmator of σ. It s ndependent wth y, so y - MS r ~ t(fε) t (r) max They are the largest and smallest order statstcs from a sample (a observatons, obey t(fε)). Denote q a,f ) t - y - MS r t () mn ( ( a) t() y - MS r

Tukey Kramer method Then c P(W) Pq( a,f ) MS r c. e. q - ( a,f ), c q MS r - ( a,f ) MS r So the reecton regon of these multple tests wth sgnfcant level α s y - y q - ( a,f ) MS r,,,,,,a

Scheffe method For dfferent number of replcatons Under ~ N(0, ) Replace σ by MSε, and MSε s ndependent wth y, so y - y ~ F(, fε) If F y - y r H 0 r H MS 0 : r r s true, F should not be large.

Scheffe method When all the H 0 are true, the reecton regon of multple tests s W F c P( W ) P max F Scheffe proved that approxmately a - obeys F(a-, fε). Gven sgnfcant level α, then c ( a -)F ( a - -,f ) The reecton regon s F c Pmax F c y - y ( a -)F - ( a -,f )( )MS,,, r r,,,a

Test of normalty

Test of normalty Many test procedures that we have developed rely on the assumpton of Normalty. There are several methods of assessng whether data are normally dstrbuted or not (H 0 : the data obeys Normal dstrbuton; H : not obey Normal dstrbuton).

Test of normalty They fall nto two broad categores: graphcal and statstcal. The most common are:.graphcal Q-Q probablty plots Cumulatve frequency (P-P) plots. Statstcal Kolmogorov-Smrnov test Shapro-Wlk test

Normal quantle Normal quantle Q-Q probablty plots Q-Q plots dsplay the observed values aganst normally dstrbuted data (represented by the lne). Normal Q-Q plot: Normally dstrbuton data Normal Q-Q plot: Non-normally dstrbuton data 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 Observaton 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 Observaton Normally dstrbuted data fall along the lne.

Normal cumulatve probablty Normal cumulatve probablty P-P plots P-P (cumulatve) plot dsplays the cumulatve probabltes aganst normally dstrbuted data (represented by the lne). Normal P-P plot: Normally dstrbuton data Normal P-P plot: Non-normally dstrbuton data.0 0.8 0.6 0.4 0. 0.0 0.0 0. 0.4 0.6 0.8.0 Observed cumulatve probablty.0 0.8 0.6 0.4 0. 0.0 0.0 0. 0.4 0.6 0.8.0 Observed cumulatve probablty Normally dstrbuted data fall along the lne.

Statstcal tests Statstcal tests for normalty are more precse snce actual probabltes are calculated. The Kolmogorov-Smrnov and Shapro-Wlks tests for normalty calculate the probablty that the sample was drawn from a normal populaton. The hypotheses used are: H 0 : The sample data are not sgnfcantly dfferent than a normal populaton. H a : The sample data are sgnfcantly dfferent than a normal populaton.

Statstcal tests Typcally, we are nterested n fndng a dfference between groups. When we are, we look for small probabltes. If the probablty of fndng an event s rare (less than 5%) and we actually fnd t, that s of nterest. When testng normalty, we are not lookng for a dfference. In effect, we want our data set to be no dfferent than normal. So when testng for normalty: Probabltes > 0.05 mean the data are normal. Probabltes < 0.05 mean the data are NOT normal.

Based on comparng the observed frequences and the expected frequences Let F (x) dstrbuton. Kolmogorov-Smrnov Normalty Test P(X x) In the unform(0,) case: be the cdf for the F(x) Compare ths to the emprcal dstrbuton functon : Fˆ n (x) n (number of X x, 0 n the sample x) x

Kolmogorov-Smrnov If X, X,, X n really come from the dstrbuton wth cdf F, the dstance D D should be small. Example: Suppose we have 7 observatons: 0.6 0. 0.5 0.9 0. 0.4 0. Put them n order: Normalty Test n max (x) F(x) 0. 0. 0. 0.4 0.5 0.6 0.9 x Fˆ n -

Kolmogorov-Smrnov Normalty Test Now the emprcal cdf s: ˆ F 7 (x) 0 for x 0. ˆ F 7 ˆ F 7 ˆ F 7 ˆ (x) (x) (x) (x) Fˆ7 F 7 ˆ F 7 (x) 7 3 7 4 7 5 7 6 7 (x) for 0. for 0. for 0.4 x 0. x 0.4 x 0.5 for 0.5 x 0.6 for 0.6 for x 0.9 x 0.9 D 7 9 35 0.5749

Kolmogorov-Smrnov Normalty Test Let X (), X (),,X (n) be the ordered sample. Then D n can be estmated by Dn Where D D n n max max n max n D n (assumng non-repeatng values) n, D - F(X F(X () ) - n () - ) - n

Kolmogorov-Smrnov Normalty Test We reect that ths sample came from the proposed dstrbuton f the emprcal cdf s too far from the true cdf of the proposed dstrbuton.e.: We reect f D n s too large.

Kolmogorov-Smrnov Normalty Test In the 930 s, Kolmogorov and Smrnov / - - t showed that lm P n Dn t - (-) e n So, for large sample szes, you could assume P(n D n t) - and fnd the value of t that makes the rght hand sde -α for an α level test. (-) - e - t

Kolmogorov-Smrnov Normalty Test For small samples, people have worked out and tabulated crtcal values, but there s no nce closed form soluton. J. Pomeranz (973) J. Durbn (968) Good approxmatons for n>40: α 0.0 0.0 0.05 0.0 0.0 CV.0730 n.39 n.358 n.574 n.676 n

Kolmogorov-Smrnov Normalty Test From a table, the crtcal value for a 0.05 level test for n=7 s 0.483. 9 D 7 0.5749 0.483 35 So we cannot reect H 0,.e. the data obeys Normal dstrbuton.

Shapro-Wlk test The test statstc s: Where x () s the th order statstc The constants a are gven by W m,..., m n are the expected values of the order statstcs of ndependent and dentcally dstrbuted random varables sampled from the standard normal dstrbuton, and V s the covarance matrx of those order statstcs. ( [n/] m V (a (m V V m ) a n x (n-) (x T - T,,a n), where m (m T - - T,,mn) - - x) x () )

Shapro-Wlk test The user may reect the null hypothess f s too small W should be closed to under H 0. The reecton regon s {W c} Where P(W c)

Example The data from FDP actvtes of mce. Obs 3 4 5 6 7 8 x 3.83 3.6 4.70 3.97.03.87 3.65 5.09 Put the data n order and dvde them nto two. Obs n 3 4 5 6 7 8 order x.03.87 3.6 3.65 3.83 3.97 4.70 5.09

So Example Order a x (n+-) x () d =x (n+-) - a d x () 0.605 5.09.03 3.06.859 0.364 4.70.87.83 0.5790 3 0.743 3.97 3.6 0.8 0.483 4 0.056 3.83 3.65 0.8 0.00098 Total.5805 W ( [n/] a n x (n-) (x - - x) x () ) (.5805) 6.78549963 0.98307903 From the reference table of W, W (8.0.05) =0.88<W, so we cannot reect H 0.

Choosng the methods Whch normalty test should I use? Kolmogorov-Smrnov: More sutable for large samples. Shapro-Wlk: Works best for small data sets

Test of homogenety

Test of homogenety If we have varous groups or levels of a varable, we want to make sure that the varance wthn these groups or levels s the same. It s the basc assumpton for ANOVA. H 0 : H : not H 0 a

Test of homogenety Some common methods:. Hartley test: only used for the same sample sze for each level.. Bartlett test: sample sze may be dfferent, but each sze>=5 3. Adusted Bartlett test: sample sze may be dfferent, and no restrcton on sze

Hartley test The numbers of replcatons are the same for each level,.e. r r r a r Hartley proposed the statstc: max{s mn{s,s,s,,s,,s a H a The values of H under H 0 can be smulated, and denote the dstrbuton as H(a, f), f=r- } }

Hartley test Under H 0, the value of H should be close to. Gven sgnfcance level α, the reecton regon should be W {H H - ( a,f)} Where H - ( a, dstrbuton. f ) s the - α quantle of H

Bartlett test The th sample varance s S Where r - We know that r r (y -,,,, a It s the (average) arthmetc mean of y ) Q f Q (y - y),f r S MS f, S,, Sa -(degree of freedom) a Q a f f S

Bartlett test Denote geometrc mean GMS Where It s true that GMS f f S S S MS f a f GMS S So under H 0, GMS should be close to. If t s too large, reect H 0. Reecton regon s a MS S f a f a S a MS MS W {ln d} GMS ( r -) n - a

Bartlett test Bartlett proved that: for large sample, one MS functon of approxmately obeys ( a.e. B f C ln -) GMS (ln MS - ln GMS ) ~ ( a -) Where C 3( a -) a - f f s always larger than.

Bartlett test Takng the statstc B a (f ln MS - f lns C The reecton regon s W { B - ( a -)} ) ~ ( a -) Here B approxmately obeys. So the method s more sutable for data wth more than 5 replcatons n each level.

Adusted Bartlett test Box proposed the adusted Bartlett statstc B' fbc (A - BC) B and C are gven above. And f f a -, f a (C -), A - C f f Under H 0, B approxmately obeys F(f, f )

Adusted Bartlett test The reecton regon s W { B' F (f,f - )} Sometmes, f s not an nteger. We can use Interpolaton method of the quantles for F dstrbuton.

Example Testng folc acd content for teas from 4 locatons. Level Data Rep Sum Mean SS wthn groups A 7.9 6. 6.6 8.6 8.9 0. 9.6 r 7 T 9 57. 8.7 Q. 83 A 5.7 7.5 9.8 6. 8.4 r 5 T 37. 5 7.50 Q. 30 A 6.4 7. 7.9 4.5 5.0 4.0 r 6 T 34. 3 9 5.8 Q. 3 03 3 3 A 6.8 7.5 5.0 5.3 6. 7.4 r 6 T 4 38. 6.35 Q 4 5. 6 4 4 r 4 T 68. 4 S 4.77

Example For Bartlett test, And MS ε =.09. Then..4,.83,.4, 4 3 S S S S.0856 0 5 5 4 6 4 3 3 f f a C a 970 0. ln. 5 ln.4 5 ln.83 4 ln.4 6 ln.09 0.0856 ln ln a f S MS f C B

Example Gven α=0.05, a 4 7. B 0.95 85 So we cannot reect H 0,.e. we agree wth that a

Example For Adusted Bartlett test, f 4 3 f 4.0856C 68.4 68.4 68.4 0.970.0856 A 743.9 B.0856 68.4 3 743.9 0.970.0856 Gven α=0.05, F, 68.4 We cannot reect H 0 0.3 F 3,.60 ' 0.95 3 0. 95 B

Data transformaton Data transformaton s used to make your data obey Normal dstrbuton. Three normal transformaton methods: a) No need for transformaton b) Use square root transformaton c) Use logarthmc transformaton d) Use recprocal transformaton

Let s work on the prevous example together Mutants Rep I Rep II Rep III A 0.9 9.. B 0.8.3 4.0 C..5 0.5 D 9. 0.7 0. E.8 3.9 6.8 F 0. 0.6.8 G 0.0.5 4. H 9.3 0.4 4.4 Do the randomzaton of the three blocks Buld the ANOVA table for the observed data Multple test by LSD Normalty test by Q-Q and P-P plot What else do we want? Orthogonal contrasts