Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Similar documents
Chapter 12 Analysis of Covariance

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Chapter 8 Indicator Variables

Topic 23 - Randomized Complete Block Designs (RCBD)

Chapter 13: Multiple Regression

First Year Examination Department of Statistics, University of Florida

Lecture 6: Introduction to Linear Regression

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Statistics for Business and Economics

Polynomial Regression Models

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

x = , so that calculated

Chapter 9: Statistical Inference and the Relationship between Two Variables

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Chapter 11: Simple Linear Regression and Correlation

F statistic = s2 1 s 2 ( F for Fisher )

x i1 =1 for all i (the constant ).

/ n ) are compared. The logic is: if the two

Chapter 11: I = 2 samples independent samples paired samples Chapter 12: I 3 samples of equal size J one-way layout two-way layout

Chapter 15 - Multiple Regression

STAT 3008 Applied Regression Analysis

Two-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Statistics for Economics & Business

STATISTICS QUESTIONS. Step by Step Solutions.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Statistics II Final Exam 26/6/18

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Econometrics of Panel Data

Analysis of Variance and Design of Experiments-II

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Lecture 6 More on Complete Randomized Block Design (RBD)

10-701/ Machine Learning, Fall 2005 Homework 3

ANALYSIS OF COVARIANCE

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

Basically, if you have a dummy dependent variable you will be estimating a probability.

Chapter 3 Describing Data Using Numerical Measures

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

e i is a random error

Economics 130. Lecture 4 Simple Linear Regression Continued

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

ANOVA. The Observations y ij

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Comparison of Regression Lines

PROBABILITY AND STATISTICS Vol. III - Analysis of Variance and Analysis of Covariance - V. Nollau ANALYSIS OF VARIANCE AND ANALYSIS OF COVARIANCE

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Composite Hypotheses testing

Topic- 11 The Analysis of Variance

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Chapter 14 Simple Linear Regression

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Diagnostics in Poisson Regression. Models - Residual Analysis

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

Lecture 4 Hypothesis Testing

Regression Analysis. Regression Analysis

STAT 511 FINAL EXAM NAME Spring 2001

MD. LUTFOR RAHMAN 1 AND KALIPADA SEN 2 Abstract

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

17 - LINEAR REGRESSION II

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Negative Binomial Regression

Linear Approximation with Regularization and Moving Least Squares

Joint Statistical Meetings - Biopharmaceutical Section

Basic Business Statistics, 10/e

Bose (1942) showed b t r 1 is a necessary condition. PROOF (Murty 1961): Assume t is a multiple of k, i.e. t nk, where n is an integer.

Professor Chris Murray. Midterm Exam

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

Outline. Multivariate Parametric Methods. Multivariate Data. Basic Multivariate Statistics. Steven J Zeil

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Primer on High-Order Moment Estimators

However, since P is a symmetric idempotent matrix, of P are either 0 or 1 [Eigen-values

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

Lecture 2: Prelude to the big shrink

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

17 Nested and Higher Order Designs

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Chapter 5 Multilevel Models

Testing for seasonal unit roots in heterogeneous panels

Number of cases Number of factors Number of covariates Number of levels of factor i. Value of the dependent variable for case k

Modeling and Simulation NETW 707

Convexity preserving interpolation by splines of arbitrary degree

Transcription:

Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur

Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty of expermental unts s small relatve to the treatment dfferences and the expermenter do not wshes to use expermental desgn, then ust take large number of observatons on each treatment effect and compute ts mean The varaton around mean can be made as small as desred by takng more observatons When there s consderable varaton among observatons on the same treatment and t s not possble to take an unlmted number of observatons, the technques used for reducng the varaton are () use of proper expermental desgn and () use of concomtant varables The use of concomtant varables s accomplshed through the technque of analyss of covarance If both the technques fal to control the expermental varablty then the number of replcatons of dfferent treatments (n other words, the number of expermental unts) are needed to be ncreased to a pont where adequate control of varablty s attaned

3 Introducton t to analyss of covarance model In the lnear model Y = X1β1+ Xβ + + X p β p + ε, f the explanatory varables are quanttatve varables as well as ndcator varables, e, some of them are qualtatve and some are quanttatve, then the lnear model s termed as analyss of covarance (ANCOVA) model Note that the ndcator varables do not provde as much nformaton as the quanttatve varables For example, the quanttatve observatons on age can be converted nto ndcator varable Let an ndctor varable be 1 f age 17years D = 0 f age < 17 years Now the followng quanttatve values of age can be changed nto ndcator varables Ages (n years) Ages (n terms of ndcator varable) 14 0 15 0 16 0 17 1 0 1 1 1 1

4 In many real applcaton, some varables may be quanttatve and others may be qualtatve In such cases, ANCOVA provdes a way out It helps n reducng the sum of squares due to error whch n turn reflects the better model adequacy dagnostcs See how does ths work: In one way model: Y = μ + α + ε, we have TSS = SSA + SSE 1 1 1 In two way model: Y = μ+ α + β + ε, we have TSS = SSA + SSB + SSE In three way model : Y = μ + α + β + γ + ε, we have TSS = SSA + SSB + SSγ + SSE k k 3 3 3 3 3 If we have a gven data set, then deally TSS = TSS = TSS 1 3 SSA = SSA = SSA ; SSB 1 3 = SSB 3 1 3 So SSE SSE SSE Note that n the constructon of F - statstcs we use SS( effects)/ df SSE / df So F - statstc essentally depends on the SSEs Smaller SSE large F more chance of reecton

5 Snce SSA, SSB etc here are based on dummy varables, so obvously f SSA, SSB, etc are based on quanttatve varables, they wll provde more nformaton Such deas are used n ANCOVA models and we construct the model by ncorporatng the quanttatve explanatory varables n ANOVA models In another example, suppose our nterest s to compare several dfferent knds of feed for ther ablty to put weght on anmals If we use ANOVA, then we use the fnal weghts at the end of experment However, fnal weghts of the anmals depend upon the ntal weght of the anmals at the begnnng of the experment as well as upon the dfference n feeds Use of ANCOVA models enables us to adust or correct these ntal t dfferences ANCOVA s useful for mprovng the precson of an experment Suppose response Y s lnearly related to covarate X (or concomtant varable) Suppose expermenter cannot control X but can observe t ANCOVA nvolves adustng for the effect of X If such an adustment s not made, then the X can nflate the error mean square and makes the true dfferences s Y due to treatment harder to detect If for a gven expermental materal, the use of proper expermental desgn cannot control the expermental varaton, the use of concomtant varables (whch are related to expermental materal) may be effectvee n reducng the varablty

6 Consder the one way classfcaton model as EY ( ) = β = 1,,, p; = 1,,, N, Var Y ( ) = σ If usual analyss of varance for testng the hypothess of equalty of treatment effects shows a hghly sgnfcant dfference n the treatment effects due to some factors affectng the experment, then consder the model whch takes nto account ths effect E ( Y ) = β + γ t = 1 1,,, p, = 1 1,,, N, Var( Y ) = σ where t are the observatons on concomtant varables (whch are related to X ) and γ s the regresson coeffcent assocated wth t Wth ths model, the varablty of treatment effects can be consderably reduced For example, n any agrcultural expermental, f the expermental unts are plots of land then, t characterstc of the th plot recevng th treatment and X can be yeld can be measure of fertlty In another example, f expermental unts are anmals and suppose the obectve s to compare the growth rates of groups of anmals recevng dfferent dets Note that the observed dfferences n growth rates can be attrbuted to det only f all the anmals are smlar n some observable characterstcs lke weght, age etc whch nfluence the growth rates

7 In the absence of smlarty, use t whch s the weght or age of th anmal recevng th treatment If we consder the quadratc regresson n t then E Y t t p n ( ) = β + γ + γ, = 1,,, = 1,,, Var Y ( ) = σ ANCOVA n ths case s the same as ANCOVA wth two concomtant varables t and t In two way classfcaton wth one observaton per cell, or EY ( ) = μ + α + β + γt, = 1,, I, = 1,, J wth EY ( ) = μ+ α + β + γ t + γ w α = 0, β = 0, th ( y, t ) ( y, t, w ) (, ) t, w then or are the observatons n cell and are the concomtment varables The concomtant varables can be fxed on random We consder the case of fxed concomtant varables only

8 One-way classfcaton Let Y ( = 1 n, = 1 p) μ = EY ( ) = β + γt Var( Y ) = σ be a random sample of sze n from th normal populatons wth mean where β, γ and σ are the unknown parameters, t are known constants whch are the observatons on a concomtant varable The null hypothess s H 0 β1 = β = = β p : Let 1 1 1 y = y ; y = y, y = y n p n o o oo 1 1 1 t = t ; t = t, t = t n p n o o oo n= n Under the whole parametrc space, use lkelhood rato test for whch we obtan ˆ β ' s and γˆγ usng the least ( ) π Ω squares prncple (or maxmum lkelhood estmaton) as follows: S = ( y μ ) Mnmze = ( y β γt ) S = 0 β for fxed β = y γt o o γ

9 β Put n and mnmze the functon by S S = 0, γ e mnmze y yo γ( t to ) wth respect to γ gves Thus we have ( y yo)( t to) ˆ γ = ( t t ) ˆ β = y ˆ γt o o ˆ μ = ˆ β + ˆ γt ( o) y ˆ μ = y ˆ β ˆ γt Snce = y y ˆ( γ t t ), o o ( y y )( t t ) o o ( y ˆ μ ) = ( y yo) ( t to) Under H 0 : β1 = = β p = β (say), consder S = y β γt w S w ( ) π w and mnmze under sample space as S w = 0, β S w = 0 γ

10 Hence and ˆ β = y ˆt γt ˆ γ = oo oo ˆ μ = ˆ β + γt ˆ ( y y )( t t ) oo oo ( t t ) oo ( y ˆ μ ) = ( y y ) ( y yoo)( t too) ( t t ) oo oo ( ) ( ) ( ) ˆ ( ) ˆ μ ˆ ˆ ˆ μ = y yoo + γ t to γ t too The lkelhood rato test statstc n ths case s gven by λ = = max L( βγσ,, ) w max L( βγσ,, ) Ω ( ˆ μ ˆ μ ) ( y ˆ μ )

11 Now we use the followng theorems: Theorem 1: Let Y = ( Y, Y,, Y ) follow a multvarate normal dstrbuton N ( μ, Σ) wth mean vector μ and postve 1 Σ n defnte covarance matrx Then YAY follows a noncentral ch-square dstrbuton wth p degrees of freedom and noncentralty parameter μ Aμ, e, χ ( p, μ Aμ) f and only f ΣA s an dempotent matrx of rank p Theorem : Let Y = ( Y, Y,, Y ) follows a multvarate normal dstrbuton N ( μ, ) wth mean vector μ and postve 1 n Σ 1 defnte covarance matrx Let YAY follows χ ( p, μ A μ) and YAY follows χ ( p, μ Aμ) Then YAY and YAY are ndependently d dstrbuted b t d f A 1 Σ A = 0 1 1 1 Theorem 3: Let Y = ( Y, Y,, Y ) 1 n follows a multvarate normal dstrbuton N( μσ, I), then the maxmum lkelhood (or βˆ least squares) estmator L β of estmable lnear parametrc functon s ndependently dstrbuted ted of σˆ ; Lβˆ β follow 1 nσˆ N L β, L ( XX ) L and follows χ ( n p) where rank( X ) = p σ Usng these theorems on the ndependence of quadratc forms and dvdng the numerator and denomnator by respectve degrees of freedom, we have ( ˆ μ ˆ μ ) n p1 F = p1 ( y ˆ μ ) ~ F( p1, n p) under H So reect H whenever F F1 ( p1, n p) at α level of sgnfcance 0 α 0

1 λ The terms nvolved n can be smplfed for computatonal convenence as follows: We can wrte ( y ˆ ) ˆ ˆ μ = y β γ t ˆ = ( y ˆ t t yoo) γ ( oo) ˆ = ( y ) ˆ ( ) ˆ( ) ˆ( ˆ yoo γ t too + γ t to γ t to) = ( ) ˆ( ˆ y yo γ t to ) = ( y ) ˆ( ) ˆ yoo + γ t to γ( t too) y ˆ ˆ ˆ μ μ μ = ( ) + ( ) For computatonal convenence where T yt E yt ( ˆ μ ˆ μ ) Tyy Eyy Ttt E tt λ = = ( y ˆ μ ) E yt Eyy E yy T y y T t t T y y t t yy = ( oo), tt = ( oo ), yt = ( oo)( oo), yy = ( o), Ett = ( t to), Eyt = ( y yo)( t to) E y y

Analyss of covarance table for one way classfcaton s as follows: 13