Lecture 07 Hypothesis Testing with Multivariate Regression

Similar documents
Lecture 05 Geometry of Least Squares

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Lecture 16 Solving GLMs via IRWLS

1. The Multivariate Classical Linear Regression Model

Eigenvalues and diagonalization

Properties of the least squares estimates

Lecture 14 Singular Value Decomposition

Financial Econometrics

Least Squares Estimation-Finite-Sample Properties

LECTURE 5 HYPOTHESIS TESTING

Advanced Econometrics I

Lecture 34: Properties of the LSE

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

The Statistical Property of Ordinary Least Squares

Multiple Regression Model: I

Multiple Regression Analysis

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Topic 3: Inference and Prediction

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

The Linear Regression Model

Economics 620, Lecture 5: exp

Topic 3: Inference and Prediction

Intermediate Econometrics

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Simple Linear Regression Model & Introduction to. OLS Estimation

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Introduction to Estimation Methods for Time Series models. Lecture 1

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28

The General Linear Model. Monday, Lecture 2 Jeanette Mumford University of Wisconsin - Madison

STAT 100C: Linear models

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

STT 843 Key to Homework 1 Spring 2018

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Economics 620, Lecture 4: The K-Varable Linear Model I

Multiple Regression Analysis

ECONOMET RICS P RELIM EXAM August 19, 2014 Department of Economics, Michigan State University

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N

Homoskedasticity. Var (u X) = σ 2. (23)

AGEC 621 Lecture 16 David Bessler

Data Mining Stat 588

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

Ch 2: Simple Linear Regression

Lecture 3: Multiple Regression

Gauss Markov & Predictive Distributions

Probability and Statistics Notes

Ch 3: Multiple Linear Regression

Standard Linear Regression Model (SLM)

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

The Multiple Regression Model Estimation

Linear Algebra Review

3. The F Test for Comparing Reduced vs. Full Models. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

Multivariate Regression

Math 423/533: The Main Theoretical Topics

Hypothesis Testing hypothesis testing approach

17: INFERENCE FOR MULTIPLE REGRESSION. Inference for Individual Regression Coefficients

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation

1 Appendix A: Matrix Algebra

Distributions of Quadratic Forms. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 31

Dimensionality Reduction Techniques (DRT)

Large Sample Properties of Estimators in the Classical Linear Regression Model

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Test of hypotheses with panel data

14 Multiple Linear Regression

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Next is material on matrix rank. Please see the handout

Econometrics Master in Business and Quantitative Methods

Inference in Regression Analysis

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Introductory Econometrics

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Notes on Implementation of Component Analysis Techniques

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Econometrics II - EXAM Answer each question in separate sheets in three hours

Association studies and regression

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

The General Linear Model. How we re approaching the GLM. What you ll get out of this 8/11/16

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

MATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003

Lecture 6 Multiple Linear Regression, cont.

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing

Matrix Factorizations

Linear Regression. y» F; Ey = + x Vary = ¾ 2. ) y = + x + u. Eu = 0 Varu = ¾ 2 Exu = 0:

Multiple Linear Regression

Spatial Regression. 9. Specification Tests (1) Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Applied Econometrics (QEM)

Regression #5: Confidence Intervals and Hypothesis Testing (Part 1)

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Transcription:

Lecture 07 Hypothesis Testing with Multivariate Regression 23 September 2015 Taylor B. Arnold Yale Statistics STAT 312/612

Goals for today 1. Review of assumptions and properties of linear model 2. The multivariate T-test 3. Hypothesis test simulations 4. The multivariate F-test

Review from Last Time

Classical linear model assumptions I. Linearity Y = Xβ + ϵ II. Strict exogeneity E (ϵ X) = 0 III. No multicollinearity P [rank(x) = p] = 1 IV. Spherical errors V (ϵ X) = σ 2 I n V. Normality ϵ X N (0, σ 2 I n )

Finite sample properties Under assumptions I-III: (A) E( β X) = β Under assumptions I-IV: (B) V( β X) = σ 2 (X t X) 1 (C) β is the best linear unbiased estimator (Gauss-Markov) (D) Cov( β, r X) = 0 (E) E(s 2 X) = σ 2 Under assumptions I-V: (F) β achieves the Cramér Rao lower bound

The T-test

Hypothesis tests Consider testing the hypothesis H 0 : β j = b j. Under assumptions I-V we have the following: β j b j X, H0 N (0, σ 2 ( (X t X) 1 ) jj )

Z-test This suggests the following test statistic: z = β j b j σ 2 ( (X t X) 1 jj ) With, z X, H 0 N (0, 1)

T-test As in the simple linear linear regression case, we generally need to estimate σ 2 with s 2. This yields the following test statistic: t = β j b j s 2 ( (X t X) 1 jj ) = β j b j S.E.( β j )

T-test The test statistic has a T-distribution with (n p) degrees of freedom under the null hypothesis: t X, H 0 t n p This time around, we ll actually prove this.

To start, we re-write the test statistic as: t = = = β b σ σ ( 2 (X t X) 1 ) 2 s 2 jj z s 2 /σ 2 z q/(n p) Where q = r t r/σ 2. We need to show that (1) q X χ 2 n p and (2) z q X.

Lemma If B is a symmetric idempotent matrix and u N (0, I n ), then u t Bu χ 2 tr(b).

Proof: The symmetric matrix B can be written as by its eigen-decompostion; for some orthonormal Q matrix and diagonal matrix Λ: B = Q t ΛQ

Notice that as B is idempotent: (Q t ΛQ) = (Q t ΛQ)(Q t ΛQ) = Q t ΛQQ t ΛQ = Q t Λ 2 Q Since Q t = Q 1 : Λ = Λ 2 Therefore all the elements of Λ are 0 or 1.

By rotating the columns of Q, we can actually write: ( ) Irank(B) 0 Λ = 0 0 Notice that this also shows that: tr(b) = tr(q t ΛQ) = tr(λqq t ) = tr(λ) = rank(b)

Now let v = Qu. We can see that the mean of v is zero: And the variance of v is: Ev = EQu = QEu = 0 Evv t = QE(uu t )Q = QI n Q t = I n

Now we calculate the quadtratic form u t Bu by the transform of v. Substitute that u = (Q) 1 v = Q t v: u t Bu = v t QBQ t v = v t Q(Q t ΛQ)Q t v = v t QQ t ΛQQ t v = v t Λv = v t ( Itr(B) 0 0 0 = Which completes the lemma. tr(b) v 2 i i=1 χ 2 tr(b) ) v

(1) We know that r t r = ϵ t Mϵ, so: q = rt r σ 2 = ϵt σ M ϵ σ Under Assumption V, ϵ σ N (0, I n). Therefore from the lemma, then q χ 2 tr(m) ; last time we showed the tr(m) = n p, which finishes the first part of the proof.

(2) The random variables β and r are both linear combinations of ϵ, and are therefore jointly normally distributed. As they are uncorrelated (problem set 2), this implies that they are independent. Finally, this implies that z = f( β) and q = g(r) and themselves independent.

So, therefore, we have: t = = β b σ 2 ( (X t X) 1 jj z q/(n p) t n p ) σ 2 s 2

Simulation

F-Test

F-test Now consider the hypothesis test H 0 : Dβ = d for a matrix D with k rows and rank k.

We ll form the following test statistic: F = (D β d) t [D(X t X) 1 D t ] 1 (D β d)/k s 2 And prove that is has an F-distribution with k and n p degrees of freedom.

We can re-write the test statistics as: Where: F = w/k q/(n p) w = (D β d) t [σ 2 D(X t X) 1 D t ] 1 (D β d) q = r t r/σ 2 We ve already shown that q X χ 2 n p, and can see that q w by the same argument as before. All that is left is to show that w χ 2 k.

Let v = D β d. Under the null hypothesis v = D( β β). Therefore: V(v X) = V(D( β β) X) = DV( β β X)D t = σ 2 D(X t X) 1 D t Now, for the multivariate normally distributed v with zero mean, we have: w = v t V(v X) 1 v

Now, decompose V(v X) 1 = Σ 1 as Q t ΛQ. Notice that this implies: Σ 1 = Q t Λ 1/2 Λ 1/2 Q QΣ 1 = Λ 1/2 Λ 1/2 Q Λ 1/2 QΣ 1 = Λ 1/2 Q

From here, we can write w as the inner product of a suitably defined vector u: With a mean of zero: w = v t Σ 1 v = v t Q t Λ 1/2 Λ 1/2 Qv = u t u E(u X) = E(Λ 1/2 Qv X) = Λ 1/2 QE(v X) = 0

And variance of And, finally: E(uu t X) = E(Λ 1/2 Qvv t Q t Λ 1/2 X) = Λ 1/2 QE(vv t X)Q t Λ 1/2 = Λ 1/2 QΣQ t Λ 1/2 = Λ 1/2 QΣ 1 ΣQ t Λ 1/2 = Λ 1/2 Λ 1/2 = I k Which finishes the proof. w = uu t χ 2 k

An alternative F-test Now, consider the following estimator: β = arg min y Xb 2 2, s.t. Db = d b We then define the restricted residuals r = y X β.

An alternative F-test An alternative expression for the F-test statistic is: F = ( rt r r t r)/k r t r/(n p) Conceptually, it should make sense that this is large whenever the null hypothesis is false.

An alternative F-test If we let SSR U be the sum of squared residuals of the unrestricted model (r t r) and SSR R be the sum of squared residuals of the restricted model, then this can be re-written as: F = (SSR R SSR U )/k SSR U /(n p) This is the way it is written in the homework and in the Fumio Hayashi text. It is left for you to prove that this is equivalent to the other F test.

Application