Economics 620, Lecture 4: The K-Varable Linear Model I

Similar documents
Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N

Economics 620, Lecture 7: Still More, But Last, on the K-Varable Linear Model

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation

Economics 620, Lecture 5: exp

Economics 620, Lecture 18: Nonlinear Models

Economics 620, Lecture 15: Introduction to the Simultaneous Equations Model

Economics 620, Lecture 14: Lags and Dynamics

Economics 620, Lecture 13: Time Series I

Econ 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A )

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Economics 620, Lecture 2: Regression Mechanics (Simple Regression)

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation

LECTURE 11: GENERALIZED LEAST SQUARES (GLS) In this lecture, we will consider the model y = Xβ + ε retaining the assumption Ey = Xβ.

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Multiple Regression Model: I

Economics 620, Lecture 20: Generalized Method of Moment (GMM)

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Finansiell Statistik, GN, 15 hp, VT2008 Lecture 15: Multiple Linear Regression & Correlation

STAT 100C: Linear models

STA 302f16 Assignment Five 1

Lecture 6 & 7. Shuanglin Shao. September 16th and 18th, 2013

Solutions for Econometrics I Homework No.1

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)

Variable Selection and Model Building

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions

Linear Regression. y» F; Ey = + x Vary = ¾ 2. ) y = + x + u. Eu = 0 Varu = ¾ 2 Exu = 0:

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

The General Linear Model. Monday, Lecture 2 Jeanette Mumford University of Wisconsin - Madison

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

1 A complete Fourier series solution

Introduction to the Simultaneous Equations Model

Appendix A: Matrices

18.06SC Final Exam Solutions

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

plim W 0 " 1 N = 0 Prof. N. M. Kiefer, Econ 620, Cornell University, Lecture 16.

Applied Econometrics (QEM)

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

1 Correlation between an independent variable and the error

Advanced Econometrics I

Introduction to Estimation Methods for Time Series models. Lecture 1

Inference in Regression Analysis

Introduc)on to linear algebra

3 Multiple Linear Regression

Econometrics I. Professor William Greene Stern School of Business Department of Economics 3-1/29. Part 3: Least Squares Algebra

STAT 100C: Linear models

Need for Several Predictor Variables

Chapter 6. Orthogonality

Chapter 5 Matrix Approach to Simple Linear Regression

Econ 2120: Section 2

the error term could vary over the observations, in ways that are related

can only hit 3 points in the codomain. Hence, f is not surjective. For another example, if n = 4

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27

Linear Algebra March 16, 2019

Multiple Regression Analysis

EE263: Introduction to Linear Dynamical Systems Review Session 2

STAT 540: Data Analysis and Regression

FIRST MIDTERM EXAM ECON 7801 SPRING 2001

Notes on Time Series Modeling

Lecture 4: Multivariate Regression, Part 2

Lecture 12: Diagonalization

4 Multiple Linear Regression

MATH 304 Linear Algebra Lecture 10: Linear independence. Wronskian.

M340L Final Exam Solutions May 13, 1995

LECTURE 13: TIME SERIES I

Multivariate Regression

Regression Models - Introduction

Topic 3: Inference and Prediction

Conceptual Questions for Review

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).

E2 212: Matrix Theory (Fall 2010) Solutions to Test - 1

Chapter 3. Matrices. 3.1 Matrices

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #2 Solutions

Quantitative Techniques - Lecture 8: Estimation

Chapter 11 Specification Error Analysis

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =

Mathematics. EC / EE / IN / ME / CE. for

Topic 3: Inference and Prediction

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

ECONOMET RICS P RELIM EXAM August 19, 2014 Department of Economics, Michigan State University

Brief Suggested Solutions

Ridge Regression and Ill-Conditioning

10-725/36-725: Convex Optimization Prerequisite Topics

MATH 221: SOLUTIONS TO SELECTED HOMEWORK PROBLEMS

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

EC3062 ECONOMETRICS. THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation. (1) y = β 0 + β 1 x β k x k + ε,

Spatial Regression. 3. Review - OLS and 2SLS. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Simple Linear Regression for the MPG Data

LECTURE 18: NONLINEAR MODELS

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

Homoskedasticity. Var (u X) = σ 2. (23)

DS-GA 1002 Lecture notes 10 November 23, Linear models

Matrix Algebra. Matrix Algebra. Chapter 8 - S&B

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

Transcription:

Economics 620, Lecture 4: The K-Varable Linear Model I Nicholas M. Kiefer Cornell University Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 1 / 20

Consider the system y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N or in matrix form y = X + " where y is N 1, X is N 2, is 2 1, and " is N 1. Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 2 / 20

K-Variable Linear Model 2 3 1 x 1 X = 6 1 x 2 7 4 :. 5, = : 1 x N Good statistical practice requires inclusion of the column of ones. Consider the general model y = X + " Convention: y is N 1, X is N K, is K 1, and " is N 1. 2 3 2 3 1 x 21 ::::x 1 K 1 X = 6 1 x 22 ::::x K 2 7 4 : : :::: : 5, = 2 6 : 7 4 : 5 : 1 x 2N ::::x Kn K Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 3 / 20

More on the Linear Model A typical row looks like: y i = 1 + 2 x 2i + 3 x 3i + ::: + K x Ki + " i The Least Squares Method: First Assumption: Ey = X S(b) = (y Xb) 0 (y Xb) = y 0 y 2b 0 X 0 y + b 0 X 0 Xb Normal Equations X 0 X ^ X 0 y = 0 These equations always have a solution. (Clear from geometry to come) If X 0 X is invertible ^ = (X 0 X ) 1 X 0 y. Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 4 / 20

More on the Linear Model Proposition: ^ is a minimizer. Proof : Let b be any other K-vector. (y Xb) 0 (y Xb) = (y X ^ + X (^ b)) 0 (y X ^ + X (^ b)) = (y X ^) 0 (y X ^) + (^ b) 0 X 0 X (^ b) (y X ^)(y X ^). (Why?) De nition: e = y X ^ is the vector of residuals. Note: Ee = 0 and X 0 e = 0. Proposition: The LS estimator is unbiased. Proof : E ^ = E[(X 0 X ) 1 X 0 y] = E[(X 0 X ) 1 X 0 (X + ")] = Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 5 / 20

Geometry of Least Squares Consider y = X + " with y = y1 y 2 x1, X = x 2 De nition: The space spanned by matrix X is the vector space which consists of all linear combinations of the column vectors of X. De nition: X (X 0 X ) 1 X 0 y is the orthogonal projection of y to the space spanned by X. : Proposition: e is perpendicular to X, i.e., X 0 e = 0. Proof : e = y X ^ = y X (X 0 X ) 1 X 0 y e = (I X (X 0 X ) 1 X 0 )y ) X 0 e = (X 0 X 0 )y = 0. Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 6 / 20

Geometry of Least Squares (cont d) Thus the equation y = X ^ + e gives y as the sum of a vector in R[X ] and a vector in N[X 0 ]. Common (friendly) projection matrices: 1. The matrix which projects to the space orthogonal to the space spanned by X (i.e. to N[X 0 ] is M = I X (X 0 X ) 1 X 0 : Note: e = My. If X is full column rank, M has rank (N K). 2. The matrix which projects to the space spanned by X is I M = X (X 0 X ) 1 X 0 : Note: ^y = y e = y My = (I M)y. If X is full column rank, (I M) has rank K. Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 7 / 20

Example in R 2 y i = x i + " i i = 1; 2 What is the case of singular X 0 X? Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 8 / 20

Properties of projection matrices 1. Projection matrices are idempotent. I.G. (I M)(I M) = (I M): Proof : (I M)(I M) = (X (X 0 X ) 1 X 0 )(X (X 0 X ) 1 X 0 ) = X (X 0 X ) 1 X 0 = (I M) 2. Idempotent matrices have eigenvalues equal to zero or one. Proof : Consider the characteristic equation Mz = z ) M 2 z = Mz = 2 z. Since M is idempotent, M 2 z = Mz. Thus, 2 z = z, which implies that is either 0 or 1: Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 9 / 20

Properties of projection matrices 3. The number of nonzero eigenvalues of a matrix is equal to its rank. ) For idempotent matrices, trace = rank. More assumptions to the K-variable linear model: Second assumption: V (y) = V (") = 2 I N where y and " are N-vectors. With this assumption, we can obtain the sampling variance of ^: Proposition: V (^) = 2 (X 0 X ) 1 Proof : hence ^ = (X 0 X ) 1 X 0 y = (X 0 X ) 1 X 0 X + (X 0 X ) 1 X 0 " ^ = + (X 0 X ) 1 X 0 " Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 10 / 20

cont d V (^) = E(^ E ^)(^ E ^) 0 = E(X 0 X ) 1 X 0 "" 0 X (X 0 X ) 1 V (^) = (X 0 X ) 1 X 0 (E"" 0 )X (X 0 X ) 1 = 2 (X 0 X ) 1 Gauss-Markov Theorem: The LS estimator is BLUE. Proof : Consider estimating c 0 for some c. A possible estimator is c 0^ with variance 2 c 0 (X 0 X ) 1 c. An alternative linear unbiased estimator: b = a 0 y. Eb = a 0 Ey = a 0 X. Since both c 0^ and b are unbiased, a 0 X = c 0. Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 11 / 20

Gauss-Markov Theorem(cont d) Thus, b = a 0 y = a 0 (X + ") = a 0 X + a 0 " = c 0 + a 0 ". Hence, V (b) = 2 a 0 a: Now, V (c 0^) = 2 a 0 X (X 0 X ) 1 X 0 a since c 0 = a 0 X. So V (b) V (c 0^) = 2 a 0 Ma, p.s.d. Hence, V (b) V (c 0^) Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 12 / 20

Estimation of Variance Proposition: s 2 = e 0 e=(n K) is an unbiased estimator for 2. Proof : e = y X ^ = My = M" ) e 0 e = " 0 M" Ee 0 e = E" 0 M" = E tr " 0 M" (Why?) = tr E" 0 M" = tr EM"" 0 (important trick) = tr M E"" 0 = 2 tr M = 2 (N K) ) s 2 = e 0 e=(n K) is unbiased for 2 : Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 13 / 20

Fit: Does the Regression Model Explain the Data? We will need the useful idempotent matrix A = I 1(1 0 1) 1 1 0 = I 11 0 =N which sweeps out means. Here 1 is an N-vector of ones. Note that AM = M when X contains a constant term. De nition: The correlation coe cient in the K-variable case is R 2 = (Sum of squares due to X )/(Total sum of squares) = 1 (e 0 e=y 0 Ay). Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 14 / 20

More Fit Using A, y 0 Ay =: NP (y i y) 2 i=1 y 0 Ay = (Ay) 0 (Ay) = (A^y + Ae) 0 (A^y + Ae) = ^y 0 A^y + e 0 Ae since ^y 0 e = 0 Thus, y 0 Ay = ^y 0 A^y + e 0 e since Ae = e: Scaling yields: 1 = ^y 0 A^y y 0 Ay + e0 e y 0 Ay What are the two terms of this splitup? Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 15 / 20

More Fit R 2 gives the fraction of variation explained by X : R 2 = 1 (e 0 e=y 0 Ay): Note: The adjusted squared correlation coe cient is given by R 2 = 1 e 0 e=(n K) y 0 Ay=(N 1) (Why might this be preferable?) Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 16 / 20

Reporting Always report characteristics of the sample, i.e. means, standard deviations, anything unusual or surprising, how the data set is collected and how the sample is selected. Report ^ and standard errors (not t-statistics). The usual format is Specify S 2 or 2 ML : Report N and R 2 : ^ (s:e: of ^) Plots are important. For example, predicted vs. actual values or predicted and actual values over time in time series studies should be presented. Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 17 / 20

Comments on Linearity Consider the following argument: Economic functions don t change suddenly. Therefore they are continuous. Thus they are di erentiable and hence nearly linear by Taylor s Theorem. This argument is false (but irrelevant). Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 18 / 20

NOTE ON THE GAUSS-MARKOV THEOREM Consider estimation of a mean based on an observation X. Assume: X F and Z x df = ; Z x 2 df = 2 + 2 : () Suppose that the estimator for is h(x). Unbiasedness implies that Z h(x) df = : Theorem: The only function h unbiased for all F and satisfying () is h(x) = x. Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 19 / 20

Proof : Let h(x) = x + (x). Then Z (x) df = 0 for all F. Suppose, on the contrary, that (x) > 0 on some set A. Let 2 A and F = (the distribution assigning point mass to x = ). Then () is satis ed and Z h(x)d = + () 6= ; which is a contradiction. Argument is the same if (x) < 0. This shows that (x) = 0: The logic is that if the estimator is nonlinear, we can choose a distribution so that it is biased. Professor N. M. Kiefer (Cornell University) Lecture 4: The K-Varable Linear Model I 20 / 20