Econometrics I. Professor William Greene Stern School of Business Department of Economics 3-1/29. Part 3: Least Squares Algebra

Similar documents
Econometrics I. Professor William Greene Stern School of Business Department of Economics 5-1/36. Part 5: Regression Algebra and Fit

Econ 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A )

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Advanced Econometrics I

Reference: Davidson and MacKinnon Ch 2. In particular page

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

Simple Linear Regression Estimation and Properties

Random Vectors, Random Matrices, and Matrix Expected Value

Chapter 5 Matrix Approach to Simple Linear Regression

Introduction to Estimation Methods for Time Series models. Lecture 1

Econ 2120: Section 2

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)

STAT Homework 8 - Solutions

STAT 100C: Linear models

1. The OLS Estimator. 1.1 Population model and notation

Greene, Econometric Analysis (7th ed, 2012)

The Multivariate Normal Distribution. In this case according to our theorem

STAT 100C: Linear models

Vectors and Matrices Statistics with Vectors and Matrices

Do not copy, quote, or cite without permission LECTURE 4: THE GENERAL LISREL MODEL

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Regression and Statistical Inference

Linear Algebra Review

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

Economics 573 Problem Set 5 Fall 2002 Due: 4 October b. The sample mean converges in probability to the population mean.

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N

Economics 620, Lecture 4: The K-Varable Linear Model I

Multivariate Regression

11. Regression and Least Squares

Solutions for Econometrics I Homework No.1

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1

Elementary operation matrices: row addition

Economics 620, Lecture 7: Still More, But Last, on the K-Varable Linear Model

Multivariate Random Variable

Basic Linear Algebra in MATLAB

STAT5044: Regression and Anova. Inyoung Kim

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Inverses. Stephen Boyd. EE103 Stanford University. October 28, 2017

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Regression and Covariance

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Lecture 6. Numerical methods. Approximation of functions

Lecture 6: Geometry of OLS Estimation of Linear Regession

STA 256: Statistics and Probability I

Key Algebraic Results in Linear Regression

Matrices: 2.1 Operations with Matrices

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

Lecture 9 SLR in Matrix Form

Unsupervised Learning: Dimensionality Reduction

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =

APPENDIX A. Matrix Algebra. a n1 a n2 g a nk

Basic Probability Reference Sheet

arxiv: v1 [stat.me] 3 Nov 2015

(Refer Slide Time: 0:36)

Quick Review on Linear Multiple Regression

Further Mathematical Methods (Linear Algebra) 2002

Lecture 14 Simple Linear Regression

Bivariate distributions

Matrices. Chapter What is a Matrix? We review the basic matrix operations. An array of numbers a a 1n A = a m1...

REFRESHER. William Stallings

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Introduc)on to linear algebra

Linear Regression. Junhui Qian. October 27, 2014

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved.

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion

STAT 8260 Theory of Linear Models Lecture Notes

Homework 4 Solution, due July 23

Applied Econometrics (QEM)

Class 8 Review Problems solutions, 18.05, Spring 2014

Chapter 1. Linear equations

Regression: Ordinary Least Squares

Econometrics I Lecture 3: The Simple Linear Regression Model

ECON 331 Homework #2 - Solution. In a closed model the vector of external demand is zero, so the matrix equation writes:

1.4 Complex random variables and processes

Advanced Quantitative Methods: ordinary least squares

Chapter 1. The Noble Eightfold Path to Linear Regression

Regression Analysis. Ordinary Least Squares. The Linear Model

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84

Linear Estimation of Y given X:

Econ Slides from Lecture 7

Lecture 22: Variance and Covariance

STAT 450. Moment Generating Functions

3 Multiple Linear Regression

Chapter 3: Theory Review: Solutions Math 308 F Spring 2015

Chapter 1. Matrix Algebra

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays

Linear Regression. y» F; Ey = + x Vary = ¾ 2. ) y = + x + u. Eu = 0 Varu = ¾ 2 Exu = 0:

L2: Two-variable regression model

Introductory Econometrics

The regression model with one stochastic regressor.

WISE International Masters

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

7.6 The Inverse of a Square Matrix

Lecture 22: Section 4.7

Transcription:

Econometrics I Professor William Greene Stern School of Business Department of Economics 3-1/29

Econometrics I Part 3 Least Squares Algebra 3-2/29

Vocabulary Some terms to be used in the discussion. Population characteristics and entities vs. sample quantities and analogs Residuals and disturbances Population regression line and sample regression Objective: Learn about the conditional mean function. Estimate and 2 First step: Mechanics of fitting a line (hyperplane) to a set of data 3-3/29

Fitting Criteria The set of points in the sample Fitting criteria - what are they: LAD: Minimize b y x b LAD Least squares: Minimize b (y x b LS ) 2 and so on Why least squares? A fundamental result: Sample moments are good estimators of their population counterparts We will examine this principle and apply it to least squares computation. 3-4/29

An Analogy Principle for Estimating In the population E[y X ] = X so E[y - X X] = 0 Continuing (assumed) E[x i i ] = 0 for every i Summing, Σ i E[x i i ] = Σ i 0 = 0 Exchange Σ i and E[] E[Σ i x i i ] = E[ X ] = 0 E[X(y - X) ] = 0 So, if X is the conditional mean, then E[X ] = 0. We choose b, the estimator of, to mimic this population result: i.e., mimic the population mean with the sample mean Find b such that 1 Xe = 0 1 X( y - Xb) n n As we will see, the solution is the least squares coefficient vector. 3-5/29

Population Moments We assumed that E[ i x i ] = 0. (Slide 2:40) It follows that Cov[x i, i ] = 0. Proof: Cov(x i, i ) = Cov(x i,e[ i x i ]) = Cov(x i,0) = 0. (Theorem B.2). If E[y i x i ] = x i, then = (Var[x i ]) -1 Cov[x i,y i ]. Proof: Cov[x i,y i ] = Cov[x i,e[y i x i ]]=Cov[x i,x i ] This will provide a population analog to the statistics we compute with the data. 3-6/29

U.S. Gasoline Market, 1960-1995 3-7/29

Least Squares Example will be, G i regressed on x i = [1, PG i, Y i ] Fitting criterion: Fitted equation will be y i = b 1 x i1 + b 2 x i2 +... + b K x ik. Criterion is based on residuals: e i = y i - b 1 x i1 + b 2 x i2 +... + b K x ik Make e i as small as possible. Form a criterion and minimize it. 3-8/29

Fitting Criteria Sum of residuals: Sum of squares: n e i 1 i n e 2 i 1 i Sum of absolute values of residuals: n e i 1 i Absolute value of sum of residuals n 2 We focus on now and later i e 1 i n e i 1 i n e i 1 i 3-9/29

Least Squares Algebra n 2 n 2 e (y ) 1 i i i1 i i - x b = e e = (y - Xb)'(y - Xb) Matrix and vector derivatives. Derivative of a scalar with respect to a vector Derivative of a column vector wrt a row vector Other derivatives 3-10/29

Least Squares Normal Equations n 2 2 2 e n (y ) 1 i 1 i - x i i ib n (y i - xib) i 1 b b b n n n 2(y - xb)(- x ) = -2 x y 2 x xb i1 i i i i1 i i i1 i i n n 1 i i i i1 i i = -2 x y 2 x x b = -2X'y + 2X'Xb 3-11/29

Least Squares Normal Equations (y - Xb)'(y - Xb) 2X'(y - Xb) = 0 b ( 11) / ( K 1 ) (-2)(n K)'( n 1) = (-2)(K n)( n 1) = K 1 Note: Derivative of (11) wrt K 1 vector is a K 1 vector. Solution: 2 X'(y - Xb) = 0 X'y = X'Xb 3-12/29

Least Squares Solution Assuming it exists: = ( ) -1 b X'X X'y 1 x x Note the analogy: = Var( ) Cov(,y) 1 1 1 1 1 n 1 n i` i i i` i i n n n n b= X'X X'y x x x y Suggests something desirable about least squares 3-13/29

3-14/29 2 ( y - Xb )'( y - Xb) bb Second Order Conditions Necessary Condition : First derivatives = 0 (y - Xb)'(y - Xb) 2X'(y - Xb) b Sufficient Condition : Second derivatives... 2 X' ( y - Xb) 2X'y = ( y - Xb )'( y - Xb) b b 2X' -Xb 2X'Xb 0 b b b b K 1 column vector = = K K matrix 1K row vector = 2X'X

Side Result: Sample Moments 3-15/29 x x x... x x X'X =............ n n n 2 i1xik xi 1 i 1xiK xi 2... i 1xiK n 2 n n i1 i1 i1 i1 i2 i1 i1 ik n n 2 n i1xi 2xi1 i1xi 2... i1xi 2xiK x x x... x x 2 i1 i1 i2 i1 ik 2 n xi 2xi1 xi 2... xi 2xiK i1 =............ 2 xik xi 1 xik xi 2... xik xi1 x n i2 = i1 xi1 xi 2... x ik... xik n = xx i1 i i

Does b Minimize e e? x x x... x x n 2 n n i1 i1 i1 i1 i2 i1 i1 ik 2 n n 2 n e'e i1xi 2xi1 i 1xi 2... i 1xi 2xiK bb' 2X'X = 2............ x x x x... x n n n 2 i1 ik i1 i1 ik i2 i1 ik If there were a single b, we would require this to be n 2 positive, which it would be; 2 xx ' = 2 x 0. OK i1 i The matrix counterpart of a positive number is a positive definite matrix. 3-16/29

A Positive Definite Matrix Matrix C is positive definite if a'ca is > 0 for every a. Generally hard to check. Requires a look at characteristic roots (later in the course). For some matrices, it is easy to verify. X' X is one of these. a'x'xa = ( a'x' )( Xa) = ( Xa )'( Xa) = v'v K 2 k=1 k = v 0 Could v = 0? v = 0 means Xa = 0. Is this possible? No. -1 Conclusion: b = ( X'X) X'y does indeed minimize e'e. 3-17/29

Algebraic Results - 1 In the population: E[ X' ] = 0 1 n In the sample : x i1 i ei 0 n Xe = 0 means for each column of X, xe = 0 (1) Each column of X is orthogonal to e. (2) One of the columns of X is a column of ones. i'e n = e = 0. The residuals sum to zero. i1 i 1 (3) It follows that n e 1 i = 0 which mimics E[ i ] 0. i n k 3-18/29

Residuals vs. Disturbances Disturbances (population) y x i i i Partitioning y: y = E[ y X] + ε = conditional mean + disturbance Residuals (sample) y xb e i i i Partitioning y : y = Xb + e = projection + residual (Note : Projection into the column space of X, i.e., the set of linear combinations of the columns of X. Xb is one of these.) 3-19/29

Algebraic Results - 2 A residual maker M = (I - X(X X) -1 X ) e = y - Xb= y - X(X X) -1 X y = My My = The residuals that result when y is regressed on X MX = 0 (This result is fundamental!) How do we interpret this result in terms of residuals? When a column of X is regressed on all of X, we get a perfect fit and zero residuals. (Therefore) My = MXb + Me = Me = e (You should be able to prove this. y = Py + My, P = X(X X) -1 X = (I - M). PM = MP = 0. Py is the projection of y into the column space of X. 3-20/29

The M Matrix M = I- X(X X) -1 X is an nxn matrix M is symmetric M = M M is idempotent M*M = M (just multiply it out) M is singular; M -1 does not exist. (We will prove this later as a side result in another derivation.) 3-21/29

Results when X Contains a Constant Term X = [1,x 2,,x K ] The first column of X is a column of ones Since X e = 0, x 1 e = 0 the residuals sum to zero. y Xb + e Define i [1,1,...,1]' a column of n ones i'y n i=1 = y ny i i'y i'xb + i'e = i'xb so (1/n) i'y = (1/n) i'xb implies y xb (the regression line passes through the means) These do not apply if the model has no constant term. 3-22/29

U.S. Gasoline Market, 1960-1995 3-23/29

Least Squares Algebra 3-24/29

Least Squares 3-25/29

Residuals 3-26/29

Least Squares Residuals (autocorrelated) 3-27/29

Least Squares Algebra-3 I X XX X M M is n n potentially huge 3-28/29

Least Squares Algebra-4 MX = 3-29/29