Econometrics I Professor William Greene Stern School of Business Department of Economics 3-1/29
Econometrics I Part 3 Least Squares Algebra 3-2/29
Vocabulary Some terms to be used in the discussion. Population characteristics and entities vs. sample quantities and analogs Residuals and disturbances Population regression line and sample regression Objective: Learn about the conditional mean function. Estimate and 2 First step: Mechanics of fitting a line (hyperplane) to a set of data 3-3/29
Fitting Criteria The set of points in the sample Fitting criteria - what are they: LAD: Minimize b y x b LAD Least squares: Minimize b (y x b LS ) 2 and so on Why least squares? A fundamental result: Sample moments are good estimators of their population counterparts We will examine this principle and apply it to least squares computation. 3-4/29
An Analogy Principle for Estimating In the population E[y X ] = X so E[y - X X] = 0 Continuing (assumed) E[x i i ] = 0 for every i Summing, Σ i E[x i i ] = Σ i 0 = 0 Exchange Σ i and E[] E[Σ i x i i ] = E[ X ] = 0 E[X(y - X) ] = 0 So, if X is the conditional mean, then E[X ] = 0. We choose b, the estimator of, to mimic this population result: i.e., mimic the population mean with the sample mean Find b such that 1 Xe = 0 1 X( y - Xb) n n As we will see, the solution is the least squares coefficient vector. 3-5/29
Population Moments We assumed that E[ i x i ] = 0. (Slide 2:40) It follows that Cov[x i, i ] = 0. Proof: Cov(x i, i ) = Cov(x i,e[ i x i ]) = Cov(x i,0) = 0. (Theorem B.2). If E[y i x i ] = x i, then = (Var[x i ]) -1 Cov[x i,y i ]. Proof: Cov[x i,y i ] = Cov[x i,e[y i x i ]]=Cov[x i,x i ] This will provide a population analog to the statistics we compute with the data. 3-6/29
U.S. Gasoline Market, 1960-1995 3-7/29
Least Squares Example will be, G i regressed on x i = [1, PG i, Y i ] Fitting criterion: Fitted equation will be y i = b 1 x i1 + b 2 x i2 +... + b K x ik. Criterion is based on residuals: e i = y i - b 1 x i1 + b 2 x i2 +... + b K x ik Make e i as small as possible. Form a criterion and minimize it. 3-8/29
Fitting Criteria Sum of residuals: Sum of squares: n e i 1 i n e 2 i 1 i Sum of absolute values of residuals: n e i 1 i Absolute value of sum of residuals n 2 We focus on now and later i e 1 i n e i 1 i n e i 1 i 3-9/29
Least Squares Algebra n 2 n 2 e (y ) 1 i i i1 i i - x b = e e = (y - Xb)'(y - Xb) Matrix and vector derivatives. Derivative of a scalar with respect to a vector Derivative of a column vector wrt a row vector Other derivatives 3-10/29
Least Squares Normal Equations n 2 2 2 e n (y ) 1 i 1 i - x i i ib n (y i - xib) i 1 b b b n n n 2(y - xb)(- x ) = -2 x y 2 x xb i1 i i i i1 i i i1 i i n n 1 i i i i1 i i = -2 x y 2 x x b = -2X'y + 2X'Xb 3-11/29
Least Squares Normal Equations (y - Xb)'(y - Xb) 2X'(y - Xb) = 0 b ( 11) / ( K 1 ) (-2)(n K)'( n 1) = (-2)(K n)( n 1) = K 1 Note: Derivative of (11) wrt K 1 vector is a K 1 vector. Solution: 2 X'(y - Xb) = 0 X'y = X'Xb 3-12/29
Least Squares Solution Assuming it exists: = ( ) -1 b X'X X'y 1 x x Note the analogy: = Var( ) Cov(,y) 1 1 1 1 1 n 1 n i` i i i` i i n n n n b= X'X X'y x x x y Suggests something desirable about least squares 3-13/29
3-14/29 2 ( y - Xb )'( y - Xb) bb Second Order Conditions Necessary Condition : First derivatives = 0 (y - Xb)'(y - Xb) 2X'(y - Xb) b Sufficient Condition : Second derivatives... 2 X' ( y - Xb) 2X'y = ( y - Xb )'( y - Xb) b b 2X' -Xb 2X'Xb 0 b b b b K 1 column vector = = K K matrix 1K row vector = 2X'X
Side Result: Sample Moments 3-15/29 x x x... x x X'X =............ n n n 2 i1xik xi 1 i 1xiK xi 2... i 1xiK n 2 n n i1 i1 i1 i1 i2 i1 i1 ik n n 2 n i1xi 2xi1 i1xi 2... i1xi 2xiK x x x... x x 2 i1 i1 i2 i1 ik 2 n xi 2xi1 xi 2... xi 2xiK i1 =............ 2 xik xi 1 xik xi 2... xik xi1 x n i2 = i1 xi1 xi 2... x ik... xik n = xx i1 i i
Does b Minimize e e? x x x... x x n 2 n n i1 i1 i1 i1 i2 i1 i1 ik 2 n n 2 n e'e i1xi 2xi1 i 1xi 2... i 1xi 2xiK bb' 2X'X = 2............ x x x x... x n n n 2 i1 ik i1 i1 ik i2 i1 ik If there were a single b, we would require this to be n 2 positive, which it would be; 2 xx ' = 2 x 0. OK i1 i The matrix counterpart of a positive number is a positive definite matrix. 3-16/29
A Positive Definite Matrix Matrix C is positive definite if a'ca is > 0 for every a. Generally hard to check. Requires a look at characteristic roots (later in the course). For some matrices, it is easy to verify. X' X is one of these. a'x'xa = ( a'x' )( Xa) = ( Xa )'( Xa) = v'v K 2 k=1 k = v 0 Could v = 0? v = 0 means Xa = 0. Is this possible? No. -1 Conclusion: b = ( X'X) X'y does indeed minimize e'e. 3-17/29
Algebraic Results - 1 In the population: E[ X' ] = 0 1 n In the sample : x i1 i ei 0 n Xe = 0 means for each column of X, xe = 0 (1) Each column of X is orthogonal to e. (2) One of the columns of X is a column of ones. i'e n = e = 0. The residuals sum to zero. i1 i 1 (3) It follows that n e 1 i = 0 which mimics E[ i ] 0. i n k 3-18/29
Residuals vs. Disturbances Disturbances (population) y x i i i Partitioning y: y = E[ y X] + ε = conditional mean + disturbance Residuals (sample) y xb e i i i Partitioning y : y = Xb + e = projection + residual (Note : Projection into the column space of X, i.e., the set of linear combinations of the columns of X. Xb is one of these.) 3-19/29
Algebraic Results - 2 A residual maker M = (I - X(X X) -1 X ) e = y - Xb= y - X(X X) -1 X y = My My = The residuals that result when y is regressed on X MX = 0 (This result is fundamental!) How do we interpret this result in terms of residuals? When a column of X is regressed on all of X, we get a perfect fit and zero residuals. (Therefore) My = MXb + Me = Me = e (You should be able to prove this. y = Py + My, P = X(X X) -1 X = (I - M). PM = MP = 0. Py is the projection of y into the column space of X. 3-20/29
The M Matrix M = I- X(X X) -1 X is an nxn matrix M is symmetric M = M M is idempotent M*M = M (just multiply it out) M is singular; M -1 does not exist. (We will prove this later as a side result in another derivation.) 3-21/29
Results when X Contains a Constant Term X = [1,x 2,,x K ] The first column of X is a column of ones Since X e = 0, x 1 e = 0 the residuals sum to zero. y Xb + e Define i [1,1,...,1]' a column of n ones i'y n i=1 = y ny i i'y i'xb + i'e = i'xb so (1/n) i'y = (1/n) i'xb implies y xb (the regression line passes through the means) These do not apply if the model has no constant term. 3-22/29
U.S. Gasoline Market, 1960-1995 3-23/29
Least Squares Algebra 3-24/29
Least Squares 3-25/29
Residuals 3-26/29
Least Squares Residuals (autocorrelated) 3-27/29
Least Squares Algebra-3 I X XX X M M is n n potentially huge 3-28/29
Least Squares Algebra-4 MX = 3-29/29