Ma 3/103: Lecture 24 Linear Regression I: Estimation

Similar documents
Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

The Standard Linear Model: Hypothesis Testing

Advanced Quantitative Methods: ordinary least squares

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Multivariate Regression

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Ch 3: Multiple Linear Regression

Chapter 2 Multiple Regression I (Part 1)

Homoskedasticity. Var (u X) = σ 2. (23)

Advanced Econometrics I

Reference: Davidson and MacKinnon Ch 2. In particular page

STAT 540: Data Analysis and Regression

Linear models and their mathematical foundations: Simple linear regression

[y i α βx i ] 2 (2) Q = i=1

Simple and Multiple Linear Regression

Ch 2: Simple Linear Regression

Least Squares Estimation-Finite-Sample Properties

STAT5044: Regression and Anova. Inyoung Kim

Introduction to Econometrics Midterm Examination Fall 2005 Answer Key

Introduction to Estimation Methods for Time Series models. Lecture 1

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Econ 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A )

1. The OLS Estimator. 1.1 Population model and notation

The Multiple Regression Model Estimation

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Regression Models - Introduction

Applied Econometrics (QEM)

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

Lecture 6: Linear models and Gauss-Markov theorem

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

STAT 100C: Linear models

STAT5044: Regression and Anova. Inyoung Kim

Linear Regression Model. Badr Missaoui

4 Multiple Linear Regression

2. A Review of Some Key Linear Models Results. Copyright c 2018 Dan Nettleton (Iowa State University) 2. Statistics / 28

Simple Linear Regression

Quick Review on Linear Multiple Regression

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Non-Spherical Errors

Lecture 34: Properties of the LSE

Xβ is a linear combination of the columns of X: Copyright c 2010 Dan Nettleton (Iowa State University) Statistics / 25 X =

Chapter 5 Matrix Approach to Simple Linear Regression

Econometrics. Andrés M. Alonso. Unit 1: Introduction: The regression model. Unit 2: Estimation principles. Unit 3: Hypothesis testing principles.

Reliability of inference (1 of 2 lectures)

Matrix Algebra, part 2

Simple Linear Regression

Econ 583 Final Exam Fall 2008

17: INFERENCE FOR MULTIPLE REGRESSION. Inference for Individual Regression Coefficients

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Lecture Notes on Different Aspects of Regression Analysis

Estimating Estimable Functions of β. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 17

Applied Econometrics (QEM)

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

Econometrics of Panel Data

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Weighted Least Squares

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

7. GENERALIZED LEAST SQUARES (GLS)

Lecture 9 SLR in Matrix Form

EC3062 ECONOMETRICS. THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation. (1) y = β 0 + β 1 x β k x k + ε,

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Ordinary Least Squares Regression

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Weighted Least Squares

MS&E 226: Small Data

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Estimation of the Response Mean. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 27

Review of Econometrics

where x and ȳ are the sample means of x 1,, x n

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Standard Linear Regression Model (SLM)

Multiple Regression Analysis

Next is material on matrix rank. Please see the handout

Lecture 4: Heteroskedasticity

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Financial Econometrics

The Statistical Property of Ordinary Least Squares

The outline for Unit 3

Regression #4: Properties of OLS Estimator (Part 2)

Lecture 15. Hypothesis testing in the linear model

Association studies and regression

ECON The Simple Regression Model

Regression: Lecture 2

3 Multiple Linear Regression

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

the error term could vary over the observations, in ways that are related

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

Chapter 3: Multiple Regression. August 14, 2018

Lecture 16 Solving GLMs via IRWLS

14 Multiple Linear Regression

Simple Linear Regression: The Model

Empirical Economic Research, Part II

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

Heteroskedasticity and Autocorrelation

School of Education, Culture and Communication Division of Applied Mathematics

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N

Lecture 11: Regression Methods I (Linear Regression)

Transcription:

Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32

Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the regression function; components of X = (X 1,..., X K ) are regressors. KC Border Linear Regression I March 3, 2017 2 / 32

The standard linear model The standard linear model or Y = Xβ + ε y t = x t,1 β 1 + + x t,k β K + ε t (t = 1,..., N) KC Border Linear Regression I March 3, 2017 3 / 32

The standard linear model The linear model is more general than you might think Kepler s 3rd Law. The square of the orbital period of a planet is directly proportional to the cube of the semi-major axis of its orbit. or Hubble s Law. Newton s Law of Gravity: P 2 = ca 3. 2 ln P = ln c + 3 ln A red shift = c distance F = G M 1M 2 d 2 ln F = ln G + ln M 1 + ln M 2 2 ln d KC Border Linear Regression I March 3, 2017 4 / 32

The standard linear model Polynomials: Geometric means: y = b 0 + b 1 x + b 2 x 2 + + b K x K y = b 0 x b 1 1 x b 2 2 x b K K ln y = ln b 0 + b 1 ln x 1 + + b K ln x K Dummy variables, or indicators: e.g., { 1 Honda X 1 = 0 otherwise { 1 Kawasaki X 2 = 0 otherwise X l =. { 1 Ducati 0 otherwise KC Border Linear Regression I March 3, 2017 5 / 32

The standard linear model Variates The variates X k may be fixed constants chosen by an experimenter or they may be random variables themselves. They are called regressors. Almost always a constant variate is included. KC Border Linear Regression I March 3, 2017 6 / 32

The standard linear model Data N observations of the values x 1,..., x K and y. y t = x t,1 β 1 + + x t,k β K + ε t (t = 1,..., N) where the ε t s are unobserved errors. In matrix form: y = Xβ + ε KC Border Linear Regression I March 3, 2017 7 / 32

The standard linear model y = y 1. is a N 1 column vector y N x 1,1 x 1,K X =..... x N,1 x N,K β 1 β =. β K is a N K matrix, is a K 1 column vector, and ε = ε 1. is a N 1 column vector. ε N KC Border Linear Regression I March 3, 2017 8 / 32

The standard linear model The estimation problem The problem is to estimate (β 1,..., β K ). Statistical assumptions of the standard model: E(ε X) = 0, Var(ε X) = E(εε X) = σ 2 I N N. This last assumption is known as homoskedasticity. KC Border Linear Regression I March 3, 2017 9 / 32

The standard linear model The Least Squares approach KC Border Linear Regression I March 3, 2017 10 / 32

The standard linear model Sum of squared residuals Vector of residuals as a function of b is y Xb The sum of squared residuals (SSR) is (y Xb) (y Xb). Expanding yields SSR(b) = y y 2y Xb + b X Xb. which is a convex quadratic function in the components of b. KC Border Linear Regression I March 3, 2017 11 / 32

The standard linear model Minimizing the sum of squared residuals By convexity, the minimum occurs whenever the gradient equals zero. The gradient of this function is SSR(b) = 2X y + 2X Xb. Thus the minimizer ˆβ OLS satisfies the first-order condition SSR( ˆβ OLS ) = 0: X y = X X ˆβ OLS. This matrix equation is known as the normal equation for ˆβ OLS. KC Border Linear Regression I March 3, 2017 12 / 32

The standard linear model Least Squares Estimator On the hypothesis that X X (a K K matrix) is nonsingular, we then have that ˆβ OLS = (X X) 1 X y minimizes the sum of squared residuals. This ˆβ OLS is called the ordinary least squares (OLS) estimator of β. KC Border Linear Regression I March 3, 2017 13 / 32

The standard linear model The singular case What if X X is singular? Then where not all a k are zero. Then a 1 X 1 + + a K X K = 0, y = β 1 X 1 + + β K X K + ε + c (a 1 X 1 + + a K X K ) }{{} =0 = (β 1 + ca 1 )X 1 + + (β K + ca k )X K + ε for any value of c. Whenever a k is nonzero, the coefficient on X k can be whatever we want. That is, the data cannot tell us what the coefficient β k is, even if every error term is zero. KC Border Linear Regression I March 3, 2017 14 / 32

The standard linear model Properties ˆβ OLS = (X X) 1 X y = (X X) 1 X (Xβ + ε) = β + (X X) 1 X ε. This is a random vector. Set e = y X ˆβ OLS, the vector e of residuals is orthogonal to each k th column vector of the values of the regressor X k. X e = 0, since X e = X (y X ˆβ OLS ) = X y X X ˆβ OLS = X y X X(X X) 1 X y = X y X y = 0. KC Border Linear Regression I March 3, 2017 15 / 32

The standard linear model If the regressors include a constant term, then the fitted plane passes through the sample means. That is, Proof: so ȳ = x 1 ˆβ1 + + x K ˆβK. y = X ˆβ OLS + e, 1 y = 1 X ˆβ OLS + 1 e, where 1 is a N-vector of ones. Since it is one of the regressors, 1 e = 0. Dividing by N gives ȳ = x 1 ˆβ1 + + x K ˆβK. KC Border Linear Regression I March 3, 2017 16 / 32

The standard linear model The Geometry of LSE y ˆβ 1 x 1 e x 1 ŷ 0 x 2 ˆβ2 x 2 KC Border Linear Regression I March 3, 2017 17 / 32

OLS and MLE OLS and MLE When the error vector ε has a multivariate normal distribution N(0, σ 2 I) distribution, then the OLS estimator of β is also the Maximum Likelihood Estimator. KC Border Linear Regression I March 3, 2017 18 / 32

OLS and MLE MLE of β The density of ε = y Xβ is the multivariate normal density N(0, σ 2 I) ( 1 2π ) N 1 det σ 2 I e 1 2 (y Xβ) (σ 2 I) 1 (y Xβ) = Taking logs, we find the log likelihood function is ( ) 1 N ( ) 1 1 2 e 1 2π (σ 2 ) N 2σ 2 (y Xβ) (y Xβ) N 2 log(2π) N 2 log σ2 1 2σ 2 (y Xβ) (y Xβ). Maximizing this with respect to β amounts to minimizing (y Xβ) (y Xβ), which is exactly what OLS does. KC Border Linear Regression I March 3, 2017 19 / 32

OLS and MLE MLE of σ 2 The first order condition for the maximum with respect to σ 2 is N 1 2 σ 2 + 1 2 (y 1 Xβ) (y Xβ) (σ 2 ) 2 = 0. Then multiply by 2(σ 2 ) 2 to get Nσ 2 + (y Xβ) (y Xβ) = 0, so where ˆσ 2 MLE = e e N, e = y X ˆβ. KC Border Linear Regression I March 3, 2017 20 / 32

OLS and MLE ˆβ OLS is unbiased ˆβ OLS = (X X) 1 X y = (X X) 1 X (Xβ + ε) = β + (X X) 1 X ε, ˆβ OLS β = (X X) 1 X ε E( ˆβ OLS β) = E(X X) 1 X ε = (X X) 1 X E ε = 0. ˆβ OLS is unbiased, E ˆβ OLS = β. KC Border Linear Regression I March 3, 2017 21 / 32

OLS and MLE Variance-covariance matrix ˆβ OLS ( ˆβ OLS β)( ˆβ OLS β) = (X X) 1 X εε X(X X) 1, Var( ˆβ OLS ) = E( ˆβ OLS β)( ˆβ OLS β) = (X X) 1 X σ 2 IX(X X) 1 = σ 2 (X X) 1. KC Border Linear Regression I March 3, 2017 22 / 32

OLS and MLE Gauss Markov Theorem In the standard linear model, if X has rank K, then the OLS estimator ˆβ OLS is the Best Linear Unbiased Estimate (BLUE) of β in the following sense. Given any other estimator b of β which is linear in y and which satisfies E b = β for any possible value of β, then Var b = Var ˆβ OLS + P, where P is positive semidefinite. This implies that for any vector w of weights Var w b Var w ˆβOLS. KC Border Linear Regression I March 3, 2017 23 / 32

OLS and MLE Proof of Gauss Markov Let b = Ay. Define Then D = A (X X) 1 X. b = Ay = ( D + (X X) 1 X ) y = ( D + (X X) 1 X ) (Xβ + ε) So in expectation, = DXβ + β + ( D + (X X) 1 X ) ε, b β = DXβ + ( (X X) 1 X + D ) ε. (1) E b β = DXβ + ( (X X) 1 X + D ) E ε. }{{} =0 KC Border Linear Regression I March 3, 2017 24 / 32

OLS and MLE Proof of Gauss Markov, continued Now b is unbiased if and only if DXβ = 0 for all β. Therefore DX = 0, so (1) becomes b β = ( D + (X X) 1 X ) ε. KC Border Linear Regression I March 3, 2017 25 / 32

OLS and MLE Proof of Gauss Markov, continued So for an unbiased linear estimator b, Var b = E(b β)(b β) = ( D + (X X) 1 X ) E(εε ) ( D + (X X) 1 X ) = σ 2( D + (X X) 1 X )( D + X(X X) 1) = σ 2( DD + }{{} DX (X X) 1 + (X X) 1 X D ) }{{} + (X X) 1 =0 =0 = σ 2 DD + Var ˆβ OLS. But P = σ 2 DD is positive semidefinite as w DD w = (D w) (D w) 0. q.e.d. KC Border Linear Regression I March 3, 2017 26 / 32

OLS and MLE Estimating σ 2 e = My = Mε, where M = I X(X X) 1 X. e e = ε M Mε = ε Mε. Since ε Mε is 1 1, it is equal to its trace, and since trace is a linear operator, the expected value of the trace of a random matrix is the trace of the expected matrix. Thus by the magic of linear algebra, E(e e) = E(ε Mε) = (N K)σ 2 KC Border Linear Regression I March 3, 2017 27 / 32

OLS and MLE Estimating σ 2, continued Define s 2 = e e N K, s = e e N K. Theorem If ε N(0, σ 2 I), then ˆβ OLS N ( β, σ 2 (X X) 1), and (N K)s 2 σ 2 χ 2 (N K) Also, ˆβ OLS and s 2 are independent. KC Border Linear Regression I March 3, 2017 28 / 32

OLS and MLE Test statistics If ε is jointly Normal, then for any K-vector w of weights, w ( ˆβ ( ) OLS β) N 0, σ 2 w (X X) 1 w, so w ( ˆβ OLS β) t(n K). (2) s w (X X) 1 w KC Border Linear Regression I March 3, 2017 29 / 32

OLS and MLE Standard error of ˆβ k OLS Special case, w is the k th unit coordinate vector: ˆβ k β k t(n K). s (X X) 1 kk Since σ 2 (X X) 1 kk = Var ˆβ kols, we have that s (X X) 1 kk is the estimated standard deviation of ˆβ kols, and is called the standard error of ˆβ kols. KC Border Linear Regression I March 3, 2017 30 / 32

OLS and MLE Confidence intervals for β k The 1 α confidence interval for β k is ( ) ˆβ k t α,n K 2 s (X X) 1 kk, ˆβk + t 1 α,n K 2 s (X X) 1 kk KC Border Linear Regression I March 3, 2017 31 / 32

OLS and MLE Testing β k To test Compute H 0 : β k = β 0 k versus H 1 : β k β 0 k t = ˆβ kols βk 0 s (X X) 1 kk We reject the null hypothesis if t > t α 2,N K. For the null hypothesis H 0 : ˆβ k = 0, we have t = ˆβ kols. s (X X) 1 kk It is this value of t that statistical software reports as the t-value for β k. KC Border Linear Regression I March 3, 2017 32 / 32