Economics 620, Lecture 2: Regression Mechanics (Simple Regression)

Similar documents
REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)

Economics 620, Lecture 9: Asymptotics III: Maximum Likelihood Estimation

Chapter 2 The Simple Linear Regression Model: Specification and Estimation

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N

Economics 620, Lecture 4: The K-Varable Linear Model I

4.1 Least Squares Prediction 4.2 Measuring Goodness-of-Fit. 4.3 Modeling Issues. 4.4 Log-Linear Models

Applied Econometrics (QEM)

Advanced Econometrics I

Economics 620, Lecture 5: exp

Applied Econometrics (QEM)

STAT 100C: Linear models

1 Correlation between an independent variable and the error

Economics 620, Lecture 7: Still More, But Last, on the K-Varable Linear Model

Lecture 14 Simple Linear Regression

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Simple Linear Regression for the MPG Data

Introduction to Estimation Methods for Time Series models. Lecture 1

Chapter 3: Maximum Likelihood Theory

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Fitting a regression model

SF2930: REGRESION ANALYSIS LECTURE 1 SIMPLE LINEAR REGRESSION.

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Making sense of Econometrics: Basics

ECONOMICS AND ECONOMIC METHODS PRELIM EXAM Statistics and Econometrics August 2007

Simple Linear Regression Analysis

Lectures 5 & 6: Hypothesis Testing

Simple Linear Regression

STAT5044: Regression and Anova. Inyoung Kim

Regression Models - Introduction

The cover page of the Encyclopedia of Health Economics (2014) Introduction to Econometric Application in Health Economics

ECO220Y Simple Regression: Testing the Slope

Basic Econometrics - rewiev

Lecture 3: Multiple Regression

Ch 2: Simple Linear Regression

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).

Regression #4: Properties of OLS Estimator (Part 2)

Multiple Linear Regression

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Intermediate Econometrics

Simple Linear Regression

Inference in Regression Analysis

Heteroscedasticity 1

Environmental Econometrics

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Lecture 11. Multivariate Normal theory

The Multiple Regression Model Estimation

ECON The Simple Regression Model

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Simple Linear Regression: The Model

Lecture 13: Data Modelling and Distributions. Intelligent Data Analysis and Probabilistic Inference Lecture 13 Slide No 1

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Probability and Statistics Notes

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Department of Economics. Working Papers

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 9 SLR in Matrix Form

Chapter 8 Heteroskedasticity

Intermediate Econometrics

Intro to Linear Regression

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Topic 4: Model Specifications

The Simple Regression Model. Part II. The Simple Regression Model

Regression and Covariance

The regression model with one fixed regressor cont d

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

Lecture 34: Properties of the LSE

MS&E 226: Small Data. Lecture 6: Bias and variance (v2) Ramesh Johari

Quick Review on Linear Multiple Regression

Multiple Regression Analysis. Part III. Multiple Regression Analysis

z = f (x; y) f (x ; y ) f (x; y) f (x; y )

Microeconometrics: Clustering. Ethan Kaplan

Chapter 1 Linear Regression with One Predictor

Properties of the least squares estimates

Business Statistics. Lecture 10: Correlation and Linear Regression

INTRODUCTORY ECONOMETRICS

STA 302f16 Assignment Five 1

General Linear Model Introduction, Classes of Linear models and Estimation

Statistics, Data Analysis, and Simulation SS 2015

Multivariate Regression: Part I

STA 4210 Practise set 2a

Statistical Methods in Particle Physics

Regression Models - Introduction

Homoskedasticity. Var (u X) = σ 2. (23)

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

BNAD 276 Lecture 10 Simple Linear Regression Model

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

1. Variance stabilizing transformations; Box-Cox Transformations - Section. 2. Transformations to linearize the model - Section 5.

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Linear Regression. y» F; Ey = + x Vary = ¾ 2. ) y = + x + u. Eu = 0 Varu = ¾ 2 Exu = 0:

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Transcription:

1 Economics 620, Lecture 2: Regression Mechanics (Simple Regression) Observed variables: y i ; x i i = 1; :::; n Hypothesized (model): Ey i = + x i or y i = + x i + (y i Ey i ) ; renaming we get: y i = + x i + " i Unobserved: ; ; " i EXAMPLE: ENGEL CURVES Utility function: u(z 1 ; :::; z k ) = P k j=1 a j ln z j : Budget constraint: m = P k j=1 p j z j. FOC: a j z j p j = 0 j = 1; :::; k ) = P kj=1 a j m ) z j = a jm p j P k`=1 a` ) z j p j = a j P k`=1 a`m

2 Estimation We want to estimate: E(y) = + x Where y is the expenditure on good j and x is income. According to the model we also have: = a j = P a`; = 0 We would like to estimate the unknowns from a sample of n observations on y and x.

3 The Least Squares Method The Least Squares criterion to estimate and is to choose ^ and ^ to minimize the sum of squared vertical distances between ^y i = ^ + ^xi and y i. Why do we consider the vertical distances? Why do we square? Let S(a; b) = P n i=1 (y i a bx i ) 2 : Partial deriviatives: @S @a = 2 P n i=1 (y i a bx i ) @S @b = 2 P n i=1 x i (y i a bx i ).

4 Normal Equations Normal equations: 0 = P n i=1 (y i a bx i ) 0 = P n i=1 x i (y i a bx i ): ^ and ^ are the Least Squares (LS) Estimators P ni=1 y i = n^ + ^ P ni=1 x i (1) P ni=1 x i y i = ^ P n i=1 x i + ^ P ni=1 x 2 i (2) From (1): ^ = y ^x where y = P ni=1 y i n ; x = P ni=1 x i n Substituting into (2): P xi y i = (y ^x) P xi + ^ P x 2 i

5 Normal Equations cont d. ) P x i (y i y) = ^ P x 2 i x P x i = ^ P x 2 i nx 2 = ^ P (xi x) 2 ) ^ = P xi (y i y) P (xi x) 2 = P (xi x)y i P (xi x) 2 ) ^ = P (xi x)(y i y) P (xi x) 2 Is this a minimum? Note that: @ 2 S @a 2 = 2n; @2 S @a@b = 2 P x i ; @2 S @b 2 = 2 P x 2 i

6 Normal Equations cont d. Is the Hessian p.d.? H = 2 " P # n xi P P xi x 2 i YES! Use Cauchy-Schwarz: ( X x i z i ) 2 ( X x 2 i )(X z 2 i ) Here: ( P x i ) 2 P x 2 i n De ne the residuals as: e i = y i ^ ^xi From the normal equations: P ei = P x i e i = 0

7 Proof of Minimization Consider alternative estimators a and b : S(a ; b ) = P (y i a b x i ) 2 = P [(y i ^ ^xi ) + (^ a ) + (^ b )x i ] 2 = P e 2 i + 2(^ a ) P e i + 2(^ b ) P x i e i + P [(^ a ) + (^ b )x i ] 2 P e 2 i

8 Properties of Estimators LS estimators are unbiased: ^ = P (xi x)y i P (xi x) 2 = P (x i x) + P (x i x)x i + P (x i x)" i P (xi x) 2 = + P (xi x)" i P (xi x) 2 ) E^ = ; ^ = y ^x = + ( ^)x + " ) E ^ =

9 More Properties We cannot get more properties without further assumptions: Assume: V (y i jx i ) = V (" i ) = 2 ; Cov(" i " j ) = 0: Now: V (^) = E(^ ) 2 = E "P # 2 (xi x)" i P (xi x) 2 = P (xi x) 2 2 P(xi x) 2 2 ; using E" i " j = 0. Thus: V (^) = 2 P (xi x) 2

10 More Properties cont d. Now for V (^); ^ = ( ^)x + " ) V (^) = V (^)x 2 + 2 n ) V (^) = 2 1 n + x 2 P (xi x) 2 This requires Cov(^; ") = 0. Why? P (xi x)" E(^ )" = E P(xi i 1n P x) 2 "j = P (xi x) 2 =n P (xi x) 2 = 0

11 Engel Curve Example cont d. We know: p j z j = P a j a`m: Is V (" j ) = 2 plausible here? How about logs: ln(p j z j ) = ln a j P a` + ln m? This implies the regression equation y = + x where y is log expenditure on good j and x is log income. What are our expectations about the estimator values? Is this better?

12 Covariance of Estimators Cov(^; ^) = E[(^ )(^ )] " P!# (xi x)" = E (( ^)x + ") i P (xi x) 2 = E = "P # 2 (xi x)" i P (xi x) 2 x 2 x P (xi x) 2:

13 Gauss-Markov Theorem The LS estimator is the best linear unbiased estimator (BLUE). Proof: de ne w i = (x i x) P (xi x) 2 P so ^ = wi y i : Consider an alternative linear unbiased estimator: ~ = P c i y i : Write c i = w i + d i : Note: P E~ = ) E ci ( + x i + " i ) = E P c i ( + x i + " i ) = P c i + P c i x i ) P c i = 0; P c i x i = 1

14 Note that Gauss-Markov Theorem proof cont d. w i satis es P w i = 0; P w i x i = 1, so P d i = 0 and P di x i = 0. so Now we have WHY? V (~ ) = E ( P ci " i ) 2 = 2 P c 2 i = 2 P (w i + d i ) 2 = 2 h P d 2 i + 2 P w i d i + P w 2 i V (~ ) V (^) = 2 h P d 2 i + 2 P w i d i i = 2 P d 2 i This is minimized when the estimators are identical! A similar argument applies for b and any linear combination of b and b. i

15 Estimation of Variance It is natural to use the sum of squared residuals to obtain information about the variance. e i = y i ^ ^xi = (y i y) ^(xi x) = (^ )(xi x) + (" i ") ) P e 2 i = (^ ) 2 P (x i x) 2 + P (" i ") 2 2(^ ) P (xi x)(" i ") This will involve 2 in expectation - term by term. First term: E(^ ) 2 P (x i x) 2 = 2

16 Estimation of Variance cont d. Second term: E P (" i ") 2 = E " P " 2 1 i + n P 2 "i 2 P # " i " n = n 2 + 2 2 2 = (n 1) 2 Third term: P E2(^ ) (xi x)(" i ") "P (xi x)" = 2E i P P (xi (xi x) 2 x)(" i ") = 2E [P (x i x)" i ] 2 P (xi x) 2 = 2 2 #

17 Estimation of Variance cont d. Adding the terms we get: E P e 2 i = (n 2)2 This suggest the estimator: s 2 = P e 2 i =(n 2) This is an unbiased estimator It is a quadratic function of y This is all we can say without further assumptions

18 Summing Up With the assumption Ey i = + x i, we can calculate unbiased estimates of and (linear in y i ). Adding the assumption V (y i jx i ) = 2 and E" i " j = 0, we can obtain sampling variance for ^ and ^, get an optimality property and an unbiased estimate for 2 : Note the the optimality property may not be that compelling and that we have very little information about the variance estimate.