Advanced Econometrics I

Similar documents
LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I Lecture 3: The Simple Linear Regression Model

1. The OLS Estimator. 1.1 Population model and notation

Ch 3: Multiple Linear Regression

Applied Econometrics (QEM)

Regression and Statistical Inference

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Review of Econometrics

ECON The Simple Regression Model

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Introductory Econometrics

Ch 2: Simple Linear Regression

STAT 100C: Linear models

The Multiple Regression Model Estimation

7. Introduction to Large Sample Theory

Advanced Quantitative Methods: ordinary least squares

Least Squares Estimation-Finite-Sample Properties

The Simple Regression Model. Part II. The Simple Regression Model

Multivariate Regression

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Motivation for multiple regression

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Linear Regression. Junhui Qian. October 27, 2014

Introduction to Estimation Methods for Time Series models. Lecture 1

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Homoskedasticity. Var (u X) = σ 2. (23)

Introduction to Econometrics

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Economics 620, Lecture 2: Regression Mechanics (Simple Regression)

LECTURE 5 HYPOTHESIS TESTING

Econometrics Summary Algebraic and Statistical Preliminaries

3. Linear Regression With a Single Regressor

Econometrics I. Professor William Greene Stern School of Business Department of Economics 3-1/29. Part 3: Least Squares Algebra

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Intermediate Econometrics

Greene, Econometric Analysis (7th ed, 2012)

Regression Analysis. y t = β 1 x t1 + β 2 x t2 + β k x tk + ϵ t, t = 1,..., T,

Introductory Econometrics

Applied Econometrics (QEM)

The regression model with one fixed regressor cont d

Empirical Economic Research, Part II

The regression model with one stochastic regressor (part II)

Reference: Davidson and MacKinnon Ch 2. In particular page

1. The Multivariate Classical Linear Regression Model

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

The Multiple Linear Regression Model

The regression model with one stochastic regressor.

The Statistical Property of Ordinary Least Squares

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Topic 4: Model Specifications

Multiple Regression Analysis

Lecture 11. Multivariate Normal theory

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

3 Multiple Linear Regression

Statistical Inference with Regression Analysis

Economics 113. Simple Regression Assumptions. Simple Regression Derivation. Changing Units of Measurement. Nonlinear effects

Simple Linear Regression

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Quick Review on Linear Multiple Regression

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Chapter 2 Multiple Regression I (Part 1)

The Linear Regression Model

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

STAT5044: Regression and Anova. Inyoung Kim

Econometrics. Andrés M. Alonso. Unit 1: Introduction: The regression model. Unit 2: Estimation principles. Unit 3: Hypothesis testing principles.

Intermediate Econometrics

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Econ 510 B. Brown Spring 2014 Final Exam Answers

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

Lecture 14 Simple Linear Regression

ECON 4160, Autumn term Lecture 1

Instrumental Variables

Econometrics - 30C00200

Statistics and Econometrics I

ECON 3150/4150 spring term 2014: Exercise set for the first seminar and DIY exercises for the first few weeks of the course

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Answers to Problem Set #4

Simple Linear Regression

EC3062 ECONOMETRICS. THE MULTIPLE REGRESSION MODEL Consider T realisations of the regression equation. (1) y = β 0 + β 1 x β k x k + ε,

STA 302f16 Assignment Five 1

Brief Suggested Solutions

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Linear Models and Estimation by Least Squares

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Multiple Regression Model: I

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

TMA4255 Applied Statistics V2016 (5)

Multivariate Random Variable

Estadística II Chapter 4: Simple linear regression

Multivariate Regression Analysis

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Quantitative Methods I: Regression diagnostics

Transcription:

Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim

1. Introduction Introduction & CLRM, Autumn Term 2010 1

What is econometrics? Econometrics = economic statistics economic theory mathematics 1 Important notions: Data perceived as realizations of random variables Parameters are real numbers, not random variables Joint distributions of random variables depend on parameters A model a set of restrictions on the joint distribution of variables 1 according to Ragnar Frisch Introduction & CLRM, Autumn Term 2010 2

Motivating Example Mincer equation - The influence of schooling on wages ln(w AGE i ) = β 1 +β 2 S i +β 3 T ENURE i +β 4 EXP R i +ε i Notation: Logarithm of the wage rate: ln(w AGE i ) Years of schooling: S i Experience in the current job: T ENURE i Experience in the labor market: EXP R i Estimation of the parameters β k, where β 2 is return to schooling Introduction & CLRM, Autumn Term 2010 3

The importance of relationships 2 Relationship among variables is what empirical analysis is all about Consider: a dependent variable y i, and a vector of k explanatory variables, x i. Let r i = (y i, x i ) ; i = 1,..., N be i.i.d. Our interest is in the relationship between y i and the explanatory variables with the ultimate agenda of: 2 this section draws from Angrist, 2009 Introduction & CLRM, Autumn Term 2010 4

Description - How does y i usually vary with x i? Prediction - Can we use x i to forecast y i? Causality - What is the effect of elements of x i on y i? Generally we look for relationships that hold on average. On average relationships among variables are summarised by the Conditional Expectation Function (CEF), where E[y i x i ] h(x i ) y i = h(x i ) + ε i E[ε i x i ] 0. Introduction & CLRM, Autumn Term 2010 5

General regression equations Generalization: y i = β 1 x i1 + β 2 x i2 +... + β K x ik + ε i Index for observations i = 1, 2,..., n and regressors k = 1, 2,..., K y i = β x i + ε i (1x1) (1xK) (Kx1) (1x1) β = β 1 β 2. and x i = x i1 x i2. β K x ik Introduction & CLRM, Autumn Term 2010 6

The key problem of econometrics: We deal with non-experimental data Unobservable variables, interdependence, endogeneity, causality Examples: - Ability bias in Mincer equation - Reverse causality problem if unemployment is regressed on liberalization index - Causal effect on police force and crime is not an independent outcome - Simultaneity problem in demand price equation Introduction & CLRM, Autumn Term 2010 7

2. The CLRM: Parameter Estimation by OLS Hayashi p. 6/15-18 Introduction & CLRM, Autumn Term 2010 8

Classical linear regression model (CLRM) y i = β 1 x i1 + β 2 x i2 +... + β K x ik + ε i = x i β + ε i (1xK) (Kx1) y i : Dependent variable, observed x i = (x i1, x i2,..., x ik ): Explanatory variables, observed β = (β 1, β 2,..., β K ): Unknown parameters ε i : Disturbance component, unobserved b = (b 1, b 2,..., b K ) estimator of β e i = y i x ib: Estimated residual Introduction & CLRM, Autumn Term 2010 9

For convenience we introduce matrix notation y = X β + ε (nx1) (nxk) (Kx1) (nx1) y 1 y 2. y n = 1 x 12 x 13... x 1K 1. x 22..... 1 x n2... x nk β 1 β 2. β K + ε 1 ε 2. ε n Introduction & CLRM, Autumn Term 2010 10

Writing extensively: A system of linear equations y 1 = β 1 + β 2 x 12 +... + β K x 1K + ε 1 y 2 = β 1 + β 2 x 22 +... + β K x 2K + ε 2. y n = β 1 + β 2 x n2 +... + β K x nk + ε n Introduction & CLRM, Autumn Term 2010 11

We estimate the linear model and choose b such that SSR is minimized Obtain an estimator b of β by minimizing the SSR (sum of squared residuals): argmins(b) = argmin n i=1 e2 i = argmin n i=1 (y i x i b)2 {b} Differentiation with respect to b 1, b 2,..., b K FOC s: (1) S(b) b 1! = 0 e i = 0 (2) S(b) b 2! = 0 e i x i2 = 0. (K) S(b) b K! = 0 e i x ik = 0 FOC s can be conveniently written in matrix notation X e = 0 Introduction & CLRM, Autumn Term 2010 12

The system of K equations is solved by matrix algebra X e = X (y Xb) = X y X Xb = 0 Premultiplying by (X X) 1 : (X X) 1 X y (X X) 1 X Xb = 0 (X X) 1 X y Ib = 0 OLS-estimator: b = (X X) 1 X y Alternatively: b = ( 1 n X X ) 1 1 n X y = ( 1 n n i=1 x ix i) 1 1 n n i=1 x iy i Introduction & CLRM, Autumn Term 2010 13

Zoom into the matrices X X and X y b = ( 1 n X X ) 1 1 n X y = ( 1 n n i=1 x ix i) 1 1 n n i=1 x iy i n i=1 x ix i = x 2 i1 xi1 x i2 xi1 x i3... xi1 x ik xi1 x i2 x 2 i2 xi2 x ik...... xi1 x ik xi2 x ik... x 2 ik n i=1 x iy i = xi1 y i xi2 y i xi3 y i. xik y i Introduction & CLRM, Autumn Term 2010 14

3. Assumptions of the CLRM Hayashi p. 3-13 Introduction & CLRM, Autumn Term 2010 15

The four core assumptions of CLRM 1.1 Linearity y i = x i β + ε i 1.2 Strict exogeneity E(ε i X) = 0 E(ε i ) = 0 and Cov(ε i, x ik ) = E(ε i x ik ) = 0 1.3 No exact multicollinearity, P (rank(x) = k) = 1 No linear dependencies in the data matrix 1.4 Spherical disturbances: V ar(ε i X) = E(ε 2 i X) = σ2 Cov(ε i, ε j X) = 0; E(ε i ε j X) = 0 E(ε i ) = σi 2 and Cov(ε i, ε j ) = 0 by LTE (see Hayashi p. 18) Introduction & CLRM, Autumn Term 2010 16

Interpreting the parameters β of different types of linear equations Linear model y i = β 1 + β 2 x i2 +... + β K x ik + ε i : A one unit increase in the independent variable x ik increases the dependent variable by β k units Semi-log form log(y i ) = β 1 + β 2 x i2 +... + β K x ik + ε i : A one unit increase in the independent variable increases the dependent variable approximately by 100 β k percent Log linear model log(y i ) = β 1 log(x i1 ) + β 2 log(x i2 ) +... + β K log(x ik ) + ε i : A one percent increase in x ik increases the dependent variable y i approximately by β k percent Introduction & CLRM, Autumn Term 2010 17

Some important laws Law of Total Expectation (LTE): E X [E Y X (Y X)] = E Y (Y ) Double Expectation Theorem (DET): E X [E Y X (g(y ) X)] = E Y (g(y )) Law of Iterated Expectations (LIE): E Z X [E Y X,Z (Y X, Z) X] = E Y X (Y X) Introduction & CLRM, Autumn Term 2010 18

Some important laws (continued) Generalized DET: E X [E Y X (g(x, Y )) X] = E X,Y (g(x, Y )) Linearity of Conditional Expectations: E Y X [g(x)y X] = g(x)e Y X [Y X] Introduction & CLRM, Autumn Term 2010 19

4. Finite sample properties of the OLS estimator Hayashi p. 27-31 Introduction & CLRM, Autumn Term 2010 20

Finite sample properties of b = (X X) 1 X y 1. E(b) = β: Unbiasedness of the estimator Holds for any sample size Holds under assumptions 1.1-1.3 2. V ar(b X) = σ 2 (X X) 1 : Conditional variance of b Conditional variance depends on the data Holds under assumptions 1.1-1.4 3. V ar(ˆβ X) V ar(b X) ˆβ is any other linear unbiased estimator of β Holds under assumptions 1.1-1.4 Introduction & CLRM, Autumn Term 2010 21

Some key results from mathematical statistics z = (nx1) z 1 z 2. z n A = (mxn) a 11 a 12... a 1n a 21 a 22...... a m1 a m2... a mn A new random variable: v = (mx1) A (mxn) z (nx1) E(v) (mx1) = E(v 1 ) E(v 2 ). E(v m ) = AE(z) V ar(v) (mxm) = AV ar(z)a Introduction & CLRM, Autumn Term 2010 22

The OLS estimator s unbiasedness E(b) = β E(b β) = 0 sampling error b β = (X X) 1 X y β = (X X) 1 X (Xβ + ε) β = (X X) 1 X Xβ + (X X) 1 X ε β = β + (X X) 1 X ε β = (X X) 1 X ε E(b β X) = (X X) 1 X E(ε X) = 0 under assumption 1.2 E X (E(b X)) = E X (β) = E(b) by the LTE Introduction & CLRM, Autumn Term 2010 23

We show that V ar(b X) = σ 2 (X X) 1 V ar(b X) = V ar(b β X) = V ar((x X) 1 X ε X) = V ar(aε X) = AV ar(ε X)A = Aσ 2 I n A = σ 2 AI n A = σ 2 AA = σ 2 (X X) 1 X X(X X) 1 = σ 2 (X X) 1 Note: β non-random b β sampling error A = (X X) 1 X V ar(ε X) = σ 2 I n Introduction & CLRM, Autumn Term 2010 24

Sketch of the proof of the Gauss Markov theorem V ar( ˆβ X) V ar(b X) V ar( ˆβ X) = V ar(ˆβ β X) = V ar[(d + A)ε X] = (D + A)V ar(ε X)(D + A ) = σ 2 (D + A)(D + A ) = σ 2 (DD + AD + DA + AA ) = σ 2 [DD + (X X) 1 ] σ 2 (X X) 1 = V ar(b X) where C is a function of X ˆβ = Cy D = C A A (X X) 1 X Details of proof: Hayashi pages 29-30 Introduction & CLRM, Autumn Term 2010 25

The OLS estimator is BLUE - OLS is the best estimator Holds under the Gauss Markov theorem V ar( ˆβ X) V ar(b X) - OLS is linear Holds under assumption 1.1 - OLS is unbiased Holds under assumption 1.1-1.3 Introduction & CLRM, Autumn Term 2010 26

5. Hypothesis Testing under Normality Hayashi p. 33-45 Introduction & CLRM, Autumn Term 2010 27

Hypothesis testing Economic theory provides hypotheses about parameters If theory is right testable implications But: Hypotheses can t be tested without distributional assumptions about ε Distributional assumption: Normality assumption about the conditional distribution of ε X MV N(0, σ 2 I n ) [Assumption 1.5] Introduction & CLRM, Autumn Term 2010 28

Some facts from multivariate statistics Vector of random variables: x = (x 1, x 2,..., x n ) Expectation vector: E(x) = µ = (µ 1, µ 2,..., µ n ) = (E(x 1 ), E(x 2 ),..., E(x n )) Variance-covariance matrix: V ar(x) = Σ = V ar(x 1 ) Cov(x 1, x 2 )... Cov(x 1, x n ) Cov(x 1, x 2 ). V ar(x 2 ).... Cov(x 1, x n )... V ar(x n ) y = c + Ax; c, A non-random vector/matrix E(y) = (E(y 1 ), E(y 2 ),..., E(y n )) = c + Aµ V ar(y) = AΣA x MV N(µ, Σ) y = c + Ax MV N(c + Aµ, AΣA ) Introduction & CLRM, Autumn Term 2010 29

Application of the facts from multivariate statistics and the assumptions 1.1-1.5 b }{{ β } = (X X) 1 X ε sampling error Assuming ε X MV N(0, σ 2 I n ) b β X MV N ( (X X) 1 X E(ε X), (X X) 1 X σ 2 I n X(X X) 1) b β X MV N ( 0, σ 2 (X X) 1) Note that V ar(b X) = σ 2 (X X) 1 OLS-estimator conditionally normally distributed if ε X is multivariate normal Introduction & CLRM, Autumn Term 2010 30

Testing hypothesis about individual parameters (t-test) Null hypothesis: H 0 : β k = β k, β k a hypothesized value, a real number Under assumption 1.5 and ε X MV N(0, σ 2 I n ) alternative hypothesis: H A : β k β k If H 0 is true E(b k ) = β k Test statistic: t k = b k β k σ 2 [(X X) 1 ] kk N(0, 1) Note: [(X X) 1 ] kk is the k-th row k-th column element of (X X) 1 Introduction & CLRM, Autumn Term 2010 31

Nuisance parameter σ 2 can be estimated σ 2 = E(ε 2 i X) = V ar(ε i X) = E(ε 2 i ) = V ar(ε i) We don t know ε i but we use the estimator e i = y i x i b ˆσ 2 = 1 n n i=1 (e i 1 n n i=1 e i) 2 = 1 n n i=1 e2 i = 1 ne e ˆσ 2 is a biased estimator: E(ˆσ 2 X) = n K n σ2 Introduction & CLRM, Autumn Term 2010 32

An unbiased estimator of σ 2 For s 2 = 1 n K n i=1 e2 i = 1 n K e e we get an unbiased estimator E(s 2 X) = 1 n K E(e e X) = σ 2 E ( E(s 2 X) ) = E(s 2 ) = σ 2 Using this provides an unbiased estimator of V ar(b X) = σ 2 (X X) 1 : V ar(b X) = s 2 (X X) 1 t-statistic under H 0 : t k = b k β k [ ] V ar(b X) kk = b k β k SE(b k ) = b k β k [ V ar(bk X) ] t(n K) Introduction & CLRM, Autumn Term 2010 33

Decision rule for the t-test 1. H 0 : β k = β k, is often β k = 0 H A : β k β k 2. Given β k, OLS-estimate b k and s 2, we compute t k = b k β k SE(b k ) 3. Fix significance level α of two-sided test 4. Fix non-rejection and rejection regions decision Remark: σ2 [(X X) 1 ] kk : standard deviation b k X s2 [(X X) 1 ] kk : standard error b k X Introduction & CLRM, Autumn Term 2010 34

Testing joint hypotheses (F-test/Wald test) Write hypothesis as: H 0 : R β = r (#r x K) (K x 1) (#r x 1) R: matrix of real numbers r: number of restrictions Replacing the β = (β 1, β 2,..., β k ) by estimator b = (b 1, b 2,..., b K ) : R b = r Introduction & CLRM, Autumn Term 2010 35

Definition of the F-test statistic Properties of R b: R E(b X) = Rβ = r R V ar(b X)R = R σ 2 (X X) 1 R Rb = r MV N(Rβ, R σ 2 (X X) 1 R ) Using some additional important facts from multivariate statistics z = (z 1, z 2,..., z m ) MV N(µ, Ω) (z µ) Ω 1 (z µ) χ 2 (m) Result applied: Wald statistic (Rb r) [σ 2 R(X X) 1 R ] 1 (Rb r) χ 2 (#r) Introduction & CLRM, Autumn Term 2010 36

Properties of the F-test statistic Replace σ 2 by its unbiased estimate s 2 = 1 dividing by #r: n K n i=1 e2 i = 1 n K e e and F -ratio: F = (Rb r) [R(X X) 1 R ] 1 (Rb r)/#r (e e)/(n K) = (Rb r) [R V ar(b X)R ] 1 (Rb r)/#r F (#r, n K) Note: F-test is one-sided Proof: see Hayashi p. 41 Introduction & CLRM, Autumn Term 2010 37

Decision rule of the F-test 1. Specify H 0 in the form Rβ = r and H A : Rβ r. 2. Calculate F-statistic. 3. Look up entry in the table of the F-distribution for #r and n K at given significance level. 4. Null is not rejected on the significance level α for F less than F α (#r, n K) Introduction & CLRM, Autumn Term 2010 38

Alternative representation of the F-statistic Minimization of the unrestricted sum of squared residuals: min n i=1 (y i x i b)2 SSR U Minimization of the restricted sum of squared residuals: F -ratio: min n i=1 (y i x i b) 2 SSR R F = (SSR R SSR U )/#r SSR U /(n K) Introduction & CLRM, Autumn Term 2010 39

6. Confidence intervals and goodness of fit measures Hayashi p. 38/20 Introduction & CLRM, Autumn Term 2010 40

Duality of t-test and confidence interval Under H 0 : β k = β k t k = b k β k SE(b k ) t(n K) Probability for non-rejection: ( ) P tα 2 (n K) t k tα 2 (n K) = 1 α tα 2 (n K) tα 2 (n K) t k lower critical value upper critical value random variable (value of test statistic) 1 α fixed number ( ) P b k SE(b k )tα 2 (n K) β k b k + SE(b k )tα 2 (n K) = 1 α Introduction & CLRM, Autumn Term 2010 41

The confidence interval Confidence interval for β k : ( ) P b k SE(b k )tα 2 (n K) β k b k + SE(b k )tα 2 (n K) = 1 α The confidence bounds are random variables! b k SE(b k )tα 2 (n K): lower bound b k + SE(b k )tα 2 (n K): upper bound Wrong Interpretation: True parameter β k lies with probability 1 α within the bounds of the confidence interval Problem: Confidence bounds are not fixed; they are random! H 0 is rejected at significance level α if the hypothesized value does not lie within the confidence bounds of the 1 α interval. Introduction & CLRM, Autumn Term 2010 42

Coefficient of determination: uncentered R 2 Measure of the variability of the dependent variable: yi 2 = y y Decomposition of y y: y y = (ŷ + e) (ŷ + e) = ŷ ŷ + 2ŷe + e e = ŷ ŷ + e e R 2 uc 1 e e y y A good model explains much and therefore the residual variation is very small compared to the explained variation. Introduction & CLRM, Autumn Term 2010 43

Coefficient of determination: centered R 2 and R 2 adj Use centered R 2 if there is a constant in the model (x i1 = 1) n i=1 (y i y) 2 = n i=1 (ŷ i y) 2 + n i=1 e2 i R 2 c 1 ni=1 e 2 i ni=1 (y i y) 2 1 SSR SST Note, that R 2 uc and R 2 c lie both in the interval [0, 1] but describe different models. They are not comparable! R 2 adj is constructed with a penalty for heavy parametrization: Radj 2 = 1 SSR/(n K) SST/(n 1) = 1 n 1 SSR n K SST The R 2 adj is an accepted model selection criterion Introduction & CLRM, Autumn Term 2010 44

Alternative goodness of fit measures Akaike criterion (AIC): Schwarz criterion (SBC): log ( ) SSR n + 2K n log ( SSR n ) + log(n)k n Note: Both criteria include a penalty term for heavy parametrization Select model with smallest AIC/SBC Introduction & CLRM, Autumn Term 2010 45