Multivariate Regression Analysis

Similar documents
Introduction to Estimation Methods for Time Series models. Lecture 1

Review of Econometrics

L2: Two-variable regression model

ECON The Simple Regression Model

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Homoskedasticity. Var (u X) = σ 2. (23)

Simple Linear Regression

Introductory Econometrics

Introductory Econometrics

INTRODUCTORY ECONOMETRICS

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

Empirical Economic Research, Part II

Simple Linear Regression: The Model

Introductory Econometrics

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Chapter 2: simple regression model

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

3. Linear Regression With a Single Regressor

F3: Classical normal linear rgression model distribution, interval estimation and hypothesis testing

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Multiple Linear Regression

Two-Variable Regression Model: The Problem of Estimation

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Econometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11

Linear Regression. Junhui Qian. October 27, 2014

ECON3150/4150 Spring 2015

Heteroskedasticity. Part VII. Heteroskedasticity

Multiple Linear Regression

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Introductory Econometrics

The Statistical Property of Ordinary Least Squares

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Motivation for multiple regression

The Simple Regression Model. Part II. The Simple Regression Model

Heteroskedasticity and Autocorrelation

Econometrics - 30C00200

Quick Review on Linear Multiple Regression

Dealing With Endogeneity

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Econometrics Master in Business and Quantitative Methods

Intermediate Econometrics

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

ECNS 561 Multiple Regression Analysis

Applied Statistics and Econometrics

Applied Statistics and Econometrics

Multiple Regression Analysis

Advanced Econometrics

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Econometrics Multiple Regression Analysis: Heteroskedasticity

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

2. Linear regression with multiple regressors

Properties of the least squares estimates

8. Instrumental variables regression

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

The regression model with one stochastic regressor.

Applied Econometrics (QEM)

Least Squares Estimation-Finite-Sample Properties

Introduction to Econometrics Midterm Examination Fall 2005 Answer Key

The Multiple Regression Model Estimation

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Applied Econometrics (QEM)

Final Exam. Economics 835: Econometrics. Fall 2010

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

Multiple Regression Analysis: Heteroskedasticity

Financial Econometrics

The Simple Linear Regression Model

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Answers to Problem Set #4

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

Weighted Least Squares

The Simple Regression Model. Simple Regression Model 1

Lecture 3: Multiple Regression

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

3 Multiple Linear Regression

Regression, Ridge Regression, Lasso

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

Statistics 910, #5 1. Regression Methods

ECO375 Tutorial 8 Instrumental Variables

1 Motivation for Instrumental Variable (IV) Regression

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

F9 F10: Autocorrelation

ECON 3150/4150, Spring term Lecture 7

Lecture 4: Heteroskedasticity

Advanced Quantitative Methods: ordinary least squares

Econometrics. Week 6. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Introduction to Econometrics. Heteroskedasticity

Essential of Simple regression

WISE International Masters

4 Multiple Linear Regression

Transcription:

Matrices and vectors The model from the sample is: Y = Xβ +u with n individuals, l response variable, k regressors Y is a n 1 vector or a n l matrix with the notation Y T = (y 1,y 2,...,y n ) 1 x 11 x 12 x 1k X is a n (k +1) matrix; X =..... 1 x n1 x n2 x nk β is a (k +1) 1 vector or a (k +1) 1 matrix β T = (β 0,β 1,β 2,...,β k ) u is a n 1 vector; u T = (u 1,u 2,...,u n ) or a n 1 matrix During this course we will consider only the case l = 1 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 129 / 253 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 130 / 253 The Multivariate Linear Regression Model: Details Some remarks: y = β 0 + k β j x j +u = (1,x 1,...,x k ) j=1 β 0 β 1. β k +u From now on it is desirable to use the matrix notation because the formulas become more compact The estimation principles are exactly the same! Also the decomposition principle of the variance and the determination coefficient (R-squared) remain the same Remind the so-called marginal interpretation ceteris paribus Variance Decomposition and the R-squared Once again, think in one observation as being composed by an explained component and another part non-explained y i = ŷ i +û i Var[Y] = Var[u]+Var[Econ. Model]+2 Cov[u,Econ. Model] = E[u 2 ]+Var[Econ. Model]+0 In the Linear Regression Model Var[Econ. Model] = β 2 1Var[X] Note that we have obtained TSS = ESS +RSS How can we think what is the way our regression surface does correspond to our observed data? the R 2 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 131 / 253 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 132 / 253

Variance Decomposition: Projection Assumptions for Unbiasedness 1 The population model is linear in the parameters: y = β 0 + k β j x j +u 2 We can use a simple random sample of size n k +1 with {(x i1,x i2,...,x ik,y i ) : i = 1,2,...,n}, from the population model j=1 with Ŷ = Xˆβ J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 133 / 253 y i = β 0 + k β j x ij +u i Y = Xβ +u j=1 3 E[u x 1,x 2,...,x k ] = E[u] = 0, this implies that all our explanatory variables are exogenous and that σ 2 < 4 Any of the explanatory variables in x are constant, and it does not exist any exact linear relationship among the columns of x (otherwise we have multicolinearity) J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 134 / 253 Matrix Notation for the OLS Estimator Bias of the Estimator In order to minimize the Residual Sum of Squares we consider n ui 2 = u T u = (Y Xβ) T (Y Xβ) with (AB) T = B T A T i=1 = (Y T Y Y T Xβ β T X T Y+β T X T Xβ) = (Y T Y 2β T X T Y+β T X T Xβ) in such a way that the score vector is u T u β T = 2XT Y+2X T Xβ and this give us the first order condition (FOC) X T Y X T Xˆβ = 0 X T Y = X T Xˆβ ˆβ = (X T X) 1 X T Y we could also have chosen MM and ML under normality J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 135 / 253 Taking into account our assumptions we obtain: ˆβ = (X T X) 1 X T (Xβ +u) = (X T X) 1 X T Xβ +(X T X) 1 X T u = β +(X T X) 1 X T u = E[ˆβ] = β +E[(X T X) 1 X T u] = β +0 biais = E[ˆβ] β = 0 since we have uncorrelated variables X, u et E[u] = 0. Now we know that our sampling distribution is centered around the true parameter value (= unbiased) What can we say about the dispersion parameter? It is very easy to think about the variance under the additional assumption that: Var(u x 1,x 2,...,x k ) = σ 2 (homoskedasticity) that implies Var(Y x) = σ 2 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 136 / 253

Variance of the OLS Estimator Under homoskedasticity it is rather easy to calculate the variance of ˆβ (conditioning on X, and calculating the expected value with respect to u): If we define I as the identity matrix we have that [ ] Var[ˆβ] = Var (X T X) 1 X T u = (X T X) 1 X T Var[u]X(X T X) T = = (X T X) 1 X T Iσ 2 X(X T X) 1 = (X T X) 1 σ 2 Note the relationship between (X T X) 1 and the inverse of ncov[(x 1,x 2,...,x k )]. It is obvious that ncov[(x 1,...,x k )] Also, bias 2 (ˆβ)+var(ˆβ) 0 if n : convergence! The four assumptions that we needed to show the unbiasedness plus the homoskedasticity assumption are known as Gauss-Markov assumptions Remark and notation: if Z is a vector, Var(Z) = Cov(Z) is the variance-covariance matrix J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 137 / 253 Variance of the OLS Estimator Taking into account the Gauss-Markov assumptions Var(ˆβ j ) = σ 2 TSS j (1 R 2 j ) (5) where TSS j = n i=1 (x ij x j ) 2 and the R 2 j is the R 2 from the regression of x j on all the other variables x. Variance Components of the OLS Error Variance: a larger σ 2 implies a larger variance for the OLS Estimators Total sampling variation: The larger is TSS j the smaller is the variance for the OLS Estimators The linear relationships among explanatory variables: a larger R 2 j implies a larger variance for the estimators J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 138 / 253 Assumptions: additive linear model, n k + 1, random sample, E[u x] = E[u] = 0, absence of multicolinearity, Estimator: ˆβ = (X T X) 1 X T Y; E[ˆβ] = β et Var[ˆβ] = (X T X) 1 σ 2 The Gauss-Markov Theorem: Efficiency We have proved the convergence of the OLS estimator for the multivariate linear regression model Taking into account the Gauss-Markov assumptions we can show that the OLS Estimator is BLUE B est (with the smallest Variance / Mean Squared Error) L inear U nbiased E stimator(s) That is, it is efficient (the most one). Therefore, if the Gauss-Markov assumptions are fulfilled then use the OLS estimators Remark: Under the Gauss-Markov assumptions the OLS, the MM and the ML (gaussian errors) estimators are the same. Indeed, the other estimators can be also convergent... J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 139 / 253 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 140 / 253

Estimation of the Error Variance The sources of bias... I We can not estimate the error variance, σ 2, directly, because we do not observe the errors, u i. What we do observe are the residuals, û i. We use them to construct an estimator for σ 2 : û2 ˆσ 2 = i n k 1 = RSS df hence ˆσ se(ˆβ j ) = TSS j (1 Rj 2) where df = n (k +1) = n k 1 (degrees of freedom) is the number of observations minus the number of parameters. J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 141 / 253 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 143 / 253 Omitted variables bias Too many or too few variables - I Since we have been able to deduce before the omitted variables bias, True model: Y = Xβ +Zα+u We can think Y = Xβ +v with v = Zα+u, remind that: ˆβ = β +(X T X) 1 X T v Remarks plim(ˆβ) = β +E[(X T X) 1 X T Z]α = β if Cov(X j,z k ) = 0 j,k It is very complicated in this context to obtain the expected sign of the (asymptotic) bias because we need to know the whole structure of the covariance. Note that the inconsistency is not a finite sample size problem - it does not disappear when adding observations. What are the consequences of including variables in our model specification that are not really part of the model? There is no effect in the identification of the parameters, and the OLS (MM, ML) Estimator remains unbiased but the variance changes But having too many variables might end up in a problem of overfitting, a possible multicolinearity, the degrees of freedom, etc. that reduce the quality of the estimators of interest. Hence it is recommended to undertake a variable selection (or model) if we have too much J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 148 / 253 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 149 / 253

Too many or too few variables - II What are the consequences of excluding one variable from our specification that belongs to the model? The OLS is biased if the omitted variable is important / remarkable and correlated with the included variables: Multivarate Regression Analysis Inference Finding the right mix is an art! This is a crucial problem in Econometrics: The correct identification of a marginal effect taking into account all problems with variables. J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 150 / 253 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 160 / 253 Reminder: Estimation Algorithm Récapitulation Normal Distribution (illustration) Récapitulation If we assume that Y = X T β +u, β IR k+1. OLS/MM/ML(Gaussian) estimators are all ˆβ = (X T X) 1 X T Y. The residuals are û = Y Ŷ = Y Xˆβ. The OLS estimator of σ 2 u is 1 n k 1ûT û, and therefore Var(ˆβ) = ˆσ 2 u(x T X) 1 where the diagonal elements are the variances, and the off-diagonal elements are the covariances. Var(ˆβ 0 ) Cov(ˆβ 0, ˆβ 1 ) Cov(ˆβ 0, ˆβ 2 ) Cov(ˆβ 0, ˆβ k ) Cov(ˆβ 1, ˆβ 0 ) Var(ˆβ 1 ) Cov(ˆβ 1, ˆβ 2 ) Cov(ˆβ 1, ˆβ k )..... Cov(ˆβ k, ˆβ 0 ) Cov(ˆβ k, ˆβ 1 ) Cov(ˆβ k, ˆβ 2 ) Var(ˆβ k ) The regression line with only one explanatory variable, a normal and homocesdastic conditional distribution for Y X J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 161 / 253 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 162 / 253

Récapitulation The t-test Summary: Normality of ˆβ j (revision) The t-test Under some general assumptions ˆβ j follows a normal distribution being a linear combination of the errors that are all normal (We have conditioned on the explanatory variables X and therefore we can get rid of their distributions) or a Central Limit Theorem applies (for large sample sizes, and under certain regularity conditions) then, we obtain the approximation (asymptotic) (ˆβ j β j ) sd(ˆβ j ) and we work with ˆβ j N(β j,var(ˆβ j )) Recall that sd 2 = se 2, ˆσ 2 = asympt N(0,1) (7) i û2 i n k 1 σ2 n k 1 χ2 n k 1 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 163 / 253 Due to equation (7) and the distribution of ˆσ 2 we have (ˆβ j β j ) = (ˆβ j β j ) σ 2 se(ˆβ j ) sd(ˆβ j ) ˆσ 2 = N(0,1) 1 n k 1 χ2 n k 1 = t n k 1 Hence: The knowledge of the standarized sampling distribution of the estimator enables us to undertake hypotheses tests. Example: Test H 0 : β j = 0 If we accept H 0, we accept that x j does not have a marginal effect on y In order to make our tests we need an statistic, for example tˆβj := ˆβ j βj 0 se(ˆβ j ) = ˆβ j 0 se(ˆβ j ) We are going to use this t-statistic H 0 t n k 1 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 164 / 253 Confidence bands: Idea Other hypotheses and the confidence bands Other hypotheses and the confidence bands Confidence bands: Right interpretation and construction Another way to implement classic inference on a coefficient is to set up a confidence interval: Remind the right interpretation of confidence intervals (it is referred to a course of statistics) We say that there is a chance of (1 α)% that this interval contains the true value β j if we refer to a random interval (or a β j as a random coefficient bayesian interpretation) by using the same critical values c that has been used for the two-sided test, a confidence interval (1 α)% is defined as [ ˆβ j c se(ˆβ j ) ; ˆβ j +c se(ˆβ j ) ] where c is the percentile 1 α 2 in the t n k 1 distribution. J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 171 / 253 J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 172 / 253

Confidence intervals: Precautionary note Other hypotheses and the confidence bands What is the confidence interval that we set up? J.M. Rodriguez-Poo (Université de Genève) Introduction to Econometrics March 24, 2016 173 / 253