Lecture 3: Multiple Regression

Similar documents
Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

The regression model with one fixed regressor cont d

Review of Econometrics

Multiple Regression Analysis

LECTURE 5 HYPOTHESIS TESTING

Lecture 6: Dynamic Models

Ch 2: Simple Linear Regression

Econometrics Multiple Regression Analysis: Heteroskedasticity

Simple Linear Regression

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

1. The OLS Estimator. 1.1 Population model and notation

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Applied Econometrics (QEM)

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

Homoskedasticity. Var (u X) = σ 2. (23)

Lecture 14 Simple Linear Regression

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

ECON The Simple Regression Model

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Lectures 5 & 6: Hypothesis Testing

Greene, Econometric Analysis (7th ed, 2012)

2 Regression Analysis

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

1. The Multivariate Classical Linear Regression Model

Econometrics Summary Algebraic and Statistical Preliminaries

MFin Econometrics I Session 5: F-tests for goodness of fit, Non-linearity and Model Transformations, Dummy variables

Regression #8: Loose Ends

Econometrics of Panel Data

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

WISE International Masters

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Nonlinear Regression Functions

Simple Linear Regression

Correlation Analysis

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Econometrics of Panel Data

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

ECON 5350 Class Notes Functional Form and Structural Change

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Applied Statistics and Econometrics

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Simple Linear Regression Model & Introduction to. OLS Estimation

Financial Econometrics Lecture 6: Testing the CAPM model

14 Multiple Linear Regression

Testing Linear Restrictions: cont.

Intermediate Econometrics

Lecture 1: OLS derivations and inference

CHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

Intermediate Econometrics

The multiple regression model; Indicator variables as regressors

Business Statistics. Lecture 10: Correlation and Linear Regression

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

11 Hypothesis Testing

Inference for the Regression Coefficient

Econometrics II - EXAM Answer each question in separate sheets in three hours

Econometrics I Lecture 3: The Simple Linear Regression Model

Econometrics Lecture 5: Limited Dependent Variable Models: Logit and Probit

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

Linear Models and Estimation by Least Squares

Simple Linear Regression: The Model

Topic 4: Model Specifications

405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati

Multiple Linear Regression

Föreläsning /31

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).

Multiple Regression Analysis

Final Exam - Solutions

ECONOMETRIC THEORY. MODULE VI Lecture 19 Regression Analysis Under Linear Restrictions

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Analisi Statistica per le Imprese

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Empirical Market Microstructure Analysis (EMMA)

Formulary Applied Econometrics

Econometrics Review questions for exam

INTRODUCTORY ECONOMETRICS

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

the error term could vary over the observations, in ways that are related

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

2. Linear regression with multiple regressors

Introduction to Estimation Methods for Time Series models. Lecture 1

Probability and Statistics Notes

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Multiple Regression Analysis: The Problem of Inference

ECON 4230 Intermediate Econometric Theory Exam

Statistics II. Management Degree Management Statistics IIDegree. Statistics II. 2 nd Sem. 2013/2014. Management Degree. Simple Linear Regression

Chapter 14 Student Lecture Notes 14-1

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

The OLS Estimation of a basic gravity model. Dr. Selim Raihan Executive Director, SANEM Professor, Department of Economics, University of Dhaka

Regression Analysis II

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

WEIGHTED LEAST SQUARES. Model Assumptions for Weighted Least Squares: Recall: We can fit least squares estimates just assuming a linear mean function.

Lecture 4: Heteroskedasticity

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

Transcription:

Lecture 3: Multiple Regression R.G. Pierse 1 The General Linear Model Suppose that we have k explanatory variables Y i = β 1 + β X i + β 3 X 3i + + β k X ki + u i, i = 1,, n (1.1) or Y i = β j X ji + u i, i = 1,, n j=1 where X 1i = 1, i = 1,, n is an intercept. 1.1 Assumptions of the Model E(u i ) = 0, i = 1,, n (A1) E(u i ) = σ, i = 1,, n (A) E(u i u j ) = 0, i, j = 1,, n j i (A3) X values are fixed in repeated sampling (A4) We now need to make an additional assumption about the explanatory variables: The variables X j are not perfectly collinear. (A4 ) Perfect collinearity occurs when there is one or more variables X m such that X mi = c j X ji, i = 1,, n j m where c j are fixed constants. In this case, one of the variables is redundant and should be dropped from the regression.when a set of variables is perfectly 1

collinear, dropping any one of them will solve the problem. An example of a perfectly collinear set of variables would be GDP plus all the components of the GDP identity. Y i is a random variable with the following properties: E(Y i ) = β 1 + β X i + β 3 X 3i + + β k X ki, i = 1,, n V ar(y i ) = E(Y i E(Y i )) = E(u i ) = σ, i = 1,, n Interpretation of the OLS Coefficients β j represents the effect of changing the variable X j on the dependent variable Y while holding all other variables constant. In mathematical terms this is the partial derivative of Y with respect to X j β j = Y X j 3 The Ordinary Least Squares Estimator Y i = β j X ji + e i, i = 1,, n (3.1) j=1 The OLS estimator is the estimator that minimises the sum of squared residuals s = n e i. min β 1,, β k s = e i = ( Y i ) β j X ji Differentiating this expression with respect to the k parameters β j gives k first order conditions: s β 1 = s β = j=1 (Y i β 1 β X i β ) k X ki = 0 (3.) (Y i β 1 β X i β k X ki ) X i = 0 (3.3)

and so on up to s β k = (Y i β 1 β X i β k X ki ) X ki = 0 (3.4) Note that the first order conditions can be rewritten as e i = 0 and X ji e i = 0, j =,, k (3.5) Rearranging the equations gives the k normal equations in the k unknowns β j Y i = n β 1 + β X i + + β k X ki Y i X i = β 1 and so on up to Y i X ki = β 1 X i + β X ki + β Xi + + β k X i X ki X i X ki + + β k X ki. Solving these k equations defines the OLS estimators. Note that from the first normal equation β 1 = 1 n Y i j= β j 1 n X ji = Y β j X j (3.6) where Y = 1 n n i Y i and X j = 1 n n i X ji are the sample means of Y and X j respectively. 3.1 Special case (k = 3) In the general case, the formulae for the estimators β j are messy in summation notation. Elegant expressions require matrix notation. However, for the case k = 3 we have β = yi x i 3i y i x 3i xi x 3i i 3i ( i x 3i ) j= and β 3 = yi x 3i i y i x i xi x 3i i 3i ( i x 3i ) where, as before, lower case letters denotes deviations from sample means. 3

4 Properties of the OLS Estimator The OLS estimator in the general model has the same properties as in the simple regression model. Proofs will not be given here. 4.1 The OLS Estimator β j is Unbiased It can be shown that 4. OLS is BLUE E( β j ) = β j It can be shown that the OLS estimators β j are Best Linear Unbiased Estimators (BLUE) in the general regression model. 4.3 The Variance of OLS Estimators For the general case, again the expressions are messy in summation notation, and elegant expressions require matrix notation. However, for the k = 3 case we have [ V ar( β 1 ) = σ 1 n + X x 3i + X ] 3 x i X X 3 xi x 3i i x 3i ( i x 3i ) V ar( β ) = σ 3i i 3i ( i x 3i ) and V ar( β 3 ) = σ i i 3i ( i x 3i ). We now also have an expression for the covariance between β and β 3 Cov( β, β 3 ) = σ i x 3i i 3i ( i x 3i ). 4.4 The OLS estimator of σ It can be shown that the estimator σ = n e i n k (4.1) is an unbiased estimator of σ. 4

5 Matrix Representation Writing out all n observations in the model as a set of equations Y 1 = β 1 X 11 + β X 1 + β 3 X 31 + + β k X k1 + u 1 Y = β 1 X 1 + β X + β 3 X 3 + + β k X k + u. Y n = β 1 X 1n + β X n + β 3 X 3n + + β k X kn + u n this can be rewritten in matrix notation as Y 1 X 11 X 1 X k1 Y. = X 1 X X k...... Y n X 1n X n X kn or y = Xβ + u β 1 β. β k + where y and u are an n 1 column vectors, β is a k 1 column vector and X is an n k matrix. In matrix notation the OLS estimator β is given by the expression β = (X X) 1 X y u 1 u. u n 6 Goodness of Fit and R Recall that Y i = or, taking mean deviations, y i = β j X ji + e i = Ŷi + e i j=1 β j x ji + e i = ŷ i + e i. j= Squaring and summing over all observations gives yi = ŷi + e i + ŷ i e i = ŷi + 5 e i

since and ŷ i e i = β j j= x ji e i x ji e i = 0, j from (3.5). Defining n y i = Total Sum of Squares (T SS), n ŷ i = Explained Sum of Squares (ESS), and n e i = Residual Sum of Squares (RSS), we have the identity T SS = ESS + RSS and define the Multiple Coefficient of Determination, R as R = ESS n T SS = ŷ i n y i or, alternatively, R = 1 RSS n T SS = 1 ê i n. y i R measures the proportion of the total variation of Y i about its mean that is explained by all the regressors jointly. This is the goodness-of-fit of the regression. 6.1 Properties of R As long as an intercept is included in the set of regressors then 0 R 1. (If an intercept is not included then it is possible to get a negative value for R ). R will increase as k increases i.e. more regressors are added. In the limit when k = n then R = 1. This suggests that choosing the equation with the highest R will not always be a sensible strategy. The value of R depends on the scale of the dependent variable. R measures cannot be compared between equations with different dependent variables. Example: Consider the equation Subtracting Y t 1 from both sides gives Y t = a + by t 1 + u t. Y t Y t Y t 1 = a + (b 1)Y t 1 + u t 6

but does not affect the error term or the coefficients. This is merely a renormalisation of the equation and the RSS of the two will be the same. However, the R statistic will give very different values, purely because of the different scales of the dependent variables. 6. Correcting for degrees of freedom: R R = 1 (1 R ) n 1 n k R makes a correction to R based on the number of parameters in the model k. Note that R < R in general, and that R can be negative. 7 Hypothesis Testing In order to make statistical inferences on the parameter estimates β we must make a further assumption: u i iid N(0, σ ), i = 1,, n (A5) that the errors u i are distributed as independent normal variables. For hypotheses involving a single parameter β j, β j β j V ar( β j ) t n k where t n k is the Student t distribution with n k degrees of freedom. Note that V ar( β j ) is the standard error of the estimator β σ βj j. This is just as in the simple regression case. However, there is now the additional possibility of tests involving more than one parameter, or more than one restriction. 7.1 Testing equality of two coefficients Without loss of generality, let the null hypothesis be H 0 : β = β 3 or, alternatively, H 0 : β β 3 = 0. 7

This is a single restriction although it involves two parameters. It can be shown that β β 3 V ar( β β t n k 3 ) where V ar( β β 3 ) = V ar( β ) + V ar( β 3 ) Ĉov( β, β 3 ) and the estimated variances and covariances needed to compute this test are available in any computer regression package. 7. Jointly testing all the coefficients One interesting hypothesis is that or, alternatively, H 0 : β = β 3 = = β k = 0 H 0 : β j = 0, j =,, k so that the model collapses to the intercept term Y i = β 1 + u i, i = 1,, n Note that the hypothesis H 0 involves jointly testing k 1 restrictions. It can be shown that a test of this hypothesis is given by ESS/(k 1) RSS/(n k) = R /(k 1) (1 R )/(n k) F k 1,n k where F k 1,n k is the F distribution with k 1 and n k degrees of freedom. The Student t distribution and the F distribution are related. If x is distributed as t n k then x is distributed as F 1,n k. 7.3 General F Testing More generally, we may often be interested in jointly testing several linear restrictions. Suppose that there are r restrictions. Then the null hypothesis can be written H 0 : a ij β j a i0 = 0, i = 1,, r j=1 where a ij are fixed constants. 8

To compute a test of this hypothesis we need to estimate the model both under the null hypothesis (imposing the restrictions) and on the alternative hypothesis (unrestricted). Let RSS U be the sum of squared residuals for the unrestricted model, and RSS R be the sum of squared residuals of the restricted model. Then it can be shown that a test of H 0 is given by 7.4 Testing σ (RSS R RSS U ) /r RSS U /(n k) Finally, as with the simple regression model F r,n k where χ n k (n k) σ σ χ n k (7.1) is the Chi-squared distribution with n k degrees of freedom. 8 Functional Form We require that the OLS model is linear in parameters. It is not necessary that it is linear in variables. It is often possible to tranform variables in order to make an equation linear in parameters. 8.1 Example: Cobb-Douglas Production Function Q = AK γ L δ where γ + δ = 1 implies constant returns to scale. Taking logarithms we get or ln Q = ln A + γ ln K + δ ln L Y = b 0 + b 1 X 1 + b X + u where Y = ln Q, X 1 = ln K, X = ln L, b 0 = ln A, b 1 = γ, and b = δ, which is linear in parameters. Constant return to scale can be tested in this model by the linear hypothesis that b 1 + b = 1. Note that an error term u has been added to the final equation. If we assume that u N(0, σ ) then this implies that the error term on the original equation is e u and is distributed with a log-normal distribution. Specifications cannot always be transformed to make them linear in parameters. Consider the CES production function Q = A[γK ρ + δl ρ ] µ/ρ. 9

This is intrinsically nonlinear in parameters. Similarly, the equation Y = β 0 + β 1 X 1 + β 0 β 1 X + u is linear in variables but nonlinear in parameters. 8. Log-linear Specifications The Cobb-Douglas example is one in which all the variables in the econometric equation are in logarithms. Such specifications are called log-linear or log-log and are very common in economics. Compare a linear and a log-linear specification and Y = a + bx + u ln Y = c + d ln X + v. In the linear specification b = Y/ X is the marginal response of Y to a change in X. Note that the effect of doubling X on Y depends on the level of X, because the elasticity is not constant. In the log-linear specification d = ln Y/ ln X which is the elasticity of Y with respect to X. Log-linear models impose the assumption that elasticities are constant. In log-linear models the slope parameters are independent of the scale of measurement of the variables. Suppose we rescale X by multiplying by a factor k, denoting the new variable kx as X. In the linear model the effect is that the parameter b becomes b = b/k. In the log-linear model we have ln X = ln k+ln X so that the rescaling factor is constant and changes the intercept c but not the elasticity d. Many economic variables represent money flows or stocks or prices that logically cannot ever be negative. In principle, a linear specification can give a negative value for the dependent variable if u is large and negative enough. In the log-linear specification, the dependent variable can never become negative since Y = e ln Y 0. 8.3 Linearity as an Approximation Consider a general function of a single variable Y = g(x). Weierstrass s theorem states that, under certain very general conditions, Y a 0 + a 1 X + a X + a 3 X 3 + 10

so that the function g(x) can be approximated, to any desired degree of precision, by a polynomial function in X. This polynomial function is linear in parameters so it can be estimated by OLS, treating X, X, X 3 etc. as separate regressors. Note that X, X, X 3, although they are exact functions of each other, are not linearly dependent so that there is no perfect collinearity here. 11