Topic 4: Model Specifications

Similar documents
Föreläsning /31

CHAPTER 6: SPECIFICATION VARIABLES

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

Multiple Linear Regression CIVL 7012/8012

Lecture 5: Omitted Variables, Dummy Variables and Multicollinearity

The regression model with one fixed regressor cont d

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Topic 7: Heteroskedasticity

ECNS 561 Multiple Regression Analysis

Econometrics Summary Algebraic and Statistical Preliminaries

Heteroscedasticity 1

Linear Regression with one Regressor

Applied Statistics and Econometrics

The Multiple Regression Model Estimation

Applied Econometrics (QEM)

Lecture 3: Multiple Regression

The Simple Regression Model. Part II. The Simple Regression Model

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend

6. Assessing studies based on multiple regression

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

LECTURE 11. Introduction to Econometrics. Autocorrelation

Review of Econometrics

Introductory Econometrics

Brief Suggested Solutions

Intermediate Econometrics

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

Lecture 12. Functional form

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

1 Correlation between an independent variable and the error

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Advanced Econometrics I

Applied Econometrics (QEM)

Econometrics Review questions for exam

Lecture 4: Heteroskedasticity

Introductory Econometrics

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u.

Econometrics Multiple Regression Analysis: Heteroskedasticity

Variable Selection in Restricted Linear Regression Models. Y. Tuaç 1 and O. Arslan 1

1. The OLS Estimator. 1.1 Population model and notation

ECON The Simple Regression Model

4. Nonlinear regression functions

Chapter 8 Heteroskedasticity

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Exercise sheet 3 The Multiple Regression Model

Econometrics. 7) Endogeneity

FinQuiz Notes

Econometrics - 30C00200

Linear Regression with Multiple Regressors

The Linear Regression Model

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

Model Specification and Data Problems. Part VIII

Answers to Problem Set #4

1 Motivation for Instrumental Variable (IV) Regression

10. Alternative case influence statistics

Christopher Dougherty London School of Economics and Political Science

Multicollinearity and A Ridge Parameter Estimation Approach

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2 Prediction and Analysis of Variance

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Empirical Economic Research, Part II

REVIEW (MULTIVARIATE LINEAR REGRESSION) Explain/Obtain the LS estimator () of the vector of coe cients (b)

Introduction to Econometrics. Heteroskedasticity

2. Linear regression with multiple regressors

Iris Wang.

Multiple Regression Analysis: Estimation. Simple linear regression model: an intercept and one explanatory variable (regressor)

Matematické Metody v Ekonometrii 7.

Making sense of Econometrics: Basics

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Chapter 6: Linear Regression With Multiple Regressors

ECONOMETRICS HONOR S EXAM REVIEW SESSION

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

The Simple Linear Regression Model

Linear Regression with Multiple Regressors

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

10. Time series regression and forecasting

Multiple Regression Analysis

Topic 10: Panel Data Analysis

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

THE MULTIVARIATE LINEAR REGRESSION MODEL

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

ACE 564 Spring Lecture 11. Violations of Basic Assumptions IV: Specification Errors. by Professor Scott H. Irwin

Vector Autoregressive Model. Vector Autoregressions II. Estimation of Vector Autoregressions II. Estimation of Vector Autoregressions I.

Final Exam - Solutions

Regression Models - Introduction

Specification Error: Omitted and Extraneous Variables

An Introduction to Parameter Estimation

6.1 The F-Test 6.2 Testing the Significance of the Model 6.3 An Extended Model 6.4 Testing Some Economic Hypotheses 6.5 The Use of Nonsample

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

LATVIAN GDP: TIME SERIES FORECASTING USING VECTOR AUTO REGRESSION

Ordinary Least Squares Regression

Transcription:

Topic 4: Model Specifications Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Functional Forms 1.1 Redefining Variables Change the unit of measurement of the variables will lead to changes in the coefficient estimates and their standard errors, but it won t alter the significance of the variables and their interpretations. Therefore, the results from the t test, F test and R 2 remain unchanged. If in the model some variables are expressed in the log form, then it will result in changes in the intercept terms, but the slopes will remain unchanged. For example 1.2 β Coefficient y = β 0 + β 1 x 1 + ε, (1) y = β 0 + β 1 ln x 1 + ε, (2) ln y = β 0 + β 1 x 1 + ε, (3) ln y = β 0 + β 1 ln x 1 + ε. (4) Sometime when the meaning of the marginal change of the original variable is unclear, we may report the so-called β coefficients rather than the regular coefficient estimates. To do this, replace the dependent and independent variables by their standardized values: z y = y y σ y, (5) z j = x j x j σ j. (6) In this case, the interpretation of the coefficients becomes the change of the dependent variable (y) associated with the change of the independent variables (x) measured in standard deviations. By doing this, we eliminate the impact of unit of measurement. but the economic significance and statistic significance remain unchanged. Consider the following example. y i = b 0 + b 1 x i1 +... + b k x ik + e i. (7) 1

1 Functional Forms 2 Taking the sample average, we have Taking the difference, y = b 0 + b 1 x 1 +... + b k x k. (8) y i y = b 1 (x i1 x 1 ) +... + b k (x ik x k ) + e i. (9) Dividing both sides by σ y, we have ( ) ( ) ( ) ( ) y i y b1 σ 1 xi1 x 1 bk σ k xik x k = +... + + e i. (10) σ y σ y σ 1 σ y σ k σ y In the above model, b j σ j σ y, j = 1, 2,..., k (11) is called the β coefficient. The interpretation is that, when x j changes one standard deviation, y will change of its standard deviation. 1.3 Natural Logarithm bj bσj bσ y Sometimes it is useful to express the variables in their natural logarithm. There might be different reasons for this, which include: 1. In a log-linear model, since the change of variables are measured in terms of percentage, so the estimates are independent of the unit of measurement of variables. 2. The coefficient can be interpreted as elasticity. 3. In the case that y > 0, using natural logs can reduce heteroskedasticity and skewness. 4. The values of ln y are more concentrated than y, and hence it reduces the impact of extreme values on estimates. Usually natural logs are used for variables that take values of large positive numbers like population, income, GDP, etc. Variables like percentages or proportions should not be expressed in natural log. The following are some examples. In the model ln y = β 0 + β 1 ln x + ε, (12) β 1 is interpreted as the elasticity of y with respective to x. This model is also called a log-linear model. In the model ln y = β 0 + β 1 x + ε, (13) β 1 is interpreted as the percentage change of y associated with a one-unit change of x. In the model y = β + β 1 ln x + ε, (14) β 1 is interpreted as the change of y associated with a one-percent change of x.

2 Omission of Relevant Variables 3 1.4 Quadratic Terms Consider the model Now y = β 0 + β 1 x + β 2 x 2 + ε. (15) y (β 1 + 2β 2 x) x. (16) Suppose β 1 > 0 and β 2 < 0. In this case, y increases with x at first, but eventually it decreases with x. If β 1 < 0 and β 2 > 0, then y decreases with x at first, but eventually it increases with x. The turning point is where dy/dx = 0, which yields, x = β 1 2β 2. (17) 1.5 Models with Interaction Terms Consider the model y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 1 x 2 + ε. (18) When we interpret β 1, we cannot only consider the change of x 1, but also need to consider β 3. This is because y x 1 = β 1 + β 3 x 2. (19) Therefore, when we consider the impact of the change of x 1 on y, we need to take into account the effect of x 2 as well. Usually we evaluate x 2 at its sample mean, x 2. Example 1: Consider the following estimated model, where n = 680, R 2 = 0.229, R 2 = 0.222. stndfnl = 2.05 0.0067 atndrte 1.63 prigp A 0.128 ACT + (1.36) (0.0102) (0.48) (0.098) 0.296 prigp (0.101) A2 + 0.0045 ACT 2 + 0.0056priGP A atndrte. (0.0022) (0.0043) If we only look at the coefficient for atndrte ( 0.0067), we find that it has negative impact on stndf nl, although not statistically significant (t = 0.66). However, only looking at the coefficient for atndrte cannot reveal the relationship between atndrte and stndfnl, because it only holds when prigp A = 0, which is not meaningful for the current problem. If we evaluate prigp A at its sample mean, which is 2.59, then we have 0.0067 + 0.0056 2.59 0.0078. 2 Omission of Relevant Variables So far we have assumed that the correct model specification is y = Xβ + ε. (20)

2 Omission of Relevant Variables 4 There are many types of errors that one might make in specifying the regressor set, among which two of them are very common: (1) Excluding relevant variables; and (2) including irrelevant variables. What happens to the properties of least squares estimator in the faces of these mistakes? Suppose the true data generating process (DGP) is y = X 1 β 1 + X 2 β 2 + ε, (21) n 1 n K 1 K 1 1 n K 2 K n 1 2 1 where ε N ( 0, σ 2 I ). But we fitted the model y = X 1 β 1 + u, (22) where u = X 2 β 2 + ε. That is, we wrongly exclude the regressors in X 2. Normally we do not know that we have done this and will apply OLS to (22). Therefore, we would assume that u N ( 0,σ 2 I ). However, this is incorrect. In fact, u = X 2 β 2 + ε, and thus u N ( X 2 β 2, σ 2 I ). (23) That is, the error term has non-zero mean because of omission of relevant regressors. The least squares estimator of β 1 from (22) is Therefore, b 1 = (X 1X 1 ) 1 X 1y = (X 1X 1 ) 1 X 1 (X 1 β 1 + X 2 β 2 + ε) = β 1 + (X 1X 1 ) 1 X 1X 2 β 2 + (X 1X 1 ) 1 X 1ε. (24) E (b 1 ) = β 1 + (X 1X 1 ) 1 X 1X 2 β 2. (25) So b 1 is no longer unbiased estimator of β 1 if we have omitted variables. Note that b 1 is unbiased only if X 2 β 2 = 0 (i.e., we haven t omitted regressors) or if X 1X 2 = 0 (i.e., X 1 and X 2 are orthogonal). In addition, if we don t realize the omitted-variable problem, so we would form Var (b 1 ) = σ 2 (X 1X 1 ) 1. (26) But this is not the appropriate formula if we apply least squares to the true DGP (21). Note that (21) can be written as where X = n K the new notation, we have y = Xβ + ε, [ [ X1 X 2, β = n K 1 n K 2 K 1 β 1 K 1 1 β 2 K 2 1, K = K 1 + K 2. With where b = [ b 1 b 2. Var (b) = σ 2 (X X) 1 (27) [ Var (b = 1 ) Cov (b 1, b 2 ), (28) Cov (b 1, b 2 ) Var (b 2 )

3 Inclusion of Irrelevant Variables 5 Therefore, the variance of the coefficients on X 1 would have a covariance matrix equal to the upper left block of (28), which is Var (b 1 ). It can be shown that Var (b 1 ) = σ 2 ( X 1X 1 X 1X 2 (X 2X 2 ) 1 X 2X 1 ) 1 (29) = σ 2 (X 1M 2 X 1 ) 1 (30) σ 2 (X 1X 1 ) 1, (31) where M 2 = I X 2 (X 2X 2 ) 1 X 2. The above analysis shows that if we omit relevant regressors, then we would use the wrong formula for the variancecovariance matrix of the least squares estimator. (Again, if X 1 and X 2 are orthogonal, then the variance-covariance matrix has the correct form. However, as it is shown below, the regular estimate for σ 2 still will be wrong under this condition.) Furthermore, in the presence of omitted-variable problem, if we use the usual OLS, we would estimate σ 2 by s 2 = e 1 e1 n K 1, where e 1 = y X 1 b 1. However, it can be shown that E (e 1e 1 ) = β 2X 2M 1 X 2 β 2 + σ 2 (n K 1 ), (32) where M 1 = I X 1 (X 1X 1 ) 1 X 1. Therefore, E ( s 2) ( ) e = E 1 e 1 n K 1 (33) = σ 2 + β 2X 2M 1 X 2 β 2 n K 1 (34) σ 2. (35) So, s 2 is a biased estimator of σ 2 when there are omitted variables. Note that the bias of s 2 disappears only if X 2 β 2 = 0. Different from b 1, which would be unbiased if X 1X 2 = 0, s 2 would still be biased even if X 1X 2 = 0. In summary, omitting variables will lead to biased estimator of coefficients, incorrect variance-covariance matrix, and biased estimator of the standard error of regression. (Equivalently, we can view omitting X 2 as estimating (21) subject to the incorrect restriction β 2 = 0, which, as we have shown in previous chapters, leads to biased estimator.) These problems will affect the properties of every test we undertake on our model and make it impossible to make correct inference on β 1 due to misspecification of the model. 3 Inclusion of Irrelevant Variables Suppose the true DGP is but we estimate the model y = X 1 β 1 + ε, ε N ( 0,σ 2 I ), (36) y = X 1 β 1 + X 2 β 2 + u. (37)

3 Inclusion of Irrelevant Variables 6 That is, we have included extra (irrelevant) variables, X 2. We can write (37) as [ X1 X 2, β = n K 1 n K 2 K 1 y = Xβ + u, (38) [, K = K 1 + K 2. β 1 K 1 1 β 2 K 2 1 where X = n K In terms of the effect on the properties of least squares estimator of β 1, inclusion of irrelevant variables is not as serious as omitting relevant regressors. Estimating (38) by OLS, we have Therefore, Define A = K K1 Then we can write or b = (X X) 1 X y (39) I K 1 K 1 0 K 2 K 1 = (X X) 1 X X 1 β 1 + (X X) 1 X ε. (40) E (b) = (X X) 1 X X 1 β 1. (41), a selection matrix such that XA = X 1. (42) E (b) = (X X) 1 X XAβ 1 = Aβ [ 1 I = β 0 1, ( ) [ b1 β1 E = b 2 0. (43) Expression (43) implies that b 1 and b 2 are unbiased estimators of β 1 and β 2. It also can be shown that E ( s 2) = σ 2, (44) i.e., s 2 is unbiased estimator of σ 2 even in the face of misspecification. The cost of including irrelevant variables lies in the reduced precision of estimates. 1. Additional regressors means of loss of degrees of freedom. We try to estimate more parameters with the same number of observations and this leads to loss of precision. 2. Variance of the estimator from the incorrect model will be higher. We may think of the situation as having failed to incorporating the valid restrictions that β 2 = 0 in (37). We know that the variance of RLS estimator is less than that of OLS when the restrictions are true. So this result applies here. Remark 1: The ultimate criterion to select the correct model should be suggested by theories.

3 Inclusion of Irrelevant Variables 7 Remark 2: The discussions also suggest that we need an approach to determine selection of variables. Recall we have seen some criteria already. R 2 = 1 e e SST. However, e e always stays the same or decreases when additional variables are added, irrespective of their relevance. Therefore, R 2 is not recommended. R 2 = 1 e e/(n K) SST/(n 1). This is a better criterion than R2, but some researchers have suggested that R 2 does not penalize the loss of degrees of freedom enough when adding variables. This leads to other model selection criteria. Some Model Selection Criteria: 1. FPE: Akaike s (1969, 1979) Final Prediction Error Criterion. In the present context F P E (K j ) = [e e/ (n K j ) [(n + K j ) /n, (45) where K j is the number of parameters in model j. It is also known as Amemiya s (1985) Prediction Criterion (PC). 2. AIC: Akaike s (1974) Information Criterion. In the present context ( e ) e ln (AIC j ) = ln + 2K j n n. (46) 3. SC: Schwarz s (1978) Criterion. ln (SC j ) = ln ( e ) e n + K j ln n. (47) n