CHAPTER 6: SPECIFICATION VARIABLES

Similar documents
5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Heteroscedasticity 1

The general linear regression with k explanatory variables is just an extension of the simple regression as follows

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2. Linear regression with multiple regressors

The Simple Regression Model. Part II. The Simple Regression Model

Eastern Mediterranean University Department of Economics ECON 503: ECONOMETRICS I. M. Balcilar. Midterm Exam Fall 2007, 11 December 2007.

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Brief Sketch of Solutions: Tutorial 3. 3) unit root tests

Föreläsning /31

Heteroskedasticity. Part VII. Heteroskedasticity

Exercise Sheet 6: Solutions

Applied Econometrics. Applied Econometrics Second edition. Dimitrios Asteriou and Stephen G. Hall

Model Specification and Data Problems. Part VIII

Practice Questions for the Final Exam. Theoretical Part

10. Time series regression and forecasting

Answers to Problem Set #4

Lecture 4: Multivariate Regression, Part 2

6. Assessing studies based on multiple regression

Problem Set 2: Box-Jenkins methodology

Lecture 4: Multivariate Regression, Part 2

Outline. 11. Time Series Analysis. Basic Regression. Differences between Time Series and Cross Section

ECON 4230 Intermediate Econometric Theory Exam

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Univariate linear models

3. Linear Regression With a Single Regressor

Iris Wang.

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Multiple Regression Analysis

Exercises (in progress) Applied Econometrics Part 1

ECNS 561 Multiple Regression Analysis

Econ 427, Spring Problem Set 3 suggested answers (with minor corrections) Ch 6. Problems and Complements:

Outline. 2. Logarithmic Functional Form and Units of Measurement. Functional Form. I. Functional Form: log II. Units of Measurement

Econometric Forecasting Overview

Lecture 8. Using the CLR Model

1 Quantitative Techniques in Practice

LECTURE 11. Introduction to Econometrics. Autocorrelation

7. Integrated Processes

ECON 366: ECONOMETRICS II. SPRING TERM 2005: LAB EXERCISE #10 Nonspherical Errors Continued. Brief Suggested Solutions

Brief Suggested Solutions

2 Prediction and Analysis of Variance

PBAF 528 Week 8. B. Regression Residuals These properties have implications for the residuals of the regression.

7. Integrated Processes

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity

Statistical Inference. Part IV. Statistical Inference

About the seasonal effects on the potential liquid consumption

Topic 4: Model Specifications

4. Nonlinear regression functions

APPLIED MACROECONOMETRICS Licenciatura Universidade Nova de Lisboa Faculdade de Economia. FINAL EXAM JUNE 3, 2004 Starts at 14:00 Ends at 16:30

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

Steps in Regression Analysis

Statistical Inference with Regression Analysis

Lecture 8. Using the CLR Model. Relation between patent applications and R&D spending. Variables

Econometrics Review questions for exam

Rockefeller College University at Albany

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Regression Models - Introduction

ECON 497 Midterm Spring

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Fall 2016 Instructor: Martin Farnham

Econometrics Part Three

A particularly nasty aspect of this is that it is often difficult or impossible to tell if a model fails to satisfy these steps.

FinQuiz Notes

BUSINESS FORECASTING

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

Applied Quantitative Methods II

Exercise Sheet 5: Solutions

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

7. Prediction. Outline: Read Section 6.4. Mean Prediction

Economics 308: Econometrics Professor Moody

OLS Assumptions Violation and Its Treatment: An Empirical Test of Gross Domestic Product Relationship with Exchange Rate, Inflation and Interest Rate

Romanian Economic and Business Review Vol. 3, No. 3 THE EVOLUTION OF SNP PETROM STOCK LIST - STUDY THROUGH AUTOREGRESSIVE MODELS

Multiple Regression Analysis

1 Motivation for Instrumental Variable (IV) Regression

Introduction to Econometrics. Heteroskedasticity

Hint: The following equation converts Celsius to Fahrenheit: F = C where C = degrees Celsius F = degrees Fahrenheit

Applied Statistics and Econometrics

ACE 564 Spring Lecture 11. Violations of Basic Assumptions IV: Specification Errors. by Professor Scott H. Irwin

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

11. Simultaneous-Equation Models

Research Center for Science Technology and Society of Fuzhou University, International Studies and Trade, Changle Fuzhou , China

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Multiple Regression Analysis

UNIVERSIDAD CARLOS III DE MADRID ECONOMETRICS Academic year 2009/10 FINAL EXAM (2nd Call) June, 25, 2010

coefficients n 2 are the residuals obtained when we estimate the regression on y equals the (simple regression) estimated effect of the part of x 1

x = 1 n (x = 1 (x n 1 ι(ι ι) 1 ι x) (x ι(ι ι) 1 ι x) = 1

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

ECO220Y Simple Regression: Testing the Slope

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 2 Fall 2016 Instructor: Martin Farnham

UNIVERSIDAD CARLOS III DE MADRID ECONOMETRICS FINAL EXAM (Type B) 2. This document is self contained. Your are not allowed to use any other material.

Regression Models - Introduction

The GARCH Analysis of YU EBAO Annual Yields Weiwei Guo1,a

CORRELATION, ASSOCIATION, CAUSATION, AND GRANGER CAUSATION IN ACCOUNTING RESEARCH

ECONOMETRIA II. CURSO 2009/2010 LAB # 3

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Ref.: Spring SOS3003 Applied data analysis for social science Lecture note

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK

Transcription:

Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero population mean. 3. All explanatory variables are uncorrelated with the error term. 4. Observations of the error term are uncorrelated with each other (no serial correlation). 5. The error term has a constant variance (no heteroskedasticity). 6. No explanatory variable is a perfect linear function of any other explanatory variables (no perfect multicollinearity). We have assumed these assumptions have been satisfied. For the rest of the course, we will deal with violations of these assumptions. Page 1 of 15

This chapter deals with Assumption 1. Specifying an equation consists of three parts: CHAPTER 6: SPECIFICATION VARIABLES 1. Choosing the correct independent variables 2. Choosing the correct functional form 3. Choosing the correct form of the random error term A specification error occurs when the model is misspecified in terms of the choice of variables, functional form or error structure. In choosing explanatory variables, two types of errors are likely: Omitting a variable that belongs in the model Including an irrelevant variable We examine the theoretical consequences of each of these specification errors. Page 2 of 15

Omitted Variables What happens when an important variable that belongs in the model is omitted? Suppose the true model is: But we instead estimate the model:, where Consequence: The estimated values of all the other regression coefficients included in the model will be biased unless: the true regression coefficient excluded from the model is zero or the excluded variable is uncorrelated with every included variable Why? Page 3 of 15

Page 4 of 15

Page 5 of 15

Example Let s think about trying to explain yearly earnings ($). We believe education and ability should be in the regression equation. Problem: We do not observe ability and therefore do not have data for it. Page 6 of 15

Correcting for an Omitted Variable CHAPTER 6: SPECIFICATION VARIABLES Omitted variable bias is hard to detect: invest time in thinking about the equation before you even look at the data estimated coefficient has the wrong sign (and significant) or magnitude Corrections: Include the variable Report the expected bias Page 7 of 15

Irrelevant Variables What happens when a variable that does not belong in the model is included in the equation? Suppose the true model is: But we instead estimate the model:, Consequence: The estimated values of all the other regression coefficients included in the model will still be unbiased, their variance however will be higher so we can expect lower and larger standard errors for our estimated coefficients. This will happen unless: the irrelevant variable is uncorrelated with every included variable Page 8 of 15

Example Let s go back to our example of trying to explain BUEC 333 final exam grades (out of 100) using term test grades (out of 100), assignment grades (out of 100) and tutorial grades (out of 100). Suppose we consider adding Student ID as an explanatory variable. Our original estimation produced: Dependent Variable: EXAM Method: Least Squares Date: 10/14/11 Time: 13:16 Sample: 1 265 Included observations: 265 Variable Coefficient Std. Error t-statistic Prob. ASSIGNMENTS -0.029053 0.040932-0.709776 0.4785 TUTORIALS 0.022809 0.044567 0.511782 0.6092 TEST 0.435593 0.038234 11.39270 0.0000 C 33.79783 5.050905 6.691441 0.0000 R-squared 0.334371 Mean dependent var 62.53052 Adjusted R-squared 0.326720 S.D. dependent var 13.57901 S.E. of regression 11.14207 Akaike info criterion 7.674313 Sum squared resid 32402.04 Schwarz criterion 7.728346 Log likelihood -1012.846 F-statistic 43.70346 Durbin-Watson stat 2.140137 Prob(F-statistic) 0.000000 Page 9 of 15

With the inclusion of student ID, we obtain the following regression output: Dependent Variable: EXAM Method: Least Squares Date: 10/14/11 Time: 13:17 Sample: 1 265 Included observations: 265 Variable Coefficient Std. Error t-statistic Prob. ASSIGNMENTS -0.027099 0.041043-0.660274 0.5097 TUTORIALS 0.025054 0.044697 0.560537 0.5756 TEST 0.437118 0.038316 11.40839 0.0000 STUDENTID -7.19E-09 9.34E-09-0.769827 0.4421 C 35.42682 5.479887 6.464881 0.0000 R-squared 0.335885 Mean dependent var 62.53052 Adjusted R-squared 0.325668 S.D. dependent var 13.57901 S.E. of regression 11.15078 Akaike info criterion 7.679583 Sum squared resid 32328.35 Schwarz criterion 7.747125 Log likelihood -1012.545 F-statistic 32.87460 Durbin-Watson stat 2.124827 Prob(F-statistic) 0.000000 Correlations: TEST ASSIGNMENTS TUTORIALS STUDENTID TEST 1.000000 0.124149 0.047238 0.062746 ASSIGNMENTS 0.124149 1.000000 0.153495 0.079238 TUTORIALS 0.047238 0.153495 1.000000 0.077850 STUDENTID 0.062746 0.079238 0.077850 1.000000 Page 10 of 15

Consequences of including the irrelevant student ID variable into the model: Page 11 of 15

Choosing which variables belong in the model and which do not is difficult. Let economic theory and careful judgment guide your choice of variables. Thinking about the problem is the hard part and must be done before you estimate the model. There are some formal specification criteria we will look at. What you should not do: Do not test various combination of variables until you find something that you like [Data mining estimating a lot of specifications before the equation has been chosen] Sequential specification search sequentially dropping variables Page 12 of 15

Formal specification criteria CHAPTER 6: SPECIFICATION VARIABLES We consider three such criteria. Specification tests however cannot prove a particular specification is true. Ramsey s Regression Specification Error Test (RESET) Most-used test next to. The RESET test is a general test that measures whether the overall fit of a regression equation can be significantly improved by adding polynomials in. If so, then there is specification error. RESET test involves three steps: 1. Estimate equation to be tested using OLS and save the fitted (predicted) values: 2. Take these predicted values and create,, terms. Estimate the new regression equation: Page 13 of 15

3. Compare the fits of the two equations using the F-test: / / 1 ~, If the two equations are significantly different in fit then F will be large enough to reject a null hypothesis of : 0. If you do reject, the test does not tell you what the specification error is! Page 14 of 15

Akaike s Information Criterion and the Schwarz Criterion These criteria allow you to compare alternative specifications by adjusting the residual sum of squares (RSS) for the sample size and the number of explanatory variables: AIC = log / 2 1 / SC = log / log 1 / Estimate two specifications, calculate AIC and SC (EViews automatically outputs these), choose model with lower AIC or SC. In contrast to the RESET test, AIC and SC require you to have alternative specifications to test between. Page 15 of 15