EC312: Advanced Econometrics Problem Set 3 Solutions in Stata

Similar documents
Lecture 4: Heteroskedasticity

Topic 7: Heteroskedasticity

The OLS Estimation of a basic gravity model. Dr. Selim Raihan Executive Director, SANEM Professor, Department of Economics, University of Dhaka

ECON 497: Lecture Notes 10 Page 1 of 1

Econometrics - 30C00200

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Multiple Regression Analysis: Heteroskedasticity

Heteroskedasticity and Autocorrelation

2 Prediction and Analysis of Variance

Lab 11 - Heteroskedasticity

Making sense of Econometrics: Basics

Violation of OLS assumption - Heteroscedasticity

(c) i) In ation (INFL) is regressed on the unemployment rate (UNR):

Chapter 8 Heteroskedasticity

Multiple Regression Analysis

Econometrics Multiple Regression Analysis: Heteroskedasticity

Intermediate Econometrics

Exercise E7. Heteroskedasticity and Autocorrelation. Pilar González and Susan Orbe. Dpt. Applied Economics III (Econometrics and Statistics)

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

ECO375 Tutorial 7 Heteroscedasticity

Reliability of inference (1 of 2 lectures)

the error term could vary over the observations, in ways that are related

AUTOCORRELATION. Phung Thanh Binh

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Wooldridge, Introductory Econometrics, 2d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Iris Wang.

Heteroskedasticity. Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set

Graduate Econometrics Lecture 4: Heteroskedasticity

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

mrw.dat is used in Section 14.2 to illustrate heteroskedasticity-robust tests of linear restrictions.

Introductory Econometrics

Econometrics Midterm Examination Answers

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

Semester 2, 2015/2016

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant)

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Multiple Regression Analysis

Heteroskedasticity. Part VII. Heteroskedasticity

ECON 312 FINAL PROJECT

Econometrics. 4) Statistical inference

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

Econometrics for PhDs

Introductory Econometrics

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Econometrics -- Final Exam (Sample)

Heteroscedasticity. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Heteroscedasticity POLS / 11

Microeconometria Day # 5 L. Cembalo. Regressione con due variabili e ipotesi dell OLS

1. The Multivariate Classical Linear Regression Model

LECTURE 5 HYPOTHESIS TESTING

Exam D0M61A Advanced econometrics

Introduction to Econometrics. Heteroskedasticity

ECON 4230 Intermediate Econometric Theory Exam

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Chapter 15 Panel Data Models. Pooling Time-Series and Cross-Section Data

Christopher Dougherty London School of Economics and Political Science

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

ECONOMET RICS P RELIM EXAM August 24, 2010 Department of Economics, Michigan State University

L2: Two-variable regression model

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections

In order to carry out a study on employees wages, a company collects information from its 500 employees 1 as follows:

PANEL DATA RANDOM AND FIXED EFFECTS MODEL. Professor Menelaos Karanasos. December Panel Data (Institute) PANEL DATA December / 1

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Econometrics. 9) Heteroscedasticity and autocorrelation

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time

Binary Dependent Variables

Computer Exercise 3 Answers Hypothesis Testing

Econometrics I. by Kefyalew Endale (AAU)

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Rockefeller College University at Albany

Multiple Regression. Midterm results: AVG = 26.5 (88%) A = 27+ B = C =

Okun's Law Testing Using Modern Statistical Data. Ekaterina Kabanova, Ilona V. Tregub

Heteroskedasticity Example

Econometrics Homework 4 Solutions

ECON 497: Lecture 4 Page 1 of 1

Econometrics of Panel Data

Economics 308: Econometrics Professor Moody

Empirical Economic Research, Part II

Course Econometrics I

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

Econometrics Review questions for exam

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Using EViews Vox Principles of Econometrics, Third Edition

Econometrics. 5) Dummy variables

Economics Introduction to Econometrics - Fall 2007 Final Exam - Answers

Instead of using all the sample observations for estimation, the suggested procedure is to divide the data set

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Environmental Econometrics

Lecture 8: Heteroskedasticity. Causes Consequences Detection Fixes

Lecture 2 Multiple Regression and Tests

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Ordinary Least Squares Regression

CIVL 7012/8012. Simple Linear Regression. Lecture 3

Nonlinear Regression Functions

Transcription:

EC312: Advanced Econometrics Problem Set 3 Solutions in Stata Nicola Limodio www.nicolalimodio.com N.Limodio1@lse.ac.uk The data set AIRQ contains observations for 30 standard metropolitan statistical areas (SMSAs) in California for 1972 on the following variables airq indicator for air quality (lower is better) vala value added of companies (.000 US$) rain amount of rain (inches) coas dummy variable, 1 for SMSAs at the coast, 0 others dens population density (per square mile) medi average income per head (US $) a) Estimate a linear regression model that explains airq from the other variables using ordinary least squares. Interpet the coe cient estimates. Simply reg airq vala rain coas dens medi and the usual interpretation (look at significance, sign and magnitudes). b) Test the null hypothesis that average income does not a ect the air quality. Test the joint hypothesis that none of the variables has an e ect upon air quality. The first hypothesis can be tested by looking at the t-statistic of medi. The second is a test of joint significance, F test and can be performed by typing: test vala rain coas dens medi or by looking at the top-right corner of the regression. c) Test whether the variance of the error terms is di erent for coastal and noncoastal areas, using the Goldfield-Quandt test. In view of the outcome of the test, comment upon the validity of the test from b). How would one correct the test of b) in the presence of heteroskedasticity?

From this question you can perceive that the source of heteroskedasticity that worries the researcher comes from the variable coas. For this reason the GQ test is the most appropriate. Recall that GQ is the appropriate test when we observe the following pattern between our Y and X However note that the problem set is challenging for one point: in the usual GQ test, we order the sample with respect to the relevant variable which we believe being responsible for heteroskedasticity, then split the sample in 3 equal parts (in terms of numer of observations) and compare the RSS from the regressions of the 1 st and 3 rd ; here we have a dummy variable being responsible for heteroskedasticity, therefore we apply the test for both values of the dummy (no 3-sample partition) and will need to apply an adjustement for the degrees of freedom. We can execute it as follows. reg airq vala rain coas dens medi if coas==1 now let s save the RSS of this regression scalar RSS1= _result(4) scalar list RSS1 and analogously let s run the same regression for areas far from the coast

reg airq vala rain coas dens medi if coas==0 scalar RSS2= _result(4) scalar list RSS2 now compute the ratio of the Residuals Sum of Squares, R= RSS2/RSS1. Under the null hypothesis of homoscedasticity, this ratio R is distributed according to a F((n-c- 2k)/2, (n-c-2k)/2) degrees of freedom, where n is the sample size, c is the number of dropped observations, and k is the number of regressors in the model. In the example above, n=30, c=0, and k=4. Hence, R ~ F(22, 22). And under the null, R < F. Hence: scalar R=RSS1/RSS2 scalar list R however we also need to adjust this number for the degrees of freedom (n1-k) and (n2-k), which in this case is 4/16 hence scalar test=r * 1/4 and to find the critical value from the relevant F table and apply the usual rejection rule. One way to account for this type of heteroskedasticity is to account for the clustering of the variance within-groups, hence run reg airq vala rain coas dens medi, cluster(coas) or, more appropriately, because of the small number of clusters we can just use robust standard errors reg airq vala rain coas dens medi, robust more on this will follow in the next classes. d) Perform a Breusch-Pagan test for heteroskedasticity related to all explanatory variables.

As before, run the OLS reg airq vala rain coas dens medi get the sum of the squared residuals predict error, resid matrix accum E=error matrix list E scalar N= _result(1) now generate a disturbance correction factor in the form of sum of the squared residuals divided by the sample size scalar N=_result(1) scalar sigmahat=el(e,1,1)/n scalar list N sigmahat regress the adjusted squared errors (in the form of original squared errors divided by the correction factor) on a list of explanatory variables supposed to influence the heteroscedasticity. Following the question we assume all variables influence heteroskedasticity. Hence: gen adjerr2=(error^2)/sigmahat regress adjerr2 vala rain coas dens medi This auxiliary regression gives you a model sum of squares (ESS) equals: scalar ESS=_result(2) scalar list ESS Under the null hypoteshis of homoscedasticity, ESS/2 asymptotically converges to a Chi-squared(k-1, 5%), where k is the number of coefficients on the auxiliary regression. In the last case, k=5. Hence, comparing (1/2) ESS with a Chi-squared with 4 degress of freedom and 5%, you can apply the usual rejection rule for our test.

e) Perform a White test for heteroskedasticity. Comment upon the appropriateness of the White test in light of the number of observations and the degrees of freedom of the test. Here the strategy is as follows: (i) Run the OLS regression (as you've done above, the results are ommited): (ii) Get the residuals: (iii) Generate the squared residuals: (iv) Generate new explanatory variables, in the form of the squares of the explanatory variables and the cross-product of the explanatory variables: Remember that you don t need to square a dummy variable. (v) Regress the squared residuals into a constant, the original explanatory variables, and the set of auxiliary explanatory variables (squares and cross-products) you've just created. (vi) Get the sample size (N) and the R-squared (R2), and construct the test statistic N*R2; (vii) Under the null hypothesis, the errors are homoscedastic, and NR2 is asymptotically distributed as a Chi-squared with k-1 degrees of freedom (where k is the number of coefficients on the auxiliary regression). Apply the usual rejection rules. Because we have only 30 observations, we are close to run short of degrees of freedom (because of the cross products). So the White test would not be the most appropriate. f), g) and h) are just a repetition of the previous tests with a different functional form and its application is left as an exercise.