Outline. 9. Heteroskedasticity Cross Sectional Analysis. Homoskedastic Case

Similar documents
Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

a. (All your answers should be in the letter!

Professor Chris Murray. Midterm Exam

CHAPTER 8 SOLUTIONS TO PROBLEMS

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

x i1 =1 for all i (the constant ).

CHAPER 11: HETEROSCEDASTICITY: WHAT HAPPENS WHEN ERROR VARIANCE IS NONCONSTANT?

Statistics for Economics & Business

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Lecture 4 Hypothesis Testing

If we apply least squares to the transformed data we obtain. which yields the generalized least squares estimator of β, i.e.,

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Module Contact: Dr Susan Long, ECO Copyright of the University of East Anglia Version 1

Chapter 5: Hypothesis Tests, Confidence Intervals & Gauss-Markov Result

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

) is violated, so that V( instead. That is, the variance changes for at least some observations.

Chapter 11: Simple Linear Regression and Correlation

Basic Business Statistics, 10/e

β0 + β1xi and want to estimate the unknown

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees.

January Examinations 2015

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

e i is a random error

Economics 130. Lecture 4 Simple Linear Regression Continued

Statistics for Business and Economics

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

Question 1 carries a weight of 25%; question 2 carries 20%; question 3 carries 25%; and question 4 carries 30%.

F8: Heteroscedasticity

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

β0 + β1xi. You are interested in estimating the unknown parameters β

Econometrics of Panel Data

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Chapter 15 - Multiple Regression

CHAPTER 8. Exercise Solutions

Chapter 14 Simple Linear Regression

Scatter Plot x

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Linear Regression Analysis: Terminology and Notation

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Properties of Least Squares

Interpreting Slope Coefficients in Multiple Linear Regression Models: An Example

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Limited Dependent Variables

Learning Objectives for Chapter 11

Chapter 15 Student Lecture Notes 15-1

Lecture 6: Introduction to Linear Regression

Comparison of Regression Lines

Continuous vs. Discrete Goods

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Diagnostics in Poisson Regression. Models - Residual Analysis

Exam. Econometrics - Exam 1

Chapter 13: Multiple Regression

β0 + β1xi. You are interested in estimating the unknown parameters β

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

The Ordinary Least Squares (OLS) Estimator

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Chapter 5 Multilevel Models

STAT 3008 Applied Regression Analysis

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Correlation and Regression

Biostatistics 360 F&t Tests and Intervals in Regression 1

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

REGRESSION ANALYSIS II- MULTICOLLINEARITY

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

CDS M Phil Econometrics

Chapter 4: Regression With One Regressor

Statistics II Final Exam 26/6/18

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

T E C O L O T E R E S E A R C H, I N C.

Lecture 3 Specification

Econometric Analysis of Panel Data. William Greene Department of Economics Stern School of Business

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

University of California at Berkeley Fall Introductory Applied Econometrics Final examination

Systems of Equations (SUR, GMM, and 3SLS)

First Year Examination Department of Statistics, University of Florida

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

4.3 Poisson Regression

Basically, if you have a dummy dependent variable you will be estimating a probability.

Composite Hypotheses testing

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation

Negative Binomial Regression

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Statistics MINITAB - Lab 2

Transcription:

Outlne 9. Heteroskedastcty Cross Sectonal Analyss Read Wooldrdge (013), Chapter 8 I. Consequences of Heteroskedastcty II. Testng for Heteroskedastcty III. Heteroskedastcty Robust Inference IV. Weghted Least Square Estmaton I. Consequences of Heteroskedastcty Homoskedastc Case Motvaton: Consder the model sav = 0 nc + u y = sav =savng x = nc =ncome Constant varances (MLR. 5) Var(u nc ) =, whch mples that Var(sav nc ) = f(y x). y. E(y x) = 0 x =5,000 x =100,000 I. Consequences of Heteroskedastcty 3 I. Consequences of Heteroskedastcty 4

Heteroskedastcty Example of Heteroskedastcty Volaton of homoskedastcty: What f the varablty of savngs of the rch s less than that of the lower ncome group? Here we say that the varance of savngs y (or unobserved factors u) ncreases wth ncome VAR(sav nc = 5,000) = 5 (see ) VAR(sav nc = 100,000) = 100 (see x ) f(y x).. E(y x) = 0 x When varances are unequal, ths problem s called heteroskedastcty.. (See Graph) =5,000 x =100,000 x I. Consequences of Heteroskedastcty 5 I. Consequences of Heteroskedastcty 6 MLR.5 volated: Heteroskedastcty Propertes nvald under heteroskedastcty: unequal varances Consder a model y = 0 + x + + k x k + u Heteroskedastcty VAR(u,,x k ) = Propertes unaffected by heteroskedastcty: 1) OLS estmators are stll unbased and consstent. ) The nterpretaton s the same for goodness of ft measures, R and R bar. 1) The estmators of the varances, Var( ), are based. ) t, F and LM statstcs no longer have t, F and LM dstrbutons. 3) OLS s no longer best lnear unbased estmator (BLUE). I. Consequences of Heteroskedastcty 7 I. Consequences of Heteroskedastcty 8

II. Testng for Heteroskedastcty Breusch Pagan Test There are many tests for heteroskedastcy, but we wll learn two modern tests: 1) Breusch Pagan Test for Heteroskedastcty ) Whte Test use no cross terms. use cross terms. use ftted values of the LHS varable Gven MLR.1 MLR.4, consder the Model y = 0 + x + 3 + u Want to test H 0 : whether MLR.5 s true H 0 : VAR(u,, ) =E(u ) = u = 0 + x + + k x k + v These modern tests assume that the varance of the error depends or does not depend upon the explanatory varables. To test whether u s related to x s H 0 : 1 = = = k = 0 II. Testng for Heteroskedastcty 9 II. Testng for Heteroskedastcty 10 Use resduals for u Snce we don t observe u, but we have estmates of resduals. = 0 + x + + k x k + v Use F test or LM test to test the overall sgnfcance H 0 : 1 = = = k = 0 R F (1 R uˆ ) uˆ / k / n k LM = n* ~ k F 1 k,( n k 1) Example: Consder cgarette demand functon ncome: annual ncome n dollars cgprc: state cgarette prce, cents per pack educ: years of schoolng age: age measured n years restaurn =1 f a state has restaurant smokng restrctons =0 f a state has no restaurant smokng restrctons cgs = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u BP Test for heteroskedastcty Step 1: Estmate the above equaton. Step : obtan resduals from the cgs equaton or In Evews, obtan resdual seres resd01 Step 3: Regress on all x s. resd01^ = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u Step 4: Use F and LM Tests for heteroskedastcty. Compute F and LM statstcs and compare to crtcal values of the F k,n-k-1 and k dstrbutons II. Testng for Heteroskedastcty 11 II. Testng for Heteroskedastcty 1

LM verson of the Bruesch Pagan Test Step 4: Use F test and LM test 1) F statstc = 5.55 p value = (0.00001) ) LM statstc = Obs*R squared = 807*.039973 = 3.6 Ch square dstrbuton wth 6 DFs c = 1.59 (5% sgnfcance level) c = 16.81(1% sgnfcance level) What can we say about heteroskedastcty? II. Testng for Heteroskedastcty 13 Evews: Step 1: Estmate the cgarette demand equaton Dependent Varable: CIGS Sample: 1 807 Included observatons: 807 C -3.63983 4.07866-0.151164 0.8799 LOG(INCOME) 0.88068 0.77783 1.09519 0.68 LOG(CIGPRIC) -0.75086 5.77334-0.130057 0.8966 EDUC -0.501498 0.167077-3.001596 0.008 AGE 0.770694 0.1601 4.813155 0 AGE^ -0.00903 0.001743-5.176494 0 RESTAURN -.85085 1.111794 -.541016 0.011 R-squared 0.05737 Mean dependent var 8.686493 Adjusted R-squared 0.04563 S.D. dependent var 13.715 S.E. of regresson 13.40479 Akake nfo crteron 8.037737 Sum squared resd 143750.7 Schwarz crteron 8.078448 Log lkelhood -336.7 F-statstc 7.4306 Durbn-Watson stat.0185 Prob(F-statstc) 0 Proc/Make Resdual Seres. Step : name for resdual seres: resd01 II. Testng for Heteroskedastcty 14 Step 3: Regress resd01^ ( ) on all x s Whte Test of Heteroskedastcty Dependent Varable: RESID01^ Sample: 1 807 C -636.303 65.4945-0.975186 0.398 LOG(INCOME) 4.63847 19.718 1.4930 0.119 LOG(CIGPRIC) 60.97663 156.4487 0.389755 0.6968 EDUC -.3843 4.57535-0.56606 0.5986 AGE 19.41748 4.339068 4.475034 0 AGE^ -0.1479 0.04734-4.547398 0 RESTAURN -71.1814 30.1789 -.36641 0.0184 R-squared 0.039973 Mean dependent var 178.197 Adjusted R-squared 0.03773 S.D. dependent var 369.3519 S.E. of regresson 363.491 Akake nfo crteron 14.63669 Sum squared resd 1.06E+08 Schwarz crteron 14.6774 Log lkelhood -5898.91 F-statstc 5.551687 Durbn-Watson stat 1.93730 Prob(F-statstc) 0.00001 Consder the three varable model y = 0 + x + 3 + u H 0 : Var(u, x, ) = Weaker assumpton by Whte (1980) u s uncorrelated wth (, x, ), (, x, ), ( x,, x ) = 0 + x + 3 + 4 + 5 x + 6 + 7 x + 8 + 9 x + v H 0 : 1 = = = 9 = 0 Use F and LM Test What are the rejecton rules? II. Testng for Heteroskedastcty 15 II. Testng for Heteroskedastcty 16

Intractable: more regressors Easer way to mplement Whte Test Consder the model wth 6 regressors cgs = 0 log(ncome) + (cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u = 0 + 5 regressors + v LOG(INCOME), (LOG(INCOME))^, (LOG(INCOME))*(LOG(CIGPRIC)), (LOG(INCOME))*EDUC, (LOG(INCOME))*AGE, (LOG(INCOME))*(AGE^), (LOG(INCOME))*RESTAURN, LOG(CIGPRIC), (LOG(CIGPRIC))^, (LOG(CIGPRIC))*EDUC, (LOG(CIGPRIC))*AGE, (LOG(CIGPRIC))*(AGE^), (LOG(CIGPRIC))*RESTAURN EDUC, EDUC^, EDUC*AGE, EDUC*(AGE^), EDUC*RESTAURN AGE, AGE^, AGE*(AGE^), AGE*RESTAURN, (AGE^)^, (AGE^)*RESTAURN RESTAURN H 0 : 1 = = = 5 = 0 k = 5 n k 1 = n 6 What are the rejecton rules? Idea : use OLS ftted values n a test for heteroskedastcty = + + x + + x k When we square, we get a partcular functon of all the squares and cross products Smpler form of Whte Test = 0 + + v What are the null hypothess and rejecton rules? II. Testng for Heteroskedastcty 17 II. Testng for Heteroskedastcty 18 Model: cgs = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u Evews: Step 1: Estmate the cgarette demand equaton A Specal case of the Whte Test for heteroskedastcty Step 1: Estmate the above equaton. Step : obtan ftted value from the cgarette equaton. In Evews, obtan resdual seres resd01 (or ) Note that y = + or cgs = + Generatng seres for called cgshat cgshat = cgs resd01 Step 3: Regress on and resd01^ = 0 cgshat + cgshat + v Step 4: Use F and LM Tests for heteroskedastcty. Compute F and LM statstcs and compare to crtcal values of the F,n-3 and dstrbutons Dependent Varable: CIGS Sample: 1 807 Included observatons: 807 C -3.63983 4.07866-0.151164 0.8799 LOG(INCOME) 0.88068 0.77783 1.09519 0.68 LOG(CIGPRIC) -0.75086 5.77334-0.130057 0.8966 EDUC -0.501498 0.167077-3.001596 0.008 AGE 0.770694 0.1601 4.813155 0 AGE^ -0.00903 0.001743-5.176494 0 RESTAURN -.85085 1.111794 -.541016 0.011 R-squared 0.05737 Mean dependent var 8.686493 Adjusted R-squared 0.04563 S.D. dependent var 13.715 S.E. of regresson 13.40479 Akake nfo crteron 8.037737 Sum squared resd 143750.7 Schwarz crteron 8.078448 Log lkelhood -336.7 F-statstc 7.4306 Durbn-Watson stat.0185 Prob(F-statstc) 0 Step : Generatng seres for cgs ^ called cgshat II. Testng for Heteroskedastcty 19 II. Testng for Heteroskedastcty 0

Step 3: Regress resd01^ on and ^ Step 4. LM and F test for Whte Test Dependent Varable: RESID01^ Method: Least Squares Sample: 1 807 Included observatons: 807 C 14.05341 47.7985 0.94013 0.7688 CIGSHAT 14.05344 11.56743 1.14914 0.48 CIGSHAT^ 0.491978 0.755633 0.65108 0.515 R-squared 0.0398 Mean dependent var 178.197 Adjusted R-squared 0.0305 S.D. dependent var 369.3519 S.E. of regresson 363.6715 Akake nfo crteron 14.63409 Sum squared resd 1.06E+08 Schwarz crteron 14.65154 Log lkelhood -5901.86 F-statstc 13.6876 Durbn-Watson stat 1.9351 Prob(F-statstc) 0.000001 resd01^ = 0 cgshat + cgshat + v H 0 : 1 = = 0 F test F =13.6876 p value =.000001 Could you fnd crtcal values to verfy ths? LM test LM = obs*r squared = 807*.0398 = 6.57 Ch square dstrbuton wth DFs c = 5.99 (5% sgnfcance level) c = 9.1 (1% sgnfcance level) II. Testng for Heteroskedastcty 1 II. Testng for Heteroskedastcty Model: cgs = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u We could follow smlar steps to test for heteroskedastcty usng Whte Tests. 3) Whte Test wth no cross terms = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + 7 [log(ncome)] + 8 [log(cgprc)] + 9 educ 0 [age ] + v 4) Whte Test wth cross terms. = 0 + 5 regressors + v Evews: It has commands to fnd F and LM statstcs usng Whte Tests (methods 3 4). In the equaton output wndow, (3) Choose Vew/Resdual Tests/Whte heteroskedastcty (wth no cross terms) (4) Choose Vew/Resdual Tests/Whte heteroskedastct (wth cross terms) for (4) In the Equaton wndow, run the followng regresson. Dependent Varable: CIGS Sample: 1 807 Included observatons: 807 C -3.63983 4.07866-0.151164 0.8799 LOG(INCOME) 0.88068 0.77783 1.09519 0.68 LOG(CIGPRIC) -0.75086 5.77334-0.130057 0.8966 EDUC -0.501498 0.167077-3.001596 0.008 AGE 0.770694 0.1601 4.813155 0 AGE^ -0.00903 0.001743-5.176494 0 RESTAURN -.85085 1.111794 -.541016 0.011 R-squared 0.05737 Mean dependent var 8.686493 Adjusted R-squared 0.04563 S.D. dependent var 13.715 S.E. of regresson 13.40479 Akake nfo crteron 8.037737 Sum squared resd 143750.7 Schwarz crteron 8.078448 Log lkelhood -336.7 F-statstc 7.4306 Durbn-Watson stat.0185 Prob(F-statstc) 0 II. Testng for Heteroskedastcty 3 II. Testng for Heteroskedastcty 4

Vew/Resdual Tests/Whte heteroskedastcty (no cross terms) for (3) Whte Heteroskedastcty Test: F-statstc 3.73565 Probablty 0.000065 Obs*R-squared 36.14649 Probablty 0.000079 Dependent Varable: RESID^ Included observatons: 807 C 18981.9 18710.46 1.014509 0.3106 LOG(INCOME) 16.1011 77.0494 0.05851 0.9534 (LOG(INCOME))^ 0.413056 15.7585 0.0704 0.9784 LOG(CIGPRIC) -9800.158 951.97-1.05938 0.898 (LOG(CIGPRIC))^ 119.708 1144.864 1.065374 0.87 EDUC 16.17731 7.71945 0.583609 0.5596 EDUC^ -0.78004 1.09046-0.71713 0.4735 AGE 35.95817 11.019 3.10181 0.0014 AGE^ -0.499351 0.185168 -.69675 0.0071 (AGE^)^ 1.9E-05 1.E-05 1.57888 0.1161 RESTAURN -64.69514 30.44613 -.14905 0.0339 R-squared 0.044791 Mean dependent var 178.197 Log lkelhood -5896.875 F-statstc 3.73565 Durbn-Watson stat 1.937056 Prob(F-statstc) 0.000065 5 II. Testng for Heteroskedastcty Vew/Resdual Tests/Whte heteroskedastcty (wth cross terms) for (4) Whte Heteroskedastcty Test: F-statstc.15958 Probablty 0.000905 Obs*R-squared 5.1745 Probablty 0.00114 Dependent Varable: RESID^ Included observatons: 807 Varable (5 regressors) Coeffcent Std. Error t-statstc Prob. C 9374.77 0559.14 1.48794 0.1535 LOG(INCOME) -1049.63 963.4359-1.08947 0.763 (LOG(INCOME))^ -3.94118 17.071-0.3087 0.8175 (LOG(INCOME))*(LOG(CIGPRIC)) 39.8896 39.417 1.378897 0.1683.......... RESTAURN -868. 986.776-0.9603 0.337 R-squared Mean 0.06465 dependent var 178.197 Log lkelhood -5888.4 F-statstc.15958 Durbn-Watson stat 1.93388 Prob(F-statstc) 0.000905 6 II. Testng for Heteroskedastcty III. Heteroskedastcty Robust Inference after OLS Estmaton In the presence of heteroskedastcty, Evews can adjust standard errors, t, F, and LM, statstcs so that they are vald. Ths method s called the heteroskedastc robust procedure. Techncally, ths procedure s vald, at least n large samples, whether or not the errors have constant varances. Sketch the procedure Consder the smple regresson model y = 0 x + u Var(u x ) = Steps to fnd robust standard errors: 1) Fnd estmator of 1 ) Under the assumptons MLR.1 MLR.4, the varance can be found. 3) Whte(1980) suggests usng n place of. Thus, we can fnd the estmator of VAR( ) 4) A heteroskedastc robust standard error can be found. Sketch for the general case? III. Heteroskedastcty Robust Inference after OLS Estmaton 7 III. Heteroskedastcty Robust Inference after OLS Estmaton 8

Varance wth Heteroskedastcty Varance wth Heteroskedastcty For the smple case, ˆ x x u, so x x 1 1 x x 1 SSTx Var ˆ, where SST x x x For the general multple regresson model, a vald j j estmator of Var ˆ wth heteroskedastcty s ˆˆ ˆ ru Varˆ j, SSR th j A vald estmator for ths when s x x ˆ x u, SST where uˆ are are the OLS resduals where rˆ s the resdual from regressng x on j all other ndependent varables, and SSR j s the sum of squared resduals from ths regresson j III. Heteroskedastcty Robust Inference after OLS Estmaton 9 III. Heteroskedastcty Robust Inference after OLS Estmaton 30 Example: cgs equaton Steps n Evews: Consder the model cgs = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u Fnd the cgs equaton usng heteroskedastc robust procedure. III. Heteroskedastcty Robust Inference after OLS Estmaton 31 Step 1: Estmate the log equaton n usual OLS method. cgs = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u Step : Fnd the log equaton wth heteroskedastcty-robust standard errors. In the Equaton wndow, Choose Estmate. In the Equaton Estmaton box, clck opton button. Then, clck heteroskedastcty consstent coeffcent covarance. clck Whte III. Heteroskedastcty Robust Inference after OLS Estmaton 3

Step 1: Estmate the cgs equaton n usual way Step : Whte Heteroskedastcty-Consstent s.e. s Dependent Varable: CIGS Included observatons: 807 C -3.63983 4.07866-0.151164 0.8799 LOG(INCOME) 0.88068 0.77783 1.09519 0.68 LOG(CIGPRIC) -0.75086 5.77334-0.130057 0.8966 EDUC -0.501498 0.167077-3.001596 0.008 AGE 0.770694 0.1601 4.813155 0 AGE^ -0.00903 0.001743-5.176494 0 RESTAURN -.85085 1.111794 -.541016 0.011 R-squared 0.05737 Mean dependent var 8.686493 Adjusted R-squared 0.04563 S.D. dependent var 13.715 S.E. of regresson 13.40479 Akake nfo crteron 8.037737 Sum squared resd 143750.7 Schwarz crteron 8.078448 Log lkelhood -336.7 F-statstc 7.4306 Durbn-Watson stat.0185 Prob(F-statstc) 0 Dependent Varable: CIGS Included observatons: 807 Whte Heteroskedastcty-Consstent Standard Errors & Covarance C -3.6398 5.61646-0.14089 0.887 LOG(INCOME) 0.88068 0.596011 1.476931 0.1401 LOG(CIGPRIC) -0.75086 6.035401-0.1441 0.901 EDUC -0.5015 0.16394-3.088167 0.001 AGE 0.770694 0.13884 5.5736 0 AGE^ -0.0090 0.00146-6.170768 0 RESTAURN -.8509 1.008033 -.80573 0.005 R-squared 0.05737 Mean dependent var 8.686493 Adjusted R-squared 0.04563 S.D. dependent var 13.715 S.E. of regresson 13.40479 Akake nfo crteron 8.037737 Sum squared resd 143750.7 Schwarz crteron 8.078448 Log lkelhood -336.3 F-statstc 7.4306 Durbn-Watson stat.0185 Prob(F-statstc) 0 III. Heteroskedastcty Robust Inference after OLS Estmaton 33 III. Heteroskedastcty Robust Inference after OLS Estmaton 34 OLS and Robust estmates: compared Dependent Varable: CIGS Interpretaton: cgs demand equaton cgs = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u Varable Coeffcent s.e s.e (robust) Prob. Prob. (robust) C -3.63983 4.07866 5.61646 0.8799 0.887 LOG(INCOME) 0.88068 0.77783 0.596011 0.68 0.1401 LOG(CIGPRIC) -0.75086 5.77334 6.035401 0.8966 0.901 EDUC -0.501498 0.167077 0.16394 0.008 0.001 AGE 0.770694 0.1601 0.13884 0 0 AGE^ -0.00903 0.001743 0.00146 0 0 1) In ths applcaton, any varable that s statstcally sgnfcant at 1% level usng the usual t test s stll sgnfcant under the heteroskedastcty robust procedure. varables: educ, age, age and restaurn. ) The robust standard errors are ether larger or smaller than the usual standard error 3) The robust s.e. on log(ncome) becomes smaller, but that on log(cgprce) s larger. 4) How to nterpret the coeffcents of varous varables n the model? RESTAURN -.85085 1.111794 1.008033 0.011 0.005 III. Heteroskedastcty Robust Inference after OLS Estmaton 35 III. Heteroskedastcty Robust Inference after OLS Estmaton 36

Robust F test and Wald Test cgs = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u Want to test the null hypothess: H 0 : 1 = =0 In the equaton wth robust standard errors, we can easly obtan heteroskedastc robust F statstc also called heteroskedastc robust Wald statstc. Step 1. Whte Heteroskedastcty Consstent s.e. s Step. Vew/Coeffcent Tests/Wald Coeffcent Restrctons Step 3. Type n c()=0, c(3)=0 The F statstc s F = 0.73336; p value =.480631 (ncorrect) robust F statstc = 1.099; p value=.3338 (correct) Snce p value >, we do not reject H 0 and conclude that ncome and prce together do not have an effect on cgarette demand. Step 1: Whte Heteroskedastcty-Consstent s.e. s Dependent Varable: CIGS Included observatons: 807 Whte Heteroskedastcty-Consstent Standard Errors & Covarance C -3.6398 5.61646-0.14089 0.887 LOG(INCOME) 0.88068 0.596011 1.476931 0.1401 LOG(CIGPRIC) -0.75086 6.035401-0.1441 0.901 EDUC -0.5015 0.16394-3.088167 0.001 AGE 0.770694 0.13884 5.5736 0 AGE^ -0.0090 0.00146-6.170768 0 RESTAURN -.8509 1.008033 -.80573 0.005 R-squared 0.05737 Mean dependent var 8.686493 Adjusted R-squared 0.04563 S.D. dependent var 13.715 S.E. of regresson 13.40479 Akake nfo crteron 8.037737 Sum squared resd 143750.7 Schwarz crteron 8.078448 Log lkelhood -336.3 F-statstc 7.4306 Durbn-Watson stat.0185 Prob(F-statstc) 0 III. Heteroskedastcty Robust Inference after OLS Estmaton 37 III. Heteroskedastcty Robust Inference after OLS Estmaton 38 Vew/Coeffcent Tests/Redundant Varables Redundant Varables: LOG(INCOME) LOG(CIGPRIC) F-statstc 0.73336 Probablty 0.480631 Log lkelhood rato 1.478131 Probablty 0.47756 Assume that MLR.1 MLR.4 hold, but MLR.5 does not. Vew/Coeffcent Tests/Wald Coeffcent Restrctons Wald Test: Let xdenote, x,., x k VAR(u x) = (heteroskedastcty) Equaton: Unttled Test Statstc Value df Probablty F-statstc 1.098858 (, 800) 0.3338 Ch-square.197717 0.3333 Let = h(x) VAR(u x) = h(x) h(x) > 0 (snce VAR > 0) Null Hypothess Summary: Normalzed Restrcton (= 0) Value Std. Err. C() 0.88068 0.596011 Heteroskedastty can be corrected under two cases: 1) = h(x). h(x) s known up to a multplcatve constant ) h(x) has to be estmated feasble GLS C(3) -0.75086 6.035401 Restrctons are lnear n coeffcents. III. Heteroskedastcty Robust Inference after OLS Estmaton 39 40

Case 1: h(x) s known up to a multplcatve constant Weghted Least Squares Consder the model Let y = sav ; x = nc ; h = x y = 0 x + u var(u x ) = h Trck : Dvde the equaton by sqr(h ) y 1 x u 0 1 h h h h (OLS) (heteroskedastc) (IV.1) Show that the errors n (IV.1) are homoskedastc! Weghted least squares obtan the values of j* that makes the weghted SSR as small as possble: n u h 1 1 ( y x... x ) / h * * * 0 1 1 k k where each squared resdual s weghted by 1/h. Brng 1/h nsde the squared resdual: GLS s an effcent procedure. n u y 1 x x n n * * 1 * k 0 1... k 1 h 1 h h h h 41 4 Example: Savng equaton OLS & WLS compared OLS : sav = 0 nc + u VAR(u nc ) = = nc s not constant. and are not BLUE. OLS WLS MLR.1-MLR-4 yes yes GLS : transformed equaton sav /(nc) 1/ = 0 /(nc) 1/ nc /(nc) 1/ + u /(nc) 1/ VAR[u /nc ] = s constant. 0* and 1* are BLUE MLR.5 heteroskedastcty homoskedastcty error varance not constant constant BLUE no yes The GLS estmators after correctng the error for heteroskedastcty s called weghted least squares (WLS) estmators. t and F Dst nvald vald R-squared meanngful not meanngful 43 44

OLS and WLS Results: savng equaton Step 1: OLS Regress sav on nc (wth ntercept) OLS: WLS: sav = 14.8 + 0.147nc prob {.8493} {0.014} n=100, R =.061 R bar=.056 sav/nc^.5 = 15.0[1/nc^.5] + 0.17nc^.5 prob {.7955} {.003} n=100, R =.05 R bar=.015 Dependent Varable: SAV Method: Least Squares Sample: 1 100 Included observatons: 100 C 14.844 655.3931 0.190485 0.8493 INC 0.14668 0.057549.547897 0.014 Evew Trck: WLS Step 1: Obtan Orgnal output Step : In the [Equaton:..] wndow, Choose Estmate and then optons. clck Weghted LS/TSLS Type n the weght, 1/nc^.5 or 1/sqr(nc) R-squared 0.0617 Mean dependent var 158.51 Adjusted R-squared 0.05557 S.D. dependent var 384.90 S.E. of regresson 3197.415 Akake nfo crteron 18.99787 Sum squared resd 1.00E+09 Schwarz crteron 19.04997 Log lkelhood -947.894 F-statstc 6.491778 Durbn-Watson stat 1.536387 Prob(F-statstc) 0.01391 45 46 WLS: Regress sav/(nc^.5) on 1/nc^.5 and nc^.5 (wth no ntercept) Dependent Varable: SAV Included observatons: 100 Step : Compare to the regresson of sav /(nc )1/ = 0 /(nc ) 1/ nc /(nc ) 1/ + u /(nc ) 1/ Weghtng seres: 1/INC^.5 Dependent Varable: SAV/(INC^.5) Method: Least Squares Sample: 1 100 Included observatons: 100 C -14.953 480.8606-0.5985 0.7955 INC 0.171756 0.056813 3.03184 0.003 Weghted Statstcs R-squared 0.0487 Mean dependent var 1364.931 1/(INC^.5) -14.953 480.8606-0.5985 0.7955 INC^.5 0.171756 0.056813 3.03184 0.003 R-squared 0.0487 Mean dependent var 15.5151 Adjusted R-squared 0.0151 S.D. dependent var 675.843 S.E. of regresson 659.05 Akake nfo crteron 18.691 Sum squared resd 6.93E+08 Schwarz crteron 18.6813 Log lkelhood -99.456 F-statstc 9.13964 Durbn-Watson stat 1.567035 Prob(F-statstc) 0.00319 Adjusted R-squared 0.0151 S.D. dependent var 9.89943 S.E. of regresson 9.71179 Akake nfo crteron 9.64076 Sum squared resd 86513.48 Schwarz crteron 9.69866 Log lkelhood -480.038 Durbn-Watson stat 1.567035 Unweghted Statstcs R-squared 0.060303 Mean dependent var 158.51 Adjusted R-squared 0.050714 S.D. dependent var 384.90 S.E. of regresson 300.53 Sum squared resd 1.00E+09 Durbn-Watson stat 1.55911 47 48

WLS: In practce, we rarely know h(x). Consder the model: sav = 0 nc + sze + 3 educ + 4 age + 0 black +u Does the varance depend on age or educaton? Average data at the frm level: contrb 0 earn 1 u Indvdual level data vs. Averages of data There s a case where weghts needed arse naturally. Example: the effect of earnngs on the contrbuton to Prvate Provdent Fund. Indvdual data: contrb,e = 0 earn,e + u,e Assume MLR.1 MLR.4 and Var(u,e ) = : denote a partcular frm e: an employee wthn the frm contrb,e : annual contrbuton earn,e : annual earnngs VAR(u bar) = /m (Heteroskedastc) h =1/m weght = m = m 1/ (n the transformed equaton and Evews) Show the error n the transformed equaton s homoskedastc! WLS: to use wth per capta data A smlar weghtng arses when we use data at the cty, provnce, or country level. In summary, WLS gves us an effcent way to treat averages of data. 49 50 Case : h(x) must be estmated Feasble GLS Estmator, Suppose the heteroskedastcty functon s unknown.e., VAR(u,, x k ) = h(x) where h(x) s unknown and must be estmated;.e., fnd. Assume that VAR(u x) = exp ( 0 + + k x k ) u = exp( 0 + + k x k )v log(u ) = 0 + + k x k + e Usng n GLS transformaton yelds an estmator, called an FGLS estmator. FGLS s no longer unbased but consstent n large samples. FGLS s no longer BLUE but asymptotcally more effcent than OLS. Estmate log( ) = 0 + + k x k + e = ftted value of log( ) The estmates of h are smply = exp( ) 51 5

Propertes: OLS, GLS and FGLS Example: FGLS and cgarette equaton cgs = 0 log(ncome) + log(cgprc) + 3 educ + 4 age + 5 age + 6 restaurn + u y = 0 + x + 3 + 4 x 4 + 5 x 5 + 6 x 6 + u Unbasedness yes yes OLS known h : FGLS BLUE no yes t and F dst. no exact t and F dstrbutons no longer unbased but consstent no longer BLUE but asymptotcally more effcent approxmately t and F dstrbutons Usng Breusch Pagan Test or Whte Tests, we found that the varances are nonconstant. Steps n runnng FGLS equaton Step 1: Run the regresson of y on, x,..., x 6 and obtan resduals (called, resd01) n Evews. Step : Run the regresson of log( ) or log(resd01^) on,, x k and obtan resduals (called, lresd0 n Evews) Step 3: From step, obtan and and the ftted values of log( ) (called ) = log(resd01^) lresd0 Snce h ^ = exp(g ^), then n Evew h01 = exp(log(resd01^) lresd0) Step 4: Run the FGLS equaton usng 1/h01 as weghts. (1/ 01 are weghts n Evews) 53 54 Evews: Step 1: Estmate the cgarette demand equaton Dependent Varable: CIGS Sample: 1 807 Included observatons: 807 C -3.63983 4.07866-0.151164 0.8799 LOG(INCOME) 0.88068 0.77783 1.09519 0.68 LOG(CIGPRIC) -0.75086 5.77334-0.130057 0.8966 EDUC -0.501498 0.167077-3.001596 0.008 AGE 0.770694 0.1601 4.813155 0 AGE^ -0.00903 0.001743-5.176494 0 RESTAURN -.85085 1.111794 -.541016 0.011 R-squared 0.05737 Mean dependent var 8.686493 Adjusted R-squared 0.04563 S.D. dependent var 13.715 S.E. of regresson 13.40479 Akake nfo crteron 8.037737 Sum squared resd 143750.7 Schwarz crteron 8.078448 Log lkelhood -336.7 F-statstc 7.4306 Durbn-Watson stat.0185 Prob(F-statstc) 0 Proc/Make Resdual Seres. Then, name for resdual seres: resd01 55 Step : Run the regresson of log( ) or log(resd01^) on,, x k Dependent Varable: LOG(RESID01^) Included observatons: 807 C -1.9069.563033-0.74938 0.4538 LOG(INCOME) 0.9154 0.077468 3.763351 0.000 LOG(CIGPRIC) 0.195418 0.614539 0.31799 0.7506 EDUC -0.0797 0.017784-4.481657 0 AGE 0.04005 0.017044 11.9698 0 AGE^ -0.0039 0.000186-1.89313 0 RESTAURN -0.6701 0.118344-5.9813 0 Log lkelhood -148.44 F-statstc 43.819 Durbn-Watson stat.04587 Prob(F-statstc) 0 Proc/Make Resdual Seres. Then, name for resdual seres: lresd0 Step 3: Generatng Seres: h01 = exp(log(resd01^)-lresd0) 56

Step 4: FGLS equaton wth weghts, 1/sqr(h01) Evew Trck: WLS; In the Equaton wndow, Choose Estmate and then optons. Clck Weghted LS/TSLS. Type n the weght, 1/sqr(h01). Dependent Varable: CIGS Included observatons: 807 Weghtng seres: 1/SQR(H01) C 5.635471 17.80314 0.316544 0.7517 LOG(INCOME) 1.9539 0.43701.963855 0.0031 LOG(CIGPRIC) -.94031 4.460145-0.6594 0.5099 EDUC -0.46345 0.10159-3.856953 0.0001 AGE 0.481948 0.096808 4.978378 0 AGE^ -0.00563 0.000939-5.989706 0 RESTAURN -3.46106 0.795505-4.350776 0 Weghted Statstcs R-squared 0.00751 Mean dependent var 7.1587 Log lkelhood -316.19 F-statstc 17.05549 Durbn-Watson stat.049719 Prob(F-statstc) 0 Unweghted Statstcs R-squared 0.045739 Mean dependent var 8.686493 57 Alternatvely, Step 4: Regress cgs/sqr(h01) on 1/sqr(h01), log(ncome)/sqr(h01), wth no ntercept Dependent Varable: CIGS/SQR(H01) Included observatons: 807 1/SQR(H01) 5.635471 17.80314 0.316544 0.7517 LOG(INCOME)/SQR(H01) 1.9539 0.43701.963855 0.0031 LOG(CIGPRIC)/SQR(H01) -.94031 4.460145-0.6594 0.5099 EDUC/SQR(H01) -0.46345 0.10159-3.856953 0.0001 AGE/SQR(H01) 0.481948 0.096808 4.978378 0 AGE^/SQR(H01) -0.00563 0.000939-5.989706 0 RESTAURN/SQR(H01) -3.46106 0.795505-4.350776 0 R-squared 0.00751 Mean dependent var 0.96619 Adjusted R-squared -0.00473 S.D. dependent var 1.574979 S.E. of regresson 1.578698 Akake nfo crteron 3.759715 Sum squared resd 1993.831 Schwarz crteron 3.80045 Log lkelhood -1510.05 Durbn-Watson stat.049719 58 OLS & FGLS Results Compared Dep. Var = cgs OLS FGLS OLS FGLS Varable Coeffcent Coeffcent Prob. Prob. C -3.6398 5.635471 0.8799 0.7517 LOG(INCOME) 0.8807 1.9539 0.68 0.0031 LOG(CIGPRIC) -0.7509 -.940314 0.8966 0.5099 EDUC -0.5015-0.463446 0.008 0.0001 AGE 0.77069 0.481948 0 0 AGE^ -0.009-0.00567 0 0 RESTAURN -.851-3.461064 0.011 0 Interpretaton: 1. Income effect s now statstcally sgnfcant and larger n magntude.. Prce effect s stll statstcally nsgnfcant. 3. Cgarette smokng s negatvely related to schoolng. Recap of Heteroskedastcy Consequences of Heteroskedastcty Testng for Heteroskedastcty Heteroskedastcty Robust Inference Weghted Least Square Estmaton 4. Age has a dmnshng margnal effect on smokng. Smokng ncreases wth age up untl 4.8 years old and then smokng decreases wth age. 5. Cgarette smokng s negatvely affected by restaurant smokng restrctons. 59 60