Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Similar documents
Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

The Ordinary Least Squares (OLS) Estimator

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

e i is a random error

Linear Regression Analysis: Terminology and Notation

Economics 130. Lecture 4 Simple Linear Regression Continued

III. Econometric Methodology Regression Analysis

x i1 =1 for all i (the constant ).

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

a. (All your answers should be in the letter!

Properties of Least Squares

F8: Heteroscedasticity

Basically, if you have a dummy dependent variable you will be estimating a probability.

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Midterm Examination. Regression and Forecasting Models

CHAPTER 8. Exercise Solutions

Chapter 11: Simple Linear Regression and Correlation

Chapter 9: Statistical Inference and the Relationship between Two Variables

January Examinations 2015

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Statistics for Economics & Business

[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact

Correlation and Regression

Limited Dependent Variables

Chapter 13: Multiple Regression

β0 + β1xi and want to estimate the unknown

Scatter Plot x

Chapter 14 Simple Linear Regression

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Chapter 8 Indicator Variables

CHAPTER 8 SOLUTIONS TO PROBLEMS

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Exam. Econometrics - Exam 1

A dummy variable equal to 1 if the nearby school is in regular session and 0 otherwise;

Lecture 4 Hypothesis Testing

Statistics for Business and Economics

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

Continuous vs. Discrete Goods

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

CHAPER 11: HETEROSCEDASTICITY: WHAT HAPPENS WHEN ERROR VARIANCE IS NONCONSTANT?

/ n ) are compared. The logic is: if the two

18. SIMPLE LINEAR REGRESSION III

Primer on High-Order Moment Estimators

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

28. SIMPLE LINEAR REGRESSION III

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

Chapter 15 - Multiple Regression

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

REGRESSION ANALYSIS II- MULTICOLLINEARITY

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

A Comparative Study for Estimation Parameters in Panel Data Model

Learning Objectives for Chapter 11

Diagnostics in Poisson Regression. Models - Residual Analysis

The Granular Origins of Aggregate Fluctuations : Supplementary Material

x = , so that calculated

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

STAT 3008 Applied Regression Analysis

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

Statistics II Final Exam 26/6/18

Outline. 9. Heteroskedasticity Cross Sectional Analysis. Homoskedastic Case

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Econometrics of Panel Data

A Monte Carlo Study for Swamy s Estimate of Random Coefficient Panel Data Model

Polynomial Regression Models

Basic Business Statistics, 10/e

Chapter 5: Hypothesis Tests, Confidence Intervals & Gauss-Markov Result

Lecture 6: Introduction to Linear Regression

Rockefeller College University at Albany

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Comparison of Regression Lines

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Linear Correlation. Many research issues are pursued with nonexperimental studies that seek to establish relationships among 2 or more variables

Laboratory 3: Method of Least Squares

Econometrics: What's It All About, Alfie?

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

An (almost) unbiased estimator for the S-Gini index

β0 + β1xi. You are interested in estimating the unknown parameters β

Chapter 15 Student Lecture Notes 15-1

Laboratory 1c: Method of Least Squares

Chapter 4: Regression With One Regressor

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X.

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Transcription:

I. Defnton and Problems Econ7 Appled Econometrcs Topc 9: Heteroskedastcty (Studenmund, Chapter ) We now relax another classcal assumpton. Ths s a problem that arses often wth cross sectons of ndvduals, households or frms. It can be a problem wth tme seres data, too. Homoskedastcty exsts when the varance of the dsturbances s constant: Var ( ε )= E ( ε )= σ Assumpton of equal (homo) spread (skedastcty) n the dstrbuton of the dsturbances for all observatons. Varance s a constant. Independent of anythng else, ncludng the values of the ndependent varables. Heteroskedastcty exsts when the varance of the dsturbances s varable: Var ( ε )= E ( ε )= σ The varance of the dsturbances can take on a dfferent value for each observaton n the sample. Most general specfcaton. Takes on dfferent values for each observaton. More often, σ may be related to one or more of the ndependent varables. Heteroskedastcty volates one of the basc classcal assumptons. Example: Suppose we estmate a cross-sectonal savngs functon. The varance of the dsturbances may ncrease wth dsposable ncome, due to ncreased 'dscretonary ncome'. More ncome has to be devoted to basc necesstes. Dstrbutons flatten out as DI rses. Take 3 dstnct ncome classes (low, medum and hgh) two graphs. Other examples. Mght be related to the sze of some economc aggregate (e.g., corporaton, metropoltan area, state or regon).

Page - What happens f we use OLS n a regresson wth known heteroskedastcty? () The estmated coeffcents are stll unbased. Homoskedastcty s not a necessary condton for unbasedness. Same result as multcollnearty. () But the OLS estmators are neffcent. Ths means that they're no longer

Page - 3 BLUE, lke n the case of seral correlaton. Dfferent from multcollnearty. Return to the -varable regresson: wth homoskedastcty: = β β X Y ε wth heteroskedastcty: Var ( ˆ β σ )= Σ x Of course f σ =σ, we can smplfy: Var ( x Var ( ˆ Σ σ β )= ( Σ x ) ˆ β σ Σ x σ )= = ( Σ x ) Σ x Earler formula for calculatng the standard error depends on the assumpton of homoskedastcty. It s a specal case of the more general formula. OLS estmators wll be neffcent. They re no longer mnmum varance. The formulae for the OLS estmators of the coeffcents are stll the same under heteroskedastcty. Intuton: We want the 'best ft' possble for our regresson lne. Our sample regresson functon should le as close as possble to the populaton regresson functon. OLS 'equally weghts' each observaton. It assumes each observaton contrbutes the same amount of 'nformaton' to ths estmaton. Wth heterogenety that s no longer approprate. Observatons should not be equally weghted. Those assocated wth a tghter dstrbuton of ε contrbute more nformaton, whle those assocated wth a wder dstrbuton of ε contrbute less nformaton about the bass of ths economc behavour. A pror observatons from wder dstrbuton have more potental error. By dsregardng heteroskedastcty and usng the OLS formula, we would produce based estmates of the standard errors. In general, we won't know the drecton of the bas. As a result, statstcal nference would be napproprate. Weghted Least Squares (WLS)

Page - 4 WLS essentally takes advantage of ths heterogenety n ts estmators. Assume σ s known. Transform the data by dvdng by σ : Y = β ( ) σ σ X β ( ) σ ε ( ) σ or use to denote the transformed varables: where W s the 'weght' gven to the observaton. Ths s just the nverse of the standard devaton. No longer a constant term n the regresson. OLS estmaton on the transformed model s WLS on the orgnal model (denote the estmators by ˆβ and ˆβ ). But why dd we do ths? As a result: = β W β X Y Ths means that the OLS estmators of the transformed data are BLUE, because the dsturbances are now homoskedastc. Not only constant, also equal to. In ths case ˆβ and ˆβ wll be BLUE. The transformed model meets all the classcal assumptons, ncludng homoskedastcty. Recall that ˆβ and ˆβ are unbased, but not effcent. Another way to motvate the dstncton between OLS and WLS s to look at the 'objectve functons' of the estmaton. Under OLS we mnmse the resdual sum of squares: ε σ Var( ε )= E( ε )= E [( ε / σ ) ] = = σ Σ e = Σ(Y - ˆ β - ˆ β X ) e Σ σ But under WLS we mnmse a weghted resdual sum of squares: Couple thngs to note:

Page - 5. The formulae for WLS estmates of β and β aren t worth commttng to memory, so don t wrte them down. The key s that they look smlar to the formulae under homoskedastcty, except for the weghtng factor. True of both - and k-varable models. But wth software packages, don t have to know these. Just transform data and run the regresson through the orgn.. If σ = σ for all observatons, then WLS estmators are OLS estmators. OLS s a specal case of ths more general procedure. II. Detecton How do we know when our dsturbances are heteroskedastc? The key s that we never observe the true dsturbances or the dstrbutons from whch they are drawn. In other words, we never observe σ (at least not unless we see the entre populaton). For example, take our orgnal example of the savngs functon. If we had the entre census of 4 mllon Sngaporeans we d be able to calculate t. We d know how the dsperson n the dsturbances vares wth dsposable ncome. But n samples we have to make an educated guess. We consder 4 dagnostc tests or ndcators.. A Pror Informaton Ths mght be 'antcpated' (e.g., based on past emprcal work). Check the relevant lterature n ths area. Mght show clear and persstent evdence of heteroskedastcty. For example, check both domestc and overseas studes of savngs regressons. Is t a commonly reported problem n ths emprcal work? Key s that you see t comng. Remander of the tests are post-mortems.. Graphcal Methods Key: We'd lke nformaton on u, but all we ever see are e. We want to know whether or not these squared resduals exhbt any 'systematc pattern'. Wth homoskedastcty we'll see somethng lke ths:

Page - 6 No relatonshp between e and the explanatory varable. Even f we get ths pattern, we can't rule out the possblty of heteroskedastcty. We may have to plot these squared resduals aganst other explanatory varables. The same could be done for the squared resduals and the ftted value. Use a two-step procedure: Alternatvely, wth heteroskedastcty what we'll see are patterns lke ths: 3. Park Test The Park test s just a formalsaton of the plottng of the squared resduals aganst another varable (often one of the explanatory varables).

Page - 7. Run OLS on your regresson. Retan the squared resduals. Assume that: σ β = σ Z Ths mples a 'log-log' lnear relatonshp between the squared resduals and Z.. Estmate the followng: ln e = lnσ β ln Z u Test H : β=. If null s rejected, ths suggests that heteroskedastcty s present. You need to choose whch varables mght be related to the squared resduals (often an ndependent varable s used). If β>, then upward-slopng curved relatonshp. If β<, then downward-slopng curved relatonshp. One problem s that rejecton of H s... suffcent, but not a necessary condton for heteroskedastcty. Another problem s that ths test mposes an assumed relatonshp between a partcular varable and the squared resduals. 4. Whte Test Ths gets around the problems of the Park test that the dsturbances are lkely to be heteroskedastc. Use a three-step procedure: = α β ln Z. Run OLS on your regresson (assume X and X 3 are the two ndependent varables). Retan the squared resduals. u. Estmate the followng auxlary regresson: 3 3 4 3 α 5 3 e = α α X α X α X α X X X u 3. Test the overall sgnfcance of the auxlary regresson. To do ths, use nr. Under the null of homoskedastcty, nr follows the ch-square dstrbuton wth

Page - 8 degrees of freedom equal to the number of slope coeffcents n the auxlary regresson, where n s the sample sze, R s the coeffcent of determnaton of the auxlary regresson. III. Remedal Measures What do you do when heteroskedastcty s suspected? () When σ s known, transform data by dvdng both dependent and ndependent varables by σ and run OLS. Ths s the weghted least squares procedure. Not a very nterestng stuaton. Ths nformaton s rarely avalable. () When σ s unknown, determne the lkely form of the heteroskedastcty. Transform the data accordngly, and run weghted least squares. Two Examples: Suppose we have the followng regresson for a cross secton of ctes: CR = β β EXP β ε where: CR = Per capta crme rate. EXP = Per capta expendtures on polce. = Populaton. The frst slope coeffcent pcks up the effectveness of polce expendtures at the margn (negatve). The second says that crme mght ncrease wth the sze of the metropoltan populaton (postve). () Suppose we suspect that: where σ s a constant. Var ( ε )= E ( ε )= σ Transform the data and estmate the followng:

Page - 9 or CR β EXP = β β CR = β EXP β β ε u The resduals are now homoskedastc. (The proof left to you as an exercse.) Ths doesn t change the nterpretaton of the coeffcents. Dvdng both sdes by the same varable. () Suppose we now suspect that: Var ( ε )= E ( ε )= σ CR ˆ where Cˆ R s the ftted value. Transform the data and estmate the followng: CR = β CR ˆ CR ˆ EXP β CR ˆ β CR ˆ u The resduals are now homoskedastc. Operatonally, ths second example requres two steps: () Run OLS on the orgnal equaton wth the untransformed data (recall that the estmated coeffcents are stll unbased, although they are neffcent). () Transform the data by dvdng the dependent and ndependent varables wth these ftted values and estmate as above. (). Usng heteroskedastc-corrected (HC or Whte) standard errors. Heteroskedastcty does not cause bas of OLS estmates but mpacts the standard errors. HC technque drectly adjusts the standard errors of OLS estmates to take account of heteroskedastcty.

Page - IV. Questons for Dscusson: Q.3 V. Computng Exercse: Johnson, Ch, -5