y = β 1 + β 2 x (11.1.1)

Similar documents
Chapter 11. Heteroskedasticity The Nature of Heteroskedasticity. In Chapter 3 we introduced the linear model (11.1.1)

R t. C t P t. + u t. C t = αp t + βr t + v t. + β + w t

Solutions to Odd Number Exercises in Chapter 6

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

The Simple Linear Regression Model: Reporting the Results and Choosing the Functional Form

ACE 562 Fall Lecture 8: The Simple Linear Regression Model: R 2, Reporting the Results and Prediction. by Professor Scott H.

Econ107 Applied Econometrics Topic 7: Multicollinearity (Studenmund, Chapter 8)

Regression with Time Series Data

Wednesday, November 7 Handout: Heteroskedasticity

Comparing Means: t-tests for One Sample & Two Related Samples

ACE 562 Fall Lecture 4: Simple Linear Regression Model: Specification and Estimation. by Professor Scott H. Irwin

GMM - Generalized Method of Moments

Distribution of Estimates

ACE 564 Spring Lecture 7. Extensions of The Multiple Regression Model: Dummy Independent Variables. by Professor Scott H.

Licenciatura de ADE y Licenciatura conjunta Derecho y ADE. Hoja de ejercicios 2 PARTE A

The Multiple Regression Model: Hypothesis Tests and the Use of Nonsample Information

Solutions to Exercises in Chapter 12

Hypothesis Testing in the Classical Normal Linear Regression Model. 1. Components of Hypothesis Tests

Econ Autocorrelation. Sanjaya DeSilva

1. Diagnostic (Misspeci cation) Tests: Testing the Assumptions

Unit Root Time Series. Univariate random walk

Dynamic Econometric Models: Y t = + 0 X t + 1 X t X t k X t-k + e t. A. Autoregressive Model:

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Vehicle Arrival Models : Headway

Estimation Uncertainty

Financial Econometrics Jeffrey R. Russell Midterm Winter 2009 SOLUTIONS

Distribution of Least Squares

Solutions: Wednesday, November 14

Chapter 16. Regression with Time Series Data

(a) Set up the least squares estimation procedure for this problem, which will consist in minimizing the sum of squared residuals. 2 t.

Chapter 2. First Order Scalar Equations

Time series Decomposition method

Properties of Autocorrelated Processes Economics 30331

Stationary Time Series

OBJECTIVES OF TIME SERIES ANALYSIS

20. Applications of the Genetic-Drift Model

The Brock-Mirman Stochastic Growth Model

Bias in Conditional and Unconditional Fixed Effects Logit Estimation: a Correction * Tom Coupé

How to Deal with Structural Breaks in Practical Cointegration Analysis

Robust estimation based on the first- and third-moment restrictions of the power transformation model

NCSS Statistical Software. , contains a periodic (cyclic) component. A natural model of the periodic component would be

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

DEPARTMENT OF STATISTICS

Outline. lse-logo. Outline. Outline. 1 Wald Test. 2 The Likelihood Ratio Test. 3 Lagrange Multiplier Tests

14 Autoregressive Moving Average Models

Some Basic Information about M-S-D Systems

Dynamic Models, Autocorrelation and Forecasting

ECON 482 / WH Hong Time Series Data Analysis 1. The Nature of Time Series Data. Example of time series data (inflation and unemployment rates)

Biol. 356 Lab 8. Mortality, Recruitment, and Migration Rates

Generalized Least Squares

Methodology. -ratios are biased and that the appropriate critical values have to be increased by an amount. that depends on the sample size.

4.1 Other Interpretations of Ridge Regression

3.1 More on model selection

Summer Term Albert-Ludwigs-Universität Freiburg Empirische Forschung und Okonometrie. Time Series Analysis

Math 10B: Mock Mid II. April 13, 2016

Matlab and Python programming: how to get started

23.5. Half-Range Series. Introduction. Prerequisites. Learning Outcomes

Lecture 33: November 29

Testing the Random Walk Model. i.i.d. ( ) r

On Measuring Pro-Poor Growth. 1. On Various Ways of Measuring Pro-Poor Growth: A Short Review of the Literature

Forecasting optimally

Vectorautoregressive Model and Cointegration Analysis. Time Series Analysis Dr. Sevtap Kestel 1

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Math 333 Problem Set #2 Solution 14 February 2003

MATH 4330/5330, Fourier Analysis Section 6, Proof of Fourier s Theorem for Pointwise Convergence

Wednesday, December 5 Handout: Panel Data and Unobservable Variables

23.2. Representing Periodic Functions by Fourier Series. Introduction. Prerequisites. Learning Outcomes

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

Linear Response Theory: The connection between QFT and experiments

The General Linear Test in the Ridge Regression

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

Introduction D P. r = constant discount rate, g = Gordon Model (1962): constant dividend growth rate.

10. State Space Methods

Chapter 7: Solving Trig Equations

Two Coupled Oscillators / Normal Modes

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

A Specification Test for Linear Dynamic Stochastic General Equilibrium Models

Chapter 5. Heterocedastic Models. Introduction to time series (2008) 1

References are appeared in the last slide. Last update: (1393/08/19)

Section 3.5 Nonhomogeneous Equations; Method of Undetermined Coefficients

1 Review of Zero-Sum Games

Lab #2: Kinematics in 1-Dimension

Measurement Error 1: Consequences Page 1. Definitions. For two variables, X and Y, the following hold: Expectation, or Mean, of X.

Expert Advice for Amateurs

STATE-SPACE MODELLING. A mass balance across the tank gives:

Modeling and Forecasting Volatility Autoregressive Conditional Heteroskedasticity Models. Economic Forecasting Anthony Tay Slide 1

Lecture 5. Time series: ECM. Bernardina Algieri Department Economics, Statistics and Finance

Instructor: Barry McQuarrie Page 1 of 5

STRUCTURAL CHANGE IN TIME SERIES OF THE EXCHANGE RATES BETWEEN YEN-DOLLAR AND YEN-EURO IN

Chapter 15. Time Series: Descriptive Analyses, Models, and Forecasting

ESTIMATION OF DYNAMIC PANEL DATA MODELS WHEN REGRESSION COEFFICIENTS AND INDIVIDUAL EFFECTS ARE TIME-VARYING

Physics 127b: Statistical Mechanics. Fokker-Planck Equation. Time Evolution

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

Intermediate Macro In-Class Problems

Mathcad Lecture #8 In-class Worksheet Curve Fitting and Interpolation

Linear Combinations of Volatility Forecasts for the WIG20 and Polish Exchange Rates

Økonomisk Kandidateksamen 2005(II) Econometrics 2. Solution

2.7. Some common engineering functions. Introduction. Prerequisites. Learning Outcomes

E β t log (C t ) + M t M t 1. = Y t + B t 1 P t. B t 0 (3) v t = P tc t M t Question 1. Find the FOC s for an optimum in the agent s problem.

Transcription:

Chaper 11 Heeroskedasiciy 11.1 The Naure of Heeroskedasiciy In Chaper 3 we inroduced he linear model y = β 1 + β x (11.1.1) o explain household expendiure on food (y) as a funcion of household income (x). In his funcion β 1 and β are unknown parameers ha convey informaion abou he expendiure funcion. The response parameer β describes how household food expendiure changes when household income increases by one uni. The inercep 1

parameer β 1 measures expendiure on food for a zero income level. Knowledge of hese parameers aids planning by insiuions such as governmen agencies or food reail chains. We begin his secion by asking wheher a funcion such as y = β 1 + β x is beer a explaining expendiure on food for low-income households han i is for high-income households. Low-income households do no have he opion of exravagan food ases; comparaively, hey have few choices, and are almos forced o spend a paricular porion of heir income on food. High-income households, on he oher hand, could have simple food ases or exravagan food ases. They migh dine on caviar or spaghei, while heir low-income counerpars have o ake he spaghei. Thus, income is less imporan as an explanaory variable for food expendiure of high-income families. I is harder o guess heir food expendiure. This ype of effec can be capured by a saisical model ha exhibis heeroskedasiciy.

To discover how, and wha we mean by heeroskedasiciy, le us reurn o he saisical model for he food expendiure-income relaionship ha we analysed in Chapers 3 hrough 6. Given T = 40 cross-secional household observaions on food expendiure and income, he saisical model specified in Chaper 3 was given by y = β 1 + β x + e (11.1.) where y represens weekly food expendiure for he -h household, x represens weekly household income for he -h household, and β 1 and β are unknown parameers o esimae. Specifically, we assumed he e were uncorrelaed random error erms wih mean zero and consan variance σ. Tha is, 3

E(e ) = 0 var(e ) = σ cov(e i, e j ) = 0 (11.1.3) Using he leas squares procedure and he daa in Table 3.1 we found esimaes b 1 = 40.768 and b = 0.183 for he unknown parameers β 1 and β. Including he sandard errors for b 1 and b, he esimaed mean funcion was yˆ = 40.768 + 0.183x (.139) (0.0305) (11.1.4) A graph of his esimaed funcion, along wih all he observed expendiure-income poins (y, x ), appears in Figure 11.1. Noice ha, as income (x ) grows, he observed daa poins (y, x ) have a endency o deviae more and more from he esimaed mean funcion. The poins are scaered furher away from he line as x ges larger. 4

Anoher way o describe his feaure is o say ha he leas squares residuals, defined by e = y b b x (11.1.5) ˆ 1 increase in absolue value as income grows. The observable leas squares residuals ( e ˆ ) are proxies for he unobservable errors (e ) ha are given by e = y β 1 β x (11.1.6) 5

Thus, he informaion in Figure 11.1 suggess ha he unobservable errors also increase in absolue value as income (x ) increases. Tha is, he variaion of food expendiure y around mean food expendiure E(y ) increases as income x increases. This observaion is consisen wih he hypohesis ha we posed earlier, namely, ha he mean food expendiure funcion is beer a explaining food expendiure for lowincome (spaghei-eaing) households han i is for high-income households who migh be spaghei eaers or caviar eaers. Is his ype of behavior consisen wih he assumpions of our model? The parameer ha conrols he spread of y around he mean funcion, and measures he uncerainy in he regression model, is he variance σ. If he scaer of y around he mean funcion increases as x increases, hen he uncerainy abou y increases as x increases, and we have evidence o sugges ha he variance is no consan. Insead, we should be looking for a way o model a variance σ ha increases as x increases. 6

Thus, we are quesioning he consan variance assumpion, which we have wrien as var(y ) = var(e ) = σ (11.1.7) The mos general way o relax his assumpion is o add a subscrip o σ, recognizing ha he variance can be differen for differen observaions. We hen have var( y ) = var( e ) =σ (11.1.8) In his case, when he variances for all observaions are no he same, we say ha heeroskedasiciy exiss. Alernaively, we say he random variable y and he 7

random error e are heeroskedasic. Conversely, if Equaion (11.1.7) holds we say ha homoskedasiciy exiss, and y and e are homoskedasic. The heeroskedasic assumpion is illusraed in Figure 11.. A x 1, he probabiliy densiy funcion f(y 1 x 1 ) is such ha y 1 will be close o E(y 1 ) wih high probabiliy. When we move o x, he probabiliy densiy funcion f(y x ) is more spread ou; we are less cerain abou where y migh fall. When homoskedasiciy exiss, he probabiliy densiy funcion for he errors does no change as x changes, as we illusraed in Figure 3.3. The exisence of differen variances, or heeroskedasiciy, is ofen encounered when using cross-secional daa. The erm cross-secional daa refers o having daa on a number of economic unis such as firms or households, a a given poin in ime. The household daa on income and food expendiure fall ino his caegory. Wih ime-series daa, where we have daa over ime on one economic uni, such as a firm, a household, or even a whole economy, i is possible ha he error variance will 8

change. This would be rue if here was an exernal shock or change in circumsances ha creaed more or less uncerainy abou y. Given ha we have a model ha exhibis heeroskedasiciy, we need o ask abou he consequences on leas squares esimaion of he variaion of one of our assumpions. Is here a beer esimaor ha we can use? Also, how migh we deec wheher or no heeroskedasiciy exiss? I is o hese quesions ha we now urn. 9

11. The Consequences of Heeroskedasiciy for he Leas Squares Esimaor If we have a linear regression model wih heeroskedasiciy and we use he leas squares esimaor o esimae he unknown coefficiens, hen: 1. The leas squares esimaor is sill a linear and unbiased esimaor, bu i is no longer he bes linear unbiased esimaor (B.L.U.E.).. The sandard errors usually compued for he leas squares esimaor are incorrec. Confidence inervals and hypohesis ess ha use hese sandard errors may be misleading. Now consider he following model y = β 1 + β x + e (11..1) where 10

Ee e e e i j ( ) = 0, var( ) =σ, cov( i, j) = 0, ( ) Noe he heeroskedasic assumpion var( e ) = σ. In Chaper 4, Equaion (4..1), we wroe he leas squares esimaor for β as b = β + Σw e (11..) where w = x x ( x x) 11

This expression is a useful one for exploring he properies of leas squares esimaion under heeroskedasiciy. The firs propery ha we esablish is ha of unbiasedness. This propery was derived under homoskedasiciy in Equaion (4..3) of Chaper 4. This proof sill holds because he only error erm assumpion ha i used, E(e ) = 0, sill holds. We reproduce i here for compleeness. E(b ) = E(β ) + E(Σw e ) = β + Σw E(e ) = β (11..4) The nex resul is ha he leas squares esimaor is no longer bes. Tha is, alhough i is sill unbiased, i is no longer he bes linear unbiased esimaor. The way we ackle 1

his quesion is o derive an alernaive esimaor which is he bes linear unbiased esimaor. This new esimaor is considered in Secions 10.3 and 11.5. To show ha he usual formulas for he leas squares sandard errors are incorrec under heeroskedasiciy, we reurn o he derivaion of var(b ) in Equaion (4..11). From ha equaion, and using Equaion (11..), we have var( b ) = var( β ) + var( we ) = var( we ) = w var( e ) + ww cov( e, e ) i j i j i j = w σ ( x x) σ = ( x x) (11..5) 13

In an earlier proof, where he variances were all he same ( σ =σ ), we were able o wrie he nex-o-las line as σ w. Now, he siuaion is more complex. Noe from he las line in Equaion (11..5) ha var( b ) σ ( x x) (11..6) Thus, if we use he leas squares esimaion procedure and ignore heeroskedasiciy when i is presen, we will be using an esimae of Equaion (11..6) o obain he sandard error for b, when in fac we should be using an esimae of Equaion (11..5). Using incorrec sandard errors means ha inerval esimaes and hypohesis ess will no longer be valid. Noe ha sandard compuer sofware for leas squares regression 14

will compue he esimaed variance for b based on Equaion (11..6), unless old oherwise. 11..1 Whie s Approximae Esimaor for he Variance of he Leas Squares Esimaor Halber Whie, an economerician, has suggesed an esimaor for he variances and covariances of he leas squares coefficien esimaors when heeroskedasiciy exiss. In he conex of he simple regression model, his esimaor for var(b ) is obained by replacing σ by he squares of he leas squares residuals e, in Equaion (11..5). Large variances are likely o lead o large values of he squared residuals. Because he squared residuals are used o approximae he variances, Whie s esimaor is sricly appropriae only in large samples. If we apply Whie s esimaor o he food expendiure-income daa, we obain ˆ 15

var(b 1 ) = 561.89, var(b ) = 0.0014569 Taking he square roos of hese quaniies yields he sandard errors, so ha we could wrie our esimaed equaion as yˆ = 40.768 + 0.183x (3.704) (0.038) (Whie) (.139) (0.0305) (incorrec) In his case, ignoring heeroskedasiciy and using incorrec sandard errors ends o oversae he precision of esimaion; we end o ge confidence inervals ha are narrower han hey should be. 16

Specifically, following Equaion (5.1.1) of Chaper 5, we can consruc wo corresponding 95% confidence inervals for β. Whie: b ± se( ) 0.183.04(0.038) [0.051, 0.06] c b = ± = Incorrec: b ± se( b ) = 0.183±.04(0.0305) = [0.067, 0.190] c If we ignore heeroskedasiciy, we esimae ha β lie beween 0.067 and 0.190. However, recognizing he exisence of heeroskedasiciy means recognizing ha our informaion is less precise, and we esimae ha β lie beween 0.051 and 0.06. Whie s esimaor for he sandard errors helps overcome he problem of drawing incorrec inferences from leas squares esimaes in he presence of heeroskedasiciy. However, if we can ge a beer esimaor han leas squares, hen i makes more sense o use his beer esimaor and is corresponding sandard errors. Wha is a beer 17

esimaor will depend on how we model he heeroskedasiciy. Tha is, i will depend on wha furher assumpions we make abou he σ. 18

11.3 Proporional Heeroskedasiciy Reurn o he example where weekly food expendiure (y ) is relaed o weekly income (x ) hrough he equaion y = β 1 + β x + e (11.3.1) Following he discussion in Secion 11.1, we make he following assumpions: Ee e e e i j ( ) = 0, var( ) =σ, cov( i, j) = 0, ( ) 19

By iself, he assumpion var( e ) = σ is no adequae for developing a beer procedure for esimaing β 1 and β. We would need o esimae T differen variances ( σ, σ,..., σ ) plus β 1 and β, wih only T sample observaions; i is no possible o 1 T consisenly esimae T or more parameers. We overcome his problem by making a furher assumpion abou he σ. Our earlier inspecion of he leas squares residuals suggesed ha he error variance increases as income increases. A reasonable model for such a variance relaionship is var( e ) = σ =σ x (11.3.) Tha is, we assume ha he variance of he -h error erm σ is given by a posiive unknown consan parameer σ muliplied by he posiive income variable x. 0

As explained earlier, in economic erms his assumpion implies ha for low levels of income (x ), food expendiure (y ) will be clusered close o he mean funcion E(y ) = β 1 + β x. Expendiure on food for low-income households will be largely explained by he level of income. A high levels of income, food expendiures can deviae more from he mean funcion. This means ha here are likely o be many oher facors, such as specified ases and preferences, ha reside in he error erm, and ha lead o a greaer variaion in food expendiure for high-income households. Thus, he assumpion of heeroskedasic errors in Equaion (11.3.) is a reasonable one for he expendiure model. In any given pracical seing i is imporan o hink no only abou wheher he residuals from he daa exhibi heeroskedasiciy, bu also abou wheher such heeroskedasiciy is a likely phenomenon from an economic sandpoin. Under heeroskedasiciy he leas squares esimaor is no he bes linear unbiased esimaor. One way of overcoming his dilemma is o change or ransform our 1

saisical model ino one wih homoskedasic errors. Leaving he basic srucure of he model inac, i is possible o urn he heeroskedasic error model ino a homoskedasic error model. Once his ransformaion has been carried ou, applicaion of leas squares o he ransformed model gives a bes linear unbiased esimaor. To demonsrae hese facs, we begin by dividing boh sides of he original equaion in (11.3.1) by x y 1 x e =β 1 +β + (11.3.3) x x x x Now, define he following ransformed variables

* y * 1 * x * e y =, x1 =, x =, e = (11.3.4) x x x x so ha Equaion (11.3.3) can be rewrien as y = β x +β x + e (11.3.5) 1 1 The beauy of his ransformed model is ha he new ransformed error erm e is homoskedasic. The proof of his resul is: var( e ) = var e 1 1 = var( e ) = σ x =σ x x x (11.3.6) 3

The ransformed error erm will reain he properies Ee ( ) = 0 and zero correlaion beween differen observaions, cov( e, e ) = 0 for i j. As a consequence, we can i j apply leas squares o he ransformed variables,, y 1 x and x o obain he bes linear unbiased esimaor for β 1 and β. Noe ha hese ransformed variables are all observable; i is a sraighforward maer o compue he observaions on hese variables. Also, he ransformed model is linear in he unknown parameers β 1 and β. These are he original parameers ha we are ineresed in esimaing. They have no been affeced by he ransformaion. In shor, he ransformed model is a linear saisical model o which we can apply leas squares esimaion. The ransformed model saisfies he condiions of he Gauss-Markov Theorem, and he leas squares esimaors defined in erms of he ransformed variables are B.L.U.E. 4

To summarize, o obain he bes linear unbiased esimaor for a model wih heeroskedasiciy of he ype specified in Equaion (11.3.): 1. Calculae he ransformed variables given in Equaion (11.3.4).. Use leas squares o esimae he ransformed model given in Equaion (11.3.5). The esimaor obained in his way is called a generalized leas squares esimaor. One way of viewing he generalized leas squares esimaor is as a weighed leas squares esimaor. Recall ha he leas squares esimaor is hose values of β 1 and β ha minimize he sum of squared errors. In his case, we are minimizing he sum of squared ransformed errors ha are given by e T T * e = = 1 = 1 x 5

The errors are weighed by he reciprocal of x. When x is small, he daa conain more informaion abou he regression funcion and he observaions are weighed heavily. When x is large, he daa conain less informaion and he observaions are weighed lighly. In his way we ake advanage of he heeroskedasiciy o improve parameer esimaion. Remark: In he ransformed model x 1 1. Tha is, he variable associaed wih he inercep parameer is no longer equal o 1. Since leas squares sofware usually auomaically insers a 1 for he inercep, when dealing wih ransformed variables you will need o learn how o urn his opion off. If you use a weighed or generalized leas squares opion on your sofware, he compuer will do boh he ransforming and he esimaing. In his case suppressing he consan will no be necessary. 6

Applying he generalized (weighed) leas squares procedure o our household expendiure daa yields he following esimaes: yˆ = 31.94 + 0.1410x (17.986) (0.070) (R11.4) Tha is, we esimae he inercep erm as ˆβ 1 = 31.94 and he slope coefficien ha shows ha he response of food expendiure o a change in income as ˆβ = 0.1410. These esimaes are somewha differen from he leas squares esimae b 1 = 40.768 and b = 0.183 ha did no allow for he exisence of heeroskedasiciy. I is imporan o recognize ha he inerpreaions for β 1 and β are he same in he ransformed model in Equaion (11.3.5) as hey are in he unransformed model in Equaion (11.3.1). 7

Transformaion of he variables should be regarded as a device for convering a heeroskedasic error model ino a homoskedasic error model, no as somehing ha changes he meaning of he coefficiens. The sandard errors in Equaion (R11.4), namely se( ˆβ 1 ) = 17.986 and se( ˆβ ) = 0.070 are boh lower han heir leas squares counerpars ha were calculaed from Whie s esimaor, namely se(b 1 ) = 3.704 and se(b ) = 0.038. Since generalized leas squares is a beer esimaion procedure han leas squares, we do expec he generalized leas squares sandard errors o be lower. Remark: Remember ha sandard errors are square roos of esimaed variances; in a single sample he relaive magniudes of variances may no always be refleced by heir corresponding variance esimaes. Thus, lower sandard errors do no always mean beer esimaion. 8

The smaller sandard errors have he advanage of producing narrower more informaive confidence inervals. For example, using he generalized leas squares resuls, a 95% confidence inerval for β is given by β? ± se( β ) = 0.1410 ±.04(0.070) = [0.086, 0.196] c The leas squares confidence inerval compued using Whie s sandard errors was [0.051, 0.06]. 9

11.4 Deecing Heeroskedasiciy There is likely o be uncerainy abou wheher a heeroskedasic-error assumpion is warraned. A common quesion is: How do we know if heeroskedasiciy is likely o be a problem for our model and our se of daa? Is here a way of deecing heeroskedasiciy so ha we know wheher o use generalized leas squares echniques? We will consider wo ways of invesigaing hese quesions. 11.4.1 Residual Plos One way of invesigaing he exisence of heeroskedasiciy is o esimae your model using leas squares and o plo he leas squares residuals. If he errors are homoskedasic, here should be no paerns of any sor in he residuals. If he errors are heeroskedasic, hey may end o exhibi greaer variaion in some sysemaic way. For example, for he household expendiure daa, we suspeced ha he variance may increase as income increased. In Figure 11.1 we ploed he esimaed leas squares 30

funcion and he residuals and discovered ha he absolue values of he residuals did indeed end o increase as income increased. This mehod of invesigaing heeroskedasiciy can be followed for any simple regression. When we have more han one explanaory variable, he esimaed leas squares funcion is no so easily depiced on a diagram. However, wha we can do is plo he leas squares residuals agains each explanaory variable, agains ime, or agains y ˆ, o see if hose residuals vary in a sysemaic way relaive o he specified variable. 11.4. The Goldfeld-Quand Tes A formal es for heeroskedasiciy is he Goldfeld-Quand es. I involves he following seps: 1. Spli he sample ino wo approximaely equal subsamples. If heeroskedasicy exiss, some observaions will have large variances and ohers will have small variances. Divide he sample such ha he observaions wih poenially high 31

variances are in one subsample and hose wih poenially low variances are in he oher subsample. For example, in he food expendiure equaion, where we believe he variances are relaed o x, he observaions should be sored according o he magniude of x ; he T/ observaions wih he larges values of x would form one subsample and he oher T/ observaions, wih he smalles values of x, would form he oher.. Compue esimaed error variances ˆσ 1 and ˆσ for each of he subsamples. Le ˆσ 1 be he esimae from he subsample wih poenially large variances and le ˆσ be he esimae from he subsample wih poenially small variances. If a null hypohesis of equal variances is no rue, we expec?σ 1 σ o be large. 3. Compue GQ = σ? 1 σ and rejec he null hypohesis of equal variances if GQ > F c where F c is a criical value from he F-disribuion wih (T 1 K) and (T K) degrees of freedom. The values T 1 and T are he numbers of observaions in each of he subsamples; if he sample is spli exacly in half, T 1 = T = T/. 3

Applying his es procedure o he household food expendiure model, we se up he hypoheses as follows: H : 0 σ =σ H σ =σ x (11.4.1) : 1 Afer ordering he daa according o decreasing values of x, and using a pariion of 0 observaions in each subse of daa, we find value of he Goldfeld-Quand saisic is σ ˆ = 85.9 and 1 σ ˆ = 68.46. Hence, he 85.9 GQ = = 3.35 68.46 33

The 5 percen criical value for (18, 18) degrees of freedom is F c =.. Thus, because GQ = 3.35 > F c =., we rejec H 0 and conclude ha heeroskedasiciy does exis; he error variance does depend on he level of income. REMARK: The above es is a one-sided es because he alernaive hypohesis suggesed which sample pariion will have he larger variance. If we suspec ha wo sample pariions could have differen variances, bu we do no know which variance is poenially larger, hen a wo-sided es wih alernaive hypohesis H : σ σ is more appropriae. To perform a wo-sided es a he 5 percen 1 1 significance level we pu he larger variance esimae in he numeraor and use a criical value F c such ha P[F > F c ] = 0.05. 34

11.5 A Sample Wih a Heeroskedasic Pariion 11.5.1 Economic Model Consider modeling he supply of whea in a paricular whea growing area in Ausralia. In he supply funcion he quaniy of whea supplied will ypically depend upon he producion echnology of he firm, on he price of whea or expecaions abou he price of whea, and on weaher condiions. We can depic his supply funcion as Quaniy = f (Price, Technology, Weaher) (11.5.1) 35

To esimae how he quaniy supplied responds o price and oher variables, we move from he economic model in Equaion (11.5.1) o an economeric model ha we can esimae. If we have a sample of ime-series daa, aggregaed over all farms, here will be price variaion from year o year, variaion ha can be used o esimae he response of quaniy o price. Also, producion echnology will improve over ime, meaning ha a greaer supply can become profiable a he same level of oupu price. Finally, a larger par of he year-o-year variaion in supply could be aribuable o weaher condiions. The daa we have available from he Ausralian whea-growing disric consis of 6 years of aggregae ime-series daa on quaniy supplied and price. Because here is no obvious index of producion echnology, some kind of proxy needs o be used for his variable. We use a simple linear ime-rend, a variable ha akes he value 1 in year 1, in year, and so on, up o 6 in year 6. An obvious weaher variable is also 36

unavailable; hus, in our saisical model, weaher effecs will form par of he random error erm. Using hese consideraions, we specify he linear supply funcion q = β 1 + β p + β 3 + e = 1,,,6 (11.5.) q is he quaniy of whea produced in year, p is he price of whea guaraneed for year, = 1,,,6 is a rend variable inroduced o capure changes in producion echnology, and e is a random error erm ha includes, among oher hings, he influence of weaher. As before, β 1, β, and β 3 are unknown parameers ha we wish o esimae. The daa on q, p, and are given in Table 11.1. To complee he economeric model in Equaion (11.5.) some saisical assumpions for he random error erm e are needed. One possibiliy is o assume he e are 37

independen idenically disribued random variables wih zero mean and consan variance. This assumpion is in line wih hose made in earlier chapers. In his case, however, we have addiional informaion ha makes an alernaive assumpion more realisic. Afer he 13h year, new whea varieies whose yields are less suscepible o variaions in weaher condiions were inroduced. These new varieies do no have an average yield ha is higher han ha of he old varieies, bu he variance of heir yields is lower because yield is less dependen on weaher condiions. Since he weaher effec is a major componen of he random error erm e, we can model he reduced weaher effec of he las 13 years by assuming he error variance in hose years is differen from he error variance in he firs 13 years. Thus, we assume ha 38

Ee ( ) = 0 = 1,, K,6 var( e ) =σ = 1,, K,13 1 var( e ) =σ = 14, 15, K,6 cov( e, e ) = 0 i j i j (11.5.3) From he above argumen, we expec ha σ <σ 1. Since he error variance in Equaion (11.5.3) is no consan for all observaion, his model describes anoher form of heeroskedasiciy. I is a form ha pariions he sample ino wo subses, one subse where he error variance is error variance is σ. σ 1 and one where he 39

11.5. Generalized Leas Squares Through Model Transformaion Given he heeroskedasic error model wih wo variances, one for each subse of hireen years, we consider ransforming he model so ha he variance of he ransformed error erm is consan over he whole sample. This approach made i possible o obain a bes linear unbiased esimaor by applying leas squares o he ransformed model. Now we wrie he model corresponding o he wo subses of observaions as (11.5.4) 40

Dividing each variable by σ 1 for he firs 13 observaions and by σ for he las 13 observaions yields q 1 p e =β +β +β + = 1, K,13 σ σ σ σ σ 1 3 1 1 1 1 1 q 1 p e =β +β +β + = 14, K,6 σ σ σ σ σ 1 3 (11.5.5) This ransformaion yields ransformed error erms ha have he same variance for all observaions. Specifically, he ransformed error variances are all equal o one because 41

e 1 σ1 var = var( e ) 1 1,,13 = = = K σ1 σ1 σ1 e 1 σ var = var( e ) 1 14,,6 = = = K σ σ σ Providing σ 1 and σ are known, he ransformed model in Equaion (11.5.5) provides a se of new ransformed variables o which we can apply he leas squares principle o obain he bes linear unbiased esimaor for (β 1, β, β 3 ). The ransformed variables are q 1 p ( ) ( ) ( ) ( ) σ σ σ σ i i i i (11.5.6) 4

where σ i is eiher σ 1 or σ, depending on which half of he observaions are being considered. As before, he complee process of ransforming variables, hen applying leas squares o he ransformed variables, is called generalized leas squares. 11.5.3 Implemening Generalized Leas Squares The ransformed variables in Equaion (11.5.6) depend on he unknown variance parameers σ 1 and σ. Thus, as hey sand, he ransformed variables canno be calculaed. To overcome his difficuly, we use esimaes of he variables as if he esimaes were he rue variances. Since σ 1 and σ and ransform σ 1 is he error variance from he firs half of he sample and σ is he error variance from he second half of he sample, i makes sense o spli he sample ino wo, applying leas squares o he firs half o esimae σ 1 and applying leas squares o 43

he second half o esimae no difficulies in large samples. σ. Subsiuing hese esimaes for he rue values causes If we follow his sraegy for he whea supply example we obain σ? = 641.64 and σ = 57.76 (R11.7) 1 Using hese esimaes o calculae observaions on he ransformed variables in Equaion (11.5.6), and hen applying leas squares o he complee sample defined in Equaion (11.5.5) yields he esimaed equaion as such: qˆ = 138.1+ 1.7 p + 3.83 (1.7) (8.81) (0.81) (R11.8) 44

These esimaes sugges ha an increase in price of 1 uni will bring abou an increase in supply of 1.7 unis. The coefficien of he rend variable suggess ha, each year, echnological advances mean ha an addiional 3.83 unis will be supplied, given consan prices. The sandard errors are sufficienly small o make he esimaed coefficiens significanly differen from zero. However, he 95% confidence inervals for β and β 3, derived using hese sandard errors, are relaively wide.?β ± se(β ) = 1.7±.069(8.81) = [3.5, 39.9] c?β ± se(β ) = 3.83±.069(0.81) = [1.60, 4.96] 3 c 3 (R11.9) 45

Remark: A word of warning abou calculaion of he sandard errors is necessary. As demonsraed below Equaion (11.5.5), he ransformed errors in Equaion (11.5.5) have a variance equal o one. However, when you ransform your variables using ˆσ 1 and ˆσ, and apply leas squares o he ransformed variables for he complee sample, your compuer program will auomaically esimae a variance for he ransformed errors. This esimae will no be exacly equal o one. The sandard errors in Equaion (R11.8) were calculaed by forcing he compuer o use one as he variance of he ransformed errors. Mos sofware packages will have opions ha le you do his, bu i is no crucial if your package does no; he variance esimae will usually be close o one anyway. 46

11.5.4 Tesing he Variance Assumpion To use a residual plo o check wheher he whea-supply error variance has decreased over ime, i is sensible o plo he leas-squares residuals agains ime. See Figure 11.3. The dramaic drop in he variaion of he residuals afer year 13 suppors our belief ha he variance has decreased. For he Goldfeld-Quand es he sample is already spli ino wo naural subsamples. Thus, we se up he hypoheses H : σ =σ 0 1 H : σ <σ (11.5.7) 1 1 The compued value of he Goldfeld-Quand saisic is 47

GQ ˆ 641.64 = σ = = 11.11 σˆ 57.76 1 T 1 = T = 13 and K = 3; hus, if H 0 is rue, 11.11 is an observed value from an F- disribuion wih (10, 10) degrees of freedom. The corresponding 5 percen criical value is F c =.98. Since GQ = 11.11 > F c =.98 we rejec H 0 and conclude ha he observed difference beween ˆσ 1 and ˆσ could no reasonably be aribuable o chance. There is evidence o sugges he new varieies have reduced he variance in he supply of whea. 48

Exercise 11.1 11.3 11.4 11.6 11.8 11.9 11.11 49