Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Similar documents
Making sense of Econometrics: Basics

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Multiple Regression Analysis

Iris Wang.

Lecture 4: Heteroskedasticity

Heteroscedasticity. Jamie Monogan. Intermediate Political Methodology. University of Georgia. Jamie Monogan (UGA) Heteroscedasticity POLS / 11

Topic 7: Heteroskedasticity

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

Violation of OLS assumption - Heteroscedasticity

Econometrics - 30C00200

Heteroskedasticity. y i = β 0 + β 1 x 1i + β 2 x 2i β k x ki + e i. where E(e i. ) σ 2, non-constant variance.

Heteroskedasticity and Autocorrelation

ECON 497: Lecture Notes 10 Page 1 of 1

AUTOCORRELATION. Phung Thanh Binh

Making sense of Econometrics: Basics

Reliability of inference (1 of 2 lectures)

Topics. Estimation. Regression Through the Origin. Basic Econometrics in Transportation. Bivariate Regression Discussion

Semester 2, 2015/2016

2 Prediction and Analysis of Variance

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Applied Econometrics. Applied Econometrics. Applied Econometrics. Applied Econometrics. What is Autocorrelation. Applied Econometrics

Introduction to Econometrics. Heteroskedasticity

Chapter 8 Heteroskedasticity

EC312: Advanced Econometrics Problem Set 3 Solutions in Stata

405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati

Introductory Econometrics

Econometrics Multiple Regression Analysis: Heteroskedasticity

ECO375 Tutorial 7 Heteroscedasticity

Christopher Dougherty London School of Economics and Political Science

Econometrics. 9) Heteroscedasticity and autocorrelation

Graduate Econometrics Lecture 4: Heteroskedasticity

Intermediate Econometrics

Heteroscedasticity 1

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Chapter 5. Classical linear regression model assumptions and diagnostics. Introductory Econometrics for Finance c Chris Brooks

OSU Economics 444: Elementary Econometrics. Ch.10 Heteroskedasticity

1. You have data on years of work experience, EXPER, its square, EXPER2, years of education, EDUC, and the log of hourly wages, LWAGE

Economics 308: Econometrics Professor Moody

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Econometrics Homework 4 Solutions

Multiple Regression Analysis

Review of Econometrics

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

the error term could vary over the observations, in ways that are related

LECTURE 11. Introduction to Econometrics. Autocorrelation

Diagnostics of Linear Regression

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

Multiple Regression Analysis: Heteroskedasticity

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

FinQuiz Notes

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

(c) i) In ation (INFL) is regressed on the unemployment rate (UNR):

Heteroskedasticity. Occurs when the Gauss Markov assumption that the residual variance is constant across all observations in the data set

Wooldridge, Introductory Econometrics, 2d ed. Chapter 8: Heteroskedasticity In laying out the standard regression model, we made the assumption of

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Making sense of Econometrics: Basics

1 Motivation for Instrumental Variable (IV) Regression

Maximum Likelihood (ML) Estimation

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Rockefeller College University at Albany

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Week 11 Heteroskedasticity and Autocorrelation

Motivation for multiple regression

Macroeconometrics. Christophe BOUCHER. Session 4 Classical linear regression model assumptions and diagnostics

Lab 11 - Heteroskedasticity

Heteroskedasticity. (In practice this means the spread of observations around any given value of X will not now be constant)

Econ 510 B. Brown Spring 2014 Final Exam Answers

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Heteroskedasticity. Part VII. Heteroskedasticity

Multiple Linear Regression

Lecture 3: Multiple Regression

Multiple Regression Analysis

ECONOMETRICS Online Lecture Notes Prepared By: Dr. Manoj Bhatt

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Autocorrelation. Think of autocorrelation as signifying a systematic relationship between the residuals measured at different points in time

Empirical Economic Research, Part II

The F distribution. If: 1. u 1,,u n are normally distributed; and 2. X i is distributed independently of u i (so in particular u i is homoskedastic)

Applied Statistics and Econometrics

Introduction to Econometrics. Multiple Regression (2016/2017)

Lab 07 Introduction to Econometrics

CHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

The OLS Estimation of a basic gravity model. Dr. Selim Raihan Executive Director, SANEM Professor, Department of Economics, University of Dhaka

6. Assessing studies based on multiple regression

Linear Regression with Time Series Data

Microeconometria Day # 5 L. Cembalo. Regressione con due variabili e ipotesi dell OLS

Linear Regression with Time Series Data

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Econometrics of Panel Data

Homoskedasticity. Var (u X) = σ 2. (23)

ECNS 561 Multiple Regression Analysis

Lecture 6: Dynamic Models

Introduction to Econometrics

Transcription:

1/25 Outline Basic Econometrics in Transportation Heteroscedasticity What is the nature of heteroscedasticity? What are its consequences? How does one detect it? What are the remedial measures? Amir Samimi Civil Engineering Department Sharif University of Technology Primary Source: Basic Econometrics (Gujarati) 2/25 3/25 Nature of Heteroscedasticity Possible Reasons An important t assumption in CLRM is that E(u 2 i) = 2 i ) σ This is the assumption of equal (homo) spread (scedasticity). Example: the higher income families on the average save more than the lowerincome families, but there is also more variability in their savings. 1. As people e learn, their errors of behavior become e smaller over time. As the number of hours of typing practice increases, the average number of typing errors as well as their variances decreases. 2. As incomes grow, people have more choices about the disposition of their income. Rich people have more choices about their savings behavior. 3. As data collecting techniques improve, σ 2 i is likely to decrease. Banks that have sophisticated data processing equipment are likely to commit fewer errors.

4/25 5/25 Possible Reasons Cross-sectional and Time Series Data 4. Heteroscedasticity can arise when there are outliers. An observation that is much different than other observations in the sample. 5. Heteroscedasticity arises when model is not correctly specified. Very often what looks like heteroscedasticity may be due to the fact that some important variables are omitted from the model. 6. Skewness in distribution of a regressor is an other source. Distribution of income and wealth in most societies is uneven, with the bulk of the income and wealth being owned by a few at the top. 7. Other sources of heteroscedasticity: Incorrect data transformation (ratio or first difference transformations). Incorrect functional form (linear versus log linear models). Heteroscedasticity is likely to be more common in cross- sectional than in time series data. In cross-sectional data, one usually deals with members of a population at a given point in time. These members may be of different sizes, income, etc. In time series data, the variables tend to be of similar orders of magnitude because one generally collects the data for the same entity over a period of time. 6/25 7/25 OLS Estimation with Heteroscedasticity Method of Generalized Least Squares OLS estimators s and their variances a when.. Is it still BLUE when we drop only the homoscedasticity assumption? We can easily prove that it is still linear and unbiased. We can also show that it is a consistent estimator. It is no longer best and the minimum variance is not given by the equation above. What is BLUE in the presence of heteroscedasticity? Ideally, we would like to give less weight to the observations o s coming from populations with greater variability. Consider: Y i = β 1 + β 2 X i + u i = β 1 X 0i + β 2 X i + u i Assume the heteroscedastic variances are known: Variance of transformed disturbance term is now homoscedastic: Apply OLS to the transformed model and get BLUE estimators.

8/25 9/25 GLS Estimators Consequences of Using OLS Minimize Follow the standard calculus techniques, we have: OLS estimator for variance is a biased estimator. Overestimates or underestimates, on average Cannot tell whether the bias is positive or negative No longer rely on confidence intervals, t and F tests If we persist in using the usual testing procedures despite heteroscedasticity, whatever conclusions we draw may be very misleading. Heteroscedasticity is potentially a serious problem and the researcher needs to know whether it is present in a given situation. 10/25 11/25 Detection There eare no hard-and-fast ad ast rules for detecting heteroscedasticity, c ty, only a few rules of thumb. This is inevitable because σ 2 i can be known only if we have the entire Y population corresponding to the chosen X s, More often than not, there is only one sample Y value corresponding to a particular value of X. And there is no way one can know σ 2 i from just one Y observation. Thus, heteroscedasticity may be a matter of intuition, educated guesswork, or prior empirical experience. Most of the detection methods are based on examination of OLS residuals. Those are the ones we observe, and not u i. We hope they are good estimates. This hope may be fulfilled if the sample size is fairly large. Informal Methods Nature of the Problem Nature of problem may suggest heteroscedasticity is likely to be encountered. Residual variance around the regression of consumption on income increases with income. Graphical Method Estimated u 2 i are plotted against estimated Y i Is the estimated mean value of Y systematically related to the squared residual? a) no systematic pattern, perhaps no b-e) definite pattern, perhaps no homoscedasticity. Using such knowledge, one may transform the data to alleviate the problem.

12/25 13/25 Park Test He formalizes the graphical method, by suggesting a Log-linear model: ln σ 2 i = ln σ 2 + β ln X i + v i Since σ 2 i is generally unknown, Park suggests If β turns out to be insignificant, homoscedasticity assumption may be accepted. The particular functional form chosen by Park is only suggestive. Note: the error term v i may not satisfy the OLS assumptions. Glejser Test Glejser suggests regressing the estimated error term on the X variable: Following functional forms are suggested: For large samples the first four give generally satisfactory results. The last two models are nonlinear in the parameters. Note: some argued that v i does not have a zero expected value, it is serially correlated, and heteroscedastic. 14/25 15/25 Spearman s a Rank Correlation o Test Fit the regression to the data on Y and X and estimate the residuals. Rank both absolute value of residuals and X i (or estimated Y i ) and compute the Spearman s rank correlation coefficient: d i = difference in the ranks for i th observation. Assuming that the population rank correlation coefficient is zero and n > 8, the significance ifi of fthe sample r s can be tested t by the t test, t with df = n 2: If the computed t value exceeds the critical t value, we may accept the hypothesis of Goldfeld-Quandt d dt Test Rank the observations according to X i values. Omit c central observations, and divide the remaining observations into two groups each of (n c) / 2 observations. Fit separate OLS regressions to the first and last set of observations, and obtain the residual sums of squares RSS 1 and RSS 2. Compute the ratio If u i are assumed to be normally distributed, and if the assumption of homoscedasticity is valid, then it can be shown that λ follows the F distribution. The ability of the test depends on how c is chosen. Goldfeld and Quandt suggest that c = 8 if n = 30, c = 16 if n = 60. Judge et al. note that c = 4 if n = 30 and c = 10 if n is about 60.

16/25 17/25 Breusch Pagan Godfrey Test Success of GQ test depends on c and X with which observations are ordered. Estimate Y i = β 1 + β 2 X 2i + + β k X ki + u i by OLS and obtain the residuals. Obtain, (ML estimator of σ 2 ) Construct variables p i defined as Regress p i on the Z s as p i = α 1 + α 2 Z 2i + + α m Z mi + v i o σ 2 i is assumed to be a linear function of the Z s. o Some or all of the X s can serve as Z s. Obtain the ESS (explained sum of squares) = 0.5 ESS Assuming u i are normally distributed, one can show that if there is homoscedasticity and if the sample size n increases indefinitely, then χ 2 m 1 BPG test is an asymptotic, or large-sample, test. White s General Heteroscedasticity Test Does not rely on the normality assumption and is easy to implement. Estimate Y i = β 1 + β 2 X 2i + β 3 X 3i + u i and obtain the residuals. Run the following auxiliary regression: Higher powers of regressors can also be introduced. Under the null hypothesis (homoscedasticity), if the sample size n increases indefinitely, it can be shown that nr 2 χ 2 (df = number of regressors) If the chi-square value exceeds the critical value, the conclusion is that there is If it does not α 2 = α 3 = α 4 = α 5 = α 6 = 0. It has been argued that if cross-product terms are present, then it is a test of heteroscedasticity and specification bias. 18/25 19/25 Remedial Measures Remedial Measures Heteroscedasticity c ty does not destroy unbiasedness ess and consistency. But OLS estimators are no longer efficient, not even asymptotically. There are two approaches to remediation: when σ 2 i is known, and When σ 2 i is not known. When σ 2 i is known: The most straightforward method of correcting heteroscedasticity is by means of weighted least squares. WLS method provides BLUE estimators. When σ 2 i is unknown: Is there a way of obtaining consistent estimates of the variances and covariances of OLS estimators even if there is heteroscedasticity? The answer is yes.

20/25 21/25 White s Correction White s Procedure White has suggested a procedure by which asymptotically valid statistical inferences can be made about the true parameter values. Several computer packages present White s heteroscedasticitycorrected variances and standard errors along with the usual OLS variances and standard errors. White s heteroscedasticity-corrected standard errors are also known as robust standard errors. For a 2-variable regression model Y i = β 1 + β 2 X 2i + u i we showed: White has shown that is a consistent estimator of For Y i = β 1 + β 2 X 2i + β 3 X 3i + +β k X ki + u i we have: are the residuals obtained from the original regression. are the residuals obtained from the auxiliary regression of the regressor X j on the remaining regressors. 22/25 23/25 Example Reasonable Heteroscedasticity Patterns Y = per capita expenditure on public schools by state in 1979 Income = per capita income by state in 1979 Both the regressors are statistically significant at the 5 percent level, whereas on the basis of White estimators they are not. Since robust standard errors are now available in established regression packages, it is recommended to report them. WHITE option can be used to compare the output with regular OLS output as a check for Apart from being a large-sample agesa pepocedue,o procedure, one edawbac drawback of the White procedure is that the estimators thus obtained may not be so efficient as those obtained by methods that transform data to reflect specific types of We may consider several assumptions about the pattern of

24/25 25/25 Reasonable Heteroscedasticity Patterns Homework 5 Assumption 1: if, Basic Econometrics (Gujarati, 2003) Assumption 2: if, 1. Chapter 11, Problem 15 [50 points] 2. Chapter 11, Problem 16 [50 points] Assumption 3: if, Assumption 4: A log transformation such as lny i = β 1 + β 2 ln X i + u i very often reduces Assignment weight factor = 0.5