) is violated, so that V( instead. That is, the variance changes for at least some observations.

Similar documents
If we apply least squares to the transformed data we obtain. which yields the generalized least squares estimator of β, i.e.,

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Module Contact: Dr Susan Long, ECO Copyright of the University of East Anglia Version 1

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

β0 + β1xi and want to estimate the unknown

Outline. 9. Heteroskedasticity Cross Sectional Analysis. Homoskedastic Case

Chapter 5: Hypothesis Tests, Confidence Intervals & Gauss-Markov Result

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees.

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Lecture 4 Hypothesis Testing

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Chapter 11: Simple Linear Regression and Correlation

CHAPTER 8 SOLUTIONS TO PROBLEMS

x i1 =1 for all i (the constant ).

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

STATISTICS QUESTIONS. Step by Step Solutions.

January Examinations 2015

Professor Chris Murray. Midterm Exam

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Systems of Equations (SUR, GMM, and 3SLS)

F8: Heteroscedasticity

28. SIMPLE LINEAR REGRESSION III

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

β0 + β1xi. You are interested in estimating the unknown parameters β

18. SIMPLE LINEAR REGRESSION III

Economics 130. Lecture 4 Simple Linear Regression Continued

Comparison of Regression Lines

Statistics for Economics & Business

Chapter 8 Multivariate Regression Analysis

Question 1 carries a weight of 25%; question 2 carries 20%; question 3 carries 25%; and question 4 carries 30%.

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Chapter 14 Simple Linear Regression

Chapter 13: Multiple Regression

Lecture 6: Introduction to Linear Regression

Econometrics of Panel Data

STAT 3008 Applied Regression Analysis

Political Science 552

The Ordinary Least Squares (OLS) Estimator

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

β0 + β1xi. You are interested in estimating the unknown parameters β

Basically, if you have a dummy dependent variable you will be estimating a probability.

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

e i is a random error

CHAPER 11: HETEROSCEDASTICITY: WHAT HAPPENS WHEN ERROR VARIANCE IS NONCONSTANT?

Biostatistics 360 F&t Tests and Intervals in Regression 1

Properties of Least Squares

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Topic 7: Analysis of Variance

Introduction to Regression

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Lecture 2: Prelude to the big shrink

Statistics for Business and Economics

Testing for seasonal unit roots in heterogeneous panels

Exam. Econometrics - Exam 1

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Statistics II Final Exam 26/6/18

17 - LINEAR REGRESSION II

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

REGRESSION ANALYSIS II- MULTICOLLINEARITY

Chapter 5 Multilevel Models

Chapter 8 Indicator Variables

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

Polynomial Regression Models

T E C O L O T E R E S E A R C H, I N C.

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

CHAPTER 8. Exercise Solutions

Lecture 6 More on Complete Randomized Block Design (RBD)

Chapter 15 - Multiple Regression

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Regression. The Simple Linear Regression Model

/ n ) are compared. The logic is: if the two

Modeling and Simulation NETW 707

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

Interpreting Slope Coefficients in Multiple Linear Regression Models: An Example

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

First Year Examination Department of Statistics, University of Florida

Statistics MINITAB - Lab 2

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

The Geometry of Logit and Probit

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

PBAF 528 Week Theory Is the variable s place in the equation certain and theoretically sound? Most important! 2. T-test

4.1. Lecture 4: Fitting distributions: goodness of fit. Goodness of fit: the underlying principle

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

Basic Business Statistics, 10/e

Transcription:

Econ 388 R. Butler 014 revsons Lecture 15 I. HETEROSKEDASTICITY: both pure and mpure (the mpure verson s due to an omtted regressor that s correlated wth the ncluded regressors n the model) A. heteroskedastcty=when the varance of the error s not constant across the observatons, or when the assumpton V( ) s volated, so that V( ) nstead. That s, the varance changes for at least some observatons. A. pure heteroskedastcty--there are no correlated omtted varables that cause the varance to change 1. so f Z, then we stll get PURE heteroskedastcty whenever Z s a. one of the ndependent varables already n the model, or b. t s not an ndependent varable n our model but t s not supposed to be n the model (t s not an omtted varable) c. s not an ndependent varable n the model but s supposed to be (t s an omtted varable), BUT t s uncorrelated wth all the other ndependent varables.. more generally, f the varance s some functon of omtted varables, namely Z1, Z, and f( Z, Z ) AND at least one of these should be n the regresson 1 but are NOT, AND the one that should be n the regresson s correlated wth those varables that are ncluded as the ndependent varables, then we have IMPURE heteroskedastcty. Otherwse, we have PURE heteroskedastcty. 3. f there s pure heterokedastcty, then the followng holds: a. the estmated coeffcents are unbased, but not effcent (n partcular, OLS estmates are no longer BLUE). OLS wll not be asymptotcally effcent ether. There s another knd of estmator, called a generalzed least squares (GLS) or weghted least squares (WLS) estmator that wll be more effcent and also unbased. b. estmates of the varances are based (may be based), thus nvaldatng tests of sgnfcance (t-tests and p-values) c. IF s postvely correlated wth (X k X k), whch s sometmes the case wth economc data, then the expected value of the estmated varance wll be smaller than the true varance. Hence, OLS would be understatng the true varance, and the resultng t-statstcs would be too hgh (p-values too low). We aren t sure of the drecton of the bas otherwse (the more general case). B. mpure heteroskedastcty: the heteroskedastcty s due to an omtted ndependent varable that s correlated wth one or more of the ncluded ndependent varables. In ths case, the heteroskedastcty would be assocated wth based coeffcents (unlke the PURE case), and the estmated varances of the assocated coeffcents would also be based. Wooldrdge focuses on the case of pure heteroskedastcty n Chapter 8. 1

II. Robust Standard errors. Wth pure heteroskedastcty, the OLS estmates are unbased (and consstent how would you prove ths?), but the standard errors may be based (and nconsstent). There are two general approaches to handlng the heteroskedastcty problem: 1) use weghted least squares n whch you get both new standard errors and new estmated coeffcents (the standard errors may change a lot, the estmated coeffcents wll typcally not change by very much); or ) keep the OLS estmated coeffcents (they are, after all, unbased and consstent), but adjust the standard errors. The frst approach (Approach 1) was standard untl recently, but suffers because you have to model the form of heteroskedastcty, and you are never sure f you are modelng t correctly. So the later approach (approach ) s now becomng standard snce robust standard errors have been dscovered. These automatcally adjust for any (unknown) form of heteroskedastcty (so wth pure heteroskedastcty and a large sample, you are probably gettng the correct results). Besdes beng called robust standard errors, there are also known as heteroskedastc-consstent estmates, Whte estmates, Whte-adjusted standard errors, etc (see chapter 8 for more names). Hypothess testng proceeds usng the OLS estmates of the coeffcents and the heteroskedastc-consstent standard errors. Gettng the robust standard errors s especally easy n Stata just use the robust opton as follows: regress y x1 x x3, robust; In SAS, there are many ways to get robust standard errors, but probably the easest s as follows: proc genmod; class d; model y=x1 x x3; repeated subject=d; run; example usng the Utah CPS data [[[[[ut_cps_hetcov.do]]]]] Heteroskedastcty wll also nvaldate the usual F-tests of statstcal sgnfcance. So Wooldrdge outlnes a heteroskedastc-consstent approach to testng lnear restrctons n chapter 8. Ths approach s llustrated wth the program below (wth some of the relevant output followng): STATA: * program to do hetcov robust standard errors and show a robust "F- ; * tests" wooldrdge ; * for heterosk-robust LM stat, see P. 401 ndavdson/macknnon ; * ESTIMATION&INFERENCE n ECONOMETRICS ; (((bunch of prelmnary code to read and create varables))) gen ones = 1; gen lnwage = log(wklywg); * testng educaton varables assumng homoskedastcty; regress lnwage age whte male no_h_sc hgh_sch some_col college exec tech_sal serv_occ oper_occ

test (no_h_sc=0) (hgh_sch=0) (some_col=0) (college=0); regress lnwage age whte male no_h_sc hgh_sch some_col college exec tech_sal serv_occ oper_occ ag_cnstr manuf trade pub_admn, robust; * TESTING EDUCATION VARIABLES whle allowng for heteroskedastc errors--see references at top; * get the resduals from the restr. model (next) and multply them by resduals from the ; * four auxlary regressons of the omtted varables, then regress ones on all of these ; regress lnwage age whte male exec tech_sal serv_occ oper_occ ag_cnstr manuf trade pub_admn ; predct uhat, resduals; regress no_h_sc age whte male exec tech_sal serv_occ oper_occ predct no_uhat, resduals; regress hgh_sch age whte male exec tech_sal serv_occ oper_occ predct h_uhat, resduals; regress some_col age whte male exec tech_sal serv_occ oper_occ predct som_uhat, resduals; regress college age whte male exec tech_sal serv_occ oper_occ predct col_uhat, resduals; gen uhat1 = uhat*no_uhat; gen uhat = uhat*h_uhat; gen uhat3 = uhat*som_uhat; gen uhat4 = uhat*col_uhat; regress ones uhat1 uhat uhat3 uhat4, noconstant; gen lm_heter = e(n) - e(rss); *rss=sum of squared resduals; sum lm_heter; SOME RELEVANT OUTPUT FOLLOWS:. * testng educaton varables assumng homoskedastcty;. regress lnwage age whte male no_h_sc hgh_sch some_col college exec tech_sal serv_occ oper_occ Source SS df MS Number of obs = 194 -------------+------------------------------ F( 15, 178) = 4.30 Model 3.0048301 15.13365534 Prob > F = 0.0000 Resdual 88.4508 178.49677083 R-squared = 0.658 -------------+------------------------------ Adj R-squared = 0.039 Total 10.430038 193.63989835 Root MSE =.7048 lnwage Coef. Std. Err. t P> t [95% Conf. Interval] age.0105187.004351.43 0.016.0019836.0190538 whte -.3686956.36178-1.0 0.309-1.08164.344333 male.193335.113989 1.9 0.056 -.005598.44465 no_h_sc -.8134617.869731 -.83 0.005-1.379769 -.471545 hgh_sch -.7969473.30653-3.46 0.001-1.51349 -.345461 some_col -.573499.16494 -.65 0.009-1.000597 -.1464015 college -.3698765.8009-1.6 0.107 -.8198131.0800601 exec.756144.3040339 0.91 0.366 -.34360.875589 tech_sal -.0885604.3174986-0.8 0.781 -.715106.5379851 serv_occ -.1359019.303503-0.45 0.655 -.73489.463055 oper_occ.1333178.91388 0.46 0.648 -.4414073.708048 ag_cnstr.3985858.1995688.00 0.047.0047605.794111 manuf.1959387.1610785 1. 0.5 -.119305.513808 trade.96175.1618434 1.83 0.069 -.03038.6155537 pub_admn.68051.307576 1.16 0.47 -.187305.73447 _cons 6.45909.543331 11.50 0.000 5.173708 7.31811. test (no_h_sc=0) (hgh_sch=0) (some_col=0) (college=0); 3

( 1) no_h_sc = 0 ( ) hgh_sch = 0 ( 3) some_col = 0 ( 4) college = 0 F( 4, 178) = 3.4 Prob > F = 0.0100. regress lnwage age whte male no_h_sc hgh_sch some_col college exec tech_sal serv_occ oper_occ ag_cnstr manuf trade pub_admn, robust; Regresson wth robust standard errors Number of obs = 194 F( 15, 178) = 6.75 Prob > F = 0.0000 R-squared = 0.658 Root MSE =.7048 Robust lnwage Coef. Std. Err. t P> t [95% Conf. Interval] age.0105187.0056464 1.86 0.064 -.000638.016611 whte -.3686956.19091-1.93 0.055 -.7454337.008045 male.193335.113573 1.93 0.055 -.0047893.4434563 no_h_sc -.8134617.1969169-4.13 0.000-1.0054 -.448696 hgh_sch -.7969473.7590-3.50 0.001-1.4607 -.347851 some_col -.573499.1696881-3.38 0.001 -.9083585 -.386399 college -.3698765.18015-1.74 0.084 -.7898149.0500618 exec.756144.60141 1.06 0.91 -.378871.789116 tech_sal -.0885604.537479-0.35 0.77 -.5893017.411808 serv_occ -.1359019.470563-0.55 0.583 -.63438.3516343 oper_occ.1333178.30161 0.60 0.551 -.3067779.5734134 ag_cnstr.3985858.145866 3.0 0.00.1579.644447 manuf.1959387.1415148 1.38 0.168 -.083338.475013 trade.96175.1398489.1 0.036.001998.57150 pub_admn.68051.038 1.33 0.187 -.131015.6671167 _cons 6.45909.4179341 14.94 0.000 5.41166 7.07065. * TESTING EDUCATION VARIABLES whle allowng for heteroskedastc errors;. regress lnwage age whte male exec tech_sal serv_occ oper_occ > ag_cnstr manuf trade pub_admn ; Source SS df MS Number of obs = 194 -------------+------------------------------ F( 11, 18) = 4.38 Model 5.01097 11.9110088 Prob > F = 0.0000 Resdual 95.7984 18.5330376 R-squared = 0.093 -------------+------------------------------ Adj R-squared = 0.1615 Total 10.430038 193.63989835 Root MSE =.7335 lnwage Coef. Std. Err. t P> t [95% Conf. Interval] age.0119868.0043631.75 0.007.0033779.005956 whte -.349317.3685086-0.95 0.344-1.0764.3777767 male.61855.115433.7 0.04.034115.4895948 exec.5035434.30145 1.67 0.097 -.095736 1.09966 tech_sal -.0004907.39976-0.00 0.999 -.63779.6368106 serv_occ -.106765.3080585-0.35 0.79 -.7145878.501068 oper_occ.14154.97859 0.48 0.635 -.4461664.79111 ag_cnstr.995889.07568 1.48 0.141 -.100467.6996451 manuf.1779896.160999 1.10 0.74 -.1418471.49786 trade.076753.161685 1.8 0.01 -.111348.566935 pub_admn.384559.331479 1.41 0.161 -.1315645.7884763 _cons 5.49571.5067873 10.84 0.000 4.495787 6.495655. predct uhat, resduals; (93 mssng values generated). regress no_h_sc age whte male exec tech_sal serv_occ oper_occ > 4

. predct no_uhat, resduals;. regress hgh_sch age whte male exec tech_sal serv_occ oper_occ >. predct h_uhat, resduals;. regress some_col age whte male exec tech_sal serv_occ oper_occ >. predct som_uhat, resduals;. regress college age whte male exec tech_sal serv_occ oper_occ > ((((NOTE I LEFT OUT THE ABOVE REGRESSION RESULTS wth school dummy dep var)))). regress ones uhat1 uhat uhat3 uhat4, noconstant; Source SS df MS Number of obs = 194 -------------+------------------------------ F( 4, 190) = 3.4 Model 1.3835641 4 3.09589103 Prob > F = 0.0134 Resdual 181.616436 190.955875978 R-squared = 0.0638 -------------+------------------------------ Adj R-squared = 0.0441 Total 194 194 1 Root MSE =.97769 ones Coef. Std. Err. t P> t [95% Conf. Interval] uhat1 -.4635.7839338 -.86 0.005-3.788967 -.6963038 uhat -1.309697.4139533-3.16 0.00 -.163 -.493166 uhat3-1.15455.4458074 -.59 0.010 -.033919 -.751839 uhat4 -.7433876.434706-1.71 0.089-1.600858.114085. gen lm_heter = e(n) - e(rss);. *rss=sum of squared resduals;. sum lm_heter; Varable Obs Mean Std. Dev. Mn Max -------------+-------------------------------------------------------- lm_heter 1117 1.38356 0 1.38356 1.38356 The lm_heter statstc (1.38) s Ch-square wth four degrees of freedom under the null hypothess that the schoolng dummes are jontly nsgnfcant (remember, ths s a large sample test of that hypothess that s robust to whether or not there s heteroskedastcty n the model). Ths s statstcally sgnfcant at slghtly larger than the one percent level, close to the test wthout the heteroskedastcty correcton (based on the F-test above). III. Detectng heteroskedastcty A.. Goldfeld-Quandt test: Whle ths test requres that the researcher dentfy the factor of proportonalty to order the data nto thrds (e, requres that Z be dentfed to dvde the sample nto hgher and lower varance groups), t provdes an exact test statstc even n relatvely large samples. The valdty of the Parks and Whte tests, on the other hand, hnge on havng sample szes that are sutably large. 5

The null hypothess to be nvestgated s Ho:... 1 3 n Y I II III X To do the test of ths hypothess, we proceed as follows: a. Dvde the data nto three groups (roughly equal szes n I + n II + n III = n) b. Run separate regresson on groups I and III. Let s and s represent the I III correspondng estmators of. Under the null hypothess of homoskedastcty, we have s III F( n - k, - k ) III III ni I where k=the number of coeffcents ncludng the s I ntercept. *place the larger s n the numerator for ths test. [[[why? ths makes a two-taled test nto a one taled test. explan ]]]. the F-dstrbuton densty functon Fal to Reject F ( ) reject Ho crtcal value Ho s III Under the null hypothess one would expect s I to be farly close to one and large dfferences from one would provde the bass for rejectng the null hypothess. Illustrate below, after the dscusson of the modern approaches. 6

B Modern approaches 1. Goldfeld-Quandt assumes that you know how to partton the data, but offers an exact test even n small samples. The modern approaches are all large sample tests, but t assumes that you don t know the form of heteroskedastcty except that the varance s correlated wth one or more of the ndependent varables ncluded n the analyss.. Breusch-Pagan Test for Heteroskedastcty see the text for an explanaton. Ths test and the Whte test are general purpose tests for heteroskedastcty, strctly vald only when the data sets are large. Homoskedastcty suggests that the varance s unrelated to the values of the explanatory varables, whereas n a heteroskedastc model the varance of the errors s related to the value of the ndependent varables to some functon "f" as follows: = f(x 1,X,,...,X k, ) An mportant note here. Ths does not say anythng about mpure heteroskedastcty because these Xs on the rght hand sde of the above equaton are always ncluded n the model, and even f one of the Xs were excluded from the model, t would have to be correlated wth the ncluded Xs to yeld mpure heteroskedastcty. In the absence of a vst from the covarance angel, we probably don't know what form "f" takes. Breunsch-Pagan suggests that ths could be a lnear functon, so that the test regresson takes the form (f k=3, so there are 3 slope regressors n the orgnal model): ˆ 0 1X1 X 3X3 n R wll be dstrbuted as a Ch-square dstrbuton under the null hypothess that there s no heteroskedastcty. Large values of ths statstc (beyond the crtcal values) would ndcate that there s heteroskedastcty. 3. Whte Test for Heteroskedastcty. Whte suggested takng a (second order) Taylor Seres expanson of σ = f(x 1, X, X k ) that ncludes cross product terms (snce cross product terms are also relevant n plm arguments).. We agan show ths for the smple case of three regressors, though the extenson to many regressors s straghtforward (snce you just keep ncludng all the squares and cross product terms of the orgnal regressors); σ = f(x 1, X, X k ) 0 1 1 3 3 4 1 5 6 3 7 1 X Next replace wth ˆ ( ˆ s the resduals from the orgnal model:), and run the followng regresson: 8 1 X 3 9 X 3 7

ˆ 0 1X 1 X 3X3 4X1 5X 6X3 7X1 X 8X1 X3 9XX3 If the product (number of observatons * R ) s hgh, then we reject the null hypothess of homoskedastcty of errors. n R s dstrbuted as Ch-square dstrbuton wth degrees of freedom equal to the number of non-ntercept terms n the last equaton. Large values of n R, larger than the crtcal value for the Ch-square dstrbuton, ndcate that null hypothess of homoskedastc errors s rejected. Lke the F-dstrbuton, the Ch-square s a skewed to the rght dstrbuton whose shape depends on the degrees of freedom parameter. 4. Smple Verson of the Whte Test. When the number of regressors s moderate or large, the Whte test wll obvously nclude a lot of cross product terms that wll eat up a lot of degrees of freedom. An alternate Whte test but not qute as general s to regress the squared resduals on the predcted value of Y, and the predcted value of Y squared. Under the null hypothess of no heteroskedastcty, the resultng n R wll be a Ch-square dstrbuton wth degrees of freedom. Examples of some of these tests follow: [[[ut_cps_hettest.do]]]] * testng for heteroskedastcty ; regress lnwage age whte male no_h_sc hgh_sch some_col college exec tech_sal serv_occ oper_occ predct resds, resduals; predct yhat; hettest; hettest, rhs; mtest, preserve whte; ** take the yhat and resds to form a whtes test: est var=a0+a1 yhat +a yhat^ ; gen yhatsq=yhat*yhat; gen resdsq=resds*resds; regress resdsq yhat yhatsq; gen lm_whte=e(n)*e(r); sum lm_whte; SOME OF THE OUTPUT FOLLOWS:. regress lnwage age whte male no_h_sc hgh_sch some_col college exec tech_sal serv_occ oper_oc > c > Source SS df MS Number of obs = 194 -------------+------------------------------ F( 15, 178) = 4.30 Model 3.0048301 15.13365534 Prob > F = 0.0000 Resdual 88.4508 178.49677083 R-squared = 0.658 -------------+------------------------------ Adj R-squared = 0.039 Total 10.430038 193.63989835 Root MSE =.7048 lnwage Coef. Std. Err. t P> t [95% Conf. Interval] age.0105187.004351.43 0.016.0019836.0190538 whte -.3686956.36178-1.0 0.309-1.08164.344333 male.193335.113989 1.9 0.056 -.005598.44465 no_h_sc -.8134617.869731 -.83 0.005-1.379769 -.471545 hgh_sch -.7969473.30653-3.46 0.001-1.51349 -.345461 some_col -.573499.16494 -.65 0.009-1.000597 -.1464015 college -.3698765.8009-1.6 0.107 -.8198131.0800601 8

exec.756144.3040339 0.91 0.366 -.34360.875589 tech_sal -.0885604.3174986-0.8 0.781 -.715106.5379851 serv_occ -.1359019.303503-0.45 0.655 -.73489.463055 oper_occ.1333178.91388 0.46 0.648 -.4414073.708048 ag_cnstr.3985858.1995688.00 0.047.0047605.794111 manuf.1959387.1610785 1. 0.5 -.119305.513808 trade.96175.1618434 1.83 0.069 -.03038.6155537 pub_admn.68051.307576 1.16 0.47 -.187305.73447 _cons 6.45909.543331 11.50 0.000 5.173708 7.31811. predct resds, resduals; (93 mssng values generated). predct yhat; (opton xb assumed; ftted values). hettest; Breusch-Pagan / Cook-Wesberg test for heteroskedastcty Ho: Constant varance Varables: ftted values of lnwage. hettest, rhs; ch(1) = 3.15 Prob > ch = 0.0758 **est var regressed on yhat, no heterosk at 5 precent level** Breusch-Pagan / Cook-Wesberg test for heteroskedastcty Ho: Constant varance Varables: age whte male no_h_sc hgh_sch some_col college exec tech_sal serv_occ oper_occ ag_cnstr manuf trade pub_admn ch(15) = 46.67 Prob > ch = 0.0000**est var regressed on all the rhs var, heterosk at better than 1 precent level**. mtest, preserve whte; Whte's test for Ho: homoskedastcty aganst Ha: unrestrcted heteroskedastcty ch(76) = 51.44 Prob > ch = 0.986 **est var regressed on x and cross products, no heteroskedastctyl** Cameron & Trved's decomposton of IM-test --------------------------------------------------- Source ch df p ---------------------+----------------------------- Heteroskedastcty 51.44 76 0.986 Skewness 19.39 15 0.1968 Kurtoss 3.7 1 0.0538 ---------------------+----------------------------- Total 74.54 9 0.9080 ---------------------------------------------------. ** take the yhat and resds to form a whtes test: est var=a0+a1 yhat +a yhat^ ;. gen yhatsq=yhat*yhat;. gen resdsq=resds*resds; (93 mssng values generated). regress resdsq yhat yhatsq; Source SS df MS Number of obs = 194 -------------+------------------------------ F(, 191) = 0.4 Model 1.3143371.657168559 Prob > F = 0.6555 Resdual 96.511071 191 1.5541398 R-squared = 0.0044 -------------+------------------------------ Adj R-squared = -0.0060 Total 97.85408 193 1.54313683 Root MSE = 1.46 resdsq Coef. Std. Err. t P> t [95% Conf. Interval] yhat.0489963 4.985543 0.01 0.99-9.784797 9.8879 9

yhatsq -.006454.4091188-0.05 0.960 -.876167.786358 _cons.9180861 15.14971 0.06 0.95-8.96413 30.80031. gen lm_whte=e(n)*e(r);. sum lm_whte; Varable Obs Mean Std. Dev. Mn Max -------------+-------------------------------------------------------- lm_whte 1117.8561439 0.8561439.8561439 **est var regressed on yhat and yhatsq, chd-square wth degrees of freedom, no heteroskedastcty ** In SAS, use the SPEC opton to test for heteroskedastcty: proc reg; model y=x1 x x3/spec; or for another varant of the Whte s test for heteroskedastcty, use proc model proc model; ; parms b0-b3; y=b0+b1*x1+b*x+b3*x3; Ft y/whte; run; Both these produce a Whte-lke test usng Xs and cross products of Xs; agan the null s no heteroskedastcty. [[[[[[ 3x5 Quz: Testng for heteroskedastcty usng the Professor s salares as the example: Below s the regresson of the squared resduals on experence, and the square of experence (the orgnal regresson, from whch the resduals were calculated, was the professor s salares regressed on experence): RESID_SQUARE = 446043 + 35440 experen - 9364 exper_sq Predctor Coef StDev T P Constant 446043 4419907 0.06 0.956 experen 35440 7404898 0.44 0.674 exper_sq -9364 60656-0.36 0.730 S = 3604350 R-Sq =.041 R-Sq(adj) = 0.0% a. What type of test for heteroskedastcty s beng employed here? b. Is there evdence of heteroskedastcty? Explan. ]]]]]] 10