CHAPTER 8 SOLUTIONS TO PROBLEMS

Similar documents
Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

January Examinations 2015

Outline. 9. Heteroskedasticity Cross Sectional Analysis. Homoskedastic Case

CHAPTER 8. Exercise Solutions

a. (All your answers should be in the letter!

Lecture 4 Hypothesis Testing

x i1 =1 for all i (the constant ).

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

Lecture 6: Introduction to Linear Regression

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Basically, if you have a dummy dependent variable you will be estimating a probability.

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Statistics for Economics & Business

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Chapter 13: Multiple Regression

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Economics 130. Lecture 4 Simple Linear Regression Continued

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Comparison of Regression Lines

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

STAT 3008 Applied Regression Analysis

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

18. SIMPLE LINEAR REGRESSION III

Chapter 11: Simple Linear Regression and Correlation

28. SIMPLE LINEAR REGRESSION III

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Professor Chris Murray. Midterm Exam

Statistics for Business and Economics

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Answer Key: Problem Set 6

Negative Binomial Regression

Scatter Plot x

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

University of California at Berkeley Fall Introductory Applied Econometrics Final examination

Chapter 5: Hypothesis Tests, Confidence Intervals & Gauss-Markov Result

x = , so that calculated

ECON 351* -- Note 23: Tests for Coefficient Differences: Examples Introduction. Sample data: A random sample of 534 paid employees.

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Chapter 5 Multilevel Models

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

e i is a random error

Chapter 15 Student Lecture Notes 15-1

STATISTICS QUESTIONS. Step by Step Solutions.

Chapter 15 - Multiple Regression

PBAF 528 Week Theory Is the variable s place in the equation certain and theoretically sound? Most important! 2. T-test

Chapter 9: Statistical Inference and the Relationship between Two Variables

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

) is violated, so that V( instead. That is, the variance changes for at least some observations.

Properties of Least Squares

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Statistics II Final Exam 26/6/18

Chapter 14 Simple Linear Regression

β0 + β1xi and want to estimate the unknown

III. Econometric Methodology Regression Analysis

REGRESSION ANALYSIS II- MULTICOLLINEARITY

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

F8: Heteroscedasticity

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Module Contact: Dr Susan Long, ECO Copyright of the University of East Anglia Version 1

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Linear Regression Analysis: Terminology and Notation

The SAS program I used to obtain the analyses for my answers is given below.

Dummy variables in multiple variable regression model

Correlation and Regression

Midterm Examination. Regression and Forecasting Models

Tests of Single Linear Coefficient Restrictions: t-tests and F-tests. 1. Basic Rules. 2. Testing Single Linear Coefficient Restrictions

If we apply least squares to the transformed data we obtain. which yields the generalized least squares estimator of β, i.e.,

Basic Business Statistics, 10/e

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Reminder: Nested models. Lecture 9: Interactions, Quadratic terms and Splines. Effect Modification. Model 1

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Topic- 11 The Analysis of Variance

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Lecture 10 Support Vector Machines II

Chapter 4: Regression With One Regressor

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Econometrics of Panel Data

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

CHAPTER (i) No. For each coefficient, the usual standard errors and the heteroskedasticity-robust ones are practically very similar.

Homework Assignment 3 Due in class, Thursday October 15

STAT 511 FINAL EXAM NAME Spring 2001

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

CHAPER 11: HETEROSCEDASTICITY: WHAT HAPPENS WHEN ERROR VARIANCE IS NONCONSTANT?

/ n ) are compared. The logic is: if the two

Transcription:

CHAPTER 8 SOLUTIONS TO PROBLEMS 8.1 Parts () and (). The homoskedastcty assumpton played no role n Chapter 5 n showng that OLS s consstent. But we know that heteroskedastcty causes statstcal nference based on the usual t and F statstcs to be nvald, even n large samples. As heteroskedastcty s a volaton of the Gauss-Markov assumptons, OLS s no longer BLUE. 8. Wth Var(u nc,prce,educ,female) = σ nc, h(x) = nc, where h(x) s the heteroskedastcty functon defned n equaton (8.1). Therefore, h( x ) = nc, and so the transformed equaton s obtaned by dvdng the orgnal equaton by nc: beer = β0(1/ nc) + β1 + β( prce / nc) + β3( educ / nc) + β4( female / nc) + ( u / nc). nc Notce that β 1, whch s the slope on nc n the orgnal model, s now a constant n the transformed equaton. Ths s smply a consequence of the form of the heteroskedastcty and the functonal forms of the explanatory varables n the orgnal equaton. 8.3 False. The unbasedness of WLS and OLS hnges crucally on Assumpton MLR.3, and, as we know from Chapter 4, ths assumpton s often volated when an mportant varable s omtted. When MLR.3 does not hold, both WLS and OLS are based. Wthout specfc nformaton on how the omtted varable s correlated wth the ncluded explanatory varables, t s not possble to determne whch estmator has a small bas. It s possble that WLS would have more bas than OLS or less bas. 8.4 () These varables have the antcpated sgns. If a student takes courses where grades are, on average, hgher as reflected by hgher crsgpa then hs/her grades wll be hgher. The better the student has been n the past as measured by cumgpa, the better the student does (on average) n the current semester. Fnally, tothrs s a measure of experence, and ts coeffcent ndcates an ncreasng return to experence. The t statstc for crsgpa s very large, over fve usng the usual standard error (whch s the largest of the two). Usng the robust standard error for cumgpa, ts t statstc s about.61, whch s also sgnfcant at the 5% level. The t statstc for tothrs s only about 1.17 usng ether standard error, so t s not sgnfcant at the 5% level. () Ths s easest to see wthout other explanatory varables n the model. If crsgpa were the only explanatory varable, H 0 : β = 1 means that, wthout any nformaton about the student, crsgpa the best predctor of term GPA s the average GPA n the students courses; ths holds essentally by defnton. (The ntercept would be zero n ths case.) Wth addtonal explanatory varables t s not necessarly true that β = 1 because crsgpa could be correlated wth characterstcs of crsgpa the student. (For example, perhaps the courses students take are nfluenced by ablty as 6

measured by test scores and past college performance.) But t s stll nterestng to test ths hypothess. The t statstc usng the usual standard error s t = (.900 1)/.175.57; usng the heteroskedastcty-robust standard error gves t.60. In ether case we fal to reject H 0 : β = 1 at any reasonable sgnfcance level, certanly ncludng 5%. () The n-season effect s gven by the coeffcent on season, whch mples that, other thngs equal, an athlete s GPA s about.16 ponts lower when hs/her sport s competng. The t statstc usng the usual standard error s about 1.60, whle that usng the robust standard error s about 1.96. Aganst a two-sded alternatve, the t statstc usng the robust standard error s just sgnfcant at the 5% level (the standard normal crtcal value s 1.96), whle usng the usual standard error, the t statstc s not qute sgnfcant at the 10% level (cv 1.65). So the standard error used makes a dfference n ths case. Ths example s somewhat unusual, as the robust standard error s more often the larger of the two. 8.5 () No. For each coeffcent, the usual standard errors and the heteroskedastcty-robust ones are practcally very smlar. () The effect s.09(4) =.116, so the probablty of smokng falls by about.116. () As usual, we compute the turnng pont n the quadratc:.00/[(.0006)] 38.46, so about 38 and one-half years. (v) Holdng other factors n the equaton fxed, a person n a state wth restaurant smokng restrctons has a.101 lower chance of smokng. Ths s smlar to the effect of havng four more years of educaton. (v) We just plug the values of the ndependent varables nto the OLS regresson lne: smokes ˆ =.656.069 log(67.44) +.01 log(6,500).09(16) +.00(77).0006(77 ).005. Thus, the estmated probablty of smokng for ths person s close to zero. (In fact, ths person s not a smoker, so the equaton predcts well for ths partcular observaton.) SOLUTIONS TO COMPUTER EXERCISES 8.6 () Gven the equaton crsgpa sleep = β + β totwrk + β educ + β age + β age + β yngkd + β male + u 0 1 3 4 5 6, the assumpton that the varance of u gven all explanatory varables depends only on gender s Var( u totwrk, educ, age, yngkd, male) = Var( u male) = δ + δ male 0 1 63

Then the varance for women s smply δ 0 and that for men s δ 0 + δ 1; the dfference n varances s δ 1. () After estmatng the above equaton by OLS, we regress (ncludng, of course, an ntercept). We can wrte the results as u ˆ on male, = 1,, K,706 û = 189,359. 8,849.6 male + resdual (0,546.4) (7,96.5) n = 706, R =.0016. Because the coeffcent on male s negatve, the estmated varance s hgher for women. () No. The t statstc on male s only about 1.06, whch s not sgnfcant at even the 0% level aganst a two-sded alternatve. 8.7 () The estmated equaton wth both sets of standard errors (heteroskedastcty-robust standard errors n brackets) s prce = 1.77 +.0007 lotsze +.13 sqrft +13.85 bdrms (9.48) (.00064) (.013) (9.01) [36.8] [.001] [.017] [8.8] n = 88, R =.67. The robust standard error on lotsze s almost twce as large as the usual standard error, makng lotsze much less sgnfcant (the t statstc falls from about 3.3 to about 1.70). The t statstc on sqrft also falls, but t s stll very sgnfcant. The varable bdrms actually becomes somewhat more sgnfcant, but t s stll barely sgnfcant. The most mportant change s n the sgnfcance of lotsze. () For the log-log model, log ( prce ) = 5.61 +.168 log(lotsze) +.700 log(sqrft) +.037 bdrms (0.65) (.038) (.093) (.08) [0.76] [.041] [.101] [.030] n = 88, R =.643. Here, the heteroskedastcty-robust standard error s always slghtly greater than the correspondng usual standard error, but the dfferences are relatvely small. In partcular, log(lotsze) and log(sqrft) stll have very large t statstcs, and the t statstc on bdrms s not sgnfcant at the 5% level aganst a one-sded alternatve usng ether standard error. 64

() As we dscussed n Secton 6., usng the logarthmc transformaton of the dependent varable often mtgates, f not entrely elmnates, heteroskedastcty. Ths s certanly the case here, as no mportant conclusons n the model for log(prce) depend on the choce of standard error. (We have also transformed two of the ndependent varables to make the model of the constant elastcty varety n lotsze and sqrft.) 8.8 After estmatng equaton (8.18), we obtan the squared OLS resduals û. The full-blown Whte test s based on the R-squared from the auxlary regresson (wth an ntercept), û on llotsze, lsqrft, bdrms, llotsze, lsqrft, bdrms, llotsze lsqrft, llotsze bdrms, and lsqrft bdrms, where l n front of lotsze and sqrft denotes the natural log. [See equaton (8.19).] Wth 88 observatons the n-r-squared verson of the Whte statstc s 88(.109) 9.59, and ths s the outcome of an (approxmately) χ 9 random varable. The p-value s about.385, whch provdes lttle evdence aganst the homoskedastcty assumpton. 8.9 () The estmated equaton s votea ˆ =37.66 +.5 prtystra +3.793 democa +5.779 log(expenda) (4.74) (.071) (1.407) (0.39) 6.38 log(expendb) + û (0.397) n = 173, R =.801, R =.796. You can convnce yourself that regressng the uˆ on all of the explanatory varables yelds an R- squared of zero, although t mght not be exactly zero n your computer output due to roundng error. Remember, ths s how OLS works: the estmates ˆ β j are chosen to make the resduals be uncorrelated n the sample wth each ndependent varable (as well as have zero sample average). () The B-P test entals regressng the u ˆ on the ndependent varables n part (). The F statstc for jont sgnfcant (wth 4 and 168 df) s about.33 wth p-value.058. Therefore, there s some evdence of heteroskedastcty, but not qute at the 5% level. () Now we regress u ˆ on votea ˆ and ( votea ˆ ), where the votea ˆ are the OLS ftted values from part (). The F test, wth and 170 df, s about.79 wth p-value.065. Ths s slghtly less evdence of heteroskedastcty than provded by the B-P test, but the concluson s very smlar. 65

8.10 () By regressng sprdcvr on an ntercept only we obtan ˆµ.515 se.01). The asymptotc t statstc for H 0 : µ =.5 s (.515.5)/.01.71, whch s not sgnfcant at the 10% level, or even the 0% level. () 35 games were played on a neutral court. () The estmated LPM s sprdcvr ˆ =.490 +.035 favhome +.118 neutral.03 fav5 +.018 und5 (.045) (.050) (.095) (.050) (.09) n = 553, R =.0034. The varable neutral has by far the largest effect f the game s played on a neutral court, the probablty that the spread s covered s estmated to be about.1 hgher and, except for the ntercept, ts t statstc s the only t statstc greater than one n absolute value (about 1.4). (v) Under H 0 : β 1 = β = β 3 = β 4 = 0, the response probablty does not depend on any explanatory varables, whch means nether the mean nor the varance depends on the explanatory varables. [See equaton (8.38).] (v) The F statstc for jont sgnfcance, wth 4 and 548 df, s about.47 wth p-value.76. There s essentally no evdence aganst H 0. (v) Based on these varables, t s not possble to predct whether the spread wll be covered. The explanatory power s very low, and the explanatory varables are jontly very nsgnfcant. The coeffcent on neutral may ndcate somethng s gong on wth games played on a neutral court, but we would not want to bet money on t unless t could be confrmed wth a separate, larger sample. 8.11 () The estmates are gven n equaton (7.31). Rounded to four decmal places, the smallest ftted value s.0066 and the largest ftted value s.5577. () The estmated heteroskedastcty functon for each observaton s hˆ ˆ (1 ˆ = arr86 arr86 ), whch s strctly between zero and one because 0 < arr86 ˆ < 1 for all. The weghts for WLS are 1/ h ˆ. To show the WLS estmate of each parameter, we report the WLS results usng the same equaton format as for OLS: ˆ arr86 =.448.168 pcnv +.0054 avgsen.0018 tottme.05 ptme86 (.018) (.019) (.0051) (.0033) (.003).045 qemp86 (.005) n =,75, R =.0744. 66

The coeffcents on the sgnfcant explanatory varables are very smlar to the OLS estmates. The WLS standard errors on the slope coeffcents are generally lower than the nonrobust OLS standard errors. A proper comparson would be wth the robust OLS standard errors. () After WLS estmaton, the F statstc for jont sgnfcance of avgsen and tottme, wth and,719 df, s about.88 wth p-value.41. They are not close to beng jontly sgnfcant at the 5% level. If your econometrcs package has a command for WLS and a test command for jont hypotheses, the F statstc and p-value are easy to obtan. Alternatvely, you can obtan the restrcted R-squared usng the same weghts as n part () and droppng avgsen and tottme from the WLS estmaton. (The unrestrcted R-squared s.0744.) 8.1 () The heteroskedastcty-robust standard error for ˆwhte β.19 s about.06, whch s notably hgher than the nonrobust standard error (about.00). The heteroskedastcty-robust 95% confdence nterval s about.078 to.179, whle the nonrobust CI s, of course, narrower, about.090 to.168. The robust CI stll excludes the value zero by some margn. () There are no ftted values less than zero, but there are 31 greater than one. Unless we do somethng to those ftted values, we cannot drectly apply WLS, as h ˆ wll be negatve n 31 cases. 8.13 () The equaton estmated by OLS s colgpa =1.36 +.41 hsgpa +.013 ACT.071 skpped +.14 PC (.33) (.09) (.010) (.06) (.057) n = 141, R =.59, R =.38 () The F statstc obtaned for the Whte test s about 3.58. Wth and 138 df, ths gves p- value.031. So, at the 5% level, we conclude there s evdence of heteroskedastcty n the errors of the colgpa equaton. (As an asde, note that the t statstcs for each of the terms s very small, and we could have smply dropped the quadratc term wthout losng anythng of value.) () In fact, the smallest ftted value from the regresson n part () s about.07, whle the largest s about.165. Usng these ftted values as the h ˆ n a weghted least squares regresson gves the followng: colgpa =1.40 +.40 hsgpa +.013 ACT.076 skpped +.16 PC (.30) (.083) (.010) (.0) (.056) n = 141, R =.306, R =.86 There s very lttle dfference n the estmated coeffcent on PC, and the OLS t statstc and WLS t statstc are also very close. Note that we have used the usual OLS standard error, even though 67

t would be more approprate to use the heteroskedastcty-robust form (snce we have evdence of heteroskedastcty). The R-squared n the weghted least squares estmaton s larger than that from the OLS regresson n part (), but, remember, these are not comparable. (v) Wth robust standard errors that s, wth standard errors that are robust to msspecfyng the functon h(x) the equaton s colgpa =1.40 +.40 hsgpa +.013 ACT.076 skpped +.16 PC (.31) (.086) (.010) (.01) (.059) n = 141, R =.306, R =.86 The robust standard errors do not dffer by much from those n part (); n most cases, they are slghtly hgher, but all explanatory varables that were statstcally sgnfcant before are stll statstcally sgnfcant. But the confdence nterval for β PC s a bt wder. 8.14 () I now get R =.057, but the other estmates seem okay. () One way to ensure that the unweghted resduals are beng provded s to compare them wth the OLS resduals. They wll not be the same, of course, but they should not be wldly dfferent. ( ( ( () The R-squared from the regresson u on y, y, = 1,...,807 s about.07. We use ths as R n equaton (8.15) but wth k =. Ths gves F = 11.15, and so the p-value s about zero. û (v) The substantal heteroskedastcty found n part () shows that the feasble GLS procedure descrbed on page 79 does not, n fact, elmnate the heteroskedastcty. Therefore, the usual standard errors, t statstcs, and F statstcs reported wth weghted least squares are not vald, even asymptotcally. (v) The weghted least squares equaton wth robust standard errors s cgs =5.64 + 1.30 log(ncome).94 log(cgprc).463 educ (37.31) (.54) (8.97) (.149) +.48 age.0056 age 3.46 restaurn (.115) (.001) (.7) n = 807, R =.1134 The substantal dfferences n standard errors compare wth equaton (8.36) s another ndcaton that our proposed correcton for heteroskedastcty dd not really do the trck. Wth the excepton of restaurn, all standard errors got notably bgger; for example, the standard error for 68

log(cgprc) doubled. All varables that were sgnfcant wth the nonrobust standard errors reman sgnfcant, but the confdence ntervals are much wder n several cases. [ Instructor s Note: You can also do ths exercse wth regresson (8.34) used n place of (8.3). Ths gves a somewhat larger estmated ncome effect.] 8.15 () In the followng equaton, estmated by OLS, the usual standard errors are n ( ) and the heteroskedastcty-robust standard errors are n [ ]: e401k =.506 +.014 nc.00006 nc +.065 age.00031 age.0035 male (.081) (.0006) (.000005) (.0039) (.00005) (.011) [.079] [.0006] [.000005] [.0038] [.00004] [.011] n = 9,75, R =.094. There are no mportant dfferences; f anythng, the robust standard errors are smaller. () Ths s a general clam. Snce Var(y x) = p( x)[1 p( x )], we can wrte E( u x) = p( x) [ p( x )]. Wrtten n error form, can wrte ths as a regresson model u = p( ) [ p( )] + v 0 1 x x. In other words, we u = δ + δ p( x) + δ [ p( x )] + v, wth the restrctons δ 0 = 0, δ 1 = 1, and δ = -1. Remember that, for the LPM, the ftted values, y ˆ, are estmates of p( x ) = β + β x +... + β x. So, when we run the regresson uˆ on yˆ, y ˆ (ncludng an 0 1 1 k k ntercept), the ntercept estmates should be close to zero, the coeffcent on y ˆ should be close to one, and the coeffcent on y ˆ should be close to 1. () The Whte F statstc s about 310.3, whch s very sgnfcant. The coeffcent on e401 ˆ k s about 1.010, the coeffcent on e401 ˆ k s about.970, and the ntercept s about -.009. Ths accords qute well wth what we expect to fnd. (v) The smallest ftted value s about.030 and the largest s about.697. The WLS estmates of the LPM are e401k =.488 +.016 nc.00006 nc +.055 age.00030 age.0055 male (.076) (.0005) (.000004) (.0037) (.00004) (.0117) n = 9,75, R =.108. There are no mportant dfferences wth the OLS estmates. The largest relatve change s n the coeffcent on male, but ths varable s very nsgnfcant usng ether estmaton method. 69