Applied Statistics and Econometrics

Similar documents
Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Applied Statistics and Econometrics

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Applied Statistics and Econometrics

Applied Statistics and Econometrics

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Statistical Inference with Regression Analysis

Linear Regression with Multiple Regressors

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Hypothesis Tests and Confidence Intervals in Multiple Regression

Introduction to Econometrics Third Edition James H. Stock Mark W. Watson The statistical analysis of economic (and related) data

Lecture 4: Multivariate Regression, Part 2

Econometrics Midterm Examination Answers

Lecture 5. In the last lecture, we covered. This lecture introduces you to

Graduate Econometrics Lecture 4: Heteroskedasticity

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Chapter 6: Linear Regression With Multiple Regressors

Nonlinear Regression Functions

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2016

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Introduction to Econometrics. Multiple Regression (2016/2017)

ECO220Y Simple Regression: Testing the Slope

Hypothesis Tests and Confidence Intervals. in Multiple Regression

Empirical Application of Simple Regression (Chapter 2)

The Simple Linear Regression Model

Lecture 4: Multivariate Regression, Part 2

Problem Set 1 ANSWERS

Linear Regression with 1 Regressor. Introduction to Econometrics Spring 2012 Ken Simons

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

1 Independent Practice: Hypothesis tests for one parameter:

Introduction to Econometrics. Multiple Regression

Econometrics Homework 1

Binary Dependent Variables

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Lab 07 Introduction to Econometrics

Econometrics - 30C00200

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

ECON Introductory Econometrics. Lecture 2: Review of Statistics

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

1 Linear Regression Analysis The Mincer Wage Equation Data Econometric Model Estimation... 11

Regression #8: Loose Ends

Correlation and Simple Linear Regression

Lecture 8: Instrumental Variables Estimation

Lab 11 - Heteroskedasticity

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

Statistical Modelling in Stata 5: Linear Models

Course Econometrics I

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

ECON3150/4150 Spring 2016

2. Linear regression with multiple regressors

Lab 10 - Binary Variables

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

Problem Set #3-Key. wage Coef. Std. Err. t P> t [95% Conf. Interval]

Review of Econometrics

At this point, if you ve done everything correctly, you should have data that looks something like:

Problem Set #5-Key Sonoma State University Dr. Cuellar Economics 317- Introduction to Econometrics

Answer Key: Problem Set 6

ECON Introductory Econometrics. Lecture 17: Experiments

General Linear Model (Chapter 4)

Lab 6 - Simple Regression

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

ECON Introductory Econometrics. Lecture 13: Internal and external validity

Problem Set 10: Panel Data

Exam ECON3150/4150: Introductory Econometrics. 18 May 2016; 09:00h-12.00h.

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

THE MULTIVARIATE LINEAR REGRESSION MODEL

Lecture 7: OLS with qualitative information

Simple Linear Regression: The Model

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Multiple Regression: Inference

Introduction to Econometrics. Review of Probability & Statistics

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Handout 12. Endogeneity & Simultaneous Equation Models

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

Contest Quiz 3. Question Sheet. In this quiz we will review concepts of linear regression covered in lecture 2.

2. (3.5) (iii) Simply drop one of the independent variables, say leisure: GP A = β 0 + β 1 study + β 2 sleep + β 3 work + u.

Handout 11: Measurement Error

Linear Regression with Multiple Regressors

Empirical Application of Panel Data Regression

(a) Briefly discuss the advantage of using panel data in this situation rather than pure crosssections

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

Multivariate Regression Analysis

Making sense of Econometrics: Basics

4 Instrumental Variables Single endogenous variable One continuous instrument. 2

the error term could vary over the observations, in ways that are related

Heteroskedasticity. Part VII. Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity

STATISTICS 110/201 PRACTICE FINAL EXAM

Homoskedasticity. Var (u X) = σ 2. (23)

Instrumental Variables, Simultaneous and Systems of Equations

MS&E 226: Small Data

Reliability of inference (1 of 2 lectures)

2.1. Consider the following production function, known in the literature as the transcendental production function (TPF).

Quantitative Methods Final Exam (2017/1)

Review of Statistics 101

Lecture 24: Partial correlation, multiple regression, and correlation

Transcription:

Applied Statistics and Econometrics Lecture 5 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 44 Outline of Lecture 5 Now that we know the sampling distribution of the OLS estimator, we are ready 1 to perform hypothesis tests about β 1 (SW 5.1) 2 to construct confidence intervals about β 1 (SW 5.2) In addition, we will cover some loose ends about regression: 3 Empirical application (Italian LFS) 4 Regression when X is binary (0/1) (SW 5.3) 5 Heteroskedasticity and homoskedasticity (SW 5.4) 6 The theoretical foundations of OLS Gauss-Markov Theorem (SW 5.5) Saul Lach () Applied Statistics and Econometrics September 2017 2 / 44

Testing hypotheses (SW 5.1) Suppose a skeptic suggests that reducing the number of students in a class has no effect on learning or, specifically, on test scores. Recall the model Testscore = β 0 + β 1 STR + u. The skeptic thus asserts the hypothesis, H 0 : β 1 = 0 We wish to test this hypothesis using data, and reach a tentative conclusion whether it is correct or incorrect. Saul Lach () Applied Statistics and Econometrics September 2017 3 / 44 Null and alternative hypotheses The null hypothesis and two-sided alternative: H 0 : β 1 = β 1,0 vs. H 1 : β 1 = β 1,0 where β 1,0 is the hypothesized value of β 1 under the null. Null hypothesis and one-sided alternative: H 0 : β 1 = β 1,0 vs. H 1 : β 1 < (>)β 1,0 In Economics, it is almost always possible to come up with stories in which an effect could go either way, so it is standard to focus on two-sided alternatives. Saul Lach () Applied Statistics and Econometrics September 2017 4 / 44

General approach to testing In general, the t-statistic has the form t = estimator hypothesized value under H 0 s tan dard error of estimator For testing the mean of Y this simplifies to And for testing β 1 this becomes t = Ȳ µ Y,0 s Y / n t = ˆβ 1 β 1,0 SE ( ˆβ 1 ) where SE ( ˆβ 1 ) is the square root of an estimator of the variance of the sampling distribution of ˆβ 1. Saul Lach () Applied Statistics and Econometrics September 2017 5 / 44 Formula for SE(betahat1) Recall the expression for the variance of ˆβ 1 (large n): σ 2ˆβ 1 = Var( ˆβ 1 ) = 1 n Var((X i µ X )u i ) (σ 2 X )2 The estimator of the variance of ˆβ 1 replaces the unknown population values by estimators constructed from the data: ˆσ 2ˆβ 1 = Var( ˆβ 1 ) = 1 n estimator of Var((X i µ X )u i ) (estimator of σ 2 X )2 = 1 n 1 n 2 n i=1((x i X ) 2 û 2 i ) [ 1 n n i (X i X ) 2] 2 Saul Lach () Applied Statistics and Econometrics September 2017 6 / 44

Formula for SE(betahat1) Thus SE ( ˆβ 1 ) = = ˆσ 2ˆβ = Var( ˆβ 1 1 ) 1 1 n n 2 n i=1((x i X ) 2 ûi 2 [ ) 1 n n i (X i X ) 2] 2 It looks complicated but the software does it for us. In Stata s regression output it appears in the Std. Err. column. Saul Lach () Applied Statistics and Econometrics September 2017 7 / 44 Coeffi cient estimates and their variance in Stata: test scores and STR. reg testsc str Source SS df MS Number of obs = 420 F(1, 418) = 22.58 Model 7794.11004 1 7794.11004 Prob > F = 0.0000 Residual 144315.484 418 345.252353 R-squared = 0.0512 Adj R-squared = 0.0490 Total 152109.594 419 363.030056 Root MSE = 18.581 testscr Coef. Std. Err. t P> t [95% Conf. Interval] str -2.279808.4798256-4.75 0.000-3.22298-1.336637 _cons 698.933 9.467491 73.82 0.000 680.3231 717.5428 Saul Lach () Applied Statistics and Econometrics September 2017 8 / 44

Summary for testing null against 2 sided alternative Construct the t-statistic: t = ˆβ 1 β 1,0 ˆσ 2ˆβ 1 Reject at 5% significance level if t > 1.96 The p-value is p = Pr[ t > t act ] = probability in tails of standard normal distribution above t act Reject H 0 at the 5% significance level if the p-value < 0.05. In general, reject H 0 at the α 100% significance level if the p-value is < α. This procedure relies on the large-n approximation; typically n = 50 is large enough for the approximation to be excellent. Saul Lach () Applied Statistics and Econometrics September 2017 9 / 44 Testing whether class size affects performance We test whether STR has any effect on testscores, The t-statistic is H 0 : β 1 = 0 vs. H 1 : β 1 = 0 testscore = 698.93 (9.47) 2.28 (0.48) str t = ˆβ 1 0 SE ( ˆβ 1 ) = 4.75 Check the t column in Stata s regression output (automatically checks hypothesis that coeffi cient is zero). Since t > 1.64, we reject the null hypothesis at 10%; Since t > 1.96, we reject the null hypothesis at 5%; Can we reject at 1.35%? Check and see whether the p-value is less than 0.0135... The p-value given in column P > t is essentially zero, thus we reject at any (relevant) significance level. Saul Lach () Applied Statistics and Econometrics September 2017 10 / 44

Test for the significance of a regressor The t-statistc reported by Stata is for the null hypothesis H 0 : β 1 = 0. Thus, the p-value reported by Stata is the p-value for H 0 : β 1 = 0 vs. H 1 : β 1 = 0. If your test is different, e.g., H 0 : β 1 = 1, you can t use the reported t- statistic and p-value! However, the test H 0 : β 1 = 0 vs. H 1 : β 1 = 0 is perhaps the most popular one, we call it a test for the significance of a regressor (or its parameter). If we reject the null hypothesis, we say that the coeffi cient β 1 is statistically significant. Saul Lach () Applied Statistics and Econometrics September 2017 11 / 44 Where are we? 1 Hypothesis tests about β 1 (SW 5.1) 2 Confidence intervals about β 1 (SW 5.2) 3 Empirical application 4 Regression when X is binary (0/1) (SW 5.3) 5 Heteroskedasticity and homoskedasticity (SW 5.4) 6 The theoretical foundations of OLS Gauss-Markov Theorem (SW 5.5) Saul Lach () Applied Statistics and Econometrics September 2017 12 / 44

Confidence interval for beta1 (SW 5.2) Recall that a 95% confidence interval can be expressed, equivalently: as the set of points that cannot be rejected at the 5% significance level; as an interval (that is a function of the data) that contains the true parameter value 95% of the time in repeated samples. In general, if the sampling distribution of an estimator is normal for large n, then a 95% confidence interval can be constructed as the estimator ± 1.96 standard error. So, a 95% confidence interval for β 1 is, { ˆβ 1 ± 1.96 SE ( ˆβ 1 ) } = { ˆβ 1 1.96 SE ( ˆβ 1 ), ˆβ 1 + 1.96 SE ( ˆβ 1 ) } The format of the confidence interval for β 1 is similar to that for the mean µ Y and for the difference in group means (Lecture 3). Saul Lach () Applied Statistics and Econometrics September 2017 13 / 44 Confidence interval testscore = 698.93 (9.47) 2.28 (0.48) str ˆβ 1 = 2.28, SE ( ˆβ 1 ) = 0.48 95% Confidence interval ( ˆβ 1 ± 1.96 SE ( ˆβ 1 ) ) = ( 2.28 ± 1.96 0.48) = ( 3.22, 1.34) Check last two columns in Stata s regression output. 90% Confidence interval ( ˆβ 1 ± 1.64 SE ( ˆβ 1 ) ) = ( 2.28 ± 1.64 0.48) = ( 3.22, 1.49) Saul Lach () Applied Statistics and Econometrics September 2017 14 / 44

Where are we? 1 Hypothesis tests about β 1 (SW 5.1) 2 Confidence intervals about β 1 (SW 5.2) 3 Empirical application 4 Regression when X is binary (0/1) (SW 5.3) 5 Heteroskedasticity and homoskedasticity (SW 5.4) 6 The theoretical foundations of OLS Gauss-Markov Theorem (SW 5.5) Saul Lach () Applied Statistics and Econometrics September 2017 15 / 44 Empirical application: wages and education Italian Labour Force Survey - ISTAT, 2015 Q3 RETRIC: wages of full-time employees. sum retric,d retribuzione netta del mese scorso Percentiles Smallest 1% 250 250 5% 500 250 10% 680 250 Obs 26,127 25% 1000 250 Sum of Wgt. 26,127 50% 1300 Mean 1306.757 Largest Std. Dev. 522.5101 75% 1550 3000 90% 1950 3000 Variance 273016.8 95% 2290 3000 Skewness.7246319 99% 3000 3000 Kurtosis 4.273194 Saul Lach () Applied Statistics and Econometrics September 2017 16 / 44

Empirical application: wages and education Italian Labour Force Survey - ISTAT, 2015 Q3 Education: education level attained. tab educ_lev if retric~=. educ_lev Freq. Percent Cum. Nessun titolo 142 0.54 0.54 Licenza elementare 700 2.68 3.22 Licenza media 7,510 28.74 31.97 Diploma 2-3 2,289 8.76 40.73 Diploma 4-5 10,530 40.30 81.03 Laurea 4,956 18.97 100.00 Total 26,127 100.00 Saul Lach () Applied Statistics and Econometrics September 2017 17 / 44 Recoding education We recode educ_lev in terms of years of education (and call it educ_years). //recoding education. recode educ_lev (1=0) (2=5) (3=8) (4=11) (5=13) (6=18), gen(educ_years) (89794 differences between educ_lev and educ_years). tab educ_lev educ_years if retric~=. RECODE of educ_lev educ_lev 0 5 8 11 13 18 Tot Nessun titolo 142 0 0 0 0 0 1 Licenza elementare 0 700 0 0 0 0 7 Licenza media 0 0 7,510 0 0 0 7,5 Diploma 2-3 0 0 0 2,289 0 0 2,2 Diploma 4-5 0 0 0 0 10,530 0 10,5 Laurea 0 0 0 0 0 4,956 4,9 Total 142 700 7,510 2,289 10,530 4,956 26,1 Saul Lach () Applied Statistics and Econometrics September 2017 18 / 44

OLS regression of wages on education in Stata. reg retric educ_years Source SS df MS Number of obs = 26,127 F(1, 26125) = 2740.72 Model 677244129 1 677244129 Prob > F = 0.0000 Residual 6.4556e+09 26,125 247104.039 R-squared = 0.0949 Adj R-squared = 0.0949 Total 7.1328e+09 26,126 273016.809 Root MSE = 497.1 retric Coef. Std. Err. t P> t [95% Conf. Interval] educ_years 43.0118.8215895 52.35 0.000 41.40144 44.62216 _cons 788.4206 10.36762 76.05 0.000 768.0995 808.7417 Saul Lach () Applied Statistics and Econometrics September 2017 19 / 44 OLS regression of wages on education in Stata We usually report results in a table but sometimes we write ŵages = 788.42 (10.37)) + 43.01 (0.822) educ We reject H 0 = no effect of education at the 5% significance level. An additional year of education is associated with an increase of 43 euros in average monthly salary. The average wage for individuals with no education is 788.4 euros per month. Saul Lach () Applied Statistics and Econometrics September 2017 20 / 44

Where are we? 1 Hypothesis tests about β 1 (SW 5.1) 2 Confidence intervals about β 1 (SW 5.2) 3 Empirical application 4 Regression when X is binary (0/1) (SW 5.3) 5 Heteroskedasticity and homoskedasticity (SW 5.4) 6 The theoretical foundations of OLS Gauss-Markov Theorem (SW 5.5) Saul Lach () Applied Statistics and Econometrics September 2017 21 / 44 Regression when X is binary (SW 5.3) Sometimes a regressor is binary: X = 1 if female, = 0 if male. X = 1 if treated (experimental drug), = 0 if not treated. X = 1 if small class size, = 0 if not a small class size. Binary regressors are often called dummy variables or regressors. So far, β 1 has been called a slope, but that doesn t make much sense if X is binary. How do we interpret regression with a binary regressor? Saul Lach () Applied Statistics and Econometrics September 2017 22 / 44

Interpretation of slope parameter when X is binary Consider Y i = β 0 + β 1 X i + u i, when X is binary. Then, E [Y i X i = 0] = β 0 E [Y i X i = 1] = β 0 + β 1 Thus, β 1 = E [Y i X i = 1] E [Y i X i = 0] = population difference in group means Saul Lach () Applied Statistics and Econometrics September 2017 23 / 44 Example of binary X When X is binary we usually call it D ( for dummy). For example, Testscore = β 0 + β 1 D + u where D i = { 1 if STRi 20 0 if STR i > 20 Saul Lach () Applied Statistics and Econometrics September 2017 24 / 44

Generating a dummy variable in Stata There are several options. For example, g D=(str<=20). g D=(str<=20). reg testscr D Source SS df MS Number of obs = 420 F(1, 418) = 15.05 Model 5286.87866 1 5286.87866 Prob > F = 0.0001 Residual 146822.715 418 351.250514 R-squared = 0.0348 Adj R-squared = 0.0324 Total 152109.594 419 363.030056 Root MSE = 18.742 testscr Coef. Std. Err. t P> t [95% Conf. Interval] D 7.185129 1.85201 3.88 0.000 3.544715 10.82554 _cons 649.9994 1.408711 461.41 0.000 647.2304 652.7685 Saul Lach () Applied Statistics and Econometrics September 2017 25 / 44 Comparison with difference in group means In Lecture 1 we computed Test score STR n mean sd Small 243 657.19 19.29 Large 177 650.00 17.97 All 420 654.16 19.05 And the difference in test score means is Ȳ small Ȳ large = 657.19 650.0 = 7.19 exactly as the OLS estimator of β 1...as expected. Saul Lach () Applied Statistics and Econometrics September 2017 26 / 44

Comparison with difference in group means We also showed in Lecture 1 that SE (Ȳ small Ȳ large ) = = difference in SE is due to rounding. s 2 small + s2 large n small n large 19.29 2 243 + 17.972 = 1.832 177 Given the OLS estimates, what are the mean testscores in both groups of districts? Saul Lach () Applied Statistics and Econometrics September 2017 27 / 44 Gender difference in wages Variable sg11 denotes gender of individual. sg11 coded as: 1 for male; 2 for female Recode and create new variable female What does -291 mean?. recode sg11 (1=0) (2=1), g(female) (101916 differences between sg11 and female). reg retric female Source SS df MS Number of obs = 26,127 F(1, 26125) = 2195.02 Model 552850657 1 552850657 Prob > F = 0.0000 Residual 6.5800e+09 26,125 251865.512 R-squared = 0.0775 Adj R-squared = 0.0775 Total 7.1328e+09 26,126 273016.809 Root MSE = 501.86 retric Coef. Std. Err. t P> t [95% Conf. Interval] female -291.3592 6.218838-46.85 0.000-303.5485-279.17 _cons 1444.535 4.276474 337.79 0.000 1436.153 1452.917 Saul Lach () Applied Statistics and Econometrics September 2017 28 / 44

Where are we? 1 Hypothesis tests about β 1 (SW 5.1) 2 Confidence intervals about β 1 (SW 5.2) 3 Empirical application 4 Regression when X is binary (0/1) (SW 5.3) 5 Heteroskedasticity and homoskedasticity (SW 5.4) 6 The theoretical foundations of OLS Gauss-Markov Theorem (SW 5.5) Saul Lach () Applied Statistics and Econometrics September 2017 29 / 44 Heteroskedasticity and homoskedasticity (SW 5.4) What do these two terms mean? If Var(u X = x) is constant that is, if the variance of the conditional distribution of u given X does not depend on X then u is said to be homoskedastic. Otherwise, u is heteroskedastic. What, if any, is the impact on the OLS estimator of having homoskedastic or heteroskedastic errors? Saul Lach () Applied Statistics and Econometrics September 2017 30 / 44

Heteroskedasticity and homoskedasticity Consider the example wage i = β 0 + β 1 educ i + u i Homoskedasticity means that the variance of u i does not change with the education level. Of course, we do not know anything about Var(u i educ i ), but we can use data to get an idea. On option is to plot the boxplot of wages for each educational category if u i is homoskedastic, the boxes should approximately be of the same size Saul Lach () Applied Statistics and Econometrics September 2017 31 / 44 Homoskedasticity in a picture retribuzione netta del mese scorso 0 1,000 2,000 3,000 Nessun titolo Licenza elementare Licenza media Diploma 2-3 Diploma 4-5 Laurea Saul Lach () Applied Statistics and Econometrics September 2017 32 / 44

Homoskedasticity in a table Another option is to check the variances directly. bysort educ_lev: sum retric -> educ_lev = Nessun titolo Variable Obs Mean Std. Dev. Min Max retric 142 904.2254 322.8163 250 1810 -> educ_lev = Licenza elementare Variable Obs Mean Std. Dev. Min Max retric 700 1037.114 423.5996 250 2700 -> educ_lev = Licenza media Variable Obs Mean Std. Dev. Min Max retric 7,510 1156.618 429.893 250 3000 -> educ_lev = Diploma 2-3 Variable Obs Mean Std. Dev. Min Max retric 2,289 1243.128 429.0504 250 3000 -> educ_lev = Diploma 4-5 Variable Obs Mean Std. Dev. Min Max retric 10,530 1307.365 485.4817 250 3000 -> educ_lev = Laurea Variable Obs Mean Std. Dev. Min Max retric 4,956 1611.981 633.4167 250 3000 Saul Lach () Applied Statistics and Econometrics September 2017 33 / 44 More pictures... Saul Lach () Applied Statistics and Econometrics September 2017 34 / 44

Impact of Heteroskedasticity/homoskedasticity on bias Whether the error is heteroskedastic or homoskedastic does not matter for the unbiasedness and consistency of the OLS estimator. Recall that the OLS estimator is unbiased and consistent under the three Least Squares Assumptions... And the heteroskedasticity/homoskedasticity distinction does not appear in these assumptions...so it does not affect the unbiasedness/consistency of the OLS estimator. The heteroskedasticity/homoskedasticity distinction also does not affect the large sample (asymptotic) normality of the OLS estimator. Saul Lach () Applied Statistics and Econometrics September 2017 35 / 44 Impact of Heteroskedasticity/homoskedasticity on variance The only place where this distinction matters is in the formula for computing the variance of the OLS estimator ˆβ 1. The formula presented in Lecture 4 is valid always, irrespective of whether there is heteroskedasticity or homoskedasticity. This formula delivers heteroskedastic-robust standard errors (which are also correct if there is homoskedasticity). There is another formula not presented here which is valid only when there is homoskedasticity. This is a simpler formula and it is often the default setting for many software programs. We sometimes call the S.E. based on this simpler estimator: homoskedasticity-only SEs. To get the heteroskedastic-robust std. errs. you must override the default. If you don t override the default and there is in fact heteroskedasticity, your standard errors (and t-statistics and confidence intervals) will be wrong. Typically, homoskedasticity-only SEs are somewhat smaller. Saul Lach () Applied Statistics and Econometrics September 2017 36 / 44

Summary on heteroskedasticity/homoskedasticity If the errors are either homoskedastic or heteroskedastic and you use heteroskedastic-robust standard errors, you are OK. If the errors are heteroskedastic and you use the homoskedasticity-only formula for standard errors, your standard errors will be wrong (the homoskedasticity-only estimator of the variance of ˆβ 1 is inconsistent if there is heteroskedasticity). The two formulas coincide (when n is large) in the special case of homoskedastic errors. So, you should always use heteroskedasticity-robust standard errors. Saul Lach () Applied Statistics and Econometrics September 2017 37 / 44 Heteroskedastic-robust standard errors in Stata Use option robust. reg testscr str,robust Linear regression Number of obs = 420 F(1, 418) = 19.26 Prob > F = 0.0000 R-squared = 0.0512 Root MSE = 18.581 Robust testscr Coef. Std. Err. t P> t [95% Conf. Interval] str -2.279808.5194892-4.39 0.000-3.300945-1.258671 _cons 698.933 10.36436 67.44 0.000 678.5602 719.3057. reg testscr str Source SS df MS Number of obs = 420 F(1, 418) = 22.58 Model 7794.11004 1 7794.11004 Prob > F = 0.0000 Residual 144315.484 418 345.252353 R-squared = 0.0512 Adj R-squared = 0.0490 Total 152109.594 419 363.030056 Root MSE = 18.581 testscr Coef. Std. Err. t P> t [95% Conf. Interval] str -2.279808.4798256-4.75 0.000-3.22298-1.336637 _cons 698.933 9.467491 73.82 0.000 680.3231 717.5428 Saul Lach () Applied Statistics and Econometrics September 2017 38 / 44

Where are we? 1 Hypothesis tests about β 1 (SW 5.1) 2 Confidence intervals about β 1 (SW 5.2) 3 Empirical application 4 Regression when X is binary (0/1) (SW 5.3) 5 Heteroskedasticity and homoskedasticity (SW 5.4) 6 The theoretical foundations of OLS Gauss-Markov Theorem (SW 5.5) Saul Lach () Applied Statistics and Econometrics September 2017 39 / 44 The theoretical foundations of OLS Gauss-Markov Theorem We have already learned a very great deal about OLS: 1 OLS is unbiased and consistent (under the three LS assumptions); 2 We have a formula for heteroskedasticity-robust standard errors; 3 We can use them to construct confidence intervals and test statistics. These are good properties of OLS. In addition, a very good reason to use OLS is that everyone else does, so by using it, others will understand what you are doing. In fact, OLS is the language of regression analysis, and if you use a different estimator, you will be speaking a different language. The natural question to ask is whether there other estimators that might have a smaller variance than OLS? Saul Lach () Applied Statistics and Econometrics September 2017 40 / 44

The theoretical foundations of OLS Gauss-Markov Theorem We can always find an estimator than has a very small variance (e.g., take the estimator to be the number 1.1). So, when we ask wether there are estimators with a smaller variance than OLS we need to be precise about which type of estimators we want to compare OLS to. We take all estimators that are linear functions of Y 1,... Y n and unbiased. Just as OLS is. From the proof of unbiasedness in Lecture 4 we can deduce that we can write ˆβ 1 = n i =1 ω i Y i. We then have the following important result: Theorem (Gauss-Markov Theorem) If the three Least Squares assumptions hold and if the errors are homoskedastic, then the OLS estimator is the Best Linear Unbiased Estimator ( BLUE). Saul Lach () Applied Statistics and Econometrics September 2017 41 / 44 The Gauss Markov Assumptions This is a pretty amazing result: it says that, if in addition to LSA 1-3 the errors are homoskedastic, then OLS is the best choice among all other linear and unbiased estimators. An estimator with the smallest variance is called an effi cient estimator. The GM theorems says that OLS is the effi cient estimator among the linear unbiased estimators of β 1. You could choose a biased estimator with zero variance, e.g. 1.1...but this is surely a poor choice! The set of four (4) assumptions necessary for the GM Theorem to hold are sometimes called the Gauss-Markov Assumptions Saul Lach () Applied Statistics and Econometrics September 2017 42 / 44

Limitations of the Gauss-Markov Theorem For the GM theorem to hold we need homoskedasticity which is often a not very realistic assumption. If there is heteroskedasticity and the GM theorem does not hold there may be be more effi cient estimators than OLS. But OLS is still unbiased and consistent under the three LS assumptions...it just may not have the smallest variance possible. The result is only for linear estimators. There may be other non-linear estimators with lower variance. OLS is more sensitive to outliers than some other estimators. Saul Lach () Applied Statistics and Econometrics September 2017 43 / 44 A stronger Gauss-Markov Theorem Theorem (Gauss-Markov Theorem) If the three Least Squares assumptions hold and if the errors are homoskedastic and normally distributed, then the OLS estimator has the smallest variance among all consistent estimators. Saul Lach () Applied Statistics and Econometrics September 2017 44 / 44