Applied Statistics and Econometrics

Size: px

Start display at page:

Download "Applied Statistics and Econometrics"

Brice Parrish
5 years ago
Views:

1 Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple regression model (SW 6.2, 6.3) 3 Measures of fit (SW 6.4) 4 The Least Squares Assumptions (SW 6.5) 5 Sampling distribution of the OLS estimator (SW 6.6) 6 Hypothesis tests and confidence intervals for a single coeffi cient (SW 7.1) Saul Lach () Applied Statistics and Econometrics September / 53

2 Omitted variable bias The model with a single regressor is Y = β 0 + β 1 X + u. The error u arises because of factors that influence Y but are not included in the regression. These exluded factors are omitted variables from the regression. There are always omitted variables, and sometimes this can lead to a bias in the OLS estimator. We will study when such a bias arises and the likely direction of this bias. Saul Lach () Applied Statistics and Econometrics September / 53 Example: testscores and STR The estimated regression line of the test score-class size relationship is testscore = STR A likely omitted variable here is family income. Suppose that in high-income districts classes are smaller and test scores are higher. Is 2.28 a credible estimate of the causal effect on test scores of a change in the student-teacher ratio? Probably not because it is likely that the estimated effect of STR also reflects the impact on test scores of variations in income across districts. Districts with smaller STR have higher tests due to higher income. Thus 2.78 is a larger effect (in abolute value) than the true causal effect of class size. Saul Lach () Applied Statistics and Econometrics September / 53

3 Omitted variable bias The bias in the OLS estimator that occurs as a result of an omitted factor is called the Omitted Variable Bias (OVB). Given that that there are always omitted variables, it is important to understand when such a OVB occurs? For OVB to occur, the omitted factor, which we call Z, must satisfy two conditions: 1 Z is a determinant of Y (i.e., Z is part of u) 2 Z is correlated with the regressor X. Both conditions must hold for the omission of Z to result in omitted variable bias. Saul Lach () Applied Statistics and Econometrics September / 53 Omitted variable bias: test scores and class size Another omitted variable could be English language ability 1 English language ability (whether the student has English as a second language) plausibly affects standardized test scores: Z is a determinant of Y. 2 Immigrant communities tend to be less affl uent and thus have smaller school budgets and higher STR: Z is correlated with X. Accordingly, ˆβ 1 is biased: what is the direction of the bias? That is, what is the sign of this bias? If intuition fails you, there is a formula... soon. Saul Lach () Applied Statistics and Econometrics September / 53

4 Conditions for OVB in CASchools data Sometimes we can actually check these conditions (at least in a given sample). The California School dataset has data on the percentage of students learning English. The variable is el_pct. Variable Obs Mean Std. Dev. Min Max el_pct Is this variable correlated with STR and testscore (at least in this sample)? correlate el_pct str testscr el_pct str testscr el_pct 1 str testscr Saul Lach () Applied Statistics and Econometrics September / 53 Conditions for OVB in CASchools data testscore testscore str str english english Districts with lower percent English learners have higher test scores. Districts with lower percent English learners have smaller classes. Saul Lach () Applied Statistics and Econometrics September / 53

5 OVB formula Recall from Lecture 4 (Preliminary algebra 3 slide) that we can write ˆβ 1 β 1 = n 1 i=1(x i X )u i n i=1(x i X ) 2 = n n i=1(x i X )u i 1 n n i=1(x i X ) 2 1 n = n i=1(x i X )u i n 1 n s2 X Under assumptions LS2 and LS3 we have ˆβ 1 p β1 + Cov(X, u) Var(x) }{{} OBV If LS1 holds then Cov(X, u) = 0 and ˆβ 1 E ( ) ˆβ 1 = β1 ). If LS1 does not hold then Cov(X, u) = 0 and ˆβ 1 (and also E ( ) ˆβ 1 = β1 ). p β1 (and also p β1 + Cov (X,u) Var (x ) Saul Lach () Applied Statistics and Econometrics September / 53 OVB formula in terms of omitted variable Previous formula is in terms of the error term u and, although it is called the OVB, it is more general in that the formula is correct irrespective of the reason for the correlation (or covariance) between u and X. Suppose we now that we assert that a variable Z is omitted from the regression. We are then saying that Z is part of u and we w.l.o.g. we can then write u = β 2 Z + ε where β 2 is a coeffi cient. Then Cov(X, u) = Cov(X, β 2 Z + ε) = β 2 Cov(X, Z ) assuming ε is uncorrelated with X. Saul Lach () Applied Statistics and Econometrics September / 53

6 OVB formula in terms of omitted variable The OVB formula in this case becomes ˆβ 1 p β1 + β 2 Cov(X, Z ) Var(x) }{{} OBV The math makes clear the two conditions for an OVB 1 Z is a determinant of Y = β 2 = 0. 2 Z is correlated with the regressor X = Cov(X, Z ) = 0. Saul Lach () Applied Statistics and Econometrics September / 53 OVB formula: correlation version An alternative formulation for the OVB formula is in terms of the correlation rather than the covariance: p Cov(X, u) ˆβ 1 β1 + Var(x) }{{} OBV p Cov(X, Z ) ˆβ 1 β1 + β 2 Var(x) }{{} OBV = β 1 + ρ Xu σ u σ X }{{} OBV = β 1 + β 2 ρ XZ σ Z σ X }{{} OBV Saul Lach () Applied Statistics and Econometrics September / 53

7 OVB formula in test score-class size example We usually use the OVB formula to try to sign the direction of the bias. For example, when Z is the % of English Learners it is likely that β 2 < 0 (also sample correlation suggest this). And ρ XZ is likely to be positive, ρ XZ > 0 (also suggested by sample correlation). Thus, β 2 ρ XZ ( ) (+) σ Z σ X < 0 so that ˆβ 1 is smaller than the true parameter β 1. Ignoring English learners overstates (in an absolute sense) the class size effect. What is the likely sign of the bias when Z is family income? Saul Lach () Applied Statistics and Econometrics September / 53 Three ways to overcome omitted variable bias 1. Run a randomized controlled experiment in which treatment (STR) is randomly assigned: then el_pct is still a determinant of testscore, but el_pct is uncorrelated with STR. Such random experiments are unrealistic in practice. Saul Lach () Applied Statistics and Econometrics September / 53

Three ways to overcome omitted variable bias 2. Adopt the cross tabulation approach: divide sample into groups having approx. same value of el_pct and analyze within groups.

8 Three ways to overcome omitted variable bias 2. Adopt the cross tabulation approach: divide sample into groups having approx. same value of el_pct and analyze within groups. Problems: 1) soon we will run out of data, 2) there are other determinants (e.g., family income,parental education) that are omitted. Saul Lach () Applied Statistics and Econometrics September / 53 Three ways to overcome omitted variable bias 3. Use a regression in which the omitted variable (el_pct) is no longer omitted: include el_pct as an additional regressor in a multiple regression. This is the approach we will focus on. Saul Lach () Applied Statistics and Econometrics September / 53

9 Where are we? 1 Omitted variable bias (SW 6.1) 2 Multiple regression model (SW 6.2, 6.3) 3 Measures of fit (SW 6.4) 4 The Least Squares Assumptions (SW 6.5) 5 Sampling distribution of the OLS estimator (SW 6.6) 6 Hypothesis tests and confidence intervals for a single coeffi cient (SW 7.1) Saul Lach () Applied Statistics and Econometrics September / 53 The multiple regression model The population regression model (or function) is Y = β 0 + β 1 X 1 + β 2 X β k X k + u Y is the dependent variable. X 1, X 2,..., X k are the k independent variables (regressors). β 0 is the (unknown) intercept and β 1,..., β k are the various (unknown) slopes. u is the regression error reflecting other omitted factors affecting Y. We assume right away that E (u X 1, X 2,... X k ) = 0 so that the population regression line is the C.E. of Y given the k X s and the slope parameters can be interpreted as causal effects. Saul Lach () Applied Statistics and Econometrics September / 53

10 Interpretation of coeffi cients (slopes) in multiple regression Consider changing X 1 from x 1 to x 1 + 1, while holding all the other X s fixed. Before the change we have After the change we have The difference is E (Y X 1 = x 1,..., X k = x k ) = β 0 + β 1 x 1 + β 2 x β k x k E (Y X 1 = x 1 + 1,..., X k = x k ) = β 0 + β 1 (x ) + β 2 x β k x k E (Y X 1 = x 1 +,..., X k = x k ) E (Y X 1 = x 1,..., X k = x k ) = β 1 1 Saul Lach () Applied Statistics and Econometrics September / 53 Interpretation of coeffi cients (slopes) in multiple regression When 1 = 1 we have E (Y X 1 = x 1 + 1,..., X k = x k ) E (Y X 1 = x 1,..., X k = x k ) = β 1 β 1 measures the effect on (expected) Y of a unit change in X 1, holding the other regressors X 2,... X k fixed (we also say controlling for X 2,... X k ). Whether this partial effect can be given a causal interpretation depends on what we assume for E (u X 1, X 2,... X k ). If E (u X 1, X 2,... X k ) is constant, as assumed here, then β 1 is the causal effect of X 1 on Y. Otherwise, it is not a causal effect. Why? Same interpretation for β j, j = 2,..., k. Saul Lach () Applied Statistics and Econometrics September / 53

11 The multiple regression model in the sample The regression model (or function) in the sample is Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i i = 1,..., n The i th observation in the sample is (Y i, X 1i, X 2i,..., X ki ). Saul Lach () Applied Statistics and Econometrics September / 53 Estimation To simplify the presentation we assume that we have two regressors only, k = 2, Y i = β 0 + β 1 X 1i + β 2 X 2i + u i With two regressors, the OLS estimator solves: n Min b 0,b 1,b 2 (Y i (b 0 + b 1 X 1i + b 2 X 2i )) 2 i=1 The OLS estimator minimizes the average squared difference between the actual values of Y i and the prediction (predicted value), b 0 + b 1 X 1i + b 2 X 2i, based on such b s. This minimization problem is solved using calculus. The result is the OLS estimators of β 0, β 1, β 2 denoted, respectively, by ˆβ 0, ˆβ 1, ˆβ 2. Generalization of the case with one regressor (k = 1). Saul Lach () Applied Statistics and Econometrics September / 53

12 Graphic intuition n min b 0,b 1 (Y i b 0 b 1 X 1i ) 2 i=1 Fits line through points in R 2 n min b 0,b 1,b 2 (Y i b 0 b 1 X 1i b 2 X 2i ) 2 i=1 Fits plane through points in R 3 testscore testscore english str str Saul Lach () Applied Statistics and Econometrics September / 53 Matrix notation The multiple regression model Y i = β 0 + β 1 X 1i + β 2 X 2i + + β k X ki + u i i = 1,..., n can be written in matrix form as Y = Xβ + u where Y = Y 1 Y 2. X = 1 X 11 X 21 X k1 1 X 12 X 22 X k β = β 0 β 1. u = u 1 u 2. Y n 1 X 1n X 2n X kn β k u n Saul Lach () Applied Statistics and Econometrics September / 53

13 OLS in matrix form Using matrix notation, the minimization of the sum of squared residuals can be compactly written as Min β (Y Xβ) (Y Xβ) And the first order conditions are X (Y Xβ) = 0 = X X }{{} β }{{} (k +1) (k +1) (k +1) 1 = X Y }{{} (k +1) 1 which is a system of linear equations that can be solved for β (recall Ax = b, where A = X X,x = β and c = X Y). The solution is the OLS estimator given by provided X X is invertible. ^β = ( X X ) 1 X Y Saul Lach () Applied Statistics and Econometrics September / 53 Example:the CASchools test score data What happens to the coeffi cient on STR?. reg testscr str Source SS df MS Number of obs = 420 F(1, 418) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = testscr Coef. Std. Err. t P> t [95% Conf. Interval] str _cons reg testscr str el_pct Source SS df MS Number of obs = 420 F(2, 417) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = testscr Coef. Std. Err. t P> t [95% Conf. Interval] str el_pct _cons Saul Lach () Applied Statistics and Econometrics September / 53

14 OLS predicted values and residuals Just as in the single regression model the predicted value is And the residual is Ŷ i = ˆβ 0 + ˆβ 1 X 1i + ˆβ 2 X 2i + + ˆβ k X ki û i = Y i Ŷ i = Y i ( ˆβ 0 + ˆβ 1 X 1i + ˆβ 2 X 2i + + ˆβ k X ki ) So that we can write Y i = Ŷ i + û i = ˆβ 0 + ˆβ 1 X 1i + ˆβ 2 X 2i + + ˆβ k X ki + û i Saul Lach () Applied Statistics and Econometrics September / 53 Where are we? 1 Omitted variable bias (SW 6.1) 2 Multiple regression model (SW 6.2, 6.3) 3 Measures of fit (SW 6.4) 4 The Least Squares Assumptions (SW 6.5) 5 Sampling distribution of the OLS estimator (SW 6.6) 6 Hypothesis tests and confidence intervals for a single coeffi cient (SW 7.1) Saul Lach () Applied Statistics and Econometrics September / 53

15 Measures of fit for multiple regression Same measures as before: SER (RMSE)= std. deviation of residual û i R 2 = fraction of variance of Y explained or accounted for by X 1,..., X k. (new!) R 2 is the adjusted R 2 = R 2 adjusted for the number of regressors. Saul Lach () Applied Statistics and Econometrics September / 53 Measures of fit: SER and RMSE As in regression with a single regressors, the SER/RMSE measure the spread of the Y s around the estimated regression line n 1 SER(RMSE ) = n k 1 ûi 2 i=1 Saul Lach () Applied Statistics and Econometrics September / 53

16 Measures of fit: R squared As in the regression with a single regressors, the R 2 is the fraction of the variance of Y accounted for by the model (i.e., by X 1,..., X k ) where ESS = n (Ŷi Ȳ ) 2, i=1 R 2 = ESS TSS = 1 SSR TSS n TSS = (Y i Ȳ ) 2, SSR = i=1 n ûi 2 i=1 The R 2 never decreases when another regressors is added (i.e., when k increases). (why?) Not a good feature for a measure of fit. Saul Lach () Applied Statistics and Econometrics September / 53 Measures of fit: Adjusted R squared The adjusted R 2 R 2 addresses this issue by penalizing you for including another regressor: ( ) n 1 SSR R 2 = 1 n k 1 TSS ( ) = R 2 k SSR n k 1 TSS Note that R 2 < R 2 but their difference tends to vanish for large n. R 2 does not necessarily increase with k (although SSR decreases, n 1 n k 1 increases). R 2 can be negative! Saul Lach () Applied Statistics and Econometrics September / 53

17 How to interpret the simple and adjusted R squared? A high R 2 (or R 2 ) means that the regressors account for the variation in Y. A high R 2 (or R 2 ) does not mean that you have eliminated omitted variable bias. A high R 2 (or R 2 ) does not mean that you have an unbiased estimator of a causal effect. A high R 2 (or R 2 ) does not mean that the included variables are statistically significant this must be determined using hypotheses tests. Maximize R 2 (or R 2 ) is not a a criterion we use to select regressors. Saul Lach () Applied Statistics and Econometrics September / 53 CASchools data example Regression of testscore against STR: testscore = str, R 2 = Regression of testscore against STR and el_pct: testscore = str 0.65el_pct, R 2 = Adding the % of English Learners substantially improves fit of the regression. Both regressors account for almost 43% of the variation of testscores across districts. Saul Lach () Applied Statistics and Econometrics September / 53

18 Where are we? 1 Omitted variable bias (SW 6.1) 2 Multiple regression model (SW 6.2, 6.3) 3 Measures of fit (SW 6.4) 4 The Least Squares Assumptions (SW 6.5) 5 Sampling distribution of the OLS estimator (SW 6.6) 6 Hypothesis tests and confidence intervals for a single coeffi cient (SW 7.1) Saul Lach () Applied Statistics and Econometrics September / 53 The Least Squares Assumptions for multiple regression The multiple regression model is Y = β 0 + β 1 X 1 + β 2 X β k X k + u The four least squares assumptions are: Assumption #1 The conditional distribution of u given all the X s has mean zero, that is, E (u X 1 = x 1,..., X k = x k ) = 0 for all (x 1,... x k ). Assumption #2 (Y i, X 1i,..., X ki ), i = 1,..., n, are i.i.d. Assumption #3 Large outliers in Y and X are unlikely. X 1,..., X k and Y have finite fourth moments, E ( Y 4) <, E ( X 4 1 ) <,..., E ( X 4 k ) < Assumpiton #4 There is no perfect multicollinearity. Saul Lach () Applied Statistics and Econometrics September / 53

19 Assumption #1: mean independence E (u X 1 = x 1,..., X k = x k ) = 0 Same interpretation as in the regression with a single regressor. This assumption gives a causal interpretation to the parameters (the β s). If an omitted variable (a) belongs in the equation (so is in u) and (b) is correlated with an included X, then this condition fails and there is OVB (omitted variable bias). The solution if possible is to include the omitted variable in the regression. Usually, this assumption is more likely to hold when one controls for more factors by including them in the regression. Saul Lach () Applied Statistics and Econometrics September / 53 Assumption #2: i.i.d. sample Same assumption as in the single regressor model. This is satisfied automatically if the data are collected by simple random sampling. Saul Lach () Applied Statistics and Econometrics September / 53

20 Assumption #3: large outliers are unlikely Same assumption as in the single regressor model. OLS can be sensitive to large outliers. It is recommended to check the data (via scatterplots, etc.) to make sure there are no large outliers (due to typos, coding errors, etc.) This is technical assumption satisfied automatically by variables with a bounded domain. Saul Lach () Applied Statistics and Econometrics September / 53 Assumption #4: No perfect multicollinearity New assumption that applies when there is more than a single regressor. Perfect multicollinearity occurs when one of the regressors is an exact linear function of the other regressors. Assumption #4 rules this out. Cannot estimate the effect of, say, X 1 holding all other variables constant if one of these variables is a perfect linear function of X 1. When there is perfect multicollinearity, the statistical software will let you know it by either crashing, by giving an error message, or by dropping one of the regressors arbitrarily. Saul Lach () Applied Statistics and Econometrics September / 53

21 Including a perfectly collinear regressor in R Example: generate str_new=5+.2*str and add it to regression. What happens? Stata drops one fo the collinear variables.. g str_new=5+.2*str. reg testscr str str_new note: str_new omitted because of collinearity Source SS df MS Number of obs = 420 F(1, 418) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = testscr Coef. Std. Err. t P> t [95% Conf. Interval] str str_new 0 (omitted) _cons Saul Lach () Applied Statistics and Econometrics September / 53 The dummy variable trap Suppose you have a set of multiple binary (dummy) variables, which are mutually exclusive and exhaustive that is, there are multiple categories and every observation falls in one and only one category (think of region of residence: Sicily, Lazio, Tuscany, etc.). If you include all these dummy variables and a constant in the regression, you will have perfect multicollinearity this is sometimes called the dummy variable trap. Why is there perfect multicollinearity here? Solutions to the dummy variable trap: 1 Omit one of the groups (e.g. Lazio), or 2 Omit the intercept What are the implications of (1) or (2) for the interpretation of the coeffi cients? We will analyze this later in an example. Saul Lach () Applied Statistics and Econometrics September / 53

22 Assumption #4: No perfect multicollinearity Perfect multicollinearity usually reflects a mistake in the definitions of the regressors, or an oddity in the data. The solution to perfect multicollinearity is to modify your list of regressors so that you no longer have perfect multicollinearity. Saul Lach () Applied Statistics and Econometrics September / 53 Imperfect multicollinearity Imperfect and perfect multicollinearity are quite different despite the similarity of their names. Imperfect multicollinearity occurs when two or more regressors are very highly (but not perfectly) correlated. Why the term multicollinearity? If two regressors are very highly correlated, then their scatterplot will pretty much look like a straight line they are co-linear" but unless the correlation is exactly ±1, that collinearity is imperfect. Saul Lach () Applied Statistics and Econometrics September / 53

23 Imperfect multicollinearity Imperfect multicollinearity implies that one or more of the regression coeffi cients will be imprecisely estimated. Intuition: the coeffi cient on X 1 is the effect of X 1 holding X 2 constant; but if X 1 and X 2 are highly correlated, there is very little variation in X 1 once X 2 is held constant so the data are pretty much uninformative about what happens when X 1 changes but X 2 doesn t. This means that the variance of the OLS estimator of the coeffi cient on X 1 will be large. Thus, imperfect multicollinearity (correctly) results in large standard errors for one or more of the OLS coeffi cients. Importantly, imperfect multicollinearity does not violate Assumption #4. The OLS regression will run. Saul Lach () Applied Statistics and Econometrics September / 53 Where are we? 1 Omitted variable bias (SW 6.1) 2 Multiple regression model (SW 6.2, 6.3) 3 Measures of fit (SW 6.4) 4 The Least Squares Assumptions (SW 6.5) 5 Sampling distribution of the OLS estimator (SW 6.6) 6 Hypothesis tests and confidence intervals for a single coeffi cient (SW 7.1) Saul Lach () Applied Statistics and Econometrics September / 53

24 The sampling distribution of OLS Under the four LS assumptions: 1 ˆβ 0, ˆβ 1,..., ˆβ k are unbiased and consistent estimators of β 0, β 1,..., β k. 2 The joint sampling distribution of ˆβ 0, ˆβ 1,..., ˆβ k is well approximated by a multivariate normal distribution. 3 This implies that, in large samples, for j = 0.1,..., k, ) ˆβ j N (β j, σ 2ˆβj or, equivalently, ˆβ j β j σ 2ˆβj N (0, 1) Saul Lach () Applied Statistics and Econometrics September / 53 The variance of the OLS estimator There is a more complicated formula for the estimator of the variance of ˆβ j...but the software does it for us! As in the single regressor case, there is a formula that holds only when there is homoskedasticity, i.e., when Var (u X 1,..., X k ) is a constant that does not vary with the values of (X 1,..., X k ), and another formula that holds when there is heteroskedasticity. As in the single regressor case, we prefer to use the formula that is robust to heteroskedasticity becasuse it is also correct when there is homoskedasticity. Intuitively we expect our estimator to be less precise (to have higher sampling variance) when using the same data to estimate more parameters. This is indeed correct, and the formula for the variance of ˆβ j (not shown) reflects this intuition as it usually increases with the number of variables (k) included in the regression. This result prevents us form keeping adding regressors without limits. Saul Lach () Applied Statistics and Econometrics September / 53

25 Where are we? 1 Omitted variable bias (SW 6.1) 2 Multiple regression model (SW 6.2, 6.3) 3 Measures of fit (SW 6.4) 4 The Least Squares Assumptions (SW 6.5) 5 Sampling distribution of the OLS estimator (SW 6.6) 6 Hypothesis tests and confidence intervals for a single coeffi cient (SW 7.1) Saul Lach () Applied Statistics and Econometrics September / 53 Hypothesis Tests and Confidence Intervals for a Single Coeffi cient This follows the same logic and recipe as for the slope coeffi cient in a single-regressor model. Because ˆβ j β j σ 2ˆβj is approximately distributed N(0, 1) in large samples (under the four LS assumptions), hypotheses on β 1 can be tested using the usual t-statistic t = ˆβ 1 β 1,0 SE ( ˆβ 1 ), and 95% confidence intervals are constructed as { ˆβ 1 ± 1.96 SE ( ˆβ 1 )} Similarly for β 2,..., β k. ˆβ 1 and ˆβ 2 are generally not independently distributed so neither are their t-statistics (more on this later). Saul Lach () Applied Statistics and Econometrics September / 53

26 The California school dataset Single regressor estimates. reg testscr str, robust Linear regression Number of obs = 420 F(1, 418) = Prob > F = R-squared = Root MSE = Robust testscr Coef. Std. Err. t P> t [95% Conf. Interval] str _cons Saul Lach () Applied Statistics and Econometrics September / 53 The California school dataset Multiple regression estimates. reg testscr str el_pct,robust Linear regression Number of obs = 420 F(2, 417) = Prob > F = R-squared = Root MSE = Robust testscr Coef. Std. Err. t P> t [95% Conf. Interval] str el_pct _cons Saul Lach () Applied Statistics and Econometrics September / 53

27 Testing hypotheses and CI in the California school dataset The coeffi cient on STR in the multiple regression is the effect on Testscore of a unit change in STR, holding constant the percentage of English Learners in the district. The coeffi cient on STR falls by one-half (in absolute value) when el_pct is added to the regression (does it make sense?) The 95% confidence interval for the coeffi cient on STR in is { 1.10 ± } = ( 1.95, 0.25) The t-statistic testing H 0 : β STR = 0 is t = ˆβ STR 0 σ 2ˆβSTR = = 2.54 so we reject the null hypothesis at the 5% significance level We use heteroskedasticity-robust standard errors for exactly the same reasons as in the case of a single regressor. Saul Lach () Applied Statistics and Econometrics September / 53

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution