σ σ MLR Models: Estimation and Inference v.3 SLR.1: Linear Model MLR.1: Linear Model Those (S/M)LR Assumptions MLR3: No perfect collinearity

Size: px

Start display at page:

Download "σ σ MLR Models: Estimation and Inference v.3 SLR.1: Linear Model MLR.1: Linear Model Those (S/M)LR Assumptions MLR3: No perfect collinearity"

Barry Joseph
5 years ago
Views:

1 Comparison of SLR and MLR analysis: What s New? Roadmap Multicollinearity and standard errors F Tests of linear restrictions F stats, adjusted R-squared, RMSE and t stats Playing with Bodyfat: F tests in action Comparison of SLR and MLR Analysis: What s new? Those (S/M)LR Assumptions If these hold: SLR SLR.: Linear Model Y = β + β X + U 0 SLR: Random sampling SLR3: Sampling variation in the independent variable SLR.4: Zero Conditional Mean EU ( X= ) = 0 for any LUE: OLS provides unbiased parameter estimates MLR MLR.: Linear Model Y = β + β X + U 0 j j MLR: Random sampling MLR3: No perfect collinearity MLR.4: Zero Conditional Mean EU (,,, k ) = 0 for any, LUE: OLS provides unbiased parameter estimates PRF (Population Regression Function) EY ( X ) = = β0 + β EY (,, k) β0 β jj = + Add one more assumption SLR.5: Homoskedastic errors (conditional on ) MLR.5: Homoskedastic errors (conditional on the j s) If all 5 conditions hold: s = ( i ) σ Var( B ' ) σ Var( B, w, z... etc) = ( i ) R VIF (Variance Inflation Factor) VIFj = R j

2 So Var( B, w, z... etc) = σ ( ) i VIF ˆ SSR n σ = = MSE ˆ SSR n ( k + ) σ = = MSE MSE unbiased estimator of Gauss-Markov Thm. SLR.-SLR.5 BLUE σ MSE unbiased estimator of MLR.-MLR.5 BLUE σ. Inference SLR MLR Add one last Assumption: Distribution of Y i SLR.6: Ui N(0, σ ) and indept. of X MLR.6: Ui N(0, σ ) and indept. of Xj s Degrees of Freedom n- n-k- Distribution of Estimator: t statistic B β tn where se( B ) MSE se( B ) = ( i ) B j β j se( B ) j se( B ) t where n k MSE = ( i ) i VIF Confidence Intervals (C is confidence level) [ B c se( B )] ±, P[ tn < c) = C Bj ± c se( Bj), P[ t < c) = C n k Null Hypothesis: H0 : β = 0 H : 0 0 β j = t stat (under H0) B tstat = t se( B ) n B j tstat = t se( B ) j n k p value (for given data set and OLS estimates) p P tn tstat ˆ β = [ > ) p = P[ tn k > tstat ˆ β ) j Hypothesis Test (for given ) : Reject H0 if t ˆ > or if p α β c < Reject H0 if t ˆ > c or if p < α β j

3 Roadmap. There are remarkably few substantive differences between SLR and MLR models with respect to estimation and inference. Or put differently: Much of what we learned about SLR models carries over to MLR models.. Some new issues that arise because we now have the possibility of more than one eplanatory variable in the model: a. Small differences in formulas for adjusted R-squared, MSE, estimators, confidence intervals and hypothesis tests (VIFj and n-k-) b. F statistics and F tests - testing linear restrictions on unknown parameters (plural) c. And, umm, that s about it. 3. The MLR.-MLR.6 conditions are basically the same as the SLR conditions, though SLR.3 (variation in the RHS variable) is replaced by MLR.3 (no perfect collinearity amongst the RHS variables). 4. In several cases, we replace the (n-) term in various SLR formulas with (n-k-) for MLR formulas, where k is the number of eplanatory variables in the model. a. The t-distributions used in Hypothesis Testing and Confidence Intervals now have n-k- degrees of freedom. And the F-statistic does as well, sort of. b. The formula for adjusted R-squared now has a n-k- term. c. In the MSE formula, we now divide by n-k- to generate an unbiased estimator. And yes, n- = n-k- for SLR models. Adjusted R-squared will now be of more value as we compare goodness of fit across models with differing numbers of eplanatory variables.

4 Multicollinearity and Standard Errors. Multicollinearity: Recall that multicollinearity captures the etent to which the values of one (RHS) variable can be predicted by a linear function of other RHS variables a. Typically measured as R j (R squared j): The R of the collinearity regression in which one of the RHS variables is regressed on the others RHS variables. So, for eample, R is the R of the following MLR model: X = γ0 + γx + γ3x γnxn + U. Variance Inflation Factors: VIFj =, and so named because given MLR.-5 and R j conditional on the s, σ σ σ Var( B ) = = VIF = VIF. ( ) ( ) ( n ) S i i R i i 3. As we saw earlier in the semester, while the presence of multicollinearity will lead to higher standard errors, the real problem with it is that it can lead to wacky estimated coefficients because of the ceteris paribus condition (in some cases it doesn t make sense to hold everything else fied while changing just one eplanatory variable). So if you have strange estimated coefficients, see if multicollinearity is driving them. a. Note that since VIFj =, R j =. Recall that the second relationship is R j VIFj useful in calculating R using the output from Stata s vif command. j

5 F Tests of Linear Restrictions 4. To do an F test of a set of q linear restrictions on the estimated coefficient. a. Start with the Null hypothesis that it s A-OK to impose some linear restrictions on the OLS model. i. Eamples of linear restriction: a) β = β, b) β+ β = 0, c) β = 0and β = 0, and d) β = and β =. Linear restrictions are linear functions of the unknown parameters. ii. We ll be interested in counting the number of restrictions. To do that, just count the equals signs. So in the eamples, a) and b) have one restriction, and c) and d) have two. b. Estimate the model with and without the restrictions. c. Since we ve imposed a restriction (or restrictions) on the estimated coefficients, the SSRs will almost always increase: SSRR SSRUR. (The subscripts refer to the Restricted and Unrestricted models.) d. But by how much? a lot? or maybe not so much? e. If SSRs increase by a lot (whatever that is) then the restrictions severely impacted the performance of the model, and so we reject the Null Hypothesis (which was that the restrictions were A-OK). f. But if not so much, then maybe those restrictions weren t so bad after all, and we might fail to reject. Which is to say that it really was A-OK to impose those restrictions. 5. F statistic: We use the F statistic (and the F distribution) to do the F test. The F statistic is defined by: ( ) / SSRR SSRUR q F, where q is the number of restrictions (eg the # of = s), and SSRUR / ( n k ) n k is the number of degrees of freedom in the unrestricted (UR) model. By construction F Elasticity - Another interpretation of the F statistic: We can rewrite the equation for F as ( ) / % SSRR SSRUR SSRUR SSR F = so the F statistics tells you the %change in q / ( n k ) % dofs SSRs for a given %change in degrees of freedom (you might call this bang per buck).

6 7. Under MLR.-MLR.6, the following F statistic will have a Fqn (, k ) distribution. Here are some F distributions for different parameter values: F(,0) Distribution F(3,0) Distribution F(0,0) Distribution F(5,0) Distribution 8. We do the F test following the classical hypothesis testing protocol: a. The Null Hypothesis is that the restrictions are A-OK. b. Focus on Type I errors and the probability of a False Rejection. c. Pick a small significance level, α, say α =.05, which will be the maimum acceptable probability of a False Rejection. d. For the given Fstat, compute the p-value as the probability in the tail to the right of the Fstat: p = P( F( q, n k ) > Fstat). e. The p value is the probability that you d observe an F statistic at least as large as Fstat when the Null Hypothesis was in fact true.

7 f. Reject the Null Hypothesis if the p < α and fail to reject otherwise. i. So reject the Null Hypothesis if the F statistic is large and the associated probability level is small (less than the significance level α ). In other words: the restrictions are rejected because when they were imposed, the world got a whole lot worse. ii. If the Fstat (elasticity) is small, so that relatively speaking, SSRs did not increase by much, then the associated probability level is high, and the restrictions are not rejected and are deemed to be OK. 9. And so we reject the Null Hypothesis if the probability value associated with this test statistic is below the desired significance level or equivalently, if the F statistic is too large. 0. Eample: Testing hgt parameter is zero in the following SLR model. reg Brozek hgt Source SS df MS Number of obs = F(, 50) =.00 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = Brozek Coef. Std. Err. t P> t [95% Conf. Interval] hgt _cons test hgt=0 ( ) hgt = 0 F(, 50) =.00 Prob > F = Notice that q =, n-k- = 50, the Fstat is.00 and the p value (Prob > F) is We cannot reject at the 5% level the Null Hypothesis that the true parameter value for hgt is 0. Notice that the p value for the F test is the same as the p-value for the t stat n the regression results and that the Fstat is the square of the reported tstat. These are not coincidences (see below).

8 . F stats defined using R-squared: You can also derive the F statistic using R-squared. SSRR SSRUR a. Define RR = and RUR =, and so SST SST SST ( R R R UR ) / q ( RUR RR ) / q F =. SST R UR / ( n k ) R UR / ( n k ) b. Relation to t-stats: If you apply the F test to the case of testing just one parameter value (say testing β = 0 ), the F test will be the same as the t test, as above. So there is no inconsistency here. (See the eample above.) c. Fstat will be the same as tstat, and the p-values for the two statistics will be the same since prob( F(, n k ) < ) = prob( < t < ), for > 0.. Babies and bathwater: Be careful about throwing out the baby with the bath water you don t want to eclude a significant eplanatory variable just because it happens to be associated with a set of variables that are jointly insignificant. 3. Eample: Here s an eample using a sample from a sovereign debt dataset. The F test does not reject at the 0% level the Null Hypothesis that the inflation and deficit parameters are zero even though deficit_gdp is almost statistically significant at the 5% level. But inflation is so statistically insignificant, it pulls down deficit_gdp in the process. n k. reg NSRate corrupt gdp inflation deficit_gdp debt_gdp eurozone if _n < 30 Source SS df MS Number of obs = F(6, ) =.99 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE =.0674 NSRate Coef. Std. Err. t P> t [95% Conf. Interval] corrupt gdp inflation deficit_gdp debt_gdp eurozone _cons

9 . test inflation deficit_gdp ( ) inflation = 0 ( ) deficit_gdp = 0 F(, ) =. Prob > F = The reported F stat in OLS output: Standard regression packages report the F statistic associated with test that all of the (non-intercept) parameter values are zero. This is sometimes called the overall significance of the regression. a. If you cannot reject that hypothesis, you have a really crummy model. b. Using the R formula and the fact that the restricted model regressing Y on a constant only will have R R = 0. And since RUR = R (the reported R-squared for the regression), R / k we have F = the reported F statistic. R / ( n k ) c. Note that this is confirmed in the previous regression results:. di (0.7799/6)/(( )/).9948 F stats, adjusted R-squared, RMSE and tstats 5. Recall that adjusted R is an attempt to adjust the coefficient of determination for the fact that when you add RHS variables to a model, R cannot decline since SSR cannot increase. Small decreases in SSRs will not generate a higher adj R ; larger decreases will and what is small or large will also depend on how many changes were made. 6. Adj R is defined: SSR ( n ) R = SST ( n k ) SSR / ( n k ) = SST / ( n ) a. Thus for given S YY adding or subtracting RHS variables from a MLR model, eactly opposite directions. ( n ) b. And since > 0, R ( n k ) MSE =. S, adjusted R increases if and only if MSE decreases. So if you are YY R and MSE will move in R, with the difference inversely related to k. 7. Consider two models: UR unrestricted and R- restricted. You can think of the process of adding RHS variables to a model as being equivalent to moving from a restricted model (in which the coefficients for all of the added variables are zero) to an unrestricted model (in which they are no longer constrained to be zero).

10 adjr from adding RHS var(s) to the model R Model: fewer RHS vars UR Model: more RHS vars F statistic from dropping RHS var(s) (restricting coeffs to be zero) 8. It turns out that the change in adj R in dropping the restrictions (adding the variables) is closely related to the F-statistic associated with imposing the restrictions. The change in adjusted R in going from one model to the other is defined by: SSRR SSRUR RUR RR = ( MSER MSEUR ) =. SYY SYY ( n k ) + q n k q R ( n k ) + q UR After much algebra, this is: R R = [ F ] UR R a. The sign of this epression will depend on whether the F statistic is greater or less than : adjusted R increases if and only if the F statistic associated with the restrictions eceeds. b. Recall that for a single restriction, the F statistic is the square of the t stat, and so adjusted R increases if and only if the magnitude of the t stat of the added RHS variable eceeds. 9. Adding and dropping RHS variables: This last fact is useful when considering dropping RHS variables from a model. a. One RHS variable: If a RHS variable has a t stat > then dropping that variable from the model will cause adjusted R-squared to decline. And if t stat <, then adjusted R- squared will increase when that variable is dropped from the model. 3 Or put differently: Adding RHS variables with t stat > will lead to higher adjusted R-squared. b. Multiple RHS variables: : If the Fstat associated with dropping multiple RHS variables is > then dropping those variables from the model will cause adjusted R-squared to decline. And if Fstat <, then adjusted R-squared will increase when those variables are. 3 And yes, if t stat = 0, then there will be no change to adjusted R-squared when the variable is dropped from the model.

11 dropped from the model. Or put differently: Adding RHS variables with Fstat > will lead to higher adjusted R-squared. c. Relation to statistical significance: It could be that additional RHS variables that lead to a higher adjusted R-squared are not statistically significant adding a RHS variable with t stat > will be associated with higher adjusted R-squared; but statistical significance will generally want t stat > or so. So if the added variable is statistically significant, then adjusted R-squared increased when that variable was added to the model but it could be that the new variable is not statistically significant, even though adjusted R-squared increased when it was added to the model.

12 An Eample using the bodyfat dataset. bcuse bodyfat () () (3) (4) Brozek Brozek Brozek Brozek wgt -0.5*** -0.9*** -0.54*** -0.3*** (-5.) (-3.68) (-4.9) (-3.56) abd 0.937*** 0.9*** 0.940*** 0.94*** (7.9) (5.37) (6.7) (5.0) hip (-.) (-.4) (-.) (-.4) thigh 0.77* 0.55* 0.77* 0.55* (.43) (.) (.4) (.0) hgt (-.3) (-.) ankle (0.4) (0.3) _cons -4.80*** -3.33** -4.73*** -33.4** (-6.45) (-3.05) (-5.65) (-.93) N R-sq adj. R-sq rmse t statistics in parentheses * p<0.05, ** p<0.0, *** p<0.00 Note: Add hgt -- () to (): hgt tstat >, R and adj R increase, RMSE decreases Add ankle -- () to (3): ankle tstat <, R increases, adj R decreases, and RMSE increases Add hgt -- (3) to (4): hgt tstat >, R and adj R increase, RMSE decreases Add ankle -- () to (4): ankle tstat <, R increases, adj R decreases, and RMSE increases Take Aways: adj R and RMSE (MSE) move in opposite directions, which depend on whether the t-stat of the added variable is smaller or larger than in magnitude

13 Playing with Bodyfat: F tests in action wgt _ kg About that BMI metric: BMI = hgt _ m If the BMI is a useful metric, then it should do a good job eplaining body fat levels: β β wgt _ kg Brozek = αbmi = α or hgt _ m ln( Brozek) = ln( α) + βln( BMI ) = ln( α) + βln( wgt _ kg) βln( hgt _ m) So if we regress Brozek on lnhgt and lnwgt, we should find the lnhgt coeff will be twice the magnitude of the lnwgt coeff and opposite in sign. So let s try that.. bcuse bodyfat. gen lnbrozek=ln(brozek) ( missing value generated). gen lnhgt=ln(hgt_m). gen lnwgt=ln(wgt_kg). reg lnbrozek lnhgt lnwgt Source SS df MS Number of obs = F(, 48) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE =.3779 lnbrozek Coef. Std. Err. t P> t [95% Conf. Interval] lnhgt lnwgt _cons Oops the lnhgt coefficient is actually smaller in magnitude than the lnwgt coeff so much for the BMI or maybe not!

14 . bcuse bodyfat Looking for data issues I Compare reported Brozek with calculation (Brozek). list Case Brozek Brozek if abs( Brozek- Brozek)> Case Brozek Brozek Looking for data issues II compare reported heights and weights. scatter hgt wgt.. list if hgt < Case Brozek Siri density age wgt hgt BMI fatfre~t neck chest abd hip thigh knee ankle biceps forearm wrist wgt_kg hgt_m BMI Brozek Siri Yes, that s right. 9.5 inches tall and weighs 05 pounds What happens when we drop those curious observations?. drop if Case == 33 Case == 48 Case == 76 Case == 96 Case == 8 Case == 4 (6 observations deleted). reg lnbrozek lnhgt lnwgt Source SS df MS Number of obs = F(, 43) = Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE =.3576 lnbrozek Coef. Std. Err. t P> t [95% Conf. Interval] lnhgt lnwgt _cons Bingo! - ratio! we can use the F test to test for the BMI restriction.

15 Assume that the restriction in true (the Null hypothesis is that the magnitude ratio is -).. test lnhgt = - *lnwgt ( ) lnhgt + *lnwgt = 0 F(, 43) = 0.00 Prob > F = We fail to reject the restriction, since the Prob > F is so large we typically reject when that probability is less than.,.05,.0 etc. This probability is similar to the p-value it s the probability that we will falsely reject the Null hypothesis. So let s impose the restriction and see what we get:. gen lnbmi=ln(bmi). reg lnbrozek lnbmi Source SS df MS Number of obs = F(, 44) = 00.9 Model Prob > F = Residual R-squared = Adj R-squared = Total Root MSE = lnbrozek Coef. Std. Err. t P> t [95% Conf. Interval] lnbmi _cons Reported R-squared didn t change, but SSRs increased (barely) from to (Recall that SSRs must increase (or more precisely, cannot decrease) if you impose a restriction.) The F-test basically looks at how much the SSRs increase when you impose a restriction (or restrictions): If SSRs increase by a small amount, then little eplanatory power is lost when the restriction is imposed, and the F test fails to reject the Null Hypothesis that the restriction is OK. If the SSRs increase by a lot, then the restriction is leading to a much poorer model (as measured by R squared) and the restriction is rejected. More F tests later. Stay tuned!

16 and the Chow Test 0. The Chow Test is a test of whether you can combine two datasets (for eample, estimating one model for both males and females rather than estimating separate models for the two populations). a. The easy way to do the Chow test is to define a dummy variable (say, for males) and to then interact that dummy with all of the other RHS variables in the model. b. Then do an F test to see if you can reject the Null hypothesis that all of the true parameter values for these newly interacted variables are zero. c. If you cannot reject that joint hypothesis, then it s OK to combine the datasets (which in the eample, would essentially amount to requiring identical coefficients for males and females). d. If you reject the Null hypothesis, then you should run separate regression for the two populations.. We combine datasets all the time, often without thinking about it. For instance, in the Pythagorean Theorem regressions, without giving it any thought, we included all MLB games played 960 0, for both National and American League teams. By estimating a single model on the full dataset, we essentially imposed a restriction that the coefficients for the NL teams are the same as the coefficients for the AL teams. With more data you generally have more precise coefficient estimates, but you don t want to combine NL and AL data if the two leagues follow fundamentally different models.. Here are the combined and separate results. The eyeball test says that combining the datasets is fine. We ll do the formal Chow Test when we cover dummy variables (spoiler alert: The Chow test will not reject the combining of the AL and NL datasets) (MLB) (AL) (NL) G_W G_W G_W RA_RS 0.935*** 0.945*** 0.94*** (95.54) (76.5) (60.74) _cons.069***.058***.080*** (00.9) (78.46) (64.98) N t statistics in parentheses * p<0.05, ** p<0.0, *** p<0.00

F Tests and F statistics

F Tests and F statistics Testing Linear estrictions F Stats and F Tests F Distributions F stats (w/ ) F Stats and tstat s eported F Stat's in OLS Output Example I: Bodyfat Babies and Bathwater F Stats,