Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to test whether an elastcty s equal to one. We may wsh to test the hypothess that has no mpact on the dependent varable Y. We may wsh to construct a confdence nterval for our coeffcents.
A hypothess takes the form of a statement of the true value for a coeffcent or for an expresson nvolvng the coeffcent. The hypothess to be tested s called the null hypothess. The hypothess whch t s tested agan s called the alternatve hypothess. Rejectng the null hypothess does not mply acceptng the alternatve We wll now consder testng the smple hypothess that the slope coeffcent s equal to some fxed value.
Settng up the hypothess Consder the smple regresson model: We wsh to test the hypothess that bd where d s some known value (for example zero aganst the hypothess that b s not equal to zero. We wrte ths as follows We wrte Y a + b + H H 0 a : b : b d d u
To test the hypothess we need to know the way that our estmator s dstrbuted. We start wth the smple case where we assume that the error term n the regresson model s a normal random varable wth mean zero and varance σ. Ths s wrtten as u ~ (0, σ ow recall that the OLS estmator can be wrtten as b ˆ b + Thus the OLS estmator s equal to a constant (b plus a weghted sum of normal random varables Weghted sums of normal random varables are also normal w u
The dstrbuton of the OLS slope coeffcent It follows from the above that the OLS coeffcent s a ormal random varable. What s the mean and what s the varance of ths random varable? Snce OLS s unbased the mean s b We have derved the varance and shown t to be Var ( bˆ σ Var ( Snce the OLS estmator s ormally dstrbuted ths means that z bˆ b Var ( bˆ ~ (0,
The dffculty wth usng ths result s that we do not know the varance of the OLS estmator because we do not know σ Ths needs to be estmated An unbased estmator of the varance of the resduals s the resdual sum of squares dvded by the number of observatons mnus the number of estmated parameters. Ths quantty (- n our case s called the degrees of freedom. Thus σˆ uˆ
Return now to hypothess testng. Under the null hypothess bd. Hence t must be the case that z bˆ d Var ( bˆ (0, We now replace the varance by ts estmated value to obtan a test statstc: z * bˆ d σˆ ( Ths test statstc s no longer ormally dstrbuted, but follows the t-dstrbuton wth - degrees of freedom. ~
Testng the Hypothess Thus we have that under the null hypothess z * ~ t bˆ ( σˆ d The next step s to choose the sze of the test (sgnfcance level. Ths s the probablty that we reject a correct hypothess. The conventonal sze s 5%. We say that the sze t, We now fnd the crtcal values α / and α t α /, 0.05
We accept the null hypothess f the test statstc s between the crtcal values correspondng to our chosen sze. Otherwse we reject. The logc of hypothess testng s that f the null hypothess s true then the estmate wll le wthn the crtcal values 00 ( a% of the tme. The ablty of a test to reject a hypothess s called the power of the test.
Confdence Interval We have argued that z * ~ t bˆ d ( σˆ Ths mples that we can construct an nterval such that the chance that the true b les wthn that nterval s some fxed value chosen by us. Call ths value α For a 95% confdence nterval say ths would be 0.95.
From statstcal tables we can fnd crtcal values such that any random varable whch follows a t-dstrbuton falls between these two values wth a probablty of α. Denote these crtcal values by and For a t random varable wth 0 degrees of freedom and a 95% confdence these values are (.8,-.8. Thus t α /, t α /, * pr( t < z < α α / t α / Wth some manpulaton we then get that pr bˆ se( bˆ t < b < bˆ + se( bˆ α ( α / tα / The term n the brackets s the confdence nterval.
Example Consder the regresson of log quantty of butter on the log prce agan regr lbp lpbr umber of obs 5 ------------------------------------------------------------------------------ lbp Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- log prce -.84586.95669-7.04 0.000 -.08437 -.608798 _cons 4.506.600375 8.6 0.000 4.00453 4.843668 ------------------------------------------------------------------------------ The statstc for the hypothess that the elastcty s equal to one s z 0.84 ( 0. 0.6 0..33
Crtcal values for the t dstrbuton wth 5-49 degrees of freedom (5 observatons, coeffcents estmated and sgnfcance level of 0.05 s approxmately (,-(from stat tables Snce -.33 les wthn ths range we accept the null hypothess The 95% confdence nterval s 0.84± 0. (.08, 0.6 Thus the true elastcty les wthn ths range wth 95% probablty. Everythng we have done s of course applcable to the constant as well. The varance formula s dfferent however.
Do we need the assumpton of normalty of the error term to carry out nference (hypothess testng? Under normalty our test s exact. Ths means that the test statstc has exactly a t dstrbuton. We can carry out tests based on asymptotc approxmatons when we have large enough samples. To do ths we wll use Central lmt theorem results that state that n large samples weghted averages are dstrbuted as normal varables.
A Central lmt theorem Suppose we have a set of ndependent random numbers all wth constant varance s and mean µ. Then v,,..., α ( v µ ~ (0, s α Where the symbol ~ reads dstrbuted asymptotcally,.e. as the sample sze tends to nfnty.
Ths extends to weghted sums. Let 0. So we also have that where s the probablty lmt of the sum of squares of the weghts. It s a lmt for sums of random varables. Ths lmt can be estmated n practce by the sum tself: We requre the lmt to be fnte: lm 0, ~ ( w p s wv α lm w p w < lm w p
Applyng the CLT to the slope coeffcent for OLS Recall that the OLS estmator can be wrtten as Ths s a weghted sum of random varables as n the prevous case. u w u u b b ( ( ( ˆ
The Central lmt theorem appled to the OLS estmator We can apply the central lmt theorem to the OLS estmator. Thus accordng to the central lmt theorem we have that Comparng wth the prevous slde the weghts are p u b b ( lm 0, ~ ( ( ˆ ( σ α w ( (
The mplcaton s that the statstc we had before has a normal dstrbuton n large samples rrespectve of how the error term s dstrbuted f t has a constant varance Assumpton - homoskedastcty. z * bˆ ( σˆ (0, ote how the s cancel from the top and bottom. In fact the test statstc s dentcal to the one we used under normalty. The only dfference s that now we wll use the crtcal values of the ormal dstrbuton. For a sze of 5% these are +.96 and.96. d α ~
The expresson on the denomnator s nothng but the standard error of the estmator. The test statstc for the specal case when we are testng that the coeffcent s n fact zero (no mpact on Y s often called the t- statstc. For a large sample test we can accept the hypothess that a coeffcent s zero wth a 5% level of sgnfcance f the t-statstc s between (-.96,.96
Example ------------------------------------------------------------------------------ lmap Coef. Std. Err. t P> t [95% Conf. Interval] -------------+---------------------------------------------------------------- lpsmr -.6856449.84636 -.4 0.00 -.56693 -.45967 _cons 4.83766.534038 7.83 0.000 3.0577 5.56956 ------------------------------------------------------------------------------ Regresson of log margarne purchases on the log prce. Test that the prce effect s zero. Assume large enough sample and use the crtcal values from the ormal dstrbuton. T-statstc -0.69/0.8-.4 95% ormal crtcal values are.96,.96 The hypothess s rejected The 95% confdence nterval s (.6,-0.5 Qute wde whch mples that the coeffcent s not very precsely estmated.
Summary When the error term s normally dstrbuted we can carry out exact tests by comparng the test statstc to crtcal values from the t-dstrbuton If the assumpton of normalty s not beleved to hold we can stll carry out nference when our sample s large enough. In ths case we smply use the normal dstrbuton.