Week 3 3.1 Smple Lnear Regresson Model 3. Propertes of Least Squares Estmators Y Y β 1 + β X + u weekly famly expendtures X weekly famly ncome For a gven level of x, the expected level of food expendtures wll be: E(Y X ) β 1 + β X Propertes of the Smple Lnear Regresson Model 3.3 The populaton parameters β 1 and β are unknown populaton constants. 3.4 1. Y β 1 + β X + u. E(u ) 0 <> E(Y ) β 1 + β X 3. var(u ) σ var(y ) 4. cov(u,uj j ) cov(y,y j ) 0 5. x not constant for every observaton 6. u ~N(0,σ ) <> Y ~N(β 1 +β x,σ ) The formulas that produce the sample estmates β 1 (or b 1 ) and or β (or b )are called the estmators of β 1 and β. When and β 1 β are used to represent the formulas rather than specfc values, they are called estmators of β 1 and β whch are random varables because they are dfferent from sample to sample
Estmators are Random Varables 3.5 If the least squares estmators β 1 and β are random varables, then what are ther means, varances, covarances and probablty dstrbutons? Compare the propertes of alternatve estmators wth the propertes of the least squares estmators. The Expected Values of β 1 and β The least squares formulas (estmators) n the smple regresson case: β nσx Y - ΣX ΣY nσx -(ΣX ) β 1 Y - β X Σx y Σx where Y ΣY / n and X ΣX / n x y X Y 3.6 X Y Substtute Y Into b formula to get: β 1 + β X + u β β + nσx u - ΣX Σu nσx -(ΣX( ) The mean of β s: 0 3.7 E(β ) β + nσx E(u )- ΣX ΣE(u ) nσx -(ΣX( ) Snce E(u ) 0, then E(β ) β. An Unbased Estmator The result E(β ) β means that the dstrbuton of β s centered at β. Snce the dstrbuton of β s centered at β,we say that β s an unbased estmator of β. 3.8
Wrong Model Specfcaton The unbasedness result on the prevous slde assumes that we are usng the correct model If the model s of the wrong form or there are mssng varables, then E(u ) 0, and E(β ) β For example: Y β 1 + β X + (β 3 X 3 +v) u β 3 X 3 +v E(u ) 0 3.9 Unbased Estmator of the Intercept 3.10 In a smlar manner, the estmator β 1 of the ntercept or constant term can be shown to be an unbased estmator of β 1 when the model s correctly specfed. E(β 1 ) β 1 Equvalent expressons for β : ( ) β Σ(x x )(Y Y ) Σ(x x) Σx y Σx 3.11 β β Var(β ) se(β ) 3.1 Expand and multply top and bottom by n: β nσx Y Σx ΣY nσx (Σx ) σ u
Estmatng the varance of the error term, σ u u Y β 1 β X n Σu 1 σ n k σ s an unbased estmator of σ degrees of freedom (df) n - # of parameters (k) df n - k 3.13 Varance of β Gven that both Y and u have varance σ, the varance of the estmator β s: σ var(β ) Σ(x x) β σ Σx x S(b) Se(b se(β 0 ) (8.50) /9.55 0.7809 0.8836 β s a functon of the Y values but var(β ) does not nvolve Y drectly. 3.14 Varance of β 1 Gven β 1 Y β x the varance of the estmator β 1 s: Σx var(β 1 ) σ σ Σx n Σ(x x) ( ) nσx se(β 1 ) (8.50) (35/0(9.55)) 87.38 9.34 3.15 Covarance of β 1 and β x cov(β 1,β ) σ x σ Σ(x x) Σ x 3.16 If x 0, slope can change wthout affectng the covarance of β 1 and β.
What factors determne varance and covarance? 3.17 1. σ : uncertanty about Y values uncertanty about β 1, β and ther relatonshp.. The more spread out the X values, the more confdence n β 1, β, etc. 3. The larger the sample sze, n, the smaller the varances and covarances. 4. The varance ββ 1 s large when the (squared) X values are far from zero (n ether drecton). 5. Changng the slope, β, has no effect on the ntercept, β 1, when the sample mean of X s zero. But f sample mean s postve, the covarance between β 1 and β wll be negatve, and vce versa. Propertes of Least Squares Resduals 1. Σu 0. Σu X 0 3. Σ(Y( -Y )( )(Y -Y)0 or> Σ u y 0 4. The lnear regresson lne must pass through the sample mean of X and Y. 3.18 Decomposton of sum of squares (Y - Y) (Y -Y ) + (Y -Y) To measure varaton: u Σ(Y -Y) Σ[(Y -Y ) + (Y -Y)] Σ(Y - Y) Σ(Y - Y ) + Σ(Y - Y) TSS Total sum of squares RSS (Σu ) Resdual sum of squares 3.19 ESS Explaned sum of squares r - Measure of goodness of ft Defne r r ESS TSS 1 - RSS TSS 1 - Σ(Y -Y) Σ(Y -Y) Σ u Σ(Y( -Y) 1 r 0 3.0
Alternatve r expresson r Σ(β x ) β ESS Σ(Y -Y) TSS Σ(Y -Y) S x S y β Σx Σx y Σx β Σx y 3.1 Σx (Σx y ) Σx r (Σx y ) Σx Note: corr(x,y) Then, r ± r (Σx y ) (Σx ) 1/ r 3. Notes: 3.3 Y As r 0 3.4 SRF Whch SRF? Y X http://www.ruf.rce.edu/~lane/stat_sm/comp_r/ndex.html As r 1 SRF SRF go through all ponts X
3.5 3.6 Gauss-Markov Theorem Gven the assumptons of classcal lnear regresson model, the ordnary least squares (OLS) estmators β 1 and β are the best lnear unbased estmators (BLUE) of β 1 and β. Ths means that β 1 and β have the smallest varance of all lnear unbased estmator of β 1 and β. Note: Gauss-Markov Theorem doesn t apply to non-lnear estmators 3.7 Unbased :The expected value of the estmator β k equals to the true value of β k Prob. (β ) E(β )β E(β )>β E(β )<β Based underestmate E(β ) Unbased E(β ) True value of β Based overestmate E(β ) 3.8 β
Probablty Dstrbuton of Least Squares Estmators 3.9 normally dstrbuted under The Central Lmt Theorem 3.30 β 1 ~ N β 1, β ~ N β, σ Σx nσx σ Σx If the Gauss-Markov assumptons hold, and sample sze, n, s suffcently large, then the least squares estmators, β 1 and β, have a dstrbuton that approxmates the normal ldstrbuton b wth hgreater accuracy the larger the value of sample sze, n. Effcency: β k s an effcent unbased estmator f for a gven sample sze, n, the varance of β k s the smallest one. Consstency: β k s a consstent estmator of β k f the probablty of β k - β k < ε s 1. It means that as the sample sze gets larger, β k wll become more accurately. (εa very small #) 3.31 The Least Squares Predctor, Y Gven a value of the explanatory varable, X, we would lke to predct a value of the dependent varable, Y, from the OLS estmated result. The least squares predctor s: Y β 1 + β X 3.3
Summary of BLUE estmators 3.33 Mean E(β 1 )β 1 Varance and E(β )β Σx nσx Var(β 1 )σ and Var(β ) Standard error or standard devaton se(β k ) var(β k ) σ Σx Estmated Error Varance σ T Σu 1 n E(σ ) σ 3.34 Standard Error of Regresson (Estmate), SEE σ σ T Σu 1 n k # of ndependent varables plus the constant term