Ordnary Least Squares (OLS): Smple Lnear Regresson (SLR) Analytcs The SLR Setup Sample Statstcs Ordnary Least Squares (OLS): FOCs and SOCs Back to OLS and Sample Statstcs Predctons (and Resduals) wth OLS Estmates Examples n Excel and Stata The SLR Setup You have a dataset consstng of n observatons of (, ) : { } x, y =,, n You beleve that except for random nose n the data, there s a lnear relatonshp between the x s and the y s: y β + βx You are nterested n estmatng the unknown parameters β and β 3 If there was no nose n the data, then snce y = β + βx for all observatons, we can easly determne β and β But typcally, the relatonshp s not exact n the observed data 4 Call your parameter estmates ˆβ and ˆβ, and your predcted y values y ˆ ˆ ˆ = β + βx 5 We call the dfference between the observed y and the predcted value y ˆ ˆ ˆ = β + βx the resdual, ˆ uˆ ˆ ˆ ˆ = y y = y β + βx u : ( ) 6 Here s an example (negatve resduals for # and #; postve resduals for #3 and #4): β s just the slope of the lne connectng any two dataponts, and β y β = x, for any datapont
OLS: SLR Analytcs v3 Sample Statstcs 7 The sample mean: x = x and n 8 Devatons from means: ( x x)' s and ( y y)' s y = y Note that x = nx n a By constructon, the total of the devatons from the means wll be zero: ( x x) = and ( y y) = ( y y) = ( y yy + y ) ( y ) y ( y ) ny ( y ) ny ny ( y ) ny and lkewse for the 9 Squared devatons from means: = + = + = x s n n average squared devaton from the mean (except we dvde by n-, not n the reason for ths wll become clear when we consder unbased estmaton) The sample varance: S = S = ( y y) = ( y ) ny almost the yy y The sample standard devaton: varance Sy = ( y y) n ( ) Sum of the product of the devatons from means: ( x x)( y y) = ( x y x y + ) = ( x y ) n + n = n, the square root of the sample n n almost the average product of the devatons from the means (except we agan dvde by n-, not n and yes, ths s also related to unbased estmaton) 3 The sample covarance: S = ( x x)( y y) = ( x y ) n agan,
OLS: SLR Analytcs v3 a Some examples: In the followng examples, x = and y = On the left, most of the data are n quadrants I and III, where ( x x)( y y) >, and so when you sum those products you get a postve sample covarance Most of the acton on the rght s n quadrants II and IV where ( x x)( y y) <, and so those products sum to a negatve covarance 4 The sample correlaton: sample standard devatons S ρ =, the rato of the sample covarance to the product of the SS x y a It may not be so obvous, but by constructon, ρ, or ρ b If S =, the sample covarance s and the sample correlaton s also And f the sample covarance s negatve (postve), then so s the sample correlaton (snce sample standard devatons are always postve, so long as they are well defned and not zero) c If ρ s close to then the relatonshp between x and y wll look qute lnear (wth a postve slope f ρ, and a negatve slope f ρ If y = β + βx for all, so there s n fact an exact lnear relatonshp between the x s and y s, then sample correlaton s + or d And as ρ gets closer to, the relatonshp between x and y looks less and less lnear n n S = β ( x x) = βsxx And snce Syy = ( y y) =β Sxx, Sy = β Sx n n S βsxx β So: ρ = SS = β SS = β = ± Proof: Snce S = ( x x)( y y) = ( x x) [ β + βx ( β + βx) ], y x x x 3
OLS: SLR Analytcs v3 e So: Correlaton captures the extent to whch x and y are movng together n a lnear fashon Ordnary Least Squares (OLS): FOCs and SOCs 5 Mnmze Sum (of the) Squared Resduals (SSRs) a We can estmate the ntercept and slope parameters, β and β, by mnmzng the sum of the squared resduals (SSRs) or errors ths explans where the least squares part of OLS comes from (We square the resduals so that postve and negatve resduals don t offset one another) b So the challenge s to fnd the slope coeffcent ( b ) and ntercept coeffcent ( b ) that together mnmze ( ( )) ( ) SSR = u = y b + b x c Fndng the mnmum: Frst Order Condtons (FOCs): A standard approach to fndng the mnmum of a functon s to evaluate ts (partal) dervatves and explot the fact that at the mnmum, the frst dervatves wll be zero (ths s called the frst order condton, FOC) Second Order Condtons (SOCs): And we ll also typcally want to ensure that we have a mnmum and not, say, a maxmum, by checkng as well the second dervatves At the mnmum, the second (partal) dervatves at the pont satsfyng the FOC wll be postve We wll be skppng ths step as t can get a bt complcated d Focusng on the FOCs for our mnmzaton problem: mnmze SSR = ( y ( )) b + bx wth respect to (wrt) b and b FOC : Dfferentatng wrt b : SSR = ( y b bx ) ny + nb + bnx =, and so b SSR = b = y bx b FOC : Dfferentatng wrt b : ( ) and Snce SSR = y ( b + bx ) b y bx SSR = y ( y b x + b x ) = ( y y ) b ( x x ) =, we want to mnmze 4
OLS: SLR Analytcs v3 SSR ( x x) ( y y) b ( x x) = = b So ( x x)( y y) ( x x) ( x x )( y y ) b ( x x ) =, and b = e The SOCs are more complcated, but t s easy to see that at the coeffcent values satsfyng the two FOCs, the two second partal dervatves (wrt b and wrt b ) are both SSR postve, suggestng that we have a mnmum: = n > and b SSR = ( x ) x > b 6 OLS coeffcent estmates: a For the gven sample, the OLS estmates of the unknown ntercept and slope parameters are (notce that we use hats to denote estmates): ˆ β ( x x)( y y), and ˆ β ˆ = y βx = ( x x) b ˆβ and sample means Snce ˆ β ˆ = y βx, the estmated ntercept s the sample mean of the y s mnus ˆβ tmes the sample means of the x s The estmate of the ntercept assures that the average predcted value, ˆ β ˆ + β x, s the ˆ β + ˆ β x = y ˆ β x + ˆ β x = y same as the average observed value y, snce ( ) c ˆβ and sample varances, covarance and correlaton Usng our sample statstcs notaton: ˆ β ( x x)( y y) ( x x)( y y) / ( n ) S = = = ( x x) ( x x) / ( n ) Sxx Thus, the OLS slope estmator s just the rato of the sample covarance of x s and y s and the sample varance of the x s: ˆ Sample Covarance( x, y) β = Sample Varance( x) S Recall that the sample correlaton s defned by: ρ =, where S x and SS x y the square roots of the respectve sample varances S y are 5
OLS: SLR Analytcs v3 S S Sx v Snce ρ = =, we have: ˆ S Sy β = = ρ SS S S S S x y xx y v So whle the regresson slope coeffcent s the product of the sample correlaton between the x s and y s and the rato of the two estmated standard devatons: ˆ Sample StdDev( y) β = Sample Correlaton( x, y) Sample StdDev ( x ) If the two sample standard devatons are the same then the estmated slope coeffcent wll be the estmated correlaton between the x s and y s d Important asde: Snce ( x x) =, Accordngly, ˆ ( x x) y β = ( x x) ( x x)( y y) = ( x x) y y ( x x) = ( x x) y xx Ths wll prove useful later e ˆβ s a weghted average of slopes: The estmated slope coeffcent s a weghted average of slopes of lnes jonng the varous dataponts to the sample means: ˆ ( y y) β = w ( x x ) Ths result holds because ( x )( ) ˆ x y y ( x x) ( y y) ( y y) β = = = w ( n ) Sxx ( n ) Sxx ( x x) ( x x) ( y y) s the slope of the lne connectng ( x, y ) to ( x, y ) ( x x) ( x x) ( x x) w = = ( n ) S ( x x) xx j j x are non-negatve weghts, whch sum to so the slopes are weghted proportonally to ( x x), the square of the varous x-dstances from the x mean v In ths nterpretaton, dataponts are not weghted equally Those that are farther away from x (n the x dmenson) get greater weght, and that weght ncreases wth the square of the x-dstance from x v See the posted handout for an example 6
OLS: SLR Analytcs v3 Predctons (and Resduals) wth OLS Estmates 7 OLS coeffcent estmates yeld predcted values and resduals: a Predcted values: For gven x, the predcted y value gven the estmated coeffcents s: y = ˆ β + ˆ β x (as above, we typcally use hats for predcted or estmated values) ˆ b Resduals: And for the gven predcted y value, the resdual, u ˆ, s as above the uˆ = y yˆ = y ˆ β + ˆ β x dfference between the actual and predcted values: ( ) 8 Sample Regresson Functon (SRF): The predcted values from the estmated equaton, ŷ = ˆ β + ˆ β x, s called the Sample Regresson Functon a SRFs wll depend on the actual sample used to estmate the slope and ntercept parameters: dfferent samples wll typcally lead to dfferent parameter estmates and accordngly, dfferent SRFs ˆ β = y ˆ β x assures that the SRF passes through ( x, y ) snce the value of the SRF at x b ˆ β + ˆ β x = y ˆ β x + ˆ β x = y s ( ) c The sample correlaton between the actuals and predcted values s the same as the 3 sample correlaton between the actuals and the x s: ρ = ρ 9 SRFs and elastctes measurng economc sgnfcance a We typcally use dervatves and elastctes to estmate the responsveness of the predcted values ( ys ˆ ' ) to changes n the explanatory varable x b Dervatves wll be senstve to unts of measurement but elastctes are not That s why t s not uncommon to use elastctes to measure economc sgnfcance: Is the estmated relatonshp szable or noteworthy? or s t so small that t s of lttle consequence? c Usng the SRF to estmate relatonshps: Dervatves: The estmated average margnal relatonshp between x and y: d y ˆ = ˆ β dx x d (Pont) Elastcty: ˆ ˆ x y = β evaluated at (, ˆ) = ( x, ˆ β ˆ + βx), or somewhere yˆ dx yˆ on the SRF yyˆ yx 3 S yyˆ ρ yyˆ = But snce S ˆ yyˆ = βsyx and snce SS y yˆ S ˆ β S =, yy ˆˆ 7 xx S ˆ yyˆ βsyx Syx ρyyˆ = = = = ρyx SS S ˆ β S SS y yˆ y x y x
OLS: SLR Analytcs v3 Where you evaluate the elastcty on the SRF s often arbtrary but be sure to evaluate the elastcty at some pont on the SRF You wll typcally get dfferent elastctes dependng on where along the SRF you estmate the elastcty We often evaluate the elastcty at the means: ˆ β x y a Recall that the mean of the predcted values wll be y and that the SRF passes through ( x, y) = ( x, ˆ β ˆ + βx) Propertes of OLS resduals a Recall that uˆ ˆ ( ˆ ˆ y y y β βx) = = + b The average resdual s zero: u ( ˆ ˆ y β βx) n ˆ = + = c The sample correlaton between the x ' s and the ˆ ' ( )( ) ˆ = u s s zero, snce u ( x x ) Proof: ( y ˆ y)( x x) = y ( ˆ ˆ β + βx x x (( y ˆ y) β = ( x x) )( x x) = ( )( ) ˆ y y x x β ( x x) = gven the defnton of ˆβ d The sample correlaton between the predcted values ( y ˆ s) and the resduals ( uˆ s) s zero Proof: ( )( ) ( ) ˆ uˆ ˆ ˆ ˆ ˆ ˆ ( ˆ u y y = u y y = β y y)( x x), whch s zero (see prevous proof) e Decomposton: And so OLS essentally decomposes actuals nto two uncorrelated parts, predcteds and resduals: y = yˆ + uˆ and ˆ ρ ˆˆ = Ths result wll prove useful later yu 8
OLS: SLR Analytcs v3 Examples n Excel and Stata Let s frst do ths n Excel Open the bodyfatxlsx fle n Excel Generate the x-y scatterplot of Brozek v wgt, and add trendlne You should see somethng lke: Case wgt Brozek wgt-wbar Brozek-Bbar product 545 6 (467) (634) 564 Brozek y = 67x - 9995 735 69 6 (567) (4) 683 3 54 46 (49) 566 (4) 4 8475 9 5 583 (84) (4683) 5 845 78 533 886 479 6 5 6 4 333 66 55 7 8 9 8 6 3 8 76 8 (9) (64) 795 3 9 9 5 8 (384) (67) 985 933 (694) (349) 865 75 733 (44) (8379) 6 85 378 (44) (387) 3 85 5 58 56 46 4 55 8 633 86 49 5 8775 7 883 76 437 5 5 5 3 35 4 6 675 5 (67) 56 (56) For Brozek and wgt, compute sample means, varances, standard devatons, as well as the covarance and correlaton, and apply the varous formulae for the OLS slope and ntercept estmates You should get somethng lke: Sample Varances Sample Cov Sample Corr Slope estmates 8637 68 3967 63 S/Sxx 67 StDevs 939 775 corr*(sy/sy) 67 Sum Squares Sum Intercept estmate Means 7894 8938 6,7944 5,79 35,5755 Bbar-b*wbar (9995) Case wgt Brozek wgt-wbar Brozek-Bbar product 545 6 (467) (634) 564 735 69 (567) (4) 683 3 54 46 (49) 566 (4) 4 8475 9 583 (84) (4683) 5 845 78 533 886 479 So who knew? The Excel Trendlne s generated by OLS! 9
OLS: SLR Analytcs v3 Runnng regressons n Excel You can also run the OLS regresson n Excel usng Data/Data Analyss/Regresson (you may have to load the Data Analyss Tool-Pak (go to Optons/Add-Ins) SUMMARY OUTPUT Regresson Statstcs Multple R 636 R Square 37596 Adjusted R Square 37346 Standard Error 635 Observatons 5 ANOVA df SS MS F Sgnfcance F Regresson 5,669 5,669 56 595E-7 Resdual 5 9,499 3764 Total 5 5,79 Coeffcents Standard Error t Stat P-value Lower 95% Upper 95% Intercept (9995) 389 (48) 39776E-5 (474) (5899) wgt 67 3 7 595E-7 358 877 Same OLS slope and ntercept!
OLS: SLR Analytcs v3 Now for Stata bcuse bodyfat Contans data from http://fmwwwbcedu/ec-p/data/wooldrdge/bodyfatdta obs: 5 vars: 4 6 Nov 4:4 sze: 4,68 -------------------------------------------------------------------------------------- storage dsplay value varable name type format label varable label -------------------------------------------------------------------------------------- Case nt %g Case Brozek double %g 457/densty - 44 wgt double %g weght (lbs) -------------------------------------------------------------------------------------- Sorted by: reg Brozek wgt Source SS df MS Number of obs = 5 -------------+---------------------------------- F(, 5) = 56 Model 5669335 5669335 Prob > F = Resdual 949937 5 3763963 R-squared = 376 -------------+---------------------------------- Adj R-squared = 3735 Total 57966 5 6757635 Root MSE = 635 ------------------------------------------------------------------------------ Brozek Coef Std Err t P> t [95% Conf Interval] -------------+---------------------------------------------------------------- wgt 6788 3765 7 357578 876598 _cons -99955 38956-48 -4739-58998 ------------------------------------------------------------------------------ predct bhat scatter bhat Brozek wgt 3 4 5 5 5 3 35 weght (lbs) Ftted values 457/densty - 44 Use the summarze, correlaton and dsplay commands to generate the OLS slope and ntercept estmates:
OLS: SLR Analytcs v3 summ Brozek wgt Varable Obs Mean Std Dev Mn Max -------------+--------------------------------------------------------- Brozek 5 893849 775856 45 wgt 5 78944 93896 85 3635 corr Brozek wgt, covar Brozek wgt -------------+------------------ Brozek 6758 wgt 3967 86373 slope coeffcent : rato of sample covar to sample var d 3967 / 86373 6795 ntercept estmate: d 893849-6795 * 78944-999545 corr Brozek wgt Brozek wgt -------------+------------------ Brozek wgt 63 slope coeffcent : sample corr * rato of sample standard devatons d 63 * 775856 / 93896 6734 Verfy that the correlaton of Brozek wth wgt s the same as the correlaton of Brozek wth bhat corr Brozek bhat wgt Brozek bhat wgt -------------+--------------------------- Brozek bhat 63 wgt 63 Capture the resduals and verfy that they are uncorrelated wth the predcteds (bhats) and as well wth the explanatory varable wgt predct resds, res corr bhat wgt resds bhat wgt resds -------------+--------------------------- bhat wgt resds - -
OLS: SLR Analytcs v3 Evaluate the elastcty assocated wth the estmated OLS coeffccents: d 6788*78944/ 893849 577696 Or just run the margns command rght after the reg command reg Brozek wgt margns, eyex(_all) atmeans Condtonal margnal effects Number of obs = 5 Model VCE : OLS Expresson : Lnear predcton, predct() ey/ex wrt : wgt at : wgt = 78944 (mean) ------------------------------------------------------------------------------ Delta-method ey/ex Std Err t P> t [95% Conf Interval] -------------+---------------------------------------------------------------- wgt 57769 8333 9 75 7857 ------------------------------------------------------------------------------ 3