ECON 3150/4150, Spring term Lecture 3

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Refereces to Lecture 3 ad 4 Stock ad Watso (SW) Ch 3.7 ad Ch 4 ( mai expositio) ad CH 17 (techical expositio, the level matches Ch 2 ad Ch 3); Bårdse ad Nymoe (BN) Kap 2, 3 ad Kap. 5.1-5.8 2 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step It is custom to motivate regressio, ad i particular the estimatio method Ordiary Least Squares, by settig fidig the best fittig lie i a scatter plot as the purpose of ecoometric modellig Nothig wrog i this but it should ot be take too far. Goodess of fit is oly oe aspect of buildig a relevat ecoometric model. Model parsimoy (explaiig a pheomea by simple models); theory cosistecy; ad relevat represetatio of couterfactuals to allow causal aalysis, are examples of model features that are just as importat as goodess of fit. After this caveat we start by presetig the mai ideas behid OLS estimatio i terms of fidig the best fittig lie a scatter plot of data poits. 3 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Basic ideas Scatter plot ad least squares fit Y 700 600 500 400 300 200 100 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 X 4 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Basic ideas Y 700 600 500 400 300 200 Which lie is best? Idea: Miimize sum of squared errors! But which errors? 100 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 X 5 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Basic ideas Which squared error? Y 2 3 1 x ( X i, Y i ) 1: Least vertical distace to lie 2. Least horizotal 3. Shortest distace to lie X X i 6 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Basic ideas ^Y i Y i Y x ( X i, Y i ) X i X Choose 1 whe wat to miimize squared errors from predictig Y i liearly from X i Residual: ˆε i = Y i Ŷ i, where Ŷ i is predicted value 7 / 30

Y 100 200 300 400 500 600 700 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Basic ideas Regressio lie ad predictio errors (projectios) 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 X 8 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Least squares algebra Ordiary least squares (OLS) estimates I The differet lies that we cosidered placig i the scatter-plot correspod to differet values of the parameter β 0 ad β 1 i the liear fuctio that coects give umbers X 1,X 2,... X with Y1 fitted,y2 fitted,..., Y fitted : Y fitted i = β 0 β 1 X i, i = 1, 2,..., We obtai the best fit Y fitted i Ŷ i (i = 1, 2,..., ) Ŷ i = ˆβ 0 ˆβ 1 X i, i = 1, 2,..., (1) by fidig the estimates of β 0 ad β 1 that miimizes the sum of squared residuals ( Yi Yi fitted ) 2: S(β 0,β 1 ) = (Y i β 0 β 1 X i ) 2 (2) 9 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Least squares algebra Ordiary least squares (OLS) estimates II where Cosequetly ˆβ 0 ad ˆβ 1 are determied by the 1oc s: Y ˆβ 0 ˆβ 1 X = 0 (3) X i Y i ˆβ 0 X i ˆβ 1 Xi 2 = 0 (4) X = 1 is the sample mea (empirical mea) of X. X i (5) It is expected that you ca solve the simultaeous equatio system (3)-(4). See Questio C i the first exercise-set! 10 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Least squares algebra A trick ad a simplified derivatio I The trick is to ote that β 0 + β 1 X i α + β 1 (X i X ) (6) whe the itercept parameter α is defied as α β 0 + β 1 X (7) This meas that the best predictio Ŷ i give X i ca be writte as Ŷ i = ˆβ 0 + ˆβ 1 X i ˆα + ˆβ 1 (X i X ) where ˆα ˆβ 0 + ˆβ 1 X (8) ad we therefore choose the α ad β 1 that miimize S(α,β 1 ) = [Y i α β 1 (X i X )] 2 (9) 11 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Least squares algebra A trick ad a simplified derivatio II Calculate the two partial derivatives ( kjereregele for each elemet i the sums): S(α,β 1 ) α = 2 S(α,β 1 ) = 2 β 1 [Y i α β 1 (X i X )] ( 1) ad choose ˆα ad ˆβ 1 as the solutios of 2 2 [Y i α β 1 (X i X )] (X i X ) [ Yi ˆα ˆβ 1 (X i X ) ] ( 1) = 0 (10) [ Yi ˆα ˆβ 1 (X i X ) ] (X i X ) = 0 (11) 12 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Least squares algebra A trick ad a simplified derivatio III ˆα Ȳ = 0 (12) where i X )Y i ˆβ 1 (X i X ) (X 2 = 0 (13) Ȳ = 1 Y i (14) the empirical mea of Y. Aother DIY exercise: Show that (10) gives (12), ad (11) gives (13) ad that the solutios of (12) ad (13) are ˆα = Ȳ, (15) ˆβ 1 = (X i X )Y i (X i X ) 2 (16) 13 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Least squares algebra A trick ad a simplified derivatio IV Note that for (16) to make sese, we eed to assume (X i X ) 2 > 0 (i.e., X is a variable, ot a costat) A geeralizatio of this will be importat later, ad is the called absece of perfect multicolliearity. To obtai ˆβ 0 we simply use ˆβ 0 = ˆα ˆβ 1 X (17) 14 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Residuals ad total sum of squares I Defiitio of OLS residuals: ˆε i = Y i Ŷ i, i = 1, 2,..., (18) where we deviate form the S&W otatio, which uses û i for the residual. Usig this defiitio i the 1oc s (10) ad (13) gives ˆε i = 0 (19) ˆε i (X i X ) = 0. (20) 15 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Residuals ad total sum of squares II ˆε i = 0 = ˆε = 1 ˆε i (X i X ) = 0 = ˆσ εx = 1 ˆε i = 0 (21) (ˆε i ˆε)(X i X ) = 0 (22) where ˆσ εx deotes the (empirical) covariace betwee the residuals ad the explaatory variable. These properties always hold whe we iclude the itercept (β 0 or α) i the model They geeralize to the case of multiple regressio as we shall later (22) is a orthogoality coditio. It says that the OLS residuals are ucorrelated with the explaatory variable. 16 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Residuals ad total sum of squares III ˆσ εx = 0 occurs because we have defied the OLS residuals i such a way that they measure what is left uexplaied i Y whe we have extract all the explaatory power of X 17 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Total Sum of Squares ad Residual Sum of Squares I We defie the Total Sum of Squares for Y as TSS = (Y i Ȳ ) 2 (23) We ca guess that TSS ca be split i Explaied Sum of Squares ESS = ad Residual Sum of Squares RSS = (Ŷ i Ŷ ) 2 (24) (ˆε i ε) 2 = SSR (25) 18 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Total Sum of Squares ad Residual Sum of Squares II SSR deotes Sum of Squared Residuals. RSS ad SSR are both used. TSS = ESS + RSS (26) To show this importat decompositio, start with (Y i Ȳ ) 2 = (Y i Ŷ i ) + (Ŷ }{{} i Ŷ ) ˆε i where we have used that Ȳ = 1 Y i = 1 (ˆε i + Ŷ i ) = Ŷ because of (19). Completig the square gives 2 19 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Total Sum of Squares ad Residual Sum of Squares III (Y i Ȳ ) 2 = RSS + 2 ˆε i (Ŷ i Ŷ ) + ESS } {{ } TSS Expad the middle term: ˆε i (Ŷ i Ŷ ) = ˆε i (ˆα + ˆβ 1 (X i X ) Ŷ ) = ˆα ˆε i + ˆβ 1 ˆε i (X i X ) Ŷ }{{}}{{} (19) (20) ˆε i }{{} (19) 20 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Total Sum of Squares ad Residual Sum of Squares IV Therefore ˆε i (Ŷ i Ŷ ) = 0 The residuals are ucorrelated with the predictios Ŷ i. Could it be differet? Hece we have the desired result: TSS = ESS + RSS (27) 21 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step The coefficiet of determiatio I To summarize the goodess of fit i the form of a sigle umber, the coefficiet of determiatio, almost everywhere deoted R 2, is used: R 2 = ESS TSS RSS = = 1 RSS TSS TSS TSS = 1 rate of uexplaied Y variatio (28) If ˆβ 1 = 0, RSS = (Y i Ŷ i ε) 2 = (Y i ˆα) 2 = (Y i Ȳ ) 2 = TSS. ad R 2 = 0 If RSS = 0,a perfect fit, the R 2 = 1 22 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step The coefficiet of determiatio II Hece we have the property 0 R 2 1 (29) These results deped o defiig the regressio fuctio as as i (1). If we istead use Ŷ i = ˆβ 0 ˆβ 1 X i, Ŷ o i i = ˆβ o i 1 X i which forces the regressio lie trough the origi: the correspodig residuals do ot sum to zero, the decompositio of TSS breaks dow. R 2 (as defied above) ca be egative! Work with See Questio D i the first exercise-set! 23 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Regressio ad correlatio I We defie the empirical correlatio coefficiet betwee X ad Y as r X,Y 1 1 (Y i Y )(X i X ) = 1 1 (X i X ) 2 1 1 (Y i Y ) 2 ˆσ XY ˆσ X ˆσ Y, (30) ˆσ XY deotes the empirical covariace betwee Y ad X. SW uses s XY ˆσ X ad ˆσ Y deote the two empirical stadard deviatios. SW uses s X ad s Y They are square roots of the empirical variaces, e.g., ˆσ X = ˆσ 2X = 1/( 1) (X i X ) 2 24 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Regressio ad correlatio II Note: Dividig y or 1 is ot really importat (but best stick to oe covetio) ˆσ X,Y ca be writte i three equivalet ways: ˆσ X,Y = 1 1 = 1 1 (X i X )(Y i Ȳ ) = 1 1 (Y i Ȳ )X i (X i X )Y i 25 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Regressio ad correlatio III The regressio coefficiet ca therefore be re-expressed as ˆβ 1 = 1 (X i X )Y i (X i X ) 2 = 1 (X i X )Y i 1 1 (X i X ) = ˆσ X,Y 2 ˆσ X 2 = ˆσ Y ˆσ X ˆσ X,Y ˆσ X ˆσ Y This shows that = ˆσ Y ˆσ X r X,Y (31) r X,Y = 0 is ecessary for ˆβ 1 = 0. Correlatio is ecessary for fidig regressio relatioships Still, ˆβ 1 = r XY (i geeral) ad regressio aalysis is differet from correlatio aalysis. 26 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Regressio ad causality I Three possible theoretical causal relatioships betwee X ad Y. Our regressio is causal if I is true, ad II (joit causality) ad III are ot true r XY = 0 i all three cases Ca also be that a third variable (Z) causes both Y ad X (spurious correlatio) 27 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Causal iterpretatio of regressio aalysis I Regressio aalysis ca refute a causal relatioship, sice correlatio is ecessary for causality But caot cofirm or discover a causal relatioship by statistical aalysis (such as regressio) aloe Need to supplemet the aalysis by theory ad by iterpretatio of atural experimets or quasi-experimets see page 126 ad the text box o page 131 i SW. Will see several examples later i the course. 28 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step Causal iterpretatio of regressio aalysis II I time series aalysis, the cetral cocept is autoomy of regressio parameters with respect to chages i policy variables. The cocept is developed i ECON 4160, but for those iterested Kap. 2.4, i BN gives a itroductio to this lie of thikig about correlatio ad causality. 29 / 30

Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step I this lecture we have leart about the method of ordiary least squares (OLS) to fit a straight lie to a scatter-plot of umbers (data poits). The cocepts of radom variables ad statistical model, that were cetral i Lecture 1 ad 2, have ot eve bee metioed! I Lecture 4 we start to bridge that gap by itroducig the regressio model. Note also the limitatio of fittig the straight lie : May scatter plots do ot eve resemble a liear relatioship: See Figure 3.3 i BN, ad the Phillips curve examples i Kap 3 i BN. Luckily, the OLS method ca be used i may such cases the poit will be is that the coditioal expectatio fuctio eed ot be liear Hece: Several reasos to brig the statistical model back ito the story, ad i particular the coditioal expectatio fuctio! 30 / 30