Correlato & Regreo How To Study Relato Betwee Two Quattatve Varable? Smple Lear Regreo 6. A Smple Regreo Problem I there relato betwee umber of power boat the area ad umber of maatee klled? Year NPB( ) Nkll(y ) 77 447 3 78 4 79 48 4 498 6 8 53 4 8 5 0 83 56 5 84 559 34 85 585 33 86 64 33 87 645 39 88 675 43 89 7 79 47 3 Maatee Klled 0 0 Scatter Plot 0 0 0 Number of Power Boat 0 4 Correlato Pearo Sample Correlato The correlato, ρ, betwee two radom varable, X ad Y, defed a, ( X μ X ) ( Y μ ρ average Y σ X σy ) product of the tadard devate of X ad Y, quatfe the tregth of lear relatohp. 5 r y y y : Sample tadard devato of y : Sample tadard devato of y 6
Correlato & Regreo y y y y y y y 77 447 3 -.3 -.35.77 78 4 -.7-0.69 0.8 79 48 4-0.94-0.45 0.4 498 6-0.76 -. 0.83 8 53 4-0.59-0.45 0.6 8 5 0-0. -0.77 0.47 83 56 5-0.45 -.8 0.53 84 559 34-0.09 0.38-0.03 85 585 33 0.9 0.9 0.06 86 64 33 0.5 0.9 0.5 87 645 39 0.84 0.79 0.66 88 675 43.7.. 89 7.56.69.64 79 47.65.44.38 Total.4. d. 9.9 y.9 r 0.9447789 Mea 567. y 9.43 7 y Number of People Klled (447, 3) 0 0 Scatter Plot 567.5 0 0 0 Number of Hadgu Regtered 0 y 9.43 8 : y: Oe of par (447, 3) 447 567.5.3 9.9 3 9.43.35.9 9 Shortcut Formula y y r y y y y y yy, yy y y Pearo Sample Correlato a Dfferet Formula ( )( y y) r ( ) ( y y) y yy Correlato Coeffcet Σ 7945 (Σ) 635 Σy 4 (Σy) 69744 Σ 468597 Σy 56 Σy 475 S 99.5, S yy 93.4857 S y 37 r 0.9447789, r 0.886379485
Correlato & Regreo Iterpretato of r < r < It meaure the tregth ad drecto of the lear relato betwee two quattatve varable. r f all pot le eactly o a traght le. ρ he otato for populato correlato coeffcet. Correlato Coeffcet r cloe to r cloe to r cloe to zero r cloe to zero 3 4 Correlato Doe Not Imply Cauato How to Model Lear Relato? Eample: The umber of powerboat regtered may ot be the drect caue for the death of Maatee. 5 6 Maatee Klled 0 0 Graph wth a Ftted Le Maatee Klled 0 y? +? 0 0 0 0 0 0 0 0 0 Leat Square Prcple Fd oluto of α ad β of a traght le that mmze the followg varablty meaure: [ ( ˆ α + ˆ β )] y ˆ α + ˆ β Number of Power Boat Number of Power Boat 7 8 3
Correlato & Regreo mmze q q α q β e ( )[ y ( ) [ y y α y α + β [ y ( α + β )] ( α + β )] 0 ( α + β )] 0 + β α? β? 9 The Equato of The Ftted Le y? +? The leae-quared etmate of α, β are deoted a αˆ ad βˆ ad they are ˆ y β, ˆ α y ˆ β 0 Other formula ˆ y β r, ˆ α y r y The Equato of a Ftted Le y ˆ α + ˆ β the ample tadard devato of y the ample tadard devato of y Ca be ued for etmato or predcto. The Equato of a Ftted Le y ˆ α + ˆ β Mea of y at 4 Maatee Eample y ˆ 37 β. 486 99. 5 4 7945 ˆ α.486 4. 4439 4 4 3 4 Ca be ued for etmato or predcto. Gve the etmate of locato of mea repoe for varou. 3 The regreo (predcto) equato: ˆ α + ˆ β 4. 4369 +. 486 4 4
Correlato & Regreo Data R Commader R for Smple Lear Regreo 5 6 R Output Call: lm(formula MANKILL ~ NPOWERBT, data Dataet) Redual: M Q Meda 3Q Ma -9.468 -.066 0.07.3369 5.6375 Coeffcet: Etmate Std. Error t value Pr(> t ) (Itercept) -4.44 7.4-5.589 0.0008 *** NPOWERBT 0.49 0.09 9.675 5.e-07 *** --- Sgf. code: 0 '***' 0.00 '**' 0.0 '*' 0.05 '.' 0. ' ' y Redual tadard error: 4.76 o degree of freedom Multple R-Squared: 0.8864, Adjuted R-quared: 0.8769 Coeffcet of determato R 7 F-tattc: 93.6 o ad DF, p-value: 5.9e-07 8 Equato of the regreo le: ˆ α + ˆ β ; 4.44 +.49 A Etmato If at a certa year the umber of power boat regtered 0,000, etmate how may maatee o average would be klled. 4. 4439 +. 486 4. 4439 +. 486 0 45. 973 The average repoe at 0 45.973. 9 5
Correlato & Regreo Graph wth a Ftted Le Maatee Klled How log hould you wat tll et erupto? 0 0 0 0 0 0 Number of Power Boat 3 3 0 Durato ad Iter-erupto Tme 0 Itererupto Tme.5.0.5 3.0 3.5 4.0 4.5 5.0 CDUR.00.00 5.5 Iter-erupto Tme Durato of Erupto.5.0 Durato.5 3.0 3.5 4.0 4.5 5.0 5.5 33 34 Durato ad Iter-erupto Tme Cauto 0 Avod uure etrapolato. Caualty? Iter-erupto Tme.5.0.5 3.0 3.5 4.0 4.5 5.0 5.5 Durato 35 36 6
Correlato & Regreo Problem of etrapolato Problem of etrapolato Scope of data Scope of data 37 38 Problem of etrapolato Problem of etrapolato Etrapolated reult for a value out of the cope of Etrapolated reult for a value out of the cope of A poble tred Scope of data Etmate y at Scope of data Etmate y at 39 Regreo ad Caualty Eample: y female lfe epectacy GDP (Gro dometc product) Regreo telf provde o formato about caual patter ad mut upplemeted by addtoal aaly (wth deged ad cotrolled epermet) to obta ght about caual relatohp. Female lfe epectacy 99-000 0 000 0000 000 4 GDP per capta Before Traformato 4 7
Correlato & Regreo Eample: y female lfe epectacy GDP (Gro dometc product) Eample: y female lfe epectacy GDP (Gro dometc product) Female lfe epectacy 99 Female lfe epectacy 99 4 5 6 7 8 9 4 5 6 7 8 9 Natural log of GDP Natural log of GDP ŷ ˆ α + ˆ β l() After l(gdp) Traformato 43 44 Traformato Crcle of Power: p or y p y up Quadrat II Quadrat I Traformato For up or y up: try p > for p or y p Eample:, y, 3, y 3, or e, e y dow up For dow or y dow: try p < for p or y p Eample: -/, y -/, -, y -, or l(), l(y) Quadrat III y dow Quadrat IV 45 46 Smple Lear Regreo t-tet for correlato Hypothe: H 0 : ρ ρ 0, v.. H a : ρ ρ 0 Tet Stattc: (If data are bvarate ormal.) 8.9 Tet Cocerg Regreo ad Correlato t r ρ0 ( r )/( ) ~ t-dtrbuto d.f. Deco rule: Reject H 0, f C.V. approach: t < t α/ or t > t α/ p-value approach: p-value < α 47 48 8
Correlato & Regreo I there a gfcat correlato? R for Correlato Eample: (Maatee) H 0 : ρ 0, v.. H a : ρ 0 t. 94 0 9. 65 (. 886)/( 4 ) d.f. 4 -, p-value <.0005, reject H 0, there gfcat lear relato. r.94 49 R Output wth t-tet for Zero Correlato Pearo' product-momet correlato data: Dataet$MANKILL ad Dataet$NPOWERBT t 9.6755, df, p-value 5.9e-07 alteratve hypothe: true correlato ot equal to 0 95 percet cofdece terval: 0.84 0.986797 ample etmate: cor 0.944773 5 5 Eample: I a vetgato, coutre were cluded to tudy the relato betwee female lfe epectacy ad the brthrate. Frt Order Smple Lear Regreo Model Model aumpto: Female lfe epectacy 99 0 0 r.87 y α + β + ε wth error, ε, depedet, detcally ad ormally dtrbuted a Ν (0, σ ), ad mea of y at μ y α + β. Brth per 00 populato, 99 53 54 9
Correlato & Regreo Model Aumpto Redual y Redual: e y y ( ˆ ˆ α + β ) 3 4 55 56 Eample: Fd the redual at 4 ad the oberved y. Redual Sum of Square ŷ Predcted y 4.4439 +.486 4 6.0. The redual 6.0 4.99. Redual Sum of ( or Square (SSRed) Error Sum of Square, SSE) ( y ) 57 58 Meaure Square Error ad Stadard devato for regreo Etmato of σ : y MSE SSE / ( ) 8.87 (Degree of freedom ) Etmated Stadard Error of the regreo model: y 4.8 59 Iferece for Regreo Coeffcet β (t-tet) Hypothe: H o : β β 0, v.. H a : β β 0 (It ofte tetg for Ho: β 0 v.. Ha: β 0.) Tet Stattc: ˆ β β 0 t e ˆ ( ˆ β ) ~ t-dtrbuto d.f., where e ˆ ( ˆ β ) y ( ).03 Deco rule: Reject H o, f C.V. approach: t < t α/ or t > t α/ p-value approach: p-value < α
Correlato & Regreo Iferece for Regreo Coeffcet α (t-tet) Hypothe: H o : α α 0, v.. H a : α α 0 (It ofte tetg for H o : α 0 v.. H a : α 0.) Tet Stattc: ˆ α α 0 t ~ t-dtrbuto d.f., e ˆ ( ˆ α) where e ˆ ( ˆ α) + y 7.4 ( ) Deco rule: Reject H o, f C.V. approach: t < t α/ or t > t α/ p-value approach: p-value < α Predctg Mea Repoe The (-α) 0% cofdece terval for predctg the mea repoe at : t e ˆ ( ) / ± α where e ( ) ( ) d.f. ˆ ( ) + y Predcted Number of Maatee Klled o Average at 4 > 6.0 ± 3.9 > (.09, 9.9) 6 6 Predctg a Sgle New Repoe The (-α) 0% cofdece terval for predctg a dvdual outcome at : t e ˆ ( ~ y) / ± α Cofdece Iterval Bad where e ( ) ( ) d.f. ˆ ( ~ y) + + y Predcted Number of Maatee Klled at 4 > 6.0 ±. > (5.9, 6.) Number of maatee klled 0 0 0 Number of Powerboat 0 0 0 63 64 0 Evaluato of the Model Itererupto Tme CDUR.00.00 Total Populato Coeffcet of Determato (R ): It the proporto of varato oberved y that ca be eplaed by the varable wth the lear regreo model..5.0.5 3.0 3.5 4.0 4.5 5.0 5.5 Durato of Erupto 65 66
Correlato & Regreo R Output Call: lm(formula MANKILL ~ NPOWERBT, data Dataet) Redual: M Q Meda 3Q Ma -9.468 -.066 0.07.3369 5.6375 Coeffcet: Etmate Std. Error t value Pr(> t ) (Itercept) -4.44 7.4-5.589 0.0008 *** NPOWERBT 0.49 0.09 9.675 5.e-07 *** --- Sgf. code: 0 '***' 0.00 '**' 0.0 '*' 0.05 '.' 0. ' ' Redual tadard error: 4.76 o degree of freedom Multple R-Squared: 0.8864, Adjuted R-quared: 0.8769 Coeffcet of determato R F-tattc: 93.6 o ad DF, p-value: 5.9e-07 67 Redual Plot A catter plot of the redual agat the predcted value of the repoe varable to verfy the aumpto behd the regreo model. Homogeety of varace Radom ormal error Appropratee of the lear model 68 Graph wth a Ftted Le Redual Plot.5 Scatterplot Depedet Varable: Number of maatee klle Maatee Klled 0 Regreo Stadardzed Redual.0.5 0.0 -.5 -.0 -.5 -.0 -.5 0 0 0 0 0 -.5 -.0 -.5 0.0.5.0.5.0 Number of Power Boat Regreo Stadardzed Predcted Value 69 Redual Plot 0 0 Model ot a good lear ft Volato of the equal varace aumpto 7