Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1
Chapter Topcs Types of Regresson Models Determnng the Smple Lnear Regresson Equaton Measures of Varaton n Regresson and Correlaton Assumptons of Regresson and Correlaton Resdual Analyss and the Durbn-Watson Statstc Estmaton of Predcted Values Correlaton - Measurng the Strength of the Assocaton 1999 Prentce-Hall, Inc. Chap. 13-2
Purpose of Regresson and Correlaton Analyss Regresson Analyss s Used Prmarly for Predcton A statstcal model used to predct the values of a dependent or response varable based on values of at least one ndependent or explanatory varable Correlaton Analyss s Used to Measure Strength of the Assocaton Between Numercal Varables 1999 Prentce-Hall, Inc. Chap. 13-3
The Scatter Dagram Plot of all (, Y ) pars 60 40 20 0 Y 0 20 40 60 1999 Prentce-Hall, Inc. Chap. 13-4
Types of Regresson Models Postve Lnear Relatonshp Relatonshp NOT Lnear Negatve Lnear Relatonshp No Relatonshp 1999 Prentce-Hall, Inc. Chap. 13-5
Smple Lnear Regresson Model Relatonshp Between Varables Is a Lnear Functon The Straght Lne that Best Ft the Data Y ntercept Random Error Y 0 1 Dependent (Response) Varable Slope Independent (Explanatory) Varable 1999 Prentce-Hall, Inc. Chap. 13-6
Populaton Lnear Regresson Model Y Y 0 1 = Random Error Observed Value Observed Value m Y 0 1 1999 Prentce-Hall, Inc. Chap. 13-7
Sample Lnear Y Regresson Model Y b b 0 1 = Predcted Value of Y for observaton = Value of for observaton b 0 = Sample Y - ntercept used as estmate of the populaton 0 b 1 = Sample Slope used as estmate of the populaton 1 1999 Prentce-Hall, Inc. Chap. 13-8
Smple Lnear Regresson Equaton: Example Annual Store Square Feet Sales ($000) You wsh to examne the relatonshp between the square footage of produce stores and ts annual sales. Sample data for 7 stores were obtaned. Fnd the equaton of the straght lne that fts the data best 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 1999 Prentce-Hall, Inc. Chap. 13-9
Annual Sales ($000) Scatter Dagram Example 12000 10000 8000 6000 4000 2000 0 0 1000 2000 3000 4000 5000 6000 Excel Output S q u a re F e e t 1999 Prentce-Hall, Inc. Chap. 13-10
Equaton for the Best Straght Lne Y b 0 b 1 1636. 415 1. 487 From Excel Prntout: C o e ffc e n ts I n te r c e p t 1 6 3 6. 4 1 4 7 2 6 V a r a b l e 1. 4 8 6 6 3 3 6 5 7 1999 Prentce-Hall, Inc. Chap. 13-11
Annual Sales ($000) Graph of the Best Straght Lne 12000 10000 8000 6000 4000 2000 0 0 1000 2000 3000 4000 5000 6000 S q u a re F e e t 1999 Prentce-Hall, Inc. Chap. 13-12
Interpretng the Results Y = 1636.415 +1.487 The slope of 1.487 means for each ncrease of one unt n, the Y s estmated to ncrease 1.487unts. For each ncrease of 1 square foot n the sze of the store, the model predcts that the expected annual sales are estmated to ncrease by $1487. 1999 Prentce-Hall, Inc. Chap. 13-13
Measures of Varaton: The Sum of Squares SST = Total Sum of Squares measures_ the varaton of the Y values around ther mean Y SSR = Regresson Sum of Squares explaned varaton attrbutable to the relatonshp between and Y SSE = Error Sum of Squares varaton attrbutable to factors other than the relatonshp between and Y 1999 Prentce-Hall, Inc. Chap. 13-14
Measures of Varaton: The Sum of Squares Y _ SST = (Y - Y) 2 SSE =(Y - Y ) 2 _ SSR = (Y - Y) 2 _ Y 1999 Prentce-Hall, Inc. Chap. 13-15
Measures of Varaton The Sum of Squares: Example Excel Output for Produce Stores df SS R e g r e ss o n 1 30380456.12 R e s d u a l 5 1871199.595 T o ta l 6 32251655.71 SSR SSE SST 1999 Prentce-Hall, Inc. Chap. 13-16
The Coeffcent of Determnaton SSR r 2 = = SST regresson sum of squares total sum of squares Measures the proporton of varaton that s explaned by the ndependent varable n the regresson model 1999 Prentce-Hall, Inc. Chap. 13-17
Coeffcents of Determnaton (r 2 ) and Correlaton (r) Y r 2 = 1, r = +1 Y r 2 = 1, r = -1 Y ^ = b 0 + b 1 ^ Y = b 0 + b 1 Y r 2 =.8, r = +0.9 r 2 = 0, r = 0 Y Y ^ = b 0 + b 1 Y ^ = b 0 + b 1 1999 Prentce-Hall, Inc. Chap. 13-18
Standard Error of Estmate S SSE = (Y Y yx n 2 1 n 2 n ) 2 The standard devaton of the varaton of observatons around the regresson lne 1999 Prentce-Hall, Inc. Chap. 13-19
Measures of Varaton: Example Excel Output for Produce Stores R e g re sso n S ta tstc s M u lt p le R 0. 9 7 0 5 5 7 2 R S q u a re 0. 9 4 1 9 8 1 2 9 A d ju s t e d R S q u a re 0. 9 3 0 3 7 7 5 4 S t a n d a rd E rro r 6 1 1. 7 5 1 5 1 7 r 2 =.94 O b s e rva t o n s 7 94% of the varaton n annual sales can be explaned by the varablty n the sze of the store as measured by square footage S yx 1999 Prentce-Hall, Inc. Chap. 13-20
Lnear Regresson Assumptons 1. Normalty For Lnear Models Y Values Are Normally Dstrbuted For Each Probablty Dstrbuton of Error s Normal 2. Homoscedastcty (Constant Varance) 3. Independence of Errors 1999 Prentce-Hall, Inc. Chap. 13-21
Varaton of Errors Around the Regresson Lne f(e) y values are normally dstrbuted around the regresson lne. For each x value, the spread or varance around the regresson lne s the same. Y 2 1 Regresson Lne 1999 Prentce-Hall, Inc. Chap. 13-22
Resdual Analyss Purposes Examne Lnearty Evaluate volatons of assumptons Graphcal Analyss of Resduals Plot resduals Vs. values Dfference between actual Y & predcted Y Studentzed resduals: Allows consderaton for the magntude of the resduals 1999 Prentce-Hall, Inc. Chap. 13-23
Resdual Analyss for Lnearty e Not Lnear e Lnear 1999 Prentce-Hall, Inc. Chap. 13-24
Resdual Analyss for Homoscedastcty SR Heteroscedastcty Homoscedastcty SR Usng Standardzed Resduals 1999 Prentce-Hall, Inc. Chap. 13-25
Resdual Analyss: Computer Output Example Observaton Predcted Y Resduals 1 4202.344417-521.3444173 2 3928.803824-533.8038245 3 5822.775103 830.2248971 Produce Stores 4 9894.664688-351.6646882 5 3557.14541-239.1454103 R e s d u a l P lo t 6 4918.90184 644.0981603 7 3588.364717 171.6352829 Excel Output 0 1000 2000 3000 4000 5000 6000 S q u a r e F e e t 1999 Prentce-Hall, Inc. Chap. 13-26
The Durbn-Watson Statstc Used when data s collected over tme to detect autocorrelaton (Resduals n one tme perod are related to resduals n another perod) Measures Volaton of ndependence assumpton D n (e 2 n e 1 e 2 ) 2 1 Should be close to 2. If not, examne the model for autocorrelaton. 1999 Prentce-Hall, Inc. Chap. 13-27
Resdual Analyss for Independence SR Not Independent SR Independent 1999 Prentce-Hall, Inc. Chap. 13-28
Inferences about the Slope: t Test t Test for a Populaton Slope Is a Lnear Relatonshp Between & Y? Null and Alternatve Hypotheses H 0 : 1 = 0 (No Lnear Relatonshp) H 1 : 1 0 (Lnear Relatonshp) Test Statstc: t b 1 S b 1 1 and df = n - 2 Where S b 1 n ( 1 S Y ) 2 1999 Prentce-Hall, Inc. Chap. 13-29
Example: Produce Stores Data for 7 Stores: Annual Store Square Feet Sales ($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 Regresson Model Obtaned: Y = 1636.415 +1.487 The slope of ths model s 1.487. Is there a lnear relatonshp between the square footage of a store and ts annual sales? 1999 Prentce-Hall, Inc. Chap. 13-30
Inferences about the Slope: t Test Example H 0 : 1 = 0 H 1 : 1 0 a.05 df 7-2 = 7 Crtcal Value(s): Test Statstc: From Excel Prntout t S tat P-value In te rc e p t 3.6 2 4 4 3 3 3 0.0 1 5 1 4 8 8 V a ra b le 1 9.0 0 9 9 4 4 0.0 0 0 2 8 1 2 Decson: Reject.025-2.5706 Reject.025 0 2.5706 t Reject H 0 Concluson: There s evdence of a relatonshp. 1999 Prentce-Hall, Inc. Chap. 13-31
Inferences about the Slope: Confdence Interval Example Confdence Interval Estmate of the Slope b 1 t n-2 S b1 Excel Prntout for Produce Stores L o w er 95% Up p er 95% I n te rc e p t 4 7 5. 8 1 0 9 2 6 2 7 9 7. 0 1 8 5 3 V a ra b l e 1. 0 6 2 4 9 0 3 7 1. 9 1 0 7 7 6 9 4 At 95% level of Confdence The confdence Interval for the slope s (1.062, 1.911). Does not nclude 0. Concluson: There s a sgnfcant lnear relatonshp between annual sales and the sze of the store. 1999 Prentce-Hall, Inc. Chap. 13-32
Estmaton of Predcted Values Confdence Interval Estmate for m Y The Mean of Y gven a partcular Standard error of the estmate Ŷ t t value from table wth df=n-2 n2 S yx Sze of nterval vary accordng to dstance away from mean,. 1 n ( n 1 ( ) 2 ) 2 1999 Prentce-Hall, Inc. Chap. 13-33
Estmaton of Predcted Values Confdence Interval Estmate for Indvdual Response Y at a Partcular Addton of ths 1 ncreased wdth of nterval from that for the mean Y Ŷ t n2 S yx 1 1 n ( n 1 ( ) 2 ) 2 1999 Prentce-Hall, Inc. Chap. 13-34
Interval Estmates for Dfferent Values of Y Confdence Interval for a ndvdual Y Confdence Interval for the mean of Y 1999 Prentce-Hall, Inc. Chap. 13-35 _ A Gven
Example: Produce Stores Data for 7 Stores: Annual Store Square Feet Sales ($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 Predct the annual sales for a store wth 2000 square feet. Regresson Model Obtaned: Y = 1636.415 +1.487 1999 Prentce-Hall, Inc. Chap. 13-36
Estmaton of Predcted Values: Example Confdence Interval Estmate for Indvdual Y Fnd the 95% confdence nterval for the average annual sales for stores of 2,000 square feet Predcted Sales Y = 1636.415 +1.487 = 4610.45 ($000) = 2350.29 S Y = 611.75 t n-2 = t 5 = 2.5706 Ŷ t n2 S yx 1 n ( n 1 ( ) 2 ) 2 = 4610.45 980.97 Confdence nterval for mean Y 1999 Prentce-Hall, Inc. Chap. 13-37
Estmaton of Predcted Values: Example Confdence Interval Estmate for m Y Fnd the 95% confdence nterval for annual sales of one partcular stores of 2,000 square feet Predcted Sales Y = 1636.415 +1.487 = 4610.45 ($000) = 2350.29 S Y = 611.75 t n-2 = t 5 = 2.5706 Ŷ t n2 S yx 1 1 n ( n 1 ( ) 2 ) 2 = 4610.45 1853.45 Confdence nterval for ndvdual Y 1999 Prentce-Hall, Inc. Chap. 13-38
Correlaton: Measurng the Strength of Assocaton Answer How Strong Is the Lnear Relatonshp Between 2 Varables? Coeffcent of Correlaton Used Populaton correlaton coeffcent denoted r ( Rho ) Values range from -1 to +1 Measures degree of assocaton Is the Square Root of the Coeffcent of Determnaton 1999 Prentce-Hall, Inc. Chap. 13-39
Test of Coeffcent of Correlaton Tests If There Is a Lnear Relatonshp Between 2 Numercal Varables Same Concluson as Testng Populaton Slope 1 Hypotheses H 0 : r = 0 (No Correlaton) H 1 : r 0 (Correlaton) 1999 Prentce-Hall, Inc. Chap. 13-40
Chapter Summary Descrbed Types of Regresson Models Determned the Smple Lnear Regresson Equaton Provded Measures of Varaton n Regresson and Correlaton Stated Assumptons of Regresson and Correlaton Descrbed Resdual Analyss and the Durbn- Watson Statstc Provded Estmaton of Predcted Values Dscussed Correlaton - Measurng the Strength of the Assocaton 1999 Prentce-Hall, Inc. Chap. 13-41