Liear Regressio Aalysis Aalysis of paired data ad usig a give value of oe variable to predict the value of the other 5 5 15 15 1 1 5 5 1 3 4 5 6 7 8 1 3 4 5 6 7 8
Liear Regressio Aalysis E: The chirp rate of crickets is strogly correlated with the outside temperature. O 8 differet days a statisticia wet outside ad measured two quatities: the umber of chirps a cricket made i a miute () ad the outside temperature (y). The data is i the table below. X Chirps i 1 miute 88 118 11 86 1 13 96 9 Y Outside Temperature ( F) 69.7 93.3 84.3 76.3 88.6 8.6 71.6 79.6 a) Fid the liear correlatio coefficiet r b) Fid the equatio of the least squares regressio lie c) Predict the outside temperature if you hear a cricket chirpig 14 times per miute d) Fid a 95% predictio iterval for the outside temperature if a cricket is chirpig 14 times per miute
Temperature (y) Liear Regressio Aalysis E (cotiued): X Chirps i 1 miute 88 118 11 86 1 13 96 9 Y Outside Temperature ( F) 69.7 93.3 84.3 76.3 88.6 8.6 71.6 79.6 Predictor Variable: X = Number of chirps i 1 miute Respose Variable: Y = Outside Temperature Scatter Plot Crickets Temp vs. chirps 95 9 85 8 75 7 a) r =.8693 b) y =.536 + 7.674 65 8 9 1 11 1 13 # of chirps ()
Temperature (y) Liear Regressio Aalysis E (cotiued): X Chirps i 1 miute 88 118 11 86 1 13 96 9 Y Outside Temperature ( F) 69.7 93.3 84.3 76.3 88.6 8.6 71.6 79.6 Crickets Temp vs. chirps 95 9 85 8 75 7 a) r =.8693 b) y =.536 + 7.674 65 8 9 1 11 1 13 # of chirps () c) If a cricket is chirpig at 14 chirps per miute, the outside temperature is about o y =.536(14) + 7.674 = 11 F
Temperature (y) Liear Regressio Aalysis E (cotiued): X Chirps i 1 miute 88 118 11 86 1 13 96 9 Y Outside Temperature ( F) 69.7 93.3 84.3 76.3 88.6 8.6 71.6 79.6 95 9 Crickets Temp vs. chirps a) r =.8693 85 8 75 7 65 8 9 1 11 1 13 # of chirps () b) y =.536 + 7.674 c) If = 14 chirps / mi, o y = 11 F d) For a 95% predictio iterval, the margi or error E is E = 15.9 F ad the predictio iterval is o o E y E or 85.1 F y 116.9 F
The Liear Correlatio Coefficiet r
The Liear Correlatio Coefficiet r r is a umber that ca be calculated from paired data that tells you how close to a straight lie the graph of the data is r is always betwee -1 ad 1 Whe r is 1 or -1, the data form a perfect straight lie Whe r is close to 1 or -1, the graph of the data looks almost like a straight lie, but with a little scatter The further away r is from -1 ad 1 (i.e. closer to ), the less the data looks like a straight lie. It ca look scattered all over the place, or ca form a eat shape (but ot a lie) If r is positive, the data treds upward. That meas: whe icreases, y icreases the slope of the lie is positive If r is egative, the data treds dowward. That meas: whe icreases, y decreases the slope of the lie is egative
The Liear Correlatio Coefficiet r 14 1 1 8 6 4 r is positive whe, y r = 1 1 3 4 5 14 1 1 8 r is positive whe, y 6 4 r is close to 1 (actual r =.9698) 1 3 4 5
The Liear Correlatio Coefficiet r 3 5 15 1 r is egative r = -1 whe, y 5 1 3 4 5 3 5 15 r is egative whe, y 1 5 r is close to -1 (actual r = -.775) 1 3 4 5
The Liear Correlatio Coefficiet r 1 9 8 7 6 5 4 3 1 1 3 4 5 r is close to whe, y??? (actual r = -.11) 18 16 14 1 1 8 6 4 1 3 4 5 r is close to whe, y??? (actual r =.1183)
The formula for r is: The Liear Correlatio Coefficiet r r y y y y (more o this later )
The Least-Squares Regressio Lie
The Least-Squares Regressio Lie Goal: We wat to fid a lie that is as close as possible to all the data poits Every lie has a total error ad we are lookig for the lie with the smallest total error E: Data 1 3 4 y 3 13 9 14 1 1 8 6 4 Let s see which lie is closer to this data. Our choices i this eample are 13 or 1 3 4 5
E (cotiued): Data 13 The Least-Squares Regressio Lie 1 3 4 y 3 13 9 What is the total error? E E E? 1 3 No because we do t wat egative umbers to cacel out positive oes E y [3] [ (1) 13] 8 1 E y [13] [ (3) 13] 6 E y [9] [ (4) 13] 4 3 E E E? 1 3 No because absolute values are hard to deal with i math So we ll use E E E 1 3
E (cotiued): Data 13 The Least-Squares Regressio Lie 1 3 4 y 3 13 9 E y [3] [ (1) 13] 8 1 E y [13] [ (3) 13] 6 E y [9] [ (4) 13] 4 3 Total error = E E E ( 8) (6) (4) 116 1 3 E y [3] [(1) ] 1 1 E y [13] [(3) ] 5 E y [9] [(4) ] 1 3 Total error = E E E ( 1) (5) ( 1) 1 3 7
E (cotiued): Data 13 The Least-Squares Regressio Lie 1 3 4 y 3 13 9 Total error = 116 Total error = 7 So the lie is closer to the data tha the lie 13 Is there a eve closer lie???
The Least-Squares Regressio Lie Usig calculus, it ca be show that give some data, there is a lie that is closer to the data tha ay other lie. This lie is called the Least-Squares Regressio Lie (LSRL) Formula for the Least-Squares Regressio Lie b 1 b b1 y y b y y (more o this later )
Predictio Itervals
Predictio Itervals The Least-Squares Regressio Lie ca be used to predict the value of y give the value of. The predicted value of y is called The problem: is a poit estimate because it is a 1 umber guess of the value of y. Istead, we wat a iterval estimate (a cofidece iterval). Whe fidig cofidece itervals for problems ivolvig paired data, they are called Predictio Itervals. As usual, to fid a predictio iterval, we eed the margi or error formula E, ad E y E
Summary of Formulas Correlatio Coefficiet r: y y y y r LSRL: 1 ˆ b b y 1 y y b y y b Predictio Iterval (use t distributio): 1 y b y b y s e df / 1 1 s t E e E y y E y ˆ ˆ
A Detailed Liear Regressio Problem
A Detailed Liear Regressio Problem E: I order to study global warmig, the data below was take over differet years (CO levels i parts per millio ad temperature i C) CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 a) Fid the liear correlatio coefficiet r b) Fid the equatio of the least squares regressio lie c) Predict the temperature of the Earth if the CO levels reach 4 parts per millio d) Costruct a 9% predictio iterval for the temperature of the Earth if the CO levels reach 4 parts per millio
A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 Before we do ay of the parts of this problem, we eed to fid y y y because these appear i may of the formulas.
A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 y y 314317... 369 3377 314 317... 369 1143757 13.9 14... 14.4 141. 7 13.9 14... 14.4 8. 39 y ( 314)(13.9) (317)(14)... 3377 337. 7 1 (369)(14.4) 47888.6
Temperature (Celsius) A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 Before we calculate r, here is a graph of the data Temp vs. CO Levels 14.6 14.5 14.4 14.3 14. 14.1 14 13.9 13.8 31 3 33 34 35 36 37 38 CO Level (ppm) What do you thik r is? r is positive whe, y r is close to 1
Temperature (Celsius) A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 Temp vs. CO Levels 14.6 14.5 14.4 14.3 14. 14.1 14 13.9 13.8 31 3 33 34 35 36 37 38 CO Level (ppm) a) To calculate r, plug ito the formula for r r.89 r y y y y (1)(47888.6) (1)(1143757) (3377) (3377)(141.7) 1(8.39) (141.7)
b A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 b) To fid the equatio of the regressio lie, plug ito the equatios for b ad b 1 y y (1)(47888.6) (3377)(141.7) b 1 (1)(1143757) (3377).19 y y (141.7)(1143757) (3377)(47888.6) 1. 483 (1)(1143757) (3377) b 1 b so.19 1. 483
Temperature (Celsius) A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 a) r =.89 b) y.19 1. 483 ˆ Here s what s goig o Temp vs. CO Levels 14.6 14.5 14.4 14.3 14. 14.1 14 13.9 13.8 31 3 33 34 35 36 37 38 CO Level (ppm)
A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4.19 1.483 c) The whole poit of the regressio lie is to make predictios. To make the predictio, just plug i the give value of to predict the y value Whe the Earth s CO level reaches 4 ppm (that is called umber poit estimate predictio for y is ), the best oe.19(4) 1.483 14. 843 o C
A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 a) r =.89 b) y.19 1. 483 c) 14. 843 ˆ d) As usual, to fid a predictio iterval, we eed to fid E. To fid E we eed to fid,, ad plug everythig ito the formula for E. t / se 1 cof. level 1.9. 1 /.1/. 5 t / 1.86 (from table) o C df 1 8
A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 a) r =.89 b) y.19 1. 483 c) 14. 843 d) t 1. 86? / ˆ s E =? e o C s e y b 1 y b 8.39 (1.483)(141.7) (.19)(47888.6) 1.347 y
A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 a) r =.89 b) y.19 1. 483 c) 14. 843 d) 1. 86 / ˆ t.347 E =? se o C E t / s e 1 1 (1.86)(.347) 1 1 1 1(4 337.7) 1(1143757) (3377).97
A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 a) r =.89 b) y.19 1. 483 c) 14. 843 ˆ d) t 1. 86 s. 347 E. 97 / e o C ad the 9% predictio iterval is E y E 14.843.97 y 14.843.97 o 13.873 C y 15.813 o C
Temperature (Celsius) A Detailed Liear Regressio Problem E (cotiued): CO 314 317 3 36 331 339 346 354 361 369 Temperature y 13.9 14 13.9 14.1 14 14.3 14.1 14.5 14.5 14.4 a) r =.89 b) y.19 1. 483 c) 14. 843 o o d) 13.873 C y 15.813 C 14.6 14.5 ˆ Temp vs. CO Levels o C 14.4 14.3 14. 14.1 14 13.9 13.8 31 3 33 34 35 36 37 38 CO Level (ppm)
Homework
Homework Sec. 4.1: # s 1-43 odd Sec. 4.: # s 1-31 odd Sec. 14.: # s 1-17 all
Thaks everyoe, it s bee a fu semester