Paired Data and Linear Correlation

Paired Data ad Liear Correlatio Example. A group of calculus studets has take two quizzes. These are their scores: Studet st Quiz Score ( data) d Quiz Score ( data) 7 5 5 0 3 0 3 4 0 5 5 5 5 6 0 8 7 0 0 8 5 7 9 8 0 0 5 0 Graphig scores i rectagular coordiate system yields the followig graph (the so called scatter diagram)

There is obvious relatio betwee the scores o the quizzes. Better scores o the first quiz seem to imply better scores o the secod quiz, worse scores o the first quiz imply worse scores o the secod quiz. We measure this correlatio with a Pearso (Liear Product Momet) Coefficiet of Correlatio. Pearso Coefficiet of Correlatio Calculatio Formulas (populatio of size ) r zxz y x y xy r xy (.) x y xy 7 5 55 5 0 300 0 3 0 0 5 50 5 5 5 0 8 80 0 0 400 5 7 55 8 0 80 5 0 50 x 5 y 33 xy 795 795.5 3.3 r 0 0.89. 5.74895.778

I case we are dealig with sample (of size ) ad usig sample parameters the usig s yields the followig r xy xy ss xy. s s r xy s s (.) ote that we are gettig the same umber oly usig differet formulas. Here is oe more formula ofte used as well, give by our textbook, r s s xy xy x x xy x y y x x y y y r xy x y x x y y (.3) The easiest oe to use is probably (.) but there you eed to calculate stadard deviatios for both ad before you use the formula. The formula (.3) is useful if we

do ot have to calculate stadard deviatios but you eed to calculate squares of the data poits. This is the formula we are goig to use most ofte. Let s calculate Pearso Correlatio Coefficiet for our example usig formulas (.) ad (.3). We eed slightly expaded table, x y xy x y 7 5 55 89 5 5 0 300 5 400 0 3 0 0 9 0 5 50 00 5 5 5 5 5 5 0 8 80 00 64 0 0 400 400 400 5 7 55 5 89 8 0 80 64 00 5 0 50 5 00 x 5 y 33 xy 795 x 653 y 037 Usig formula (.): r xy 795 0.5 3.3 s s 9 0.89 6.059886 5.45790 Ad oe more time usig formula (.3): r xy x y x x y y 0795 533 0653 5 0037 33 0.89 Example. Istead of the quiz # we use the shoe size of studet, here is the data summarizes i the table,

Studet Studet s Shoe Size ( data) d Quiz Score ( data) 0.5 5 0 0 3 3 4 7 5 5 0 5 6 8.5 8 7 8.5 0 8 7 9 9 0 0 8 0 Here is the scatter diagram, Diagram does ot idicate ay coectio betwee a shoe size ad quiz # score. We should see this from Pearso correlatio coefficiet as well.

Calculatio shows x y xy x y 0.5 5 57.5 0.5 5 0 0 00 00 400 3 33 9 7 5 05 49 5 0 5 50 00 5 8.5 8 68 7.5 64 8.5 0 70 7.5 400 7 04 44 89 9 0 90 8 00 8 0 80 64 00 x 94.5 y 33 xy 57.5 x 94 y 037 Usig formula (.): xy 57.5 0 9.45 3.3 r 9 0.0087. s s.5755.4579 Usig formula (.3): r xy x y x x y y 057.5 94.533 094 94.5 0037 33 0.00867. Both approximatios rouded off to the te-thousadth give 0.0087. This is a very small umber ad idicates that liear correlatio betwee ad is isigificat. Example 3. This is a radom sample for a Bosto Park muggig problem. The sample of paired data has bee take over radomly chose 0 days i the summer of 000. The first etry i

the pair is the umber of police officers o duty i the park, the secod etry is the umber of muggigs reported o that day. Days Police officers o duty i the park umber of reported muggis 0 5 5 3 6 4 9 5 4 7 6 6 8 7 8 8 5 9 4 3 0 7 6 Diagram idicates sigificat liear correlatio. We would say the correlatio is egative meaig the fittig lie is decreasig (more police officers associated with less muggigs, less police officers associated with more muggigs).

Correlatio coefficiet calculatio is easy, here usig formula (.3) that does ot require calculatio of stadard deviatio. x y xy y x 0 5 50 00 5 5 30 5 4 6 6 56 9 9 8 4 7 8 6 49 6 8 48 36 64 8 8 34 5 60 44 5 4 3 4 96 9 7 6 4 49 36 03 47 343 347 95 r xy x y x x y y 03430347 4 0.969. 86 74 0347 03 095 47 Importat otes about Pearso Correlatio Coefficiet (discussed o lecture!) ***. Correlatio Coefficiet is used to determie liear correlatio; it caot idicate ay other type of correlatio. o-liear correlatio is ot measured or expressed by correlatio coefficiet.. Correlatio Coefficiet is a umber betwee ad. Whe close to the the correlatio is egative liear, whe close to the correlatio is positive liear. Whe CC is close to 0 the there is o sigificat liear correlatio betwee ad. 3. Positive or egative liear correlatio does ot mea causatio. Homework: Check olie.

I derivatio of formula (.) we used Appedix ote for Formula (.) ad More! r x y xy zxz y. The last equatio follows this way: x y x y xy y x y x xy xy. There is aother popular formula for Pearso Correlatio Coefficiet that follows from secod formula above, r x y x y x x y y x y x x y y. That is this oe: r x x y y x x y y (.4)