8. CORRELATION Defiitios: 1. Correlatio Aalsis attempts to determie the degree of relatioship betwee variables- Ya-Ku-Chou.. Correlatio is a aalsis of the covariatio betwee two or more variables.- A.M.Tuttle. Correlatio epresses the iter-depedece of two sets of variables upo each other. Oe variable ma be called as (subject) idepedet ad the other relative variable (depedet). Relative variable is measured i terms of subject. Uses of correlatio: 1. It is used i phsical ad social scieces.. It is useful for ecoomists to stud the relatioship betwee variables like price, quatit etc. Busiessme estimates costs, sales, price etc. usig correlatio. 3. It is helpful i measurig the degree of relatioship betwee the variables like icome ad epediture, price ad suppl, suppl ad demad etc. 4. Samplig error ca be calculated. 5. It is the basis for the cocept of regressio. Scatter Diagram: It is the simplest method of studig the relatioship betwee two variables diagrammaticall. Oe variable is represeted alog the horizotal ais ad the secod variable alog the vertical ais. For each pair of observatios of two variables, we put a dot i the plae. There are as ma dots i the plae as the umber of paired observatios of two variables. The directio of dots shows the scatter or cocetratio of various poits. This will show the tpe of correlatio. 1. If all the plotted poits form a straight lie from lower left had M.Sc.- PAGE-01
Y corer to the upper right had corer the there is Perfect positive correlatio. We deote this as r +1 Perfect positive Perfect Negative Correlatio Correlatio r +1 (r 1) O X ais O X ais X. If all the plotted dots lie o a straight lie fallig from upper left had corer to lower right had corer, there is a perfect egative correlatio betwee the two variables. I this case the coefficiet of correlatio takes the value r -1. Merits: 1. It is a simplest ad attractive method of fidig the ature of correlatio betwee the two variables.. It is a o-mathematical method of studig correlatio. It is eas to uderstad. 3. It is ot affected b etreme items. 4. It is the first step i fidig out the relatio betwee the two variables. 5. We ca have a rough idea at a glace whether it is a positive correlatio or egative correlatio. Demerits: B this method we caot get the eact degree or correlatio betwee the two variables. Tpes of Correlatio: Correlatio is classified ito various tpes. The most importat oes are Y M.Sc.- PAGE-0
i) Positive ad egative. ii) Liear ad o-liear. iii) Partial ad total. iv) Simple ad Multiple. Liear ad No-liear correlatio: If the ratio of chage betwee the two variables is a costat the there will be liear correlatio betwee them. Cosider the followig. X 4 6 8 10 1 Y 3 6 9 1 15 18 Here the ratio of chage betwee the two variables is the same. If we plot these poits o a graph we get a straight lie. If the amout of chage i oe variable does ot bear a costat ratio of the amout of chage i the other. The the relatio is called Curvi-liear (or) o-liear correlatio. The graph will be a curve. Computatio of correlatio: Whe there eists some relatioship betwee two variables, we have to measure the degree of relatioship. This measure is called the measure of correlatio (or) correlatio coefficiet ad it is deoted b r. Co-variatio: The covariatio betwee the variables ad is defied as ( )( ) Cov(,) where, are respectivel meas of ad ad is the umber of pairs of observatios. 195 M.Sc.- PAGE-03
Karl pearso s coefficiet of correlatio: Karl pearso, a great biometricia ad statisticia, suggested a mathematical method for measurig the magitude of liear relatioship betwee the two variables. It is most widel used method i practice ad it is kow as pearsoia coefficiet of correlatio. It is deoted b r. The formula for calculatig r is Cov(, ) (i) r where σ, σ are S.D of ad σ.σ respectivel. (ii) r σ σ (iii) r Σ XY X. Y, X, Y whe the deviatios are take from the actual mea we ca appl a oe of these methods. Simple formula is the third oe. The third formula is eas to calculate, ad it is ot ecessar to calculate the stadard deviatios of ad series respectivel. Steps: 1. Fid the mea of the two series ad.. Take deviatios of the two series from ad. X, Y 3. Square the deviatios ad get the total, of the respective squares of deviatios of ad ad deote b X, Y respectivel. 4. Multipl the deviatios of ad ad get the total ad Divide b. This is covariace. 5. Substitute the values i the formula. r cov(, ) σ. σ ( ) ( - )/ ( ) ( ). 196 M.Sc.- PAGE-04
The above formula is simplified as follows r Σ XY, X, Y X. Y Eample 1: Fid Karl Pearso s coefficiet of correlatio from the followig data betwee height of father () ad so (). X 64 65 66 67 68 69 70 Y 66 67 65 68 70 68 7 Commet o the result. Solutio: Y X X 67 X Y Y - 68 Y XY 64 66-3 9-4 6 65 67-4 -1 1 66 65-1 1-3 9 3 67 68 0 0 0 0 0 68 70 1 1 4 69 68 4 0 0 0 70 7 3 9 4 16 1 469 476 0 8 0 34 5 469 476 67 ; 68 7 7 r Σ XY 5 5 5 X. Y 8 34 95 30. 85 0. 81 Sice r + 0.81, the variables are highl positivel correlated. (ie) Tall fathers have tall sos. Workig rule (i) We ca also fid r with the followig formula Cov(, ) We have r σ.σ Cov(,) ( )( ) Σ( + ) 197 M.Sc.- PAGE-05
Σ Σ Σ Σ - - + Σ Cov(,) - + Σ σ -, Σ σ - Cov(, ) Now r σ.σ Σ Σ r Σ Σ -. - Σ - ( Σ) ( Σ) r [ Σ ( Σ)][ Σ -( Σ)] Note: I the above method we eed ot fid mea or stadard deviatio of variables separatel. Eample : Calculate coefficiet of correlatio from the followig data. X 1 3 4 5 6 7 8 9 Y 9 8 10 1 11 13 14 16 15 1 9 1 81 9 8 4 64 16 3 10 9 100 30 4 1 16 144 48 5 11 5 11 55 6 13 36 169 78 7 14 49 196 98 8 16 64 56 18 9 15 81 5 135 45 108 85 1356 597 198 M.Sc.- PAGE-06
r r Σ - ( Σ) ( Σ) [ ( )][ -( )] Σ Σ Σ Σ 9 597-45 108 ( ) 9 85 (45).(9 1356 (108) ) 5373-4860 r (565 05).104 ( 11664) 513 513 0.95 540 540 540 Workig rule (ii) (shortcut method) Cov(, ) We have r σ.σ ( )( ) where Cov(,) Take the deviatio from as A ad the deviatio from as B Σ [( - A) - ( A)] [( - B) - ( B)] Cov(,) 1 Σ [( - A) ( - B) - ( - A) ( - B) - ( A)( B) + ( A)( B)] 1 Σ( - A) Σ [( - A) ( - B) - ( - B) Σ( - B) Σ( - A)( B) ( A) + Σ( - A)( - B) A ( B) ( ) B ( A) ( ) + ( A) ( B) 199 M.Sc.- PAGE-07
Σ( - A)( - B) ( B) ( A) ( A) ( B) + ( A) ( B) Σ( - A)( - B) ( A) ( B) Let - A u ; - B v; A u; B v Σ Cov (,) uv uv Σu σ u σu Σuv ( Σu)( Σv) r ( ). ( ) ( ) Σv Σu Σu Σv Σv σσ v σv Eample 3: Fid Karl Pearso s coefficiet of correlatio from the followig data betwee height of father () ad so (). X 64 65 66 67 68 69 70 Y 66 67 65 68 70 68 7 Commet o the result. Solutio: Y X X 67 X Y Y - 68 Y XY 64 66-3 9-4 6 65 67-4 -1 1 66 65-1 1-3 9 3 67 68 0 0 0 0 0 68 70 1 1 4 69 68 4 0 0 0 70 7 3 9 4 16 1 469 476 0 8 0 34 5 469 476 67 ; 68 7 7 r Σ XY 5 5 5 X. Y 8 34 95 30. 85 0. 81 M.Sc.- PAGE-08
Eample 4: Calculate Pearso s Coefficiet of correlatio. X 45 55 56 58 60 65 68 70 75 80 85 Y 56 50 48 60 6 64 65 70 74 8 90 r X Y u -A v -B u v uv 45 56-0 -14 400 196 80 55 50-10 -0 100 400 00 56 48-9 - 81 484 198 58 60-7 -10 49 100 70 60 6-5 -8 5 64 40 65 64 0-6 0 36 0 68 65 3-5 9 5-15 70 70 5 0 5 0 0 75 74 10 4 100 16 40 80 8 15 1 5 144 180 85 90 0 0 400 400 400-49 1414 1865 1393 [ Σu Σuv ( Σu) ( Σv) ( Σu )] [ Σv ( Σv) 11 1393 - (-49) r (1414 11 () ) (1865 11 ( 49) ) 1541 1541 0.9 15550 18114 16783.11 + ] 01 M.Sc.- PAGE-09
Eample 5 Calculate the correlatio co-efficiet for the followig heights (i iches) of fathers(x) ad their sos(y). X : 65 66 67 67 68 69 70 7 Y : 67 68 65 68 7 7 69 71 Solutio : X ΣX 544 68 8 Y ΣY 55 69 8 X Y X Y 65 67-3 - 9 4 6 66 68 - -1 4 1 67 65-1 -4 1 16 4 67 68-1 -1 1 1 1 68 7 0 3 0 9 0 69 7 1 3 1 9 3 70 69 0 4 0 0 7 71 4 16 4 8 544 55 0 0 36 44 4 Karl Pearso Correlatio Co-efficiet, Σ r(, ) 4.603 Σ Σ 36 44 Sice r(, ).603, the variables X ad Y are positivel correlated. i.e. heights of fathers ad their respective sos are said to be positivel correlated. Eample 6 Calculate the correlatio co-efficiet from the data below: X : 1 3 4 5 6 7 8 9 Y : 9 8 10 1 11 13 14 16 15 Solutio : X Y X Y XY 1 9 1 81 9 8 4 64 16 3 10 9 100 30 4 1 16 144 48 5 11 5 11 55 6 13 36 169 78 7 14 49 196 98 8 16 64 56 18 9 15 81 5 135 45 108 85 1356 597 M.Sc.- PAGE-10
Eample 8: Calculate coefficiet of correlatio from the followig data. X 1 3 4 5 6 7 8 9 Y 9 8 10 1 11 13 14 16 15 r r 1 9 1 81 9 8 4 64 16 3 10 9 100 30 4 1 16 144 48 5 11 5 11 55 6 13 36 169 78 7 14 49 196 98 8 16 64 56 18 9 15 81 5 135 45 108 85 1356 597 Σ - ( Σ) ( Σ) [ ( )][ -( )] Σ Σ Σ Σ 9 597-45 108 ( ) 9 85 (45).(9 1356 (108) ) 5373-4860 r (565 05).104 ( 11664) 513 513 0.95 540 540 540 M.Sc.- PAGE-11
Eample 9 r (X,Y) N Σ X Σ Σ N XY - ( X) Σ Σ Σ X Y N Y ( 9(597) - (45) (108) Σ Y) 9(85) (45) 9(1356) (108) X ad Y are highl positivel correlated..95 Calculate the correlatio co-efficiet for the ages of husbads (X) ad their wives (Y) X : 3 7 8 9 30 31 33 35 36 39 Y : 18 3 4 5 6 8 9 30 3 Solutio : Let A 30 ad B 6 the d X Α d Y Β X Y d d d d d d 3 18-7 -8 49 64 56 7-3 -4 9 16 1 8 3 - -3 4 9 6 9 4-1 - 1 4 30 5 0-1 0 1 0 31 6 1 0 1 0 0 33 8 3 9 4 6 35 9 5 3 5 9 15 36 30 6 4 36 16 4 39 3 9 6 81 36 54 11-3 15 159 175 NΣdd Σd Σd r (, ) NΣd ( Σd) NΣd ( Σd) 10(175) (11)( 3) 10(15) (11) 10(159) ( 3) 1783 0.99 X ad Y are highl positivel correlated. i.e. the ages of husbads ad their wives have a high degree of correlatio. M.Sc.- PAGE-1
Eample 9 data Solutio : Calculate the correlatio co-efficiet from the followig N 5, SX 15, SY 100 SX 650 SY 436, SXY 50 We kow, r r 0.667 NΣXY - ÓXÓY NΣX ( ΣX) NΣY ( ΣY) 5(50) - (15) (100) 5(650) (15) 5(436) (100) Properties of Correlatio: 1. Correlatio coefficiet lies betwee 1 ad +1 (i.e) 1 r +1 Let ; σ σ Sice ( + ) beig sum of squares is alwas o-egative. ( + ) 0 + + 0 Σ σ + Σ Σ σ + σ σ 0 Σ( ) Σ( ) Σ( ) ( Y Y) + + 0 σ σ σσ dividig b we get 1 1 1 1 1. Σ( ) +. Σ( ) +. Σ ( ) ( ) σ σ σσ 0 1 1. σ + σ +.cov(, ) 0 σ σ σσ 1 + 1 + r 0 + r 0 (1+r) 0 (1 + r) 0 1 r -------------(1) M.Sc.- PAGE-13
Similarl, ( ) 0 (l-r) 0 l - r 0 r +1 --------------() (1) +() gives 1 r 1 Note: r +1 perfect +ve correlatio. r 1 perfect ve correlatio betwee the variables. Propert : r is idepedet of chage of origi ad scale. Propert 3: It is a pure umber idepedet of uits of measuremet. Propert 4: Idepedet variables are ucorrelated but the coverse is ot true. Propert 5: Correlatio coefficiet is the geometric mea of two regressio coefficiets. Propert 6: The correlatio coefficiet of ad is smmetric. r r. Limitatios: 1. Correlatio coefficiet assumes liear relatioship regardless of the assumptio is correct or ot.. Etreme items of variables are beig udul operated o correlatio coefficiet. 3. Eistece of correlatio does ot ecessaril idicate causeeffect relatio. M.Sc.- PAGE-14
Rak Correlatio: It is studied whe o assumptio about the parameters of the populatio is made. This method is based o raks. It is useful to stud the qualitative measure of attributes like hoest, colour, beaut, itelligece, character, moralit etc.the idividuals i the group ca be arraged i order ad there o, obtaiig for each idividual a umber showig his/her rak i the group. This method was developed b Edward Spearma i 1904. It is defied 6ΣD as r 1 r rak correlatio coefficiet. 3 Note: Some authors use the smbol ρ for rak correlatio. D sum of squares of differeces betwee the pairs of raks. umber of pairs of observatios. The value of r lies betwee 1 ad +1. If r +1, there is complete agreemet i order of raks ad the directio of raks is also same. If r -1, the there is complete disagreemet i order of raks ad the are i opposite directios. Computatio for tied observatios: There ma be two or more items havig equal values. I such case the same rak is to be give. The rakig is said to be tied. I such circumstaces a average rak is to be give to each idividual item. For eample if the value so is repeated twice at the 5 th rak, the commo rak to be assiged to each item is 5 + 6 5.5 which is the average of 5 ad 6 give as 5.5, appeared twice. If the raks are tied, it is required to appl a correctio 1 factor which is 1 (m3 -m). A slightl differet formula is used whe there is more tha oe item havig the same value. The formula is 1 1 Σ D + m m + m m + r 1 1 1 3 3 3 6[ ( ) ( )...] 08 M.Sc.- PAGE-15
Where m is the umber of items whose raks are commo ad should be repeated as ma times as there are tied observatios. Eample 10: I a marketig surve the price of tea ad coffee i a tow based o qualit was foud as show below. Could ou fid a relatio betwee ad tea ad coffee price. Price of tea 88 90 95 70 60 75 50 Price of coffee 10 134 150 115 110 140 100 Price of Rak Price of Rak D D tea coffee 88 3 10 4 1 1 90 134 3 1 1 95 1 150 1 0 0 70 5 115 5 0 0 60 6 110 6 0 0 75 4 140 4 50 7 100 7 0 0 D 6 6ΣD 6 6 r 1 1 3 3 7 7 36 1 1 0.1071 336 0.899 The relatio betwee price of tea ad coffee is positive at 0.89. Based o qualit the associatio betwee price of tea ad price of coffee is highl positive. Eample 11: I a evaluatio of aswer script the followig marks are awarded b the eamiers. 1 st 88 95 70 960 50 80 75 85 d 84 90 88 55 48 85 8 7 09 M.Sc.- PAGE-16
Do ou agree the evaluatio b the two eamiers is fair? R1 R D D 88 84 4 4 95 1 90 1 0 0 70 6 88 4 16 60 7 55 7 0 0 50 8 48 8 0 0 80 4 85 3 1 1 85 3 75 6 3 9 30 6ΣD 6 30 r 1 1 3 3 8 8 180 1 1 0.357 0.643 504 r 0.643 shows fair i awardig marks i the sese that uiformit has arise i evaluatig the aswer scripts betwee the two eamiers. Eample 1: Rak Correlatio for tied observatios. Followig are the marks obtaied b 10 studets i a class i two tests. Studets A B C D E F G H I J Test 1 70 68 67 55 60 60 75 63 60 7 Test 65 65 80 60 68 58 75 63 60 70 Calculate the rak correlatio coefficiet betwee the marks of two tests. Studet Test 1 R1 Test R D D A 70 3 65 5.5 -.5 6.5 B 68 4 65 5.5-1.5.5 C 67 5 80 1.0 4.0 16.00 D 55 10 60 8.5 1.5.5 E 60 8 68 4.0 4.0 16.00 F 60 8 58 10.0 -.0 4.00 G 75 1 75.0-1.0 1.00 H 63 6 6 7.0-1.0 1.00 I 60 8 60 8.5 0.5 0.5 J 7 70 3.0-1.0 1.00 50.00 10 M.Sc.- PAGE-17
60 is repeated 3 times i test 1. 60,65 is repeated twice i test. m 3; m ; m 1 3 1 3 1 3 6[ Σ D + ( m m) + ( m m) + ( m m) r 1 1 1 1 3 1 1 1 + + + 1 1 1 1 3 10 10 6[50 + + 0.5 + 0.5] 1 990 6 53 67 1 0. 68 990 990 3 3 3 6[50 (3 3) ( ) ( )] Iterpretatio: There is uiformit i the performace of studets i the two tests. Eercise 8 I. Choose the correct aswer: 1.Limits for correlatio coefficiet. (a) 1 r 1 (b) 0 r 1 (c) 1 r 0 (d) 1 r. The coefficiet of correlatio. (a) caot be egative (b) caot be positive (c) alwas positive (d)ca either be positive or egative 3. The product momet correlatio coefficiet is obtaied b ΣXY ΣXY (a) r (b) r σ σ ΣXY (c) r (d) oe of these σ 4. If cov(,) 0 the (a) ad are correlated (b) ad are ucorrelated (c) oe (d) ad are liearl related 11 M.Sc.- PAGE-18
5. If r 0 the cov (,) is (a) 0 (b) -1 (c) 1 (d) 0. 6. Rak correlatio coefficiet is give b 6ΣD 6ΣD 6ΣD (a) 1+ (b) 1 (c) 1 3 3 6ΣD (d) 1 3 + 7. If cov (,) σ σ the (a) r +1 (b) r 0 (c) r (d) r -1 8. If D 0 rak correlatio is (a) 0 (b) 1 (c)0.5 (d) -1 9. Correlatio coefficiet is idepedet of chage of (a) Origi (b) Scale (c) Origi ad Scale (d) Noe 10. Rak Correlatio was foud b (a) Pearso (b) Spearma (c) Galto (d) Fisher II. Fill i the blaks: 11 Correlatio coefficiet is free from. 1 The diagrammatic represetatio of two variables is called 13 The relatioship betwee three or more variables is studied with the help of correlatio. 14 Product momet correlatio was foud b 15 Whe r +1, there is correlatio. 16 If r r, correlatio betwee ad is 17 Rak Correlatio is useful to stud characteristics. 18 The ature of correlatio for shoe size ad IQ is III. Aswer the followig : 19 What is correlatio? 0 Distiguish betwee positive ad egative correlatio. 1 Defie Karl Pearso s coefficiet of correlatio. Iterpret r, whe r 1, -1 ad 0. What is a scatter diagram? How is it useful i the stud of Correlatio? 1 M.Sc.- PAGE-19
3 Distiguish betwee liear ad o-liear correlatio. 4 Metio importat properties of correlatio coefficiet. 5 Prove that correlatio coefficiet lies betwee 1 ad +1. 6 Show that correlatio coefficiet is idepedet of chage of origi ad scale. 7 What is Rak correlatio? What are its merits ad demerits? 8 Eplai differet tpes of correlatio with eamples. 9 Distiguish betwee Karl Pearso s coefficiet of correlatio ad Spearma s correlatio coefficiet. 30 For 10 observatios 130; 0; 90; 5510; 3467. Fid r. 31 Cov (,) 18.6; var() 0.; var() 3.7. Fid r. 3 Give that r 0.4 cov(,) 10.5 v() 16; Fid the stadard deviatio of. 33 Rak correlatio coefficiet r 0.8. D 33. Fid. Karl Pearso Correlatio: 34. Compute the coefficiet of correlatio of the followig score of A ad B. A 5 10 5 11 1 4 3 7 1 B 1 6 8 5 1 4 6 5 35. Calculate coefficiet of Correlatio betwee price ad suppl. Iterpret the value of correlatio coefficiet. Price 8 10 15 17 0 4 5 Suppl 5 30 3 35 37 40 4 45 36. Fid out Karl Pearso s coefficiet of correlatio i the followig series relatig to prices ad suppl of a commodit. Price(Rs.) 11 1 13 14 15 16 17 18 19 0 Suppl(Rs.) 30 9 9 5 4 4 4 1 18 15 37. Fid the correlatio coefficiet betwee the marks obtaied b te studets i ecoomics ad statistics. Marks (i 70 68 67 55 60 60 75 63 60 7 ecoomics Marks (i statistics 65 65 80 60 68 58 75 6 60 70 13 M.Sc.- PAGE-0
RANK CORRELATION: 46. Two judges gave the followig raks to eight competitors i a beaut cotest. Eamie the relatioship betwee their judgemets. Judge A 4 5 1 3 6 7 8 Judge B 8 6 3 1 4 5 7 47. From the followig data, calculate the coefficiet of rak correlatio. X 36 56 0 65 4 33 44 50 15 60 Y 50 35 70 5 58 75 60 45 80 38 48. Calculate spearma s coefficiet of Rak correlatio for the followig data. X 53 98 95 81 75 71 59 55 Y 47 5 3 37 30 40 39 45 49. Appl spearma s Rak differece method ad calculate coefficiet of correlatio betwee ad from the data give below. X 8 31 3 9 31 7 31 18 Y 18 5 5 37 31 35 31 9 18 0 50. Fid the rak correlatio coefficiets. Marks i Test I Marks i Test II 70 68 67 55 60 60 75 63 60 7 65 65 80 60 68 58 75 6 60 70 51. Calculate spearma s Rak correlatio coefficiet for the followig table of marks of studets i two subjects. First 80 64 54 49 48 35 3 9 0 18 15 10 subject Secod subject 36 38 39 41 7 43 45 5 51 4 40 5 M.Sc.- PAGE-1
Aswers: I. 1. (a).. (d) 3. (b) 4.(b) 5. (a) 6. (c) 7. (a) 8. (b) 9. (c) 10. (b) II. 11. Uits 1. Scatter diagram 13. Multiple 14. Pearso 15. Positive perfect 16. Smmetric 17. Qualitative 18. No correlatio III. 30. r 0.9574 31. r 0.85 3. 6.5. 33. 10 34. r +0.58 35. r +0.98 36. r - 0.96 37. r +0.68 38. r - 0.9 39. r +0.64 40. r +0.1 41. r +0.98 4. r +0.746 43. r +0.533 44. r +0.596 45. r +0.0945 46. r +0.6 47. r - 0.93 48. r - 0.905 49. r 0.34 50. r 0.679 51. r 0.685 M.Sc.- PAGE-