Chapter 8 Measures of Dsperso Defto of Measures of Dsperso (page 31) A measure of dsperso s a descrptve summary measure that helps us characterze the data set terms of how vared the observatos are from each other. A small value dcates that the observatos are ot too dfferet from each other; that s, there s a cocetrato of observatos about the ceter of the dstrbuto. O the other had, a large value dcates that the observatos are very dfferet from each other or they are wdely spread out from the ceter. The smallest possble value of a measure of dsperso should be 0. A zero measure should dcate the absece of varato. 1
Illustrato A={98, 98, 99, 99, 99,100, 100,100, 100, 100, 100, 100, 101, 101, 101, 10, 10} B={0, 30, 40, 50, 60, 70, 80, 90, 100, 110, 10, 130, 140, 150, 160, 170, 180} The meas of both collectos are equal to 100. Ther medas are also equal to 100. Whch collecto must have a hgher measure of dsperso? For whch collecto s the mea a more relable measure of cetral tedecy? (Relable the sese that f we repeatedly select a observato at radom from the collecto, ts value s usually ot too dfferet from the mea.) The measure of dsperso serves as a measure of the relablty of the mea or meda as measures of cetral tedecy. Geeral Classfcatos of Measures of Dsperso (page 3) Measures of Absolute Dsperso A measure of absolute dsperso has the same ut as the observatos. (Examples: rage, terquartle rage, stadard devato) Measures of Relatve Dsperso A measure of relatve dsperso has o ut ad s therefore useful comparg the varablty of oe dstrbuto wth aother dstrbuto. (Example: coeffcet of varato)
Defto of Rage (page 3) Defto 8.1 The rage s the dstace betwee the maxmum value ad the mmum value. Rage = Maxmum Mmum Sometmes the rage s preseted by statg the smallest ad the largest values. Example 8.1 (page 33) Gve the weghts of 5 rabbts ( pouds), fd the rage. 8 pouds 10 pouds 1 pouds 14 pouds 15 pouds lghtest heavest Soluto: The maxmum s 15 pouds ad the mmum s 8 pouds. Thus, the rage of the weghts of the rabbts s Rage = maxmum mmum = 15-8 = 7 pouds We ca also say that the weghts of the rabbts rage from 8 to 15 pouds. 3
Approxmatg the Rage from the FDT (pages 34-35) Rage = UCLHCl LCLLCl where UCL HCl = upper class lmt of the last class LCL LCl = lower class lmt of the frst class Example 8.5: Age ( years) o. of wome Lowest Class Iterval 5-9 7 10-14 10 15-19 13 0-9 8 30-34 5 Hghest Class Iterval 35-39 3 64 Rage = 39-5 = 34 years Characterstcs of the Rage (page 35) It s a smple, easy-to-compute ad easy-to-uderstad measure. Weakesses: It fals to commucate ay formato about the clusterg or the lack of clusterg of the values the mddle of the dstrbuto sce t uses oly the extreme values (mmum ad maxmum). A outler ca greatly affect ts value. It teds to be smaller for smaller collectos tha for larger collectos. It caot be approxmated from frequecy dstrbutos wth a ope-eded class. It s ot tractable mathematcally. 4
Defto of the Iterquartle Rage (IQR) The terquartle rage (IQR) s the dfferece betwee the thrd ad frst quartles of the data set. That s, IQR = Q 3 Q 1 Remarks about the IQR: The terquartle rage reflects the varablty of the mddle 50% of the observatos the array. The IQR may be vewed as the rage for a trmmed data set where the smallest 5% ad the largest 5% of observatos have bee removed. Ths modfed rage addresses the weakess of the rage s sestvty towards outlers. A shortcomg of the IQR s that t could be 0 eve f there s stll some varato amog the smallest 5% ad largest 5% of all observatos. 5
Example A ecoomst studyg the varato famly comes a commuty foud that the frst quartle come s P36,500 ad the thrd quartle come s P10,000. Fd the terquartle rage. IQR = Q 3 Q 1 =10,000-36,500 = P83,500 Defto of the Varace (page 37) Defto 8.. The populato varace s the mea of the squared devatos betwee each observed value ad the mea. 1 Populato Varace: where s the measure take from the th elemet of populato s the populato mea s the populato sze 1 Sample Varace : s 1 where s the measure take from the th elemet of sample s the sample mea s the sample sze 6
Remarks: (page 37) The populato varace s a parameter whle the sample varace s a statstc. The squared dfferece of a observato from the mea gves us a dea o how close ths observato s to the mea. A large squared dfferece dcates that the observato ad the mea are far from each other whle a small squared dfferece dcates that the observato ad the mea are close to each other. A small varace dcates that the observatos are hghly cocetrated about the mea so that t s approprate to use the mea to represet all of the values the collecto. The sample varace s ot the mea of the squared devatos of the observatos from the mea. The deomator of the sample varace s ot (the sze of the sample); rather, t s (-1). The reaso for usg (-1) as the dvsor s that Iferetal Statstcs, the correspodg statstc wth as the dvsor teds to uderestmate the populato varace. Usg the dvsor (-1) s used to make up for ths tedecy to uderestmate. The ut of the varace s the square of the uts of measures the data set. Thus, strctly speakg, the varace s ot a measure of absolute dsperso. Ofte, t s desrable to retur to the orgal uts of measure ad so t s the stadard devato that s preseted. Defto of Stadard Devato (page 38) The stadard devato s the postve square root of the varace. The populato stadard devato for a fte populato wth elemets, deoted by the Greek letter (lower case sgma) s:. 1 The sample stadard devato for a sample wth elemets, deoted by the letter s s: s 1 1 7
Example 8.7 (pages 39-40) Gve the IQ of 7 studets the sample, compute for the sample stadard devato. Let = weght of th studet the sample, = o.of studets = 7. ( x = 107) 107 100-7 49 99-8 64 110 3 9 105-4 11 5 5 107 0 0 116 9 81 s = 107 7 1 ( 107) 7-1 ( x 107 3 ) 3 6. 6 Computatoal Formula of the Varace (page 41) If the mea s a rouded fgure the the propagato of roudg errors s very fast whe we use the deftoal formula to compute the varace. We ca avod ths by usg the followg computatoal formula for the varace: Populato Varace: 1 1 Sample Varace: s 1 1 ( 1) 8
Proof ( ) ( ) 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Example (page 4) Usg the same data set o the IQ of 7 studets the sample, we compute the sample stadard devato usg the computatoal formula. 7 1 100 10000 99 9801 110 1100 105 1105 11 1544 107 11449 116 13456 7 749 1 80375 7 7 7 1 1 7(80375) (749) s 6. 7(7 1) 7(6) 9
Mathematcal Propertes of the Stadard Devato (page 47) Property 1: If each observato of a set of data s trasformed by the addto (or subtracto) of a costat c to each observato, the stadard devato of the ew set of data s the same as the stadard devato of the orgal data set. Proof: Orgal Data = { 1,,, } Trasformed Data = {Y 1, Y,, Y } where Y = + c s Y ( Y Y ) ( c) ( c) 1 1 1 1 1 ( ) 1 s Mathematcal Propertes of the Stadard Devato (page 48) Property : If each observato of a set of data s trasformed by the multplcato (or dvso) of a costat c to each observato, the stadard devato of the ew set of data s equal to the stadard devato of the orgal data multpled (or dvded) by c. Proof: Orgal Data = { 1,,, } Trasformed Data = {Y 1, Y,, Y } where Y = c s Y ( Y Y ) ( c ) ( c ) 1 1 1 1 c( ) c ( ) 1 1 c sx 1 1 c s x 10
Example The weghts ( mllgrams) of ats a sample are as follows: Sample data = {3, 8, 35, 1, 50} Its mea s 19.4 mg. ad ts stadard devato s 1.89 mg. If each measuremet s coverted to grams (1000 mg=1 g), what wll be the ew mea? the stadard devato? If each at gaed mg. weght, what wll be the ew mea? the stadard devato? Characterstcs of the Stadard Devato (page 47) It uses every observato ts computato. It may be dstorted by outlers. Ths s because squarg large devatos from the mea wll gve more weght to these outlers. It s ameable to algebrac treatmet. It s always oegatve. A value of 0 mples the absece of varato. Level of measuremet must at least be terval for the stadard devato to be terpretable. 11
Approxmatg the Varace from FDT (page 43-44) Populato Varace: Sample Varace: s k 1 k 1 k k f f 1 1 f ( ) k k f f 1 1 f ( ) 1 ( 1) where = class mark of th class terval f = frequecy of th class terval k = umber of classes = umber of observatos the populato = umber of observatos the sample = populato mea = sample mea Example Gve the frequecy dstrbuto of daly come pesos receved by a sample of 85 wome, approxmate the mea ad stadard devato. Class Lmts f f f 00 49 4 4.5 898.0 01,601.00 50-99 10 74.5,745.0 753,50.50 300-349 14 34.5 4,543.0 1,474,03.50 350-399 5 374.5 9,36.5 3,506,56.5 400-449 13 44.5 5,518.5,34,603.5 450-499 10 474.5 4,745.0,51,50.50 500-550 9 54.5 4,70.5,475,90.5 TOTALS 85 3,53.5 13,005,571.5 7 f 1 3,53.5 38.735 7 85 f 1 7 7 f f 1 1 (85)(13,005,571.5) (3,53.5) s 81.3 pesos ( 1) (85)(84) 1
Baayme-Chebyshev Rule (page 49) The percetage of observed values that fall wth dstaces of k stadard devatos below ad above the mea must be at least 1 100%, whatsoever the shape of the data dstrbuto. The value k s ay umber greater tha 1. 1 k The Beayme-Chebyshev rule gves us a dea how to characterze the data set usg the stadard devato. We ow state ths rule for k =, 3, ad 4 as follows: 1 1 3 At least 1 100% 1 100% 100% 75% 4 of the observed values fall wth dstaces 4 of stadard devatos below ad above the mea. 1 At least 1 100% 88.89% of the observed values fall wth dstaces of 3 stadard 3 devatos below ad above the mea. 1 At least 1 100% 93.75% of the observed values fall wth dstaces of 4 stadard 4 devatos below ad above the mea. Example 8.15 (page 49) Suppose you have formato that the mea weght of all female employees a maufacturg dustry s 10 pouds ad the stadard devato of the weghts s 8 pouds. Use the Beayme-Chebyshev rule to determe the terval cotag at least 75% of all the measures the populato. 1 Soluto: We solve for k from the equato, 0.75 = 1-. We fd that k= so that the terval k we are lookg for s: = 10 ()(8) = 10 16. That s, at least 75% of all female employees have weghts ragg from 104 to 136 lbs. Questo: If the stadard devato had bee smaller, say 4 lbs, whch terval wll cota at least 75% of all the measures the populato? 13
Comparg the Varato of Observatos of or More Dstrbutos Cosder the followg sample of weghts of ats mllgrams= {3, 8, 35, 1, 50}. Its stadard devato s 1.89 mg. Ths tme cosder ths sample of weghts of elephats grams={6000000, 5999999, 5999998, 6000001, 600000}. Its stadard devato s 1.581 grams. Ca we use the stadard devatos of the two collectos to compare the varato of the observatos of these two collectos? We caot use measures of absolute dsperso to compare the varato of the observatos of two or more collectos whe () the uts are dfferet or () the meas are very dfferet from each other. Defto of Measures of Relatve Dsperso Measures of relatve dsperso are measures of dsperso that have o ut of measuremet ad are used to compare the scatter of oe dstrbuto wth the scatter of aother dstrbuto. 14
Defto of Coeffcet of Varato (page 53) The coeffcet of varato (CV) s a measure of relatve dsperso ad s defed as: populato CV = x100% where s the populato stadard devato s the populato mea s sample CV = x 100% x where s s the sample stadard devato x s the sample mea ote: The coeffcet of varato descrbes the stadard devato as a percetage of the mea. (Example: CV=10% dcates that the stadard devato s 10% of the mea). Cosequetly, the CV s ot terpretable whe the mea s egatve ad s udefed whe the mea s 0. Example 8.17b (page 54) Suppose we get the prces of a 80-gram pack of a certa brad of cracker uts at 0 dfferet grocery stores. The mea prce of the 0 packs of cracker uts s P9.50 wth a stadard devato of P0.6. O the other had, the weghts of the 0 packs of cracker uts have a mea cotet of 8 grams wth a stadard devato of 3.5 grams. Ca we say that prces are more varable tha weght? Soluto: 0.6 CV prce = 100 6.3% 9.5 x CV weght = 3.5 100 4.7% 8 x 15
Example 8.17c (page 55) The foreg exchage rate s a dcator of the stablty of the peso ad a dcator of the ecoomc performace. The level of the peso s depedet o the market forces ad ot o govermet polcy. The govermet tervees oly through the Bagko Setral g Plpas whe there are speculatve elemets the market. Gve below are the meas ad stadard devatos of the quarterly P-$ exchage rate for the perods 1998 to 1999 ad 000 to 001. Whch of the two perods s the peso more stable? 1998-1999 000-001 Mea P40.4 P48.6 Stadard Devato P.01 P1.1 Soluto:.01 1.1 CV 98-99 = x100% 4.98% 40.4 CV 00-01 = x100%.49% 48.6 Aother mportat measure: The Stadard Score (page 50) The stadard score or z-score dcates the relatve posto of a observato the collecto where the observed value came from It s used to compare two values from dfferet collectos that (1) dffer wth respect to or s, or both, or () are expressed dfferet uts. It s also used to detfy possble outlers. As a rule, f the stadard score > 3, the t s marked as a possble outler. Whe all the observatos a collecto are stadardzed the the mea ad stadard devato of ths collecto of stadard scores are 0 ad 1, respectvely. 16
Defto of Stadard Score (page 50) Defto 8.4. The stadard score or z-score measures how may stadard devatos a observed value s above or below the mea. Populato Z-score = Sample Z-score = - where s the populato mea s the populato stadard devato - s where x s the sample mea s s the sample stadard devato Remarks (page 50) A postve z-score measures the umber of stadard devatos a observato s above the mea, ad a egatve z-score measures the umber of stadard devatos a observato s below the mea. A z-score of 0 meas that the observato s equal to the mea. The z-score has o ut whch makes t possble to compare the z-scores computed usg dfferet collectos. 17
Example 8.16 (page 50-51) The mea grade Statstcs 101 s 70% ad the stadard devato s 10%; whereas Math 17, the mea grade s 80% ad the stadard devato s 0%. a) Mark got a grade of 75% Statstcs 101 ad a grade of 90% Math 17. I whch subject dd Mark perform better f we cosder the grades of the other studets the two subjects? 75 70 90 80 Soluto: z Stat 101 0.5, z Math 17 0.5 10 0 If we cosder the grades of the other studets the two subjects, Mark s score Stat 101 s just as good as hs score Math 17. Based o the z scores, Mark s scores both subjects are 0.5 stadard devatos above ther respectve mea scores. b) Peter got a grade of 70% both Statstcs 101 ad Math 17. I whch subject dd Peter perform better f we cosder the grades of the other studets the two subjects? 70 70 70 80 Soluto: z Stat 101 0, z Math 17 0.5 10 0 Peter relatvely performed better Statstcs 101. Based o the z scores, Peter s score Stat 101 s equal to the mea score Stat 101, whle hs score Math 17 s 0.5 stadard devatos below the mea. c) Paul got a grade of 100% Stat 101. Compute for the z score ad terpret. 100 70 Soluto: z 3 10 Paul s score s above the mea. Its dstace from the mea s thrce the sze of the stadard devato. We may cosder ths a uusually hgh score whe compared wth other Stat 101 grades. Assgmet 1. Usg the data Exercse o. page 51, compute for the followg: a) Mea b) Rage c) Stadard devato usg the stadard devato mode of you calculator d) Stadard devato usg the computatoal formula (show soluto) e) coeffcet of varato f) Use the Baayme-Chebyshev rule to detfy a terval that wll eclose at least 75% of the observatos. g) What percetage of observatos are actually cotaed wth stadard devatos from the mea.. Usg the fdt Exercse o. 4 page 5, approxmate the followg: a) Mea b) Rage c) Stadard devato usg the computatoal formula (show all mportat steps) d) coeffcet of varato 3. Aswer Exercse o. 6, page 5. 4. Aswer Exercse o. 1, page 55. Justfy your aswer wth the approprate statstcs. 18