Measures of Location Measures of position are used to describe the relative location of an observation 1
Measures of Position Quartiles and percentiles are two of the most popular measures of position An additional measure of central tendency, the midquartile, is defined using quartiles 2
Quartiles Quartiles: Values of the variable that divide the ranked data into quarters; each set of data has three quartiles 1. The first quartile, Q 1, is a number such that at most 25% of the data are smaller in value than Q 1 and at most 75% are larger 2. The second quartile, Q 2, is the median 3. The third quartile, Q 3, is a number such that at most 75% of the data are smaller in value than Q 3 and at most 25% are larger Ranked data, increasing order 25% 25% 25% 25% L Q 1 Q 2 Q 3 H 3
Median of Grouped Data Median = L + (N s ) x c 2 f L = LCL of median class (= 69.5) N = f = total frequency (= 20) s = total frequency before median class (= 9) f = frequency of median class (= 5) c = class size = (74.5 69.5 = 5) Median = 69.5 + (20 9) x (74.5 69.5) = 70.5 2 5 4
Class Interval Class Limit Class Midpoint (m) Frequency (less than UCL) (f) cf 50 54 49.5 54.5 52 1 1 55 59 54.5 59.5 57 1 2 60 64 59.5 64.5 62 2 4 65 69 64.5 69.5 67 5 9 Median Class 70 74 69.5 74.5 72 5 14 75 79 74.5 79.5 77 2 16 80 85 79.5 84.5 82 2 18 85 89 84.5 89.5 87 2 20 Median = 69.5 + (20 9) x (74.5 69.5) = 70.5 2 5 5
Second Quartile Q 2 = Median Q 2 = Median = L + (N s ) x c 2 f L = LCL of median class Q 2 (= 69.5) N = f = total frequency (= 20) s = total frequency before median classq 2 (= 9) f = frequency of median class Q 2 (= 5) c = class size (=74.5 69.5 = 5) Q 2 = 69.5 + (20 9) x (74.5 69.5) = 70.5 2 5 6
First Quartile Q 1 Q 1 = L + (N s ) x c 4 f L = LCL of Q 1 class (= 64.5) N = f = total frequency (= 20) s = total frequency before Q 1 class (= 4) f = frequency of Q 1 class (= 5) c = class size (= 69.5 64.5 = 5) Q 1 = 64.5 + (20 4) x (69.5 64.5) = 65.5 4 5 7
Class Interval Class Limit Class Midpoint (m) Frequency (less than UCL) (f) cf 50 54 49.5 54.5 52 1 1 55 59 54.5 59.5 57 1 2 60 64 59.5 64.5 62 2 4 Q 1 Class 65 69 64.5 69.5 67 5 9 70 74 69.5 74.5 72 5 14 75 79 74.5 79.5 77 2 16 80 85 79.5 84.5 82 2 18 85 89 84.5 89.5 87 2 20 Q 1 = 64.5 + (20 4) x (69.5 64.5) = 65.5 4 5 8
Third Quartile Q 3 Q 3 = L + (3N s ) x c 4 f L = LCL of Q 3 class (=74.5) N = f = total frequency (=20) s = total frequency before Q 3 class (=14) f = frequency of Q 3 class (= 2) c = class size (=79.5 74.5 = 5) Q 3 = 74.5 + (3x20 14) x (79.5 74.5) = 76.0 4 2 9
Class Interval Class Limit Class Midpoint (m) Frequency (less than UCL) (f) cf 50 54 49.5 54.5 52 1 1 55 59 54.5 59.5 57 1 2 60 64 59.5 64.5 62 2 4 65 69 64.5 69.5 67 5 9 70 74 69.5 74.5 72 5 14 Q 3 Class 75 79 74.5 79.5 77 2 16 80 85 79.5 84.5 82 2 18 85 89 84.5 89.5 87 2 20 Q 3 = 74.5 + (3x20 14) x (79.5 74.5) = 76.0 4 2 10
Percentiles Percentiles: Values of the variable that divide a set of ranked data into 100 equal subsets; each set of data has 99 percentiles. The k th percentile, P k, is a value such that at most k% of the data is smaller in value than P k and at most (100 - k)% of the data is larger. Notes: L at most k % at most (100 - k )% P k The 1st quartile and the 25th percentile are the same: Q 1 = P 25 The median, the 2nd quartile, and the 50th percentile are all the same: ~ x = Q = P 2 50 H 11
Percentiles 1% 1% 1% 1% 1% 1% P 1 P 2 P 3 P 97 P 98 P 99 P k = the kn 100 th value 12
Percentile P k of Ungrouped Data Procedure for finding P k : 1. Rank the n observations, lowest to highest 2. Compute A = (nk)/100 3. If A is an integer: d(p k ) = A.5 (depth) P k is halfway between the value of the data in the A th position and the value of the next data If A is a fraction: d(p k ) = B, the next larger integer P k is the value of the data in the B th position 13
Example Example: The following data represents the ph levels of a random sample of swimming pools in a town. Find: 1) the first quartile, 2) the third quartile, and 3) the 37th percentile: Solutions: 5.6 5.6 5.8 5.9 6.0 6.0 6.1 6.2 6.3 6.4 6.7 6.8 6.8 6.8 6.9 7.0 7.3 7.4 7.4 7.5 1) k = 25: (20) (25) / 100 = 5, depth = 5.5, Q 1 = 6 2) k = 75: (20) (75) / 100 = 15, depth = 15.5, Q 3 = 6.95 3) k = 37: (20) (37) / 100 = 7.4, depth = 8, P 37 = 6.2 14
Percentile of Ungrouped Data Example k th percentile, P k = value at kn 100 position 53 58 68 73 75 76 79 80 85 88 91 99 kn 62 12 Depth of 62th percentile = = at position 7.44 100 100 P 62 = 62 th percentile = 80 15
Percentile P k X f cf cf % 38 1 125 100 37 1 124 99 36 3 123 98 35 5 120 96 34 9 115 92 33 8 106 85 32 17 98 78 31 23 81 65 30 29 24 58 46 18 34 27 28 10 16 13 27 3 6 5 26 1 3 3 25 0 2 2 24 2 2 2 P 25 = L + P 25 = 29.35 kn - cf f = 28.5 + 0.85 k = 25/100 N = 125 16
Percentile P k of Grouped Data P k = L + (kn s ) x c 100 f L = LCL of P k class N = f = total frequency s = total frequency before P k class f = frequency of P k class c = class size P k = L + (kxn s) x c = 100 f 17
Percentile P 25 = Quartile Q 1 P 25 = Q 1 = L + (N s ) x c 4 f L = LCL of class P 25 N = f = total frequency s = total frequency before class P 25 f = frequency of class P 25 c = class size P 25 = 64.5 + (50 3) x (64.5 59.5) = 4 10 18
Percentile P 50 = Quartile Q 2 = Median P 50 = Q 2 = Median = L + (N s ) x c 2 f L = LCL of median class P 50 N = f = total frequency s = total frequency before median class P 50 f = frequency of median class P 50 c = class size P 50 = 69.5 + (50 13) x (74.5 69.5) = 72.5 2 20 19
Percentile P 75 = Quartile Q 3 P 75 = Q 3 = L + (3N s ) x c 4 f L = LCL of class P 75 N = f = total frequency s = total frequency before class P 75 f = frequency of class P 75 c = class size P 75 = 74.5 + (3x50 33) x (74.5 69.5) = 4 15 20
Midquartile Midquartile: The numerical value midway between the first and third quartile: Q Q midquartile= 1 3 2 Example: Find the midquartile for the 20 ph values in the previous example: Q Q 3 midquartil e = 2 6 6. 95 2 12. 95 = 2 1 = = 6. 475 Note: The mean, median, midrange, and midquartile are all measures of central tendency. They are not necessarily equal. Can you think of an example when they would be the same value? 21
5-Number Summary 5-Number Summary: The 5-number summary is composed of: 1. L, the smallest value in the data set Notes: 2. Q 1, the first quartile (also P 25 ) 3. ~ x, the median (also P 50 and 2nd quartile) 4. Q 3, the third quartile (also P 75 ) 5. H, the largest value in the data set The 5-number summary indicates how much the data is spread out in each quarter The interquartile range is the difference between the first and third quartiles. It is the range of the middle 50% of the data 22
Box-and-Whisker Display Box-and-Whisker Display: A graphic representation of the 5-number summary: The five numerical values (smallest, first quartile, median, third quartile, and largest) are located on a scale, either vertical or horizontal The box is used to depict the middle half of the data that lies between the two quartiles The whiskers are line segments used to depict the other half of the data One line segment represents the quarter of the data that is smaller in value than the first quartile The second line segment represents the quarter of the data that is larger in value that the third quartile 23
Example Example: A random sample of students in a sixth grade class was selected. Their weights are given in the table below. Find the 5-number summary for this data and construct a boxplot: Solution: 63 64 76 76 81 83 85 86 88 89 90 91 92 93 93 93 94 97 99 99 99 101 108 109 112 63 85 92 99 112 L Q 1 ~ x Q 3 H 24
Boxplot for Weight Data Weights from Sixth Grade Class 60 70 80 90 Weight 100 110 L Q 1 ~ x Q 3 H 25