Chapter 3: Data Description

Size: px
Start display at page:

Download "Chapter 3: Data Description"

Transcription

1 Chapter 3: Data Description 3.3 Graphical Methods 3.1 a. Pie Chart b. Bar Graph 3.2 a. A pie chart would not be appropriate since we are not proportionally allocating a sample or population into a number of categories. b. Bar Graph c. Yes. There was a decline from the late 80 s to the early 90 s. An abrupt surge upward occurred in 1994, 1995, and Pie Chart and Bar Graph should be plotted. Pie chart is better since we are proportionally allocating the food dollar to seven categories. 3.4 Pie chart should be plotted. 3.5 Bar graph should be plotted. 3.6 a. Range = = 0.33 b. Frequency histogram should be plotted with 7 classes ranging from to The intervals have width.05. c. Relative frequencies are given below. Plot relative frequencies versus class intervals. Class Class Interval Frequence(f i ) Relative Frequency(f i /25) Total n= d. The probability is 7/25 =.28 that the fluoride reading would be greater than.90 ppm. Thus, we would predict the 28% of the days would have a reading greater than.90 ppm. 3.7 Two separate bar graphs could be plotted, one with Lap Belt Only and the other with Lap and Shoulder Belt. A single bar graph with the Lap Belt Only value plotted next to the Lap and Shoulder for each value of Percentage of Use is probably the most effective plot. This plot would clearly demonstrate that the increase in number of lives saved by using a shoulder belt increased considerably as the percentage use increased. 3.8 a. A relative frequency histogram with 10 classes of width 100 units would have the same shape as the stem-and-leaf plot. 5

2 b. The shape is a fairly symmetric, unimodal plot. The stem-and-leaf plot is more informative because it shows the exact crime rates within the leaves (classes) as well as the frequencies. c. Construct a histogram with 10 intervals of width 100, starting at 100 and ending at The relative frequency histogram is multimodal, skewed to the right a. Because there a few relatively very large values in the data set, a frequency table with unequal class widths is given below: Class Class Interval Frequence(f i ) Relative Frequency(f i /25) Total n= b. Plot a relative frequency histogram from the table in a. c. The graph is unimodal skewed to the right. Note that the plot in this case is a graphical distortion because the amount of area in the last two classes is not representative of the relative frequencies in these classes. d. We would estimate that there is a 6/24 = 25% chance that city taxes would be greater than $ The stem-and-leaf would be since the exact values for the 24 cities is being displayed, whereas in the relative frequency histogram the data is grouped into classes. Stem-and-Leaf Plot (Leaf Unit 100) 2 70,71, ,60, ,67,88, ,20,61, ,43, , The stem-and-leaf plot is more informative because it provides the exact values of the data as well as the frequencies within each stem The computer generated relative frequency histogram will generally have classes with equal class widths. This will produce a number of intervals having relative frequency of 0. The stem-and-leaf plot may have a different choice for stems a. Construct separate relative frequency histograms. 6

3 b. The histogram for the New Therapy has one more class than the Standard Therapy. This would indicate that the New Therapy generates a few more large values than the Standard Therapy. However, there is not convincing evidence that the New Therapy generates a longer survival time The plot has a bimodal shape. This would be an indication that there are two separate populations. However, the evidence is not very convincing because the individual plots were similar in shape with the exception that the New Therapy had a few times somewhat larger than the survival times obtained under the Standard Therapy a. The outlays have increased rapidly from 1980 to 1989, then dropped somewhat until 1991, when they increased again to The values had a slight decrease from 1992 through b. The % GNP increased rapidly form 1980 to 1985, remain steady for a couple of years, then decreased fairly rapidly from 1987 through 1997 (except a minor jump from 1991 to 1992). c. The two plots have very different trends. The public interest group s contention is not supported by either graph for the decade of the 90 s a. There appears to have been a dramatic drop in verbal scores for both sexes from 1970 to 1980, with a greater drop in female scores. There appears to have been a slight drop in math scores from 1970 to 1980, then a slow steady rise from 1980 to b. Yes. For the years , the difference between males and females has remained relatively constant. c. The female verbal scores had a larger decrease for the years than the male scores Plot of average public school teacher s salaries: The plot obtained in Exercise 3.17 would lead to the conclusion that there was a dramatic increase in salaries from 1970 to 1985, whereas the plot obtained in Exercise 3.18 shows what really happened-there was a gradual increase in salaries from 1970 to Spacing the points evenly along the horizontal axis made the increase from 1970 to 1985 more pronounced. The interval width between on the horizontal axis should be five times the width of that between , and the interval width between 1983, 1985, 1987 should be two times the width of the interval between If these interval widths between time points do not correspond to the actual length of time between data observations, the resulting trend line could be seriously misleading There are a few states having either a very small or very large number of telephones per thousand residents. The vast majority of states have values in the 450 to 700 range a. Relative frequency histograms for 1985 and 1996 should be plotted. b. The two plots are very similar in shape. The proportion of states having a high percentage of homeownerships appears to have increased relative to c. There was a very robust economy during this time period which may have allowed more people to purchase homes. 7

4 d. Congress would be able to determine that there has not been a large increase in homeownership during this period of a strong economy and decide to increase the tax deductions relative to homeowners Stem-and-Leaf Plot 3.22 The shapes of the 1985 and 1996 histograms and stem-and-leaf plots are asymmetric. The four plots are unimodal and left-skewed The plots show an upward trend from year 1 to year 5. There is a strong seasonal (cyclic) effect; the number of units sold increases dramatically in the late summer and fall months a. Plot the activity data versus time. b. The plot shows an upward linear trend. c. There is no seasonal trend in the data. 3.4 Measures of Central Tendency 3.25 Mean = 243/16 = = 15.2, Median = (14+15)/2 = 14.5, Mode = Mean = 283/16 = = 17.7, Median = (14+15)/2 = 14.5, Mode = 18 The median and mode are unalterered but the mean is inflated by the two large values % of measurements = 10% of 16 = Thus, we delete the two largest and smallest data values. Since the 2 largest values are deleted in computing the 10% Trimmed Mean, its value is the same for both data sets: 10% Trimmed Mean = ( )/12 = Thus, the extreme values do not have an effect. A 5% Trimmed Mean deletes only the smallest and largest values in the data set. Therefore, the second largest extreme value does enter into the calculation and hence would have an effect on the 5% Trimmed Mean. The effect would not be as dramatic as was seen in using the untrimmed mean Mean = 96/16 = 6, Median = (5+6)/2 = 5.5, 3.29 The following table is used to calculate the summary statistics: 8

5 Class Interval Frequency (f i ) Midpoint(y i ) f i y i Total mean 114/15 = 7.6 mode 7 median L + w f m (.5n cf b ) = [(.5)(15) 4] = a. The relative frequency histogram is unimodal, slightly right skewed. b. The following table is used to calculate the summary statistics: Class Interval Frequency (f i ) Midpoint(y i ) f i y i Total ,380 mean 18380/191 = 96.2, mode 80 median L + w f m (.5n cf b ) = [(.5)(191) 92] = c. Since the median is larger than the mean, it would indicate that the plot is somewhat left skewed. This contradiction between what is indicated in the relative frequency histogram and what is indicated by the summary statistics is due to the fact that the class intervals are of different width. The correct plot would have relative frequency divided by class width on the vertical axis. This would then produce a left skewed histogram with mode at approximately 110. d. The median is more informative since the distribution is somewhat skewed to the left which produces a mean somewhat less than the middle of the distribution. The median distance traveled would at least represent a value such that half of the buses traveled less and half greater than 101,600 miles a. The mean cannot be approximated since we do not know the endpoint for the last interval, hence cannot compute the midpoint of this interval. Mode 240. median L + w f m (.5n cf b ) = [(.5)(408) 119] = b. The median since it would give an indication of the cholestrol level for which half of men in this group have a greater of lesser value a. Mean = 8.04, Median = 1.54 b. Terrestrial: Mean = 15.01, Median = 6.03 Aquatic: Mean = 0.38, Median =.375 9

6 c. The mean is more sensitive to extreme values than is the median. d. Terrestrial: Median, because the two large values(76.50 and 41.70) results in a mean which is larger than 82% of the values in the data set. Aquatic: Mean or median since the data set is relatively symmetric a. After removing the survival times of the two individuals who left the study, we obtain Mean = The median can be calculated for all 11 patients, since we know that the values for the two indiduals who left the study were greater than the listed values of 57 and 60 which would place them in the upper half of the survival times. Thus, we obtain Median = 29. b. The median would be unchanged but the mean would increase since these two values will be greater than the mean calculated from the nine observed values a. If we use all 14 failure times, we obtain Mean > and Median = 154. In fact, we know the mean is greater than since the failure times for two of engines are greater than the reported times of 300 hours. b. The median would be unchanged if we replaced the failure times of 300 with the true failure times for the two engines that did not fail. However, the mean would be increased a. The rounded measurements: b. Sample modes = 2.10, 2.77, 2.82 c. Median = ( )/2 = 2.45 d. Mean = mode = 2.10, 2.77, 2.82 median = 2.45 mean = The mode and median stay the same but the mean increases a. The values are given below: Group Mean Median Mode I no mode II , 1.57 III b. mean = median = mode = 0.70, 1.55, 1.57 c. If we were to use one summary for the combined group, then the median would be most appropriate because the three groups are substantially different. If separate summaries are computed for each group, then the mean and median are both appropriate since the three groups have relatively symmetric distributions. 10

7 3.38 Mean = , Median = , Mode = The average of the three net group means and the mean of the complete set of measurements are the same. This will be true whenever the group have the same number of measurements, but is not true if the groups have different sample sizes. However, the average of the group modes and medians are different from the overall median and mode. 3.5 Measures of Variability 3.39 a. X = yi /n = 40 8 = 5. Yes. b. 8 i=1 (y i 5) 2 = (6 5) 2 + (3 5) 2 + (10 5) 2 + (4 5) 2 + (4 5) 2 + (2 5) 2 + (4 5) 2 + (7 5) 2 = = 46 c. s 2 = = 6.57 s = The CV = 100 s 2.56 ȳ % = % = 51%. The standard deviation is 51% of the mean a. s = 7.95 b. Because the magnitude of the racers ages is larger than that of their experience Age: s (10 2)/4 = 2. The estimate is fairly close to the actual value of Experience: s (39 18)/4 = The estimate is somewhat smaller than the actual value of i y i = 1016, ȳ = 1016/50 = 20.32, i (y i 20.32) 2 = 3, , s 2 = [ ] = , s = =

8 3.43 The quantile plot is given here. Quantile Plot of Times Treatment Times(minutes) u a. The 25th percentile is the value associated with u =.25 on the graph which is 14 minutes. Also, by definition 14 minutes is the 25th percentile since 25% of the times are less than or equal to 14 and 75% of the times are greater than or equal to 14 minutes. b. Yes; the 90th percentile is 31.5 minutes. This means that 90% of the patients have a treatment time less than or equal to 31.5 minutes (which is less than 40 minutes) a. The frequency table is given here. Class Intervals Frequency Relative Frequency Total The frequency histogram is given here. 12

9 Frequency Histogram for Trees Frequency Diameters(inches) b. ȳ = = c. s = ȳ ± s yields (5.744,9.713); 50/70 = 71.43% ȳ ± 2s yields (3.759, ); 68/70 = 95.71% ȳ ± 3s yields (1.774, ); 70/70 = 100% The percentages are very close to the percentages given by the empirical rule: 68%, 95%, and 99.7% a. Range 200 b. s 200/4 = 50 or we can use the method of estimating s from grouped data which yields s (( )2 6 + ( ) ( ) ( ) ( )46 + ( ) ( ) ( ) 2 4)) = s = 37.2 and ȳ 96.2 c. ȳ ± s yields (59, 133.4); 121/191 64% ȳ ± 2s yields (21.8, 170.6); 181/ % ȳ ± 3s yields ( 15.4, 207.8); 191/191 = 100% The empirical rule works well for this data set since the relative frequency histogram is mound-shaped. 13

10 3.46 a. Luxury: ȳ = 145.0, s = 27.6 Budget: ȳ = 46.1, s = 5.13 b. Luxury: CV = 19% Budget: CV = 11% c. Luxury hotels vary in quality, location, and price, whereas budget hotels are more competitive for the low-end market so prices tend to be similar. d. The CV would be better because it takes into account the larger difference in the means between the two types of hotels a. The time series plot is given here. Mercury Concentration (ng/g dry weight) Mercury Concentration Site 1 Site Year For Site 1, there is a steady decrease until 1980, after which the level is fairly constant but at a much lower level than the values for Site 2. The concentrations at Site 2 are very erratic from 1969 to 1980, with alternating rises and falls. From 1980 through 1992, there is a fairly steady decline in mercury concentration. b. Site 1: Median = 18.25, Mean = Site 2: Median = 184.1, Mean = Both distributions are right skewed, thus the median is a more appropriate measure of center than is the mean. Site 1 has a considerably lower center than that of Site 2. c. Site 1: s = 26.95, CV = 92% Site 2: s = 194.7, CV = 68% 14

11 Comparing the standard deviations can be misleading because is smaller in magnitude than However, the data values for Site 1 are considerably smaller in magnitude also. Therefore, it is more informative to compare the CV values. Based on CV values the concentrations from Site 1 are relatively more variable than those from Site 2. d. No, Site 1 does not have values for these years. 3.6 The Boxplot 3.48 Median (Q 2 ) = 6 Lower Quartile (Q 1 ) = 4, Upper Quartile (Q 3 ) = Q 2 = 20 Q 1 = (17+18)/2 = 17.5, Q 3 = (23+24)/2 = a. Stem-and-Leaf Plot is given here: Stem Leaf b. Min=250, Q 1 = ( )/2 = 298, Q 2 = ( )/2 = 317.5, Q 3 = ( )/2 = 334, Max=386 15

12 Number of Blood Donors On Fridays Number of Volunteers There are no outliers because Q 1 (1.5)IQR = 244 < 250 and Q 3 + (1.5)IQR = 388 > 386. The distribution is approximately symmetric with no outliers a. CAN: Q , Q , Q DRY: Q , Q , Q b. Canned dogfood is more expensive (median much greater than that for dry dogfood), highly skewed to the right with a few large outliers. Dry dogfood is slightly left skewed with a considerably less degree of variability than canned dogfood. 16

13 3.7 More Than One Variable 3.52 a. Stacked Bar Graph is given here: Literacy Level of Three Subsistence Groups Percent in Literacy Level Middle Primary Illerate Shifting Settled TownDweller Subsitence Group b. Illiterate: 46%, Primary Schooling: 4%, At Least Middle School: 50% Shifting Cultivators: 27%, Settled Agriculturists: 21%, Town Dwellers: 51% There is a marked difference in the distribution in the three literacy levels for the three subsistence groups. Town dwellers and shifting cultivators have the reverse trends in the three categories, whereas settled agriculturists fall into essentially two classes Percentage comparison based on Row Totals: Turnover Age(years) Reason < Total Resigned (n=60) Transferred (n=66) Retired/fired (n=124) 50% of workers who resign are in the youngest age group; of those who transfer, 68.2% are in their 30 s; and of those who either retire or got fired 86.24% are at least 40 years old. 17

14 3.54 Percentage comparison based on Column Totals: Turnover Age(years) Reason < Resigned Transferred Retired/fired Total 100(n=50) 100(n=60) 100(n=60) 100(n=80) 60% of workers in their 20 s resign; 75% of the workers in their 30 s transfer; 87.6% of the workers in their 40 s retire or get fired; and 68.75% of the workers who are at least 50 years old retire or get fired a. The means and standard deviations are given here: Supplier ȳ s b. Side-by-side Boxplots are given here: Deviations from Target Power Value for Three Suppliers Deviations from Target Supplier 1 Supplier 2 Supplier 3 Supplier The three distributions are relatively symmetric but supplier 3 is considerably more variable and is shifted above supplier 1 s values, which in turn are shifted above supplier 2 s values. 18

15 c. Supplier 3 not only has the largest mean but also the largest standard deviation. Suppliers 1 and 2 have similar degrees of variability but supplier 1 has a greater mean than supplier 2. d. Supplier 2 because it has the smallest mean and deviates with essentially the same degree of variability as supplier A scatterplot of M3 versus M2 is given here. Money Supply (Trillions of Dollars) M M2 a. Yes, it would since we want to determine the relative changes in the two over the 20 month period of time. b. See scatterplot. The two measures follow an approximately increasing linear relationship. 19

16 3.57 A time series plot with M2 and M3 values on the vertical axis and months on the horizontal axis is given here. Money Supply (Trillions of Dollars) Money Supply M2 M Month This is a more informative plot than the scatterplot because it shows the relative changes of the two measures of money supply across the 20 months. 20

17 Supplementary Exercises 3.58 a. Mean = 57.5, Median = 34.0 b. Median since the data has a few very large values which results in the mean being larger than all but a few of the data values. c. Range = 273, s = 70.2 d. Using the approximation, s range/4 = 273/4 = The approximation is fairly accurate. e. ȳ ± s ( 12.7, 127.7); yields 82% ȳ ± 2s ( 82.9, 197.0); yields 94% ȳ ± 3s ( 153.1, 268.1); yields 97% These percentages do not match the Empirical Rule very well: 68%, 95%, and 99.7% f. The Empirical Rule applies to data sets with roughly a mound-shaped histogram. The distribution of this data set is highly skewed right a. mode = 5, median = 15, mean = b. range = 34-4 = 30, s 30/4 = 7.5 c. s = 8.5 d. No, the histogram for the data set is skewed to the right and hence is not moundshaped a. Price per roll: Mean = , s = Price per sheet: Mean = , s = b. Price per roll: CV = = 46.03% Price per sheet: CV = = 54.13% The price per sheet is more variable relative to its mean. c. CV; The CV value is unit free, whereas the standard deviation also reflects the relative magnitude of the data values Plot the data in a scatterplot. a. No b. No, as the number of sheets increases from 50 to 100, there is just a scatter of points, no real pattern. The price per roll jumps dramatically for the two brands having the largest number of sheets. c. Paper towel sheets vary in thickness and size which will affect the price From the two boxplots, there are 5 unusual brands with regards to price per roll: $1.49, $1.56, $1.59, $1.78, and $1.98. There are 2 unusual brands with respect to sheet per roll: 180 and a. Mean = 0.65; Median = 0.55; Mode =

18 b. Mean = 1.21; Median = 0.55; Mode = 0.5 The mean increases but the median and mode remain the same a. Range = = 0.9, s = b. s 0.9/4 = (fairly close to the actual value of 0.276) c. Range = = 6.5, s = 1.974, s 6.5/4 = Range = = 67.7, s = 21.32, s 67.7/4 = Note as the data becomes more dispersed, the range increases more rapidly than s a. Construct a relative frequency histogram b. Highly skewed to the right c ± 3488 (-2063, 4912) contains 37/41 = 90.2% 1424 ± (2)3488 (-5551, 8400) contains 38/41 = 92.7% 1424 ± (3)3488 (-9039, 11888) contains 39/41 = 95.1% These values do not match the percentages from the Empirical Rule: 68%, 95%, and 99.7%. d ± 1.54) (-0.06, 3.02) contains 31/41 = 75.6% 1.48 ± (2)1.54 (-1.60, 4.56) contains 40/41 = 97.6% 1.48 ± (3)1.54 (-3.14, 6.10) contains 41/41 = 100% These values closely match the percentages from the Empirical Rule: 68%, 95%, and 99.7% The values for the two years are given here: Year Mean Median Std. Dev a. The median is more appropriate since both distributions are left skewed. b. There is not a large difference in the summary statistics between the two years a. There has been very little change from 1985 to b. Yes; For both 1985 and 1996, DC had extremely low ownership. New York and Hawaii had semi-low ownership. c. No d. The cost of homes is very high Relative frequency histogram plot a. Mode = 2.5 Median L + w f m [0.5n cf b ] = [0.5(90) 35] = 7.04 b. Mean 1 13 n i=1 f iy i = 747/90 = 8.3 c. Since the distribution is skewed to the right, the median provides a better measure of the center of the distribution. 22

19 3.70 a. 75th percentile L + w f p (0.75n cf b ) = [0.75(90) 72] = th percentile L + w f p (0.25n cf b ) = frac215[.25(90) 20] = 3.83 interquartile range = b. s n 1 i=1 f i(y i 8.3) 2 = 1 89 [2(.5 8.3)2 + 18( ) ( ) 2 ] = Thus, s = The quantile plots are given here. Homeownership in Percentage(by State) for 1985 Homeownership in Percentage(by State) for 1996 Homeownership(%) Homeownership(%) u u a. The 20th percentile for 1996 is given by reading the vertical value on the graph for u = %. Thus, approximately 20% of the 1996 homeownership percentages are less than or equal to 63%. b. The upper 10th percentile would correspond to the states having the largest 5 percentages: Michigan, Indiana, West Virginia, Minnesota, and Maine. c. In 1985, the states falling in the upper 10th percentile are Pennsylvania, South Carolina, Wyoming, Maine, and West Virginia. There are only two states which fall into both groups The relative frequency distribution is unimodal, highly right skewed a. mode = 1.0 median =

20 b. mean = 10 i=0 f iy i = 663/500 = c. Becauce the mean is larger than the mode and median, the distribution is skewed to the right Since the distribution is highly skewed to the right, the Empirical Rule may not be appropriate a. Relative frequency histogram 11 b. Mean 1 n i=1 f iy i = 1 50 [(1)(52) + (4)(67) + + (1)(202)] = 5480/50 = s n 1 i=1 f i(y i 109.6) 2 = 1 49 [47412] = s = a. South: i y i = 17930, ȳ = 17930/30 = North: i y i = 14185, ȳ = 14185/29 = West: i y i = 19707, ȳ = 19707/29 = b. ȳ = 1 n [ i n iy i ] = 1 88 [30(597.67) + (29)(489.14) + (29)(679.55)] 1 88 [51822] = c. Total for all three regions is = Thus, ȳ = 1 88 [51822] = Sample mean computed in part b is equal to that obtained in part a. Note that if we were to just average the three means obtained in a we would obtain a slightly different answer since the sample sizes for the three regions are unequal. 1 3 [ ] = a. Plot relative frequency histogram. b c. mean = 25115/23 = , median = 1039 d. Because the mean is slightly larger than the median, it is likely that the distribution is slightly skewed to the right a. Mode = 688; yes, because it would indicate the most frequently occurring production. b. Median = 688 c. Mean = 17664/26 = d. Because the mean is somewhat less than the median, it is likely that the distribution skewed to the left. 24

21 3.79 The stem-and-leaf diagram is given here: Stem Leaf Yes, the distribution is left skewed a. median(q2) = 688, Ql = 667, Q3 = 701, IQR = = 34 b. The inner fence boundaries are: 667-(1.5)(34) = 616 and 701+(1.5)(34) = 752. There is an outlier at 547. c. The boxplot is given here: 25

22 Number of Cars Produced Per Shift Number of Cars Produced The means and medians are given here: Number of Members Mean Median a. mean = 9024/83 = b. Yes, we can use the following method: [20(93.75) + 23(98.652) + 16( ) + 14( ) + 10(131.90)]/83 = /83 = Results agree with part a, except for round-off errors. c. Median = 104 d. No 3.83 a. New policy ȳ = 2.27, s = 3.26 Old policy: ȳ = 4.60, s = 2.61 b. Both the average number of sick days and the variation in number of sick days have decreased with new policy. 26

23 3.84 New policy: ȳ = 1.93, s = 2.31 Old policy: ȳ = 4.27, s = 1.79 Yes, the ranges are affected a. The sample mean will be distorted by several large values which skew the distribution. State 5 and State 11 have more than 10 times as many plants destroyed as any other state; for arrests, States 1, 2, 8 and 12 exceed the other arrest figures substantially. b. Plants: ȳ = /15 = Arrests:ȳ = 1425/15 = 95 10% trimmed mean: Plants: ȳ = /11 = Arrests: ȳ = 657/11 = % trimmed mean: Plants: ȳ = /9 = 133, Arrests: ȳ = 372/9 = For plants, the 10% trimmed mean works well since it eliminates the effect of States 5 and 11. For arrests, the means differ because each takes some of the high values out of calculation. It appears that the distribution is not skewed, but rather separated into at least two parts: states with high number of arrests and states with low numbers. By trimming the mean, we may be losing important information Although the six states with fewest plants also have the fewest arrests, the states with the most plants do not have the most arrests. We could get a better idea of the relationship by plotting the two variables in a scatterplot. Other variables which might relate to number of plants destroyed include size of police force (especially the narcotics division) and size of budget allocated to drug enforcement a. A time series plot is given here: 27

24 Four Monthly FDC Indices FDC Index Pharmaceticals Diversified Chain Wholesaler Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Month b. All the components vary throughout the year with a slight increasing trend. Wholesalers show the greatest degree of variability through the year The percent change for the four indices are given here: Month Pharmaceuticals Diversified Chain Wholesaler January February March April May June July August September October November December A graph of percent changes helps to show comparisons between components. The diversified index shows much greater monthly variability than the other components; pharmaceuticals also vary, but not as rapidly as diversified a. Average Price =

25 b. Range = c. DJIA = d. Yes; The stocks covered on the NYSE. No; The companies selected are the leading companies in the different sectors of business a. No. The distributions of the data for the two years are left skewed. b. ȳ = 66.84, s = 6.69 ȳ ± s (60.16, 73.53) contains 82% of the data values ȳ ± 2s (53.46, 80.22) contains 94% of the data values ȳ ± 3s (46.77, 86.91) contains 98% of the data values These percentages are not very consistent with those provided by the Empirical Rule: 68%, 95%, 99.7% Only the District of Columbia with 37.4% and 40.4% homeownership in 1985 and 1996, respectively, fell more than 3 standard deviations from the mean. The stem-and-leaf plot showed this value at the low end of the scale, but did not really separate it too much from the rest of the data. It is not too extreme and should not decrease the mean too much. The raw and 10% trimmed means are given here: Year Raw Mean 10% Trimmed Mean a. The job-history percentages within each source are give here: Source Job History Within Firm Related Business Unrelated Business Promoted Same position Resigned Dismissed Total 100(n=57) 100(n=21) 100(n 42) b. If for each source, we compute the percentages combined over the promoted and same position categories, we find that they are 78.94% for within firms, 57.15% for related business and 66.67% for unrelated business. This ordering by source also holds for every job history category except the promoted one in which the three sources are nearly equal. It appears that a company does best when it selects its middle managers from within its own firm and worst when it takes its choices from a related firm a. The value 62 reflects the number of respondents in coal producing states who preferred a national energy policy that encouraged coal production. The value 32.8 is of those who favored a coal policy, 32.80% came from major coal producing states. The value 41.3 tells us that 41.3% of those from coal states favored a coal policy. And the value 7.8 tells us that 7.8% of all the responses come from residents of major coal producing states who were in favor of a national energy policy that encouraged coal production. b. The column percentages because they displayed the distribution of opinions within each of the three types of states. 29

26 c. Yes. For both the coal and oil-gas states, the largest percentage of responses favored the type of energy produced in their own state Arbitration seems to win the largest wage increases. If we assume that the Empirical Rule holds for these data, then a standard error for the mean of the arbitration figures would be s/ n =.25. Thus the mean increase after arbitration is ( )/.25 = 4 standard errors above the next largest mean, that for Poststrike. Management, on the other hand, should favor negotiation. It has the smallest mean percentage wage increase and the smallest variance, or least risk. 30

TOPIC: Descriptive Statistics Single Variable

TOPIC: Descriptive Statistics Single Variable TOPIC: Descriptive Statistics Single Variable I. Numerical data summary measurements A. Measures of Location. Measures of central tendency Mean; Median; Mode. Quantiles - measures of noncentral tendency

More information

Chapter 3. Data Description

Chapter 3. Data Description Chapter 3. Data Description Graphical Methods Pie chart It is used to display the percentage of the total number of measurements falling into each of the categories of the variable by partition a circle.

More information

Section 3. Measures of Variation

Section 3. Measures of Variation Section 3 Measures of Variation Range Range = (maximum value) (minimum value) It is very sensitive to extreme values; therefore not as useful as other measures of variation. Sample Standard Deviation The

More information

Chapter 1. Looking at Data

Chapter 1. Looking at Data Chapter 1 Looking at Data Types of variables Looking at Data Be sure that each variable really does measure what you want it to. A poor choice of variables can lead to misleading conclusions!! For example,

More information

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots

More information

STT 315 This lecture is based on Chapter 2 of the textbook.

STT 315 This lecture is based on Chapter 2 of the textbook. STT 315 This lecture is based on Chapter 2 of the textbook. Acknowledgement: Author is thankful to Dr. Ashok Sinha, Dr. Jennifer Kaplan and Dr. Parthanil Roy for allowing him to use/edit some of their

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Q: What is data? Q: What does the data look like? Q: What conclusions can we draw from the data? Q: Where is the middle of the data? Q: Why is the spread of the data important? Q:

More information

Stat 101 Exam 1 Important Formulas and Concepts 1

Stat 101 Exam 1 Important Formulas and Concepts 1 1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative

More information

Copyright 2017 Edmentum - All rights reserved.

Copyright 2017 Edmentum - All rights reserved. Study Island Copyright 2017 Edmentum - All rights reserved. Generation Date: 11/30/2017 Generated By: Charisa Reggie 1. The Little Shop of Sweets on the Corner sells ice cream, pastries, and hot cocoa.

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

The empirical ( ) rule

The empirical ( ) rule The empirical (68-95-99.7) rule With a bell shaped distribution, about 68% of the data fall within a distance of 1 standard deviation from the mean. 95% fall within 2 standard deviations of the mean. 99.7%

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Describing distributions with numbers

Describing distributions with numbers Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central

More information

CHAPTER 1. Introduction

CHAPTER 1. Introduction CHAPTER 1 Introduction Engineers and scientists are constantly exposed to collections of facts, or data. The discipline of statistics provides methods for organizing and summarizing data, and for drawing

More information

Exam: practice test 1 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Exam: practice test 1 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Exam: practice test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. ) Using the information in the table on home sale prices in

More information

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami Unit Two Descriptive Biostatistics Dr Mahmoud Alhussami Descriptive Biostatistics The best way to work with data is to summarize and organize them. Numbers that have not been summarized and organized are

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 2 Methods for Describing Sets of Data Summary of Central Tendency Measures Measure Formula Description Mean x i / n Balance Point Median ( n +1) Middle Value

More information

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore Chapter 3 continued Describing distributions with numbers Measuring spread of data: Quartiles Definition 1: The interquartile

More information

Exercises from Chapter 3, Section 1

Exercises from Chapter 3, Section 1 Exercises from Chapter 3, Section 1 1. Consider the following sample consisting of 20 numbers. (a) Find the mode of the data 21 23 24 24 25 26 29 30 32 34 39 41 41 41 42 43 48 51 53 53 (b) Find the median

More information

3.1 Measure of Center

3.1 Measure of Center 3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects

More information

8/4/2009. Describing Data with Graphs

8/4/2009. Describing Data with Graphs Describing Data with Graphs 1 A variable is a characteristic that changes or varies over time and/or for different individuals or objects under consideration. Examples: Hair color, white blood cell count,

More information

are the objects described by a set of data. They may be people, animals or things.

are the objects described by a set of data. They may be people, animals or things. ( c ) E p s t e i n, C a r t e r a n d B o l l i n g e r 2016 C h a p t e r 5 : E x p l o r i n g D a t a : D i s t r i b u t i o n s P a g e 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms

More information

WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities

WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and 2001-2002 Rainfall For Selected Arizona Cities Phoenix Tucson Flagstaff Avg. 2001-2002 Avg. 2001-2002 Avg. 2001-2002 October 0.7 0.0

More information

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest:

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest: 1 Chapter 3 - Descriptive stats: Numerical measures 3.1 Measures of Location Mean Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size Example: The number

More information

Histograms allow a visual interpretation

Histograms allow a visual interpretation Chapter 4: Displaying and Summarizing i Quantitative Data s allow a visual interpretation of quantitative (numerical) data by indicating the number of data points that lie within a range of values, called

More information

Chapter 2: Tools for Exploring Univariate Data

Chapter 2: Tools for Exploring Univariate Data Stats 11 (Fall 2004) Lecture Note Introduction to Statistical Methods for Business and Economics Instructor: Hongquan Xu Chapter 2: Tools for Exploring Univariate Data Section 2.1: Introduction What is

More information

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected What is statistics? Statistics is the science of: Collecting information Organizing and summarizing the information collected Analyzing the information collected in order to draw conclusions Two types

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES INTRODUCTION TO APPLIED STATISTICS NOTES PART - DATA CHAPTER LOOKING AT DATA - DISTRIBUTIONS Individuals objects described by a set of data (people, animals, things) - all the data for one individual make

More information

Int Math 1 Statistic and Probability. Name:

Int Math 1 Statistic and Probability. Name: Name: Int Math 1 1. Juan wants to rent a house. He gathers data on many similar houses. The distance from the center of the city, x, and the monthly rent for each house, y, are shown in the scatter plot.

More information

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008 DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe

More information

STAT 200 Chapter 1 Looking at Data - Distributions

STAT 200 Chapter 1 Looking at Data - Distributions STAT 200 Chapter 1 Looking at Data - Distributions What is Statistics? Statistics is a science that involves the design of studies, data collection, summarizing and analyzing the data, interpreting the

More information

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?!

Topic 3: Introduction to Statistics. Algebra 1. Collecting Data. Table of Contents. Categorical or Quantitative? What is the Study of Statistics?! Topic 3: Introduction to Statistics Collecting Data We collect data through observation, surveys and experiments. We can collect two different types of data: Categorical Quantitative Algebra 1 Table of

More information

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1 Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures

More information

IB Questionbank Mathematical Studies 3rd edition. Grouped discrete. 184 min 183 marks

IB Questionbank Mathematical Studies 3rd edition. Grouped discrete. 184 min 183 marks IB Questionbank Mathematical Studies 3rd edition Grouped discrete 184 min 183 marks 1. The weights in kg, of 80 adult males, were collected and are summarized in the box and whisker plot shown below. Write

More information

Unit 2: Numerical Descriptive Measures

Unit 2: Numerical Descriptive Measures Unit 2: Numerical Descriptive Measures Summation Notation Measures of Central Tendency Measures of Dispersion Chebyshev's Rule Empirical Rule Measures of Relative Standing Box Plots z scores Jan 28 10:48

More information

Let's Do It! What Type of Variable?

Let's Do It! What Type of Variable? Ch Online homework list: Describing Data Sets Graphical Representation of Data Summary statistics: Measures of Center Box Plots, Outliers, and Standard Deviation Ch Online quizzes list: Quiz 1: Introduction

More information

AP Final Review II Exploring Data (20% 30%)

AP Final Review II Exploring Data (20% 30%) AP Final Review II Exploring Data (20% 30%) Quantitative vs Categorical Variables Quantitative variables are numerical values for which arithmetic operations such as means make sense. It is usually a measure

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

MATH 1150 Chapter 2 Notation and Terminology

MATH 1150 Chapter 2 Notation and Terminology MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the

More information

Name: JMJ April 10, 2017 Trigonometry A2 Trimester 2 Exam 8:40 AM 10:10 AM Mr. Casalinuovo

Name: JMJ April 10, 2017 Trigonometry A2 Trimester 2 Exam 8:40 AM 10:10 AM Mr. Casalinuovo Name: JMJ April 10, 2017 Trigonometry A2 Trimester 2 Exam 8:40 AM 10:10 AM Mr. Casalinuovo Part 1: You MUST answer this problem. It is worth 20 points. 1) Temperature vs. Cricket Chirps: Crickets make

More information

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions

Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Summarizing and Displaying Measurement Data/Understanding and Comparing Distributions Histograms, Mean, Median, Five-Number Summary and Boxplots, Standard Deviation Thought Questions 1. If you were to

More information

The Empirical Rule, z-scores, and the Rare Event Approach

The Empirical Rule, z-scores, and the Rare Event Approach Overview The Empirical Rule, z-scores, and the Rare Event Approach Look at Chebyshev s Rule and the Empirical Rule Explore some applications of the Empirical Rule How to calculate and use z-scores Introducing

More information

Determine the trend for time series data

Determine the trend for time series data Extra Online Questions Determine the trend for time series data Covers AS 90641 (Statistics and Modelling 3.1) Scholarship Statistics and Modelling Chapter 1 Essent ial exam notes Time series 1. The value

More information

A is one of the categories into which qualitative data can be classified.

A is one of the categories into which qualitative data can be classified. Chapter 2 Methods for Describing Sets of Data 2.1 Describing qualitative data Recall qualitative data: non-numerical or categorical data Basic definitions: A is one of the categories into which qualitative

More information

Chapter 4. Displaying and Summarizing. Quantitative Data

Chapter 4. Displaying and Summarizing. Quantitative Data STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range

More information

Unit 2. Describing Data: Numerical

Unit 2. Describing Data: Numerical Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient

More information

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data

More information

Unit 1: Statistics. Mrs. Valentine Math III

Unit 1: Statistics. Mrs. Valentine Math III Unit 1: Statistics Mrs. Valentine Math III 1.1 Analyzing Data Statistics Study, analysis, and interpretation of data Find measure of central tendency Mean average of the data Median Odd # data pts: middle

More information

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved. 1-1 Chapter 1 Sampling and Descriptive Statistics 1-2 Why Statistics? Deal with uncertainty in repeated scientific measurements Draw conclusions from data Design valid experiments and draw reliable conclusions

More information

Units. Exploratory Data Analysis. Variables. Student Data

Units. Exploratory Data Analysis. Variables. Student Data Units Exploratory Data Analysis Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 13th September 2005 A unit is an object that can be measured, such as

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The

More information

Introduction to Statistics

Introduction to Statistics Introduction to Statistics Data and Statistics Data consists of information coming from observations, counts, measurements, or responses. Statistics is the science of collecting, organizing, analyzing,

More information

Performance of fourth-grade students on an agility test

Performance of fourth-grade students on an agility test Starter Ch. 5 2005 #1a CW Ch. 4: Regression L1 L2 87 88 84 86 83 73 81 67 78 83 65 80 50 78 78? 93? 86? Create a scatterplot Find the equation of the regression line Predict the scores Chapter 5: Understanding

More information

The point is located eight units to the right of the y-axis and two units above the x-axis. A) ( 8, 2) B) (8, 2) C) ( 2, 8) D) (2, 8) E) ( 2, 8)

The point is located eight units to the right of the y-axis and two units above the x-axis. A) ( 8, 2) B) (8, 2) C) ( 2, 8) D) (2, 8) E) ( 2, 8) Name: Date: 1. Find the coordinates of the point. The point is located eight units to the right of the y-axis and two units above the x-axis. A) ( 8, ) B) (8, ) C) (, 8) D) (, 8) E) (, 8). Find the coordinates

More information

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things. (c) Epstein 2013 Chapter 5: Exploring Data Distributions Page 1 CHAPTER 5: EXPLORING DATA DISTRIBUTIONS 5.1 Creating Histograms Individuals are the objects described by a set of data. These individuals

More information

Sampling, Frequency Distributions, and Graphs (12.1)

Sampling, Frequency Distributions, and Graphs (12.1) 1 Sampling, Frequency Distributions, and Graphs (1.1) Design: Plan how to obtain the data. What are typical Statistical Methods? Collect the data, which is then subjected to statistical analysis, which

More information

additionalmathematicsstatisticsadditi onalmathematicsstatisticsadditionalm athematicsstatisticsadditionalmathem aticsstatisticsadditionalmathematicsst

additionalmathematicsstatisticsadditi onalmathematicsstatisticsadditionalm athematicsstatisticsadditionalmathem aticsstatisticsadditionalmathematicsst additionalmathematicsstatisticsadditi onalmathematicsstatisticsadditionalm athematicsstatisticsadditionalmathem aticsstatisticsadditionalmathematicsst STATISTICS atisticsadditionalmathematicsstatistic

More information

Line Graphs. 1. Use the data in the table to make a line graph. 2. When did the amount spent on electronics increase the most?

Line Graphs. 1. Use the data in the table to make a line graph. 2. When did the amount spent on electronics increase the most? Practice A Line Graphs Use the table to answer the questions. U.S. Personal Spending on Selected Electronics Amount Spent Year ($billions, estimated) 1994 $71 1996 $80 1998 $90 2000 $107 1. Use the data

More information

SESSION 5 Descriptive Statistics

SESSION 5 Descriptive Statistics SESSION 5 Descriptive Statistics Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures. Together with simple

More information

Math 082 Final Examination Review

Math 082 Final Examination Review Math 08 Final Examination Review 1) Write the equation of the line that passes through the points (4, 6) and (0, 3). Write your answer in slope-intercept form. ) Write the equation of the line that passes

More information

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data

Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Chapter 2: Descriptive Analysis and Presentation of Single- Variable Data Mean 26.86667 Standard Error 2.816392 Median 25 Mode 20 Standard Deviation 10.90784 Sample Variance 118.981 Kurtosis -0.61717 Skewness

More information

Chapter 6 Assessment. 3. Which points in the data set below are outliers? Multiple Choice. 1. The boxplot summarizes the test scores of a math class?

Chapter 6 Assessment. 3. Which points in the data set below are outliers? Multiple Choice. 1. The boxplot summarizes the test scores of a math class? Chapter Assessment Multiple Choice 1. The boxplot summarizes the test scores of a math class? Test Scores 3. Which points in the data set below are outliers? 73, 73, 7, 75, 75, 75, 77, 77, 77, 77, 7, 7,

More information

Let's Do It! What Type of Variable?

Let's Do It! What Type of Variable? 1 2.1-2.3: Organizing Data DEFINITIONS: Qualitative Data are those which classify the units into categories. The categories may or may not have a natural ordering to them. Qualitative variables are also

More information

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Chapter 6 The Standard Deviation as a Ruler and the Normal Model Chapter 6 The Standard Deviation as a Ruler and the Normal Model Overview Key Concepts Understand how adding (subtracting) a constant or multiplying (dividing) by a constant changes the center and/or spread

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 3 Numerical Descriptive Measures 3-1 Learning Objectives In this chapter, you learn: To describe the properties of central tendency, variation,

More information

Continuous random variables

Continuous random variables Continuous random variables A continuous random variable X takes all values in an interval of numbers. The probability distribution of X is described by a density curve. The total area under a density

More information

Example 2. Given the data below, complete the chart:

Example 2. Given the data below, complete the chart: Statistics 2035 Quiz 1 Solutions Example 1. 2 64 150 150 2 128 150 2 256 150 8 8 Example 2. Given the data below, complete the chart: 52.4, 68.1, 66.5, 75.0, 60.5, 78.8, 63.5, 48.9, 81.3 n=9 The data is

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

Full file at

Full file at IV SOLUTIONS TO EXERCISES Note: Exercises whose answers are given in the back of the textbook are denoted by the symbol. CHAPTER Description of Samples and Populations Note: Exercises whose answers are

More information

Math: Question 1 A. 4 B. 5 C. 6 D. 7

Math: Question 1 A. 4 B. 5 C. 6 D. 7 Math: Question 1 Abigail can read 200 words in one minute. If she were to read at this rate for 30 minutes each day, how many days would Abigail take to read 30,000 words of a book? A. 4 B. 5 C. 6 D. 7

More information

Lecture 1: Descriptive Statistics

Lecture 1: Descriptive Statistics Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics

More information

Vocabulary: Samples and Populations

Vocabulary: Samples and Populations Vocabulary: Samples and Populations Concept Different types of data Categorical data results when the question asked in a survey or sample can be answered with a nonnumerical answer. For example if we

More information

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables)

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables) 3. Descriptive Statistics Describing data with tables and graphs (quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables) Bivariate descriptions

More information

Pre-Calc Chapter 1 Sample Test. D) slope: 3 4

Pre-Calc Chapter 1 Sample Test. D) slope: 3 4 Pre-Calc Chapter 1 Sample Test 1. Use the graphs of f and g to evaluate the function. f( x) gx ( ) (f o g)(-0.5) 1 1 0 4. Plot the points and find the slope of the line passing through the pair of points.

More information

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Chapter 2: Summarising numerical data Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data Extract from Study Design Key knowledge Types of data: categorical (nominal and ordinal)

More information

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable QUANTITATIVE DATA Recall that quantitative (numeric) data values are numbers where data take numerical values for which it is sensible to find averages, such as height, hourly pay, and pulse rates. UNIVARIATE

More information

ALGEBRA 1 SEMESTER 1 INSTRUCTIONAL MATERIALS Courses: Algebra 1 S1 (#2201) and Foundations in Algebra 1 S1 (#7769)

ALGEBRA 1 SEMESTER 1 INSTRUCTIONAL MATERIALS Courses: Algebra 1 S1 (#2201) and Foundations in Algebra 1 S1 (#7769) Multiple Choice: Identify the choice that best completes the statement or answers the question. 1. Ramal goes to the grocery store and buys pounds of apples and pounds of bananas. Apples cost dollars per

More information

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives F78SC2 Notes 2 RJRC Algebra It is useful to use letters to represent numbers. We can use the rules of arithmetic to manipulate the formula and just substitute in the numbers at the end. Example: 100 invested

More information

Technical note on seasonal adjustment for M0

Technical note on seasonal adjustment for M0 Technical note on seasonal adjustment for M0 July 1, 2013 Contents 1 M0 2 2 Steps in the seasonal adjustment procedure 3 2.1 Pre-adjustment analysis............................... 3 2.2 Seasonal adjustment.................................

More information

The science of learning from data.

The science of learning from data. STATISTICS (PART 1) The science of learning from data. Numerical facts Collection of methods for planning experiments, obtaining data and organizing, analyzing, interpreting and drawing the conclusions

More information

Range The range is the simplest of the three measures and is defined now.

Range The range is the simplest of the three measures and is defined now. Measures of Variation EXAMPLE A testing lab wishes to test two experimental brands of outdoor paint to see how long each will last before fading. The testing lab makes 6 gallons of each paint to test.

More information

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline.

MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. MATH 2560 C F03 Elementary Statistics I Lecture 1: Displaying Distributions with Graphs. Outline. data; variables: categorical & quantitative; distributions; bar graphs & pie charts: What Is Statistics?

More information

3 GRAPHICAL DISPLAYS OF DATA

3 GRAPHICAL DISPLAYS OF DATA some without indicating nonnormality. If a sample of 30 observations contains 4 outliers, two of which are extreme, would it be reasonable to assume the population from which the data were collected has

More information

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511

Types of Information. Topic 2 - Descriptive Statistics. Examples. Sample and Sample Size. Background Reading. Variables classified as STAT 511 Topic 2 - Descriptive Statistics STAT 511 Professor Bruce Craig Types of Information Variables classified as Categorical (qualitative) - variable classifies individual into one of several groups or categories

More information

Chapters 1 & 2 Exam Review

Chapters 1 & 2 Exam Review Problems 1-3 refer to the following five boxplots. 1.) To which of the above boxplots does the following histogram correspond? (A) A (B) B (C) C (D) D (E) E 2.) To which of the above boxplots does the

More information

Practice Questions for Exam 1

Practice Questions for Exam 1 Practice Questions for Exam 1 1. A used car lot evaluates their cars on a number of features as they arrive in the lot in order to determine their worth. Among the features looked at are miles per gallon

More information

1. Exploratory Data Analysis

1. Exploratory Data Analysis 1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be

More information

Honors Algebra 1 - Fall Final Review

Honors Algebra 1 - Fall Final Review Name: Period Date: Honors Algebra 1 - Fall Final Review This review packet is due at the beginning of your final exam. In addition to this packet, you should study each of your unit reviews and your notes.

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

ALGEBRA I SEMESTER EXAMS PRACTICE MATERIALS SEMESTER (1.1) Examine the dotplots below from three sets of data Set A

ALGEBRA I SEMESTER EXAMS PRACTICE MATERIALS SEMESTER (1.1) Examine the dotplots below from three sets of data Set A 1. (1.1) Examine the dotplots below from three sets of data. 0 2 4 6 8 10 Set A 0 2 4 6 8 10 Set 0 2 4 6 8 10 Set C The mean of each set is 5. The standard deviations of the sets are 1.3, 2.0, and 2.9.

More information

Descriptive Univariate Statistics and Bivariate Correlation

Descriptive Univariate Statistics and Bivariate Correlation ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to

More information

Introduction to Forecasting

Introduction to Forecasting Introduction to Forecasting Introduction to Forecasting Predicting the future Not an exact science but instead consists of a set of statistical tools and techniques that are supported by human judgment

More information

Chapter 3 Data Description

Chapter 3 Data Description Chapter 3 Data Description Section 3.1: Measures of Central Tendency Section 3.2: Measures of Variation Section 3.3: Measures of Position Section 3.1: Measures of Central Tendency Definition of Average

More information

Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc. Chapter 3 Numerically Summarizing Data Section 3.1 Measures of Central Tendency Objectives 1. Determine the arithmetic mean of a variable from raw data 2. Determine the median of a variable from raw data

More information

Chiang Rai Province CC Threat overview AAS1109 Mekong ARCC

Chiang Rai Province CC Threat overview AAS1109 Mekong ARCC Chiang Rai Province CC Threat overview AAS1109 Mekong ARCC This threat overview relies on projections of future climate change in the Mekong Basin for the period 2045-2069 compared to a baseline of 1980-2005.

More information

Chapter 5: Exploring Data: Distributions Lesson Plan

Chapter 5: Exploring Data: Distributions Lesson Plan Lesson Plan Exploring Data Displaying Distributions: Histograms Interpreting Histograms Displaying Distributions: Stemplots Describing Center: Mean and Median Describing Variability: The Quartiles The

More information

Remember your SOCS! S: O: C: S:

Remember your SOCS! S: O: C: S: Remember your SOCS! S: O: C: S: 1.1: Displaying Distributions with Graphs Dotplot: Age of your fathers Low scale: 45 High scale: 75 Doesn t have to start at zero, just cover the range of the data Label

More information