Appendix C Further Concepts in Statistics C1 Appendix C Further Concepts in Statistics Stem-and-Leaf Plots Histograms and Frequency Distributions Line Graphs Choosing an Appropriate Graph Scatter Plots Fitting a Line to Data Measures of Central Tendency Stem-and-Leaf Plots Statistics is the branch of mathematics that studies techniques for collecting, organizing, and interpreting data. In this section, you will study several ways to organize and interpret data. One type of plot that can be used to organize sets of numbers by hand is a stem-and-leaf plot. A set of test scores and the corresponding stem-and-leaf plot are shown below. Test Scores 93, 70, 76, 58, 86, 93, 82, 78, 83, 86, 64, 78, 76, 66, 83, 83, 96, 74, 69, 76, 64, 74, 79, 76, 88, 76, 81, 82, 74, 70 Stems Leaves 5 8 6 4 4 6 9 7 0 0 4 4 4 6 6 6 6 6 8 8 9 8 1 2 2 3 3 3 6 6 8 9 3 3 6 Note that the leaves represent the units digits of the numbers and the stems represent the tens digits. Stem-and-leaf plots can also be used to compare two sets of data, as shown in the following example. Example 1 Comparing Two Sets of Data Use a stem-and-leaf plot to compare the test scores given above with the following test scores. Which set of test scores is better? 90, 81, 70, 62, 64, 73, 81, 92, 73, 81, 92, 93, 83, 75, 76, 83, 94, 96, 86, 77, 77, 86, 96, 86, 77, 86, 87, 87, 79, 88 Begin by ordering the second set of scores. 62, 64, 70, 73, 73, 75, 76, 77, 77, 77, 79, 81, 81, 81, 83, 83, 86, 86, 86, 86, 87, 87, 88, 90, 92, 92, 93, 94, 96, 96 Now that the data have been ordered, you can construct a double stem-and-leaf plot by letting the leaves to the right of the stems represent the units digits for the first group of test scores and letting the leaves to the left of the stems represent the units digits for the second group of test scores.
C2 Appendix C Further Concepts in Statistics Leaves (2nd Group) Stems Leaves (1st Group) 5 8 4 2 6 4 4 6 9 9 7 7 7 6 5 3 3 0 7 0 0 4 4 4 6 6 6 6 6 8 8 9 8 7 7 6 6 6 6 3 3 1 1 1 8 1 2 2 3 3 3 6 6 8 6 6 4 3 2 2 0 9 3 3 6 By comparing the two sets of leaves, you can see that the second group of test scores is better than the first group. Example 2 Using a Stem-and-Leaf Plot The table shows the percent of the population of each state and the District of Columbia that was at least 65 years old in 2000. Use a stem-and-leaf plot to organize the data. (Source: U.S. Census Bureau) AK 5.7 AL 13.0 AR 14.0 AZ 13.0 CA 10.6 CO 9.7 CT 13.8 DC 12.2 DE 13.0 FL 17.6 GA 9.6 HI 13.3 IA 14.9 ID 11.3 IL 12.1 IN 12.4 KS 13.3 KY 12.5 LA 11.6 MA 13.5 MD 11.3 ME 14.4 MI 12.3 MN 12.1 MO 13.5 MS 12.1 MT 13.4 NC 12.0 ND 14.7 NE 13.6 NH 12.0 NJ 13.2 NM 11.7 NV 11.0 NY 12.9 OH 13.3 OK 13.2 OR 12.8 PA 15.6 RI 14.5 SC 12.1 SD 14.3 TN 12.4 TX 9.9 UT 8.5 VA 11.2 VT 12.7 WA 11.2 WI 13.1 WV 15.3 WY 11.7 Begin by ordering the numbers, as shown below. 5.7, 8.5, 9.6, 9.7, 9.9, 10.6, 11.0, 11.2, 11.2, 11.3, 11.3, 11.6, 11.7, 11.7, 12.0, 12.0, 12.1, 12.1, 12.1, 12.1, 12.2, 12.3, 12.4, 12.4, 12.5, 12.7, 12.8, 12.9, 13.0, 13.0, 13.0, 13.1, 13.2, 13.2, 13.3, 13.3, 13.3, 13.4, 13.5, 13.5, 13.6, 13.8, 14.0, 14.3, 14.4, 14.5, 14.7, 14.9, 15.3, 15.6, 17.6 Next construct a stem-and-leaf plot using the leaves to represent the digits to the right of the decimal points.
Appendix C Further Concepts in Statistics C3 Stems Leaves 5. 7 Alaska has the lowest percent. 6. 7. 8. 5 9. 6 7 9 10. 6 11. 0 2 2 3 3 6 7 7 12. 0 0 1 1 1 1 2 3 4 4 5 7 8 9 13. 0 0 0 1 2 2 3 3 3 4 5 5 6 8 14. 0 3 4 5 7 9 15. 3 6 16. 17. 6 Florida has the highest percent. Histograms and Frequency Distributions With data such as those given in Example 2, it is useful to group the data into intervals and plot the frequency of the data in each interval. For instance, the frequency distribution and histogram shown in Figure C.1 represent the data given in Example 2. Frequency Distribution Histogram Interval Tally 24 5, 7 20 7, 9 16 9, 11 12 11, 13 8 13, 15 4 15, 17 5 7 9 11 13 15 17 19 17, 19 Percent of population 65 or older Figure C.1 A histogram has a portion of a real number line as its horizontal axis. A bar graph is similar to a histogram, except that the bars can be either horizontal or vertical and the labels of the bars are not necessarily numbers. Another difference between a bar graph and a histogram is that the bars in a bar graph are usually separated by spaces, whereas the bars in a histogram are not separated by spaces. Number of states
C4 Appendix C Further Concepts in Statistics Monthly normal precipitation (in inches) 6 5 4 3 2 1 J F M A M J J ASOND Month Figure C.2 Example 3 Constructing a Bar Graph The data below show the monthly normal precipitation (in inches) in Houston, Texas. Construct a bar graph for these data. What can you conclude? (Source: PC USA) January 3.7 February 3.0 March 3.4 April 3.6 May 5.2 June 5.4 July 3.2 August 3.8 September 4.3 October 4.5 November 4.2 December 3.7 To create a bar graph, begin by drawing a vertical axis to represent the precipitation and a horizontal axis to represent the month. The bar graph is shown in Figure C.2. From the graph, you can see that Houston receives a fairly consistent amount of rain throughout the year the driest month tends to be February and the wettest month tends to be June. Line Graphs A line graph is similar to a standard coordinate graph. Line graphs are usually used to show trends over periods of time. Example 4 Constructing a Line Graph The following data show the number of immigrants (in thousands) to the United States for the years 1983 through 2002. Construct a line graph of the data. What can you conclude? (Source: Bureau of Citizenship and Immigration Services) Year Number Year Number 1983 560 1993 904 1984 544 1994 804 1985 570 1995 720 1986 602 1996 916 1987 602 1997 798 1988 643 1998 654 1989 1091 1999 647 1990 1536 2000 850 1991 1827 2001 1064 1992 974 2002 1064 Begin by drawing a vertical axis to represent the number of immigrants in thousands. Then label the horizontal axis with years and plot the points shown in the list. Finally, connect the points with line segments, as shown in Figure C.3 on the next page. From the line graph, you can see that the number of immigrants steadily increased until 1989, when there was a sharp increase followed by a sudden decrease in 1992.
Appendix C Further Concepts in Statistics C5 Number of immigrants (in thousands) 2000 1800 1600 1400 1200 1000 800 600 400 200 Figure C.3 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 Year Choosing an Appropriate Graph Line graphs and bar graphs are commonly used for displaying data. When you are using a graph to organize and present data, you must first decide which type of graph to use. Here are some guidelines to help you decide which type of graph to use. 1. Use a bar graph when the data fall into distinct categories and you want to compare totals. 2. Use a line graph when you want to show the relationship between consecutive amounts or data over time. Example 5 Organizing Data with a Graph Listed below are professional golfers who have won the most money on the PGA Tour through 2002. Organize the data graphically. (Source: PGA Tour) Golfer Money Won (in millions) Davis Love III $20.1 Phil Mickelson $22.1 Nick Price $16.7 Vijay Singh $18.3 Tiger Woods $33.1 You can use a bar graph because the data fall into distinct categories, and it would be useful to compare totals. The bar graph shown in Figure C.4 is horizontal. This makes it easier to label each bar. Also notice that the golfers are listed in order of most money won.
C6 Appendix C Further Concepts in Statistics Tiger Woods Phil Mickelson Golfer Davis Love III Vijay Singh Nick Price 5 10 15 20 25 30 35 Money won (in millions of dollars) Figure C.4 People (in millions) P 144 142 140 138 136 134 132 130 128 126 124 122 4 5 6 7 8 9 10 Year (4 1994) Figure C.5 11 t Scatter Plots Many real-life situations involve finding relationships between two variables, such as the year and the number of people in the labor force. In a typical situation, data are collected and written as a set of ordered pairs. The graph of such a set is called a scatter plot. From the scatter plot in Figure C.5 that relates the year t with the number of people in the labor force P, it appears that the points describe a relationship that is nearly linear. The relationship is not exactly linear because the labor force did not increase by precisely the same amount each year. A mathematical equation that approximates the relationship between t and P is called a mathematical model. When developing a mathematical model, you strive for two (often conflicting) goals accuracy and simplicity. For the data in Figure C.5, a linear model of the form P at b appears to be best. It is simple and relatively accurate. Consider a collection of ordered pairs of the form (x, y). If y tends to increase as x increases, the collection is said to have a positive correlation. If y tends to decrease as x increases, the collection is said to have a negative correlation. Figure C.6 shows three examples: one with a positive correlation, one with a negative correlation, and one with no (discernible) correlation. y y y x x x Positive correlation Negative correlation No correlation Figure C.6
Appendix C Further Concepts in Statistics C7 Example 6 Interpreting Correlation Test scores Test scores y 100 80 60 40 20 2 y 100 80 60 40 20 4 Figure C.7 4 6 8 Study hours 8 12 16 TV hours 10 20 x x On a Friday, 22 students in a class were asked to record the number of hours they spent studying for a test on Monday and the numbers of hours they spent watching television. The results are shown below. Construct a scatter plot for each set of data. Then determine whether the points are positively correlated, are negatively correlated, or have no discernible correlation. What can you conclude? (The first coordinate is the number of hours and the second coordinate is the score obtained on Monday s test.) Study Hours: (0, 40), (1, 41), (2, 51), (3, 58), (3, 49), (4, 48), (4, 64), (5, 55), (5, 69), (5, 58), (5, 75), (6, 68), (6, 63), (6, 93), (7, 84), (7, 67), (8, 90), (8, 76), (9, 95), (9, 72), (9, 85), (10, 98) TV Hours: (0, 98), (1, 85), (2, 72), (2, 90), (3, 67), (3, 93), (3, 95), (4, 68), (4, 84), (5, 76), (7, 75), (7, 58), (9, 63), (9, 69), (11, 55), (12, 58), (14, 64), (16, 48), (17, 51), (18, 41), (19, 49), (20, 40) Scatter plots for the two sets of data are shown in Figure C.7. The scatter plot relating study hours and test scores has a positive correlation. This means that the more a student studied, the higher his or her score tended to be. The scatter plot relating television hours and test scores has a negative correlation. This means that the more time a student spent watching television, the lower his or her score tended to be. Fitting a Line to Data Finding a linear model that represents the relationship described by a scatter plot is called fitting a line to data. You can do this graphically by simply sketching the line that appears to fit the points, finding two points on the line, and then finding the equation of the line that passes through the two points. People (in millions) 144 142 140 138 136 134 132 130 128 126 124 122 P Figure C.8 5 P = (t 4) + 131 3 4 5 6 7 8 9 10 11 Year (4 1994) t Example 7 Fitting a Line to Data Find a linear model that relates the year and the number of people P (in millions) in the United States who were part of the labor force from 1994 through 2001. (Source: U.S. Bureau of Labor Statistics) Year 1994 1995 1996 1997 1998 1999 2000 2001 People, P 131 132 134 136 138 139 141 142 Let t represent the year with t 4 corresponding to 1994. After plotting the data in the table, draw the line that you think best represents the data, as shown in Figure C.8. Two points that lie on this line are 4, 131 and 10, 141. Using the point-slope form, you can find the equation of the line to be P 5 3 t 4 131.
C8 Appendix C Further Concepts in Statistics Once you have found a model, you can measure how well the model fits the data by comparing the actual values with the values given by the model, as shown in the following table. t 4 5 6 7 8 9 10 11 Actual P 131 132 134 136 138 139 141 142 Model P 131 132.7 134.3 136 137.7 139.3 141 142.7 The sum of the squares of the differences between the actual values and the model s values is the sum of the squared differences. The model that has the least sum is called the least squares regression line for the data. For the model in Example 7, the sum of the squared differences is 1.25. The least squares regression line for the data is P 1.7t 124 Best-fitting linear model Its sum of squared differences is 1.08. Many graphing calculators have built-in least squares regression programs. If your graphing calculator has such a program, enter the data in the table and use it to find the least squares regression line. Measures of Central Tendency In many real-life situations, it is helpful to describe data by a single number that is most representative of the entire collection of numbers. Such a number is called a measure of central tendency. The most commonly used measures are as follows. 1. The mean, or average, of n numbers is the sum of the numbers divided by n. 2. The median of n numbers is the middle number when the numbers are written in numerical order. If n is even, the median is the average of the two middle numbers. 3. The mode of n numbers is the number that occurs most frequently. If two numbers tie for most frequent occurrence, the collection has two modes and is called bimodal. Example 8 Comparing Measures of Central Tendency On an interview for a job, the interviewer tells you that the average annual income of the company s 25 employees is $60,849. The actual annual incomes of the 25 employees are shown below. What are the mean, median, and mode of the incomes? Was the person telling you the truth? $17,305, $478,320, $45,678, $18,980, $17,408, $25,676, $28,906, $12,500, $24,540, $33,450, $12,500, $33,855, $37,450, $20,432, $28,956, $34,983, $36,540, $250,921, $36,853, $16,430, $32,654, $98,213, $48,980, $94,024, $35,671
The mean of the incomes is Appendix C Further Concepts in Statistics C9 Mean 17,305 478,320 45,678 18,980... 35,671 25 1,521,225 25 $60,849. To find the median, order the incomes as follows. $12,500, $12,500, $16,430, $17,305, $17,408, $18,980, $20,432, $24,540, $25,676, $28,906, $28,956, $32,654, $33,450, $33,855, $34,983, $35,671, $36,540, $36,853, $37,450, $45,678, $48,980, $94,024, $98,213, $250,921, $478,320 From this list, you can see that the median (the middle number) is $33,450. From the same list, you can see that $12,500 is the only income that occurs more than once. So, the mode is $12,500. Technically, the person was telling the truth because the average is (generally) defined to be the mean. However, of the three measures of central tendency Mean: $60,849, Median: $33,450 and Mode: $12,500 it seems clear that the median is most representative. The mean is inflated by the two highest salaries. Which of the three measures of central tendency is the most representative? The answer is that it depends on the distribution of the data and the way in which you plan to use the data. For instance, in Example 8, the mean salary of $60,849 does not seem very representative to a potential employee. To a city income tax collector who wants to estimate 1% of the total income of the 25 employees, however, the mean is precisely the right measure. Example 9 Choosing a Measure of Central Tendency Which measure of central tendency is the most representative of the data shown in each of the following frequency distributions? Number 1 2 3 4 5 6 7 8 9 Frequency 7 20 15 11 8 3 2 0 15 Number 1 2 3 4 5 6 7 8 9 Frequency 9 8 7 6 5 6 7 8 9 Number 1 2 3 4 5 6 7 8 9 Frequency 6 1 2 3 5 5 4 3 0
C10 Appendix C Further Concepts in Statistics a. For these data, the mean is 4.23, the median is 3, and the mode is 2. Of these, the median or mode is probably the most representative. b. For these data, the mean and median are each 5 and the modes are 1 and 9 (the distribution is bimodal). Of these, the mean or median is the most representative. c. For these data, the mean is 4.59, the median is 5, and the mode is 1. Of these, the mean or median is the most representative. Appendix C Exercises 1. Exam Scores Construct a stem-and-leaf plot for the following exam scores for a class of 30 students. The scores are for a 100-point exam. See Examples 1 and 2. 77, 100, 77, 70, 83, 89, 87, 85, 81, 84, 81, 78, 89, 78, 88, 85, 90, 92, 75, 81, 85, 100, 98, 81, 78, 75, 85, 89, 82, 75 2. Insurance Coverage The following table shows the total number of persons (in thousands) without health insurance coverage in the 50 states and the District of Columbia in 2000. Use a stem-and-leaf plot to organize the data. (Source: U.S. Census Bureau) AK 125 AL 600 AR 364 AZ 793 CA 6281 CO 563 CT 263 DC 73 DE 82 FL 2620 GA 1135 HI 117 IA 248 ID 196 IL 1659 IN 701 KS 301 KY 513 LA 810 MA 595 MD 501 ME 145 MI 982 MN 430 MO 586 MS 364 MT 162 NC 980 ND 69 NE 164 NH 85 NJ 1049 NM 427 NV 311 NY 2802 OH 1255 OK 636 OR 465 PA 905 RI 55 SC 448 SD 82 TN 577 TX 4425 UT 296 VA 886 VT 67 WA 780 WI 386 WV 254 WY 70 Exam Scores In Exercises 3 and 4, use the following set of data, which lists students scores on a 100-point exam. 93, 84, 100, 92, 66, 89, 78, 52, 71, 85, 83, 95, 98, 99, 93, 81, 80, 79, 67, 59, 90, 55, 77, 62, 90, 78, 66, 63, 93, 87, 74, 96, 72, 100, 70, 73 3. Use a stem-and-leaf plot to organize the data. 4. Draw a histogram to represent the data. 5. Complete the following frequency distribution table and draw a histogram to represent the data. 44, 33, 17, 23, 16, 18, 44, 47, 18, 20, 25, 27, 18, 29, 29, 28, 27, 18, 36, 22, 32, 38, 33, 41, 49, 48, 45, 38, 49, 15 Interval Tally [15, 22) [22, 29) [29, 36) [36, 43) [43, 50) 6. Meteorology The data below show the seasonal snowfall (in inches) in Lincoln, Nebraska for the years 1973 through 2002 (the amounts are listed in order by year). How would you organize this data? Explain your reasoning. (Source: University of Nebraska-Lincoln) 33.6, 42.1, 21.1, 21.8, 31.0, 34.4, 23.3, 13.0, 32.3, 38.0, 47.5, 21.5, 18.9, 15.7, 13.0, 19.1, 18.7, 25.8, 23.8, 32.1, 21.3, 21.8, 30.7, 29.0, 44.6, 24.4, 11.9, 37.9, 29.5, 27.1 7. Travel The data below give the places of origin and the numbers of travelers (in millions) to the United States in 2000. Construct a bar graph for this data. (Source: U.S. Department of Commerce) See Example 3. Canada 14.6 Europe 11.6 Mexico 10.3 Far East 7.6 Other 6.8
Appendix C Further Concepts in Statisticss C11 8. Agriculture The data below show the cash receipts (in millions of dollars) from fruit crops of farmers in 2001. Construct a bar graph for this data. (Source: U.S. Department of Agriculture) Apples 1370 Peaches 477 Cherries 326 Pears 267 Grapes 2924 Plums and Prunes 219 Lemons 273 Strawberries 1086 Oranges 1369 Environment In Exercises 9 14, use the line graph which shows municipal solid waste (in millions of tons) generated, recovered, and disposed from 1990 through 2001. (Source: Franklin Associates, Ltd.) Waste (in millions of tons) 300 Total waste 270 Landfilled Recycled 240 Incinerated 210 180 150 120 90 60 30 1 2 3 4 5 6 7 8 9 10 Year (0 1990) 9. Estimate the total waste in 1990 and 1999. 10. Estimate the amount of incinerated waste in 1998. 11. Which quantities increased every year? 12. During which time period(s) did the amount of landfilled waste decrease? 13. What is the relationship among the four quantities in the line graph? 14. Why do you think recycled waste is increasing? 15. College Enrollment The table shows the enrollment in a liberal arts college. Construct a line graph for the data. See Example 4. Year 1996 1997 1998 1999 Enrollment 1675 1704 1710 1768 Year 2000 2001 2002 2003 Enrollment 1833 1918 1967 1972 16. Oil Imports The table shows the crude oil imports into the United States in millions of barrels for the years 1997 through 2004. Construct a line graph for the data and state what information it reveals. (Source: U.S. Energy Information Administration) Year 1997 1998 1999 2000 2001 Oil imports 3002 3178 3187 3260 3405 17. Stock Market The list below shows stock prices for selected companies in July of 2003. Draw a graph that best represents the data. Explain why you chose that type of graph. (Source: Value Line) See Example 5. Company Stock Price Avon Products $63 Cheesecake Factory $34 Tiffany & Co. $34 Tupperware Corp. $15 Yankee Candle Co. $24 18. Revenue The table shows the revenue (in billions of dollars) for AT&T Wireless Services for the years 1997 through 2002. Draw a graph that best represents the data. Explain why you chose that type of graph. (Source: Value Line) Year 1997 1998 1999 2000 2001 2002 Revenue 4.7 5.4 7.6 10.4 13.6 15.6 19. Entertainment The factory sales (in millions of dollars) of projection televisions for the years 1995 through 2000 are shown in the table. Draw a graph that best represents the data. Explain why you chose that type of graph. (Source: Consumer Electronics Association) Year 1995 1996 1997 1998 1999 2000 Sales 1417 1361 1577 1632 1481 1057 20. Owning Cats The average numbers (out of 100) of cat owners who state various reasons for owning a cat are listed below. Draw a graph that best represents the data. Explain why you chose that type of graph. Reason for Owning a Cat Number Have a pet to play with 93 Companionship 84 Help children learn responsibility 78 Have a pet to communicate with 62 Security 51
C12 Appendix C Further Concepts in Statistics Interpreting a Scatter Plot In Exercises 21 24, use the scatter plot shown. The scatter plot compares the number of hits x made by 30 softball players during the first half of the season with the number of runs batted in y. Runs batted in 10 9 8 7 6 5 4 3 2 1 y 1 2 3 4 5 6 7 8 9 10 Hits 21. Do x and y have a positive correlation, a negative correlation, or no correlation? 22. Why does the scatter plot show only 28 points? 23. From the scatter plot, does it appear that players with more hits tend to have more runs batted in? 24. Can a player have more runs batted in than hits? Explain. In Exercises 25 28, decide whether a scatter plot relating the two quantities would tend to have a positive correlation, a negative correlation, or no correlation. Explain. 25. The age and value of a car 26. A student s study time and test scores 27. The height and age of a pine tree 28. A student s height and test scores Pressure In Exercises 29 32, use the data in the table, which show the relationship between the altitude A (in thousands of feet) and the air pressure P (in pounds per square inch). Altitude, A 0 5 10 15 20 25 Pressure, P 14.7 12.2 10.1 8.3 6.8 5.5 Altitude, A 30 35 40 45 50 Pressure, P 4.4 3.5 2.7 2.1 1.7 29. Sketch a scatter plot of the data. See Example 6. 30. How are A and P related? 31. Estimate the air pressure at 42,500 feet. x 32. Estimate the altitude at which the air pressure is 5.0 pounds per square inch. Agriculture In Exercises 33 36, use the data in the table, where x is the number of units of fertilizer applied to sample plots and y is the yield (in bushels) of a crop. x 0 1 2 3 4 5 6 7 8 y 58 60 59 61 63 66 65 67 70 33. Sketch a scatter plot of the data. 34. Determine whether the points are positively correlated, are negatively correlated, or have no discernible correlation. 35. Sketch a linear model that you think best represents the data. Find an equation of the line you sketched. Use the line to predict the yield when 10 units of fertilizer are used. See Example 7. 36. Can the model found in Exercise 35 be used to predict yields for arbitrarily large values of x? Explain. Speed of Sound In Exercises 37 40, use the data in the table, where h is altitude in thousands of feet and v is the speed of sound in feet per second. h 0 5 10 15 20 25 30 35 v 1116 1097 1077 1057 1036 1016 994 972 37. Sketch a scatter plot of the data. 38. Determine whether the points are positively correlated, are negatively correlated, or have no discernible correlation. 39. Sketch a linear model that you think best represents the data. Find an equation of the line you sketched. Use the line to predict the speed of sound at an altitude of 27,000 feet. 40. The speed of sound at an altitude of 70,000 feet is approximately 968 feet per second. What does this suggest about the validity of using the model in Exercise 39 to extrapolate beyond the data given in the table? In Exercises 41 44, use a graphing calculator to find the least squares regression line for the data. Sketch a scatter plot and the regression line. 41. (0, 23), (1, 20), (2, 19), (3, 17), (4, 15), (5, 11), (6, 10)
Appendix C Further Concepts in Statisticss C13 42. (4, 52.8), (5, 54.7), (6, 55.7), (7, 57.8), (8, 60.2), (9, 63.1), (10, 66.5) 43. 10, 5.1, 5, 9.8, (0, 17.5), (2, 25.4), (4, 32.8), (6, 38.7), (8, 44.2), (10, 50.5) 44. 10, 213.5, 5, 174.9, (0, 141.7), (5, 119.7), (8, 102.4), (10, 87.6) 45. Sales The table shows the sales y (in billions of dollars) from full-service restaurants for the years 1997 through 2002, where t 7 corresponds to 1997. (Source: National Restaurant Association) t 7 8 9 10 11 12 y 110.3 117.8 125.4 134.5 140.4 146.7 (a) Use a graphing calculator to find the least squares regression line. Use the equation to estimate sales in 2003. (b) Make a scatter plot of the data and sketch the graph of the regression line. 46. Advertising The management of a department store ran an experiment to determine if a relationship existed between sales S (in thousands of dollars) and the amount spent on advertising x (in thousands of dollars). The following data were collected. x 1 2 3 4 5 6 7 8 S 405 423 455 466 492 510 525 559 (a) Use a graphing calculator to find the least squares regression line. Use the equation to estimate sales when $4500 is spent on advertising. (b) Make a scatter plot of the data and sketch the graph of the regression line. In Exercises 47 52, find the mean, median, and mode of the data set. See Example 8. 47. 5, 12, 7, 14, 8, 9, 7 48. 30, 37, 32, 39, 33, 34, 32 49. 5, 12, 7, 24, 8, 9, 7 50. 20, 37, 32, 39, 33, 34, 32 51. Electric Bills A person had the following monthly bills for electricity. What are the mean, median, and mode of the collection of bills? Jan. $67.92 Feb. $59.84 Mar. $52.00 Apr. $52.50 May $57.99 June $65.35 July $81.76 Aug. $74.98 Sept. $87.82 Oct. $83.18 Nov. $65.35 Dec. $57.00 52. Car Rental A car rental company kept the following record of the numbers of miles driven by a car that was rented. What are the mean, median, and mode of these data? Monday 410 Tuesday 260 Wednesday 320 Thursday 320 Friday 460 Saturday 150 53. Six-Child Families A study was done on families having six children. The table shows the number of families in the study with the indicated number of girls. Determine the mean, median, and mode of this set of data. Number of girls 0 1 2 3 4 5 6 Frequency 1 24 45 54 50 19 7 54. Sports A baseball fan examined the records of a favorite baseball player s performance during his last 50 games. The number of games in which the player had 0, 1, 2, 3, and 4 hits are recorded in the table. Number of hits 0 1 2 3 4 Frequency 14 26 7 2 1 (a) Determine the average number of hits per game. (b) The player had 200 at bats during the 50-game series. Determine his batting average. 55. Think About It Construct a collection of numbers that has the following properties. If this is not possible, explain why it is not. Mean 6, median 4, mode 4 56. Think About It Construct a collection of numbers that has the following properties. If this is not possible, explain why it is not. Mean 6, median 6, mode 4 57. Test Scores A professor records the following scores for a 100-point exam. 99, 64, 80, 77, 59, 72, 87, 79, 92, 88, 90, 42, 20, 89, 42, 100, 98, 84, 78, 91 Which measure of central tendency best describes these test scores? See Example 9. 58. Shoe Sales A salesperson sold eight pairs of men s dress shoes. The sizes of the eight pairs were as follows: 101 8, 12, 10, 91 11, and 10 1 2, 101 2, 2, 2. Which measure (or measures) of central tendency best describes the typical shoe size for these data?