Further Concepts in Statistics

Similar documents
Further Concepts in Statistics

C Further Concepts in Statistics

Sample Statistics 5021 First Midterm Examination with solutions

Lecture 26 Section 8.4. Mon, Oct 13, 2008

Nursing Facilities' Life Safety Standard Survey Results Quarterly Reference Tables

Your Galactic Address

Summary of Natural Hazard Statistics for 2008 in the United States

5, 0. Math 112 Fall 2017 Midterm 1 Review Problems Page Which one of the following points lies on the graph of the function f ( x) (A) (C) (B)

Math 112 Spring 2018 Midterm 1 Review Problems Page 1

Use your text to define the following term. Use the terms to label the figure below. Define the following term.

Line Graphs. 1. Use the data in the table to make a line graph. 2. When did the amount spent on electronics increase the most?

What Lies Beneath: A Sub- National Look at Okun s Law for the United States.

Name: What are the landmarks for the data set above? a. maximum b. minimum c. range d. mode(s) e. median

Copyright 2017 Edmentum - All rights reserved.

Section 2.5 from Precalculus was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website.

Swine Enteric Coronavirus Disease (SECD) Situation Report June 30, 2016

2006 Supplemental Tax Information for JennisonDryden and Strategic Partners Funds

Mean, Median, Mode, and Range

Analyzing Severe Weather Data

Samples and Surveys pp

Evolution Strategies for Optimizing Rectangular Cartograms

Annual Performance Report: State Assessment Data

Cluster Analysis. Part of the Michigan Prosperity Initiative

Statistical Mechanics of Money, Income, and Wealth

Swine Enteric Coronavirus Disease (SECD) Situation Report Sept 17, 2015

Correlation Coefficient: the quantity, measures the strength and direction of a linear relationship between 2 variables.

College Algebra. Word Problems

Data, Statistics, and Probability Practice Questions

Class business PS is due Wed. Lecture 20 (QPM 2016) Multivariate Regression November 14, / 44

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Determine the trend for time series data

EXST 7015 Fall 2014 Lab 08: Polynomial Regression

Section 2.1 Exercises

Reading and Interpreting Circle Graphs

Lesson 8: Informally Fitting a Line

Math 074 Final Exam Review. REVIEW FOR NO CALCULATOR PART OF THE EXAM (Questions 1-14)

Swine Enteric Coronavirus Disease (SECD) Situation Report Mar 5, 2015

Forecasting the 2012 Presidential Election from History and the Polls

Drought Monitoring Capability of the Oklahoma Mesonet. Gary McManus Oklahoma Climatological Survey Oklahoma Mesonet

WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities

Math 082 Final Examination Review

SAMPLE AUDIT FORMAT. Pre Audit Notification Letter Draft. Dear Registrant:

Sample. Test Booklet. Subject: MA, Grade: HS PSSA 2013 Keystone Algebra 1. - signup at to remove - Student name:

Algebra 2 Level 2 Summer Packet

Writing Linear Equations and Inequalities

Analyzing Lines of Fit

ALGEBRA 1 SEMESTER 1 INSTRUCTIONAL MATERIALS Courses: Algebra 1 S1 (#2201) and Foundations in Algebra 1 S1 (#7769)

date: math analysis 2 chapter 18: curve fitting and models

MATH 2070 Test 3 (Sections , , & )

E9.2 Histograms, Bar Charts, Pictograms, Scatter Diagrams & Frequency Distributions

ALGEBRA I SEMESTER EXAMS PRACTICE MATERIALS SEMESTER (1.1) Examine the dotplots below from three sets of data Set A

REGRESSION ANALYSIS BY EXAMPLE

GRADE 6 MATHEMATICS. Form M0110, CORE 1 VIRGINIA STANDARDS OF LEARNING. Spring 2010 Released Test. Property of the Virginia Department of Education

YEAR 10 GENERAL MATHEMATICS 2017 STRAND: BIVARIATE DATA PART II CHAPTER 12 RESIDUAL ANALYSIS, LINEARITY AND TIME SERIES

Module 19: Simple Linear Regression

Analyze Scatter Plots. How can you analyze data displayed on a scatter plot?

Semester Final Exam Review

Math: Question 1 A. 4 B. 5 C. 6 D. 7

CONTINUE. Feeding Information for Boarded Pets. Fed only dry food 5. Fed both wet and dry food 11. Cats. Dogs

Champaign-Urbana 2001 Annual Weather Summary

3. If a forecast is too high when compared to an actual outcome, will that forecast error be positive or negative?

14-3. Measures of Variability. OBJECTIVES Find the interquartile range, the semiinterquartile

a) Graph the equation by the intercepts method. Clearly label the axes and the intercepts. b) Find the slope of the line.

GRADE SIX MATH CURRICULUM MAP Content Skills Assessment Activities/Resources

Press Release Consumer Price Index December 2014

Math 135 Intermediate Algebra. Homework 3 Solutions

TeeJay Publishers General Homework for Book 3G Ch 12 - statistics. Statistics. Number of Women

Systems and Matrices CHAPTER 7

Chapter 1 Review Applied Calculus 31

Date: Pd: Unit 4. GSE H Analytic Geometry EOC Review Name: Units Rewrite ( 12 3) 2 in simplest form. 2. Simplify

Appendix 5 Summary of State Trademark Registration Provisions (as of July 2016)

Chapter 2 Modeling with Linear Functions

MEP Y7 Practice Book B

Chapter 1 Math in the Real World

Reteaching Using Deductive and Inductive Reasoning

Turn to Section 4 of your answer sheet to answer the questions in this section.

Using a Graphing Calculator

Introduction Direct Variation Rates of Change Scatter Plots. Introduction. EXAMPLE 1 A Mathematical Model

Monthly Long Range Weather Commentary Issued: May 15, 2014 Steven A. Root, CCM, President/CEO

Choosing an Appropriate Display

MATH 1710 College Algebra Final Exam Review

Unit 4 Linear Functions

Outline. Lesson 3: Linear Functions. Objectives:

Making a Climograph: GLOBE Data Explorations

Linear Regression Communication, skills, and understanding Calculator Use

Data set B is 2, 3, 3, 3, 5, 8, 9, 9, 9, 15. a) Determine the mean of the data sets. b) Determine the median of the data sets.

Press Release Consumer Price Index October 2017

MATH 1101 Exam 1 Review. Spring 2018

Chapter 1 Linear Equations

Arkansas Council of Teachers of Mathematics Algebra I Regional Exam Spring 2008

6th Grade Final Exam Study Guide. 3.6 How much change should Steve get back from $10.00 if he buys 2 candy bars at $1.25 each?

6 THE NORMAL DISTRIBUTION

4 6 Quarter. During which of the following periods is the increase in the number of students with part-time jobs largest?

AIR FORCE RESCUE COORDINATION CENTER

Algebra EOC Item Specs Practice Test

Short-Term Job Growth Impacts of Hurricane Harvey on the Gulf Coast and Texas

Name: Linear and Exponential Functions 4.1H

Watch TV 4 7 Read 5 2 Exercise 2 4 Talk to friends 7 3 Go to a movie 6 5 Go to dinner 1 6 Go to the mall 3 1

Mathematical studies Standard level Paper 1

Meteorology 110. Lab 1. Geography and Map Skills

Transcription:

Appendix C Further Concepts in Statistics C1 Appendix C Further Concepts in Statistics Stem-and-Leaf Plots Histograms and Frequency Distributions Line Graphs Choosing an Appropriate Graph Scatter Plots Fitting a Line to Data Measures of Central Tendency Stem-and-Leaf Plots Statistics is the branch of mathematics that studies techniques for collecting, organizing, and interpreting data. In this section, you will study several ways to organize and interpret data. One type of plot that can be used to organize sets of numbers by hand is a stem-and-leaf plot. A set of test scores and the corresponding stem-and-leaf plot are shown below. Test Scores 93, 70, 76, 58, 86, 93, 82, 78, 83, 86, 64, 78, 76, 66, 83, 83, 96, 74, 69, 76, 64, 74, 79, 76, 88, 76, 81, 82, 74, 70 Stems Leaves 5 8 6 4 4 6 9 7 0 0 4 4 4 6 6 6 6 6 8 8 9 8 1 2 2 3 3 3 6 6 8 9 3 3 6 Note that the leaves represent the units digits of the numbers and the stems represent the tens digits. Stem-and-leaf plots can also be used to compare two sets of data, as shown in the following example. Example 1 Comparing Two Sets of Data Use a stem-and-leaf plot to compare the test scores given above with the following test scores. Which set of test scores is better? 90, 81, 70, 62, 64, 73, 81, 92, 73, 81, 92, 93, 83, 75, 76, 83, 94, 96, 86, 77, 77, 86, 96, 86, 77, 86, 87, 87, 79, 88 Begin by ordering the second set of scores. 62, 64, 70, 73, 73, 75, 76, 77, 77, 77, 79, 81, 81, 81, 83, 83, 86, 86, 86, 86, 87, 87, 88, 90, 92, 92, 93, 94, 96, 96 Now that the data have been ordered, you can construct a double stem-and-leaf plot by letting the leaves to the right of the stems represent the units digits for the first group of test scores and letting the leaves to the left of the stems represent the units digits for the second group of test scores.

C2 Appendix C Further Concepts in Statistics Leaves (2nd Group) Stems Leaves (1st Group) 5 8 4 2 6 4 4 6 9 9 7 7 7 6 5 3 3 0 7 0 0 4 4 4 6 6 6 6 6 8 8 9 8 7 7 6 6 6 6 3 3 1 1 1 8 1 2 2 3 3 3 6 6 8 6 6 4 3 2 2 0 9 3 3 6 By comparing the two sets of leaves, you can see that the second group of test scores is better than the first group. Example 2 Using a Stem-and-Leaf Plot The table shows the percent of the population of each state and the District of Columbia that was at least 65 years old in 2000. Use a stem-and-leaf plot to organize the data. (Source: U.S. Census Bureau) AK 5.7 AL 13.0 AR 14.0 AZ 13.0 CA 10.6 CO 9.7 CT 13.8 DC 12.2 DE 13.0 FL 17.6 GA 9.6 HI 13.3 IA 14.9 ID 11.3 IL 12.1 IN 12.4 KS 13.3 KY 12.5 LA 11.6 MA 13.5 MD 11.3 ME 14.4 MI 12.3 MN 12.1 MO 13.5 MS 12.1 MT 13.4 NC 12.0 ND 14.7 NE 13.6 NH 12.0 NJ 13.2 NM 11.7 NV 11.0 NY 12.9 OH 13.3 OK 13.2 OR 12.8 PA 15.6 RI 14.5 SC 12.1 SD 14.3 TN 12.4 TX 9.9 UT 8.5 VA 11.2 VT 12.7 WA 11.2 WI 13.1 WV 15.3 WY 11.7 Begin by ordering the numbers, as shown below. 5.7, 8.5, 9.6, 9.7, 9.9, 10.6, 11.0, 11.2, 11.2, 11.3, 11.3, 11.6, 11.7, 11.7, 12.0, 12.0, 12.1, 12.1, 12.1, 12.1, 12.2, 12.3, 12.4, 12.4, 12.5, 12.7, 12.8, 12.9, 13.0, 13.0, 13.0, 13.1, 13.2, 13.2, 13.3, 13.3, 13.3, 13.4, 13.5, 13.5, 13.6, 13.8, 14.0, 14.3, 14.4, 14.5, 14.7, 14.9, 15.3, 15.6, 17.6 Next construct a stem-and-leaf plot using the leaves to represent the digits to the right of the decimal points.

Appendix C Further Concepts in Statistics C3 Stems Leaves 5. 7 Alaska has the lowest percent. 6. 7. 8. 5 9. 6 7 9 10. 6 11. 0 2 2 3 3 6 7 7 12. 0 0 1 1 1 1 2 3 4 4 5 7 8 9 13. 0 0 0 1 2 2 3 3 3 4 5 5 6 8 14. 0 3 4 5 7 9 15. 3 6 16. 17. 6 Florida has the highest percent. Histograms and Frequency Distributions With data such as those given in Example 2, it is useful to group the data into intervals and plot the frequency of the data in each interval. For instance, the frequency distribution and histogram shown in Figure C.1 represent the data given in Example 2. Frequency Distribution Histogram Interval Tally 24 5, 7 20 7, 9 16 9, 11 12 11, 13 8 13, 15 4 15, 17 5 7 9 11 13 15 17 19 17, 19 Percent of population 65 or older Figure C.1 A histogram has a portion of a real number line as its horizontal axis. A bar graph is similar to a histogram, except that the bars can be either horizontal or vertical and the labels of the bars are not necessarily numbers. Another difference between a bar graph and a histogram is that the bars in a bar graph are usually separated by spaces, whereas the bars in a histogram are not separated by spaces. Number of states

C4 Appendix C Further Concepts in Statistics Monthly normal precipitation (in inches) 6 5 4 3 2 1 J F M A M J J ASOND Month Figure C.2 Example 3 Constructing a Bar Graph The data below show the monthly normal precipitation (in inches) in Houston, Texas. Construct a bar graph for these data. What can you conclude? (Source: PC USA) January 3.7 February 3.0 March 3.4 April 3.6 May 5.2 June 5.4 July 3.2 August 3.8 September 4.3 October 4.5 November 4.2 December 3.7 To create a bar graph, begin by drawing a vertical axis to represent the precipitation and a horizontal axis to represent the month. The bar graph is shown in Figure C.2. From the graph, you can see that Houston receives a fairly consistent amount of rain throughout the year the driest month tends to be February and the wettest month tends to be June. Line Graphs A line graph is similar to a standard coordinate graph. Line graphs are usually used to show trends over periods of time. Example 4 Constructing a Line Graph The following data show the number of immigrants (in thousands) to the United States for the years 1983 through 2002. Construct a line graph of the data. What can you conclude? (Source: Bureau of Citizenship and Immigration Services) Year Number Year Number 1983 560 1993 904 1984 544 1994 804 1985 570 1995 720 1986 602 1996 916 1987 602 1997 798 1988 643 1998 654 1989 1091 1999 647 1990 1536 2000 850 1991 1827 2001 1064 1992 974 2002 1064 Begin by drawing a vertical axis to represent the number of immigrants in thousands. Then label the horizontal axis with years and plot the points shown in the list. Finally, connect the points with line segments, as shown in Figure C.3 on the next page. From the line graph, you can see that the number of immigrants steadily increased until 1989, when there was a sharp increase followed by a sudden decrease in 1992.

Appendix C Further Concepts in Statistics C5 Number of immigrants (in thousands) 2000 1800 1600 1400 1200 1000 800 600 400 200 Figure C.3 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 Year Choosing an Appropriate Graph Line graphs and bar graphs are commonly used for displaying data. When you are using a graph to organize and present data, you must first decide which type of graph to use. Here are some guidelines to help you decide which type of graph to use. 1. Use a bar graph when the data fall into distinct categories and you want to compare totals. 2. Use a line graph when you want to show the relationship between consecutive amounts or data over time. Example 5 Organizing Data with a Graph Listed below are professional golfers who have won the most money on the PGA Tour through 2002. Organize the data graphically. (Source: PGA Tour) Golfer Money Won (in millions) Davis Love III $20.1 Phil Mickelson $22.1 Nick Price $16.7 Vijay Singh $18.3 Tiger Woods $33.1 You can use a bar graph because the data fall into distinct categories, and it would be useful to compare totals. The bar graph shown in Figure C.4 is horizontal. This makes it easier to label each bar. Also notice that the golfers are listed in order of most money won.

C6 Appendix C Further Concepts in Statistics Tiger Woods Phil Mickelson Golfer Davis Love III Vijay Singh Nick Price 5 10 15 20 25 30 35 Money won (in millions of dollars) Figure C.4 People (in millions) P 144 142 140 138 136 134 132 130 128 126 124 122 4 5 6 7 8 9 10 Year (4 1994) Figure C.5 11 t Scatter Plots Many real-life situations involve finding relationships between two variables, such as the year and the number of people in the labor force. In a typical situation, data are collected and written as a set of ordered pairs. The graph of such a set is called a scatter plot. From the scatter plot in Figure C.5 that relates the year t with the number of people in the labor force P, it appears that the points describe a relationship that is nearly linear. The relationship is not exactly linear because the labor force did not increase by precisely the same amount each year. A mathematical equation that approximates the relationship between t and P is called a mathematical model. When developing a mathematical model, you strive for two (often conflicting) goals accuracy and simplicity. For the data in Figure C.5, a linear model of the form P at b appears to be best. It is simple and relatively accurate. Consider a collection of ordered pairs of the form (x, y). If y tends to increase as x increases, the collection is said to have a positive correlation. If y tends to decrease as x increases, the collection is said to have a negative correlation. Figure C.6 shows three examples: one with a positive correlation, one with a negative correlation, and one with no (discernible) correlation. y y y x x x Positive correlation Negative correlation No correlation Figure C.6

Appendix C Further Concepts in Statistics C7 Example 6 Interpreting Correlation Test scores Test scores y 100 80 60 40 20 2 y 100 80 60 40 20 4 Figure C.7 4 6 8 Study hours 8 12 16 TV hours 10 20 x x On a Friday, 22 students in a class were asked to record the number of hours they spent studying for a test on Monday and the numbers of hours they spent watching television. The results are shown below. Construct a scatter plot for each set of data. Then determine whether the points are positively correlated, are negatively correlated, or have no discernible correlation. What can you conclude? (The first coordinate is the number of hours and the second coordinate is the score obtained on Monday s test.) Study Hours: (0, 40), (1, 41), (2, 51), (3, 58), (3, 49), (4, 48), (4, 64), (5, 55), (5, 69), (5, 58), (5, 75), (6, 68), (6, 63), (6, 93), (7, 84), (7, 67), (8, 90), (8, 76), (9, 95), (9, 72), (9, 85), (10, 98) TV Hours: (0, 98), (1, 85), (2, 72), (2, 90), (3, 67), (3, 93), (3, 95), (4, 68), (4, 84), (5, 76), (7, 75), (7, 58), (9, 63), (9, 69), (11, 55), (12, 58), (14, 64), (16, 48), (17, 51), (18, 41), (19, 49), (20, 40) Scatter plots for the two sets of data are shown in Figure C.7. The scatter plot relating study hours and test scores has a positive correlation. This means that the more a student studied, the higher his or her score tended to be. The scatter plot relating television hours and test scores has a negative correlation. This means that the more time a student spent watching television, the lower his or her score tended to be. Fitting a Line to Data Finding a linear model that represents the relationship described by a scatter plot is called fitting a line to data. You can do this graphically by simply sketching the line that appears to fit the points, finding two points on the line, and then finding the equation of the line that passes through the two points. People (in millions) 144 142 140 138 136 134 132 130 128 126 124 122 P Figure C.8 5 P = (t 4) + 131 3 4 5 6 7 8 9 10 11 Year (4 1994) t Example 7 Fitting a Line to Data Find a linear model that relates the year and the number of people P (in millions) in the United States who were part of the labor force from 1994 through 2001. (Source: U.S. Bureau of Labor Statistics) Year 1994 1995 1996 1997 1998 1999 2000 2001 People, P 131 132 134 136 138 139 141 142 Let t represent the year with t 4 corresponding to 1994. After plotting the data in the table, draw the line that you think best represents the data, as shown in Figure C.8. Two points that lie on this line are 4, 131 and 10, 141. Using the point-slope form, you can find the equation of the line to be P 5 3 t 4 131.

C8 Appendix C Further Concepts in Statistics Once you have found a model, you can measure how well the model fits the data by comparing the actual values with the values given by the model, as shown in the following table. t 4 5 6 7 8 9 10 11 Actual P 131 132 134 136 138 139 141 142 Model P 131 132.7 134.3 136 137.7 139.3 141 142.7 The sum of the squares of the differences between the actual values and the model s values is the sum of the squared differences. The model that has the least sum is called the least squares regression line for the data. For the model in Example 7, the sum of the squared differences is 1.25. The least squares regression line for the data is P 1.7t 124 Best-fitting linear model Its sum of squared differences is 1.08. Many graphing calculators have built-in least squares regression programs. If your graphing calculator has such a program, enter the data in the table and use it to find the least squares regression line. Measures of Central Tendency In many real-life situations, it is helpful to describe data by a single number that is most representative of the entire collection of numbers. Such a number is called a measure of central tendency. The most commonly used measures are as follows. 1. The mean, or average, of n numbers is the sum of the numbers divided by n. 2. The median of n numbers is the middle number when the numbers are written in numerical order. If n is even, the median is the average of the two middle numbers. 3. The mode of n numbers is the number that occurs most frequently. If two numbers tie for most frequent occurrence, the collection has two modes and is called bimodal. Example 8 Comparing Measures of Central Tendency On an interview for a job, the interviewer tells you that the average annual income of the company s 25 employees is $60,849. The actual annual incomes of the 25 employees are shown below. What are the mean, median, and mode of the incomes? Was the person telling you the truth? $17,305, $478,320, $45,678, $18,980, $17,408, $25,676, $28,906, $12,500, $24,540, $33,450, $12,500, $33,855, $37,450, $20,432, $28,956, $34,983, $36,540, $250,921, $36,853, $16,430, $32,654, $98,213, $48,980, $94,024, $35,671

The mean of the incomes is Appendix C Further Concepts in Statistics C9 Mean 17,305 478,320 45,678 18,980... 35,671 25 1,521,225 25 $60,849. To find the median, order the incomes as follows. $12,500, $12,500, $16,430, $17,305, $17,408, $18,980, $20,432, $24,540, $25,676, $28,906, $28,956, $32,654, $33,450, $33,855, $34,983, $35,671, $36,540, $36,853, $37,450, $45,678, $48,980, $94,024, $98,213, $250,921, $478,320 From this list, you can see that the median (the middle number) is $33,450. From the same list, you can see that $12,500 is the only income that occurs more than once. So, the mode is $12,500. Technically, the person was telling the truth because the average is (generally) defined to be the mean. However, of the three measures of central tendency Mean: $60,849, Median: $33,450 and Mode: $12,500 it seems clear that the median is most representative. The mean is inflated by the two highest salaries. Which of the three measures of central tendency is the most representative? The answer is that it depends on the distribution of the data and the way in which you plan to use the data. For instance, in Example 8, the mean salary of $60,849 does not seem very representative to a potential employee. To a city income tax collector who wants to estimate 1% of the total income of the 25 employees, however, the mean is precisely the right measure. Example 9 Choosing a Measure of Central Tendency Which measure of central tendency is the most representative of the data shown in each of the following frequency distributions? Number 1 2 3 4 5 6 7 8 9 Frequency 7 20 15 11 8 3 2 0 15 Number 1 2 3 4 5 6 7 8 9 Frequency 9 8 7 6 5 6 7 8 9 Number 1 2 3 4 5 6 7 8 9 Frequency 6 1 2 3 5 5 4 3 0

C10 Appendix C Further Concepts in Statistics a. For these data, the mean is 4.23, the median is 3, and the mode is 2. Of these, the median or mode is probably the most representative. b. For these data, the mean and median are each 5 and the modes are 1 and 9 (the distribution is bimodal). Of these, the mean or median is the most representative. c. For these data, the mean is 4.59, the median is 5, and the mode is 1. Of these, the mean or median is the most representative. Appendix C Exercises 1. Exam Scores Construct a stem-and-leaf plot for the following exam scores for a class of 30 students. The scores are for a 100-point exam. See Examples 1 and 2. 77, 100, 77, 70, 83, 89, 87, 85, 81, 84, 81, 78, 89, 78, 88, 85, 90, 92, 75, 81, 85, 100, 98, 81, 78, 75, 85, 89, 82, 75 2. Insurance Coverage The following table shows the total number of persons (in thousands) without health insurance coverage in the 50 states and the District of Columbia in 2000. Use a stem-and-leaf plot to organize the data. (Source: U.S. Census Bureau) AK 125 AL 600 AR 364 AZ 793 CA 6281 CO 563 CT 263 DC 73 DE 82 FL 2620 GA 1135 HI 117 IA 248 ID 196 IL 1659 IN 701 KS 301 KY 513 LA 810 MA 595 MD 501 ME 145 MI 982 MN 430 MO 586 MS 364 MT 162 NC 980 ND 69 NE 164 NH 85 NJ 1049 NM 427 NV 311 NY 2802 OH 1255 OK 636 OR 465 PA 905 RI 55 SC 448 SD 82 TN 577 TX 4425 UT 296 VA 886 VT 67 WA 780 WI 386 WV 254 WY 70 Exam Scores In Exercises 3 and 4, use the following set of data, which lists students scores on a 100-point exam. 93, 84, 100, 92, 66, 89, 78, 52, 71, 85, 83, 95, 98, 99, 93, 81, 80, 79, 67, 59, 90, 55, 77, 62, 90, 78, 66, 63, 93, 87, 74, 96, 72, 100, 70, 73 3. Use a stem-and-leaf plot to organize the data. 4. Draw a histogram to represent the data. 5. Complete the following frequency distribution table and draw a histogram to represent the data. 44, 33, 17, 23, 16, 18, 44, 47, 18, 20, 25, 27, 18, 29, 29, 28, 27, 18, 36, 22, 32, 38, 33, 41, 49, 48, 45, 38, 49, 15 Interval Tally [15, 22) [22, 29) [29, 36) [36, 43) [43, 50) 6. Meteorology The data below show the seasonal snowfall (in inches) in Lincoln, Nebraska for the years 1973 through 2002 (the amounts are listed in order by year). How would you organize this data? Explain your reasoning. (Source: University of Nebraska-Lincoln) 33.6, 42.1, 21.1, 21.8, 31.0, 34.4, 23.3, 13.0, 32.3, 38.0, 47.5, 21.5, 18.9, 15.7, 13.0, 19.1, 18.7, 25.8, 23.8, 32.1, 21.3, 21.8, 30.7, 29.0, 44.6, 24.4, 11.9, 37.9, 29.5, 27.1 7. Travel The data below give the places of origin and the numbers of travelers (in millions) to the United States in 2000. Construct a bar graph for this data. (Source: U.S. Department of Commerce) See Example 3. Canada 14.6 Europe 11.6 Mexico 10.3 Far East 7.6 Other 6.8

Appendix C Further Concepts in Statisticss C11 8. Agriculture The data below show the cash receipts (in millions of dollars) from fruit crops of farmers in 2001. Construct a bar graph for this data. (Source: U.S. Department of Agriculture) Apples 1370 Peaches 477 Cherries 326 Pears 267 Grapes 2924 Plums and Prunes 219 Lemons 273 Strawberries 1086 Oranges 1369 Environment In Exercises 9 14, use the line graph which shows municipal solid waste (in millions of tons) generated, recovered, and disposed from 1990 through 2001. (Source: Franklin Associates, Ltd.) Waste (in millions of tons) 300 Total waste 270 Landfilled Recycled 240 Incinerated 210 180 150 120 90 60 30 1 2 3 4 5 6 7 8 9 10 Year (0 1990) 9. Estimate the total waste in 1990 and 1999. 10. Estimate the amount of incinerated waste in 1998. 11. Which quantities increased every year? 12. During which time period(s) did the amount of landfilled waste decrease? 13. What is the relationship among the four quantities in the line graph? 14. Why do you think recycled waste is increasing? 15. College Enrollment The table shows the enrollment in a liberal arts college. Construct a line graph for the data. See Example 4. Year 1996 1997 1998 1999 Enrollment 1675 1704 1710 1768 Year 2000 2001 2002 2003 Enrollment 1833 1918 1967 1972 16. Oil Imports The table shows the crude oil imports into the United States in millions of barrels for the years 1997 through 2004. Construct a line graph for the data and state what information it reveals. (Source: U.S. Energy Information Administration) Year 1997 1998 1999 2000 2001 Oil imports 3002 3178 3187 3260 3405 17. Stock Market The list below shows stock prices for selected companies in July of 2003. Draw a graph that best represents the data. Explain why you chose that type of graph. (Source: Value Line) See Example 5. Company Stock Price Avon Products $63 Cheesecake Factory $34 Tiffany & Co. $34 Tupperware Corp. $15 Yankee Candle Co. $24 18. Revenue The table shows the revenue (in billions of dollars) for AT&T Wireless Services for the years 1997 through 2002. Draw a graph that best represents the data. Explain why you chose that type of graph. (Source: Value Line) Year 1997 1998 1999 2000 2001 2002 Revenue 4.7 5.4 7.6 10.4 13.6 15.6 19. Entertainment The factory sales (in millions of dollars) of projection televisions for the years 1995 through 2000 are shown in the table. Draw a graph that best represents the data. Explain why you chose that type of graph. (Source: Consumer Electronics Association) Year 1995 1996 1997 1998 1999 2000 Sales 1417 1361 1577 1632 1481 1057 20. Owning Cats The average numbers (out of 100) of cat owners who state various reasons for owning a cat are listed below. Draw a graph that best represents the data. Explain why you chose that type of graph. Reason for Owning a Cat Number Have a pet to play with 93 Companionship 84 Help children learn responsibility 78 Have a pet to communicate with 62 Security 51

C12 Appendix C Further Concepts in Statistics Interpreting a Scatter Plot In Exercises 21 24, use the scatter plot shown. The scatter plot compares the number of hits x made by 30 softball players during the first half of the season with the number of runs batted in y. Runs batted in 10 9 8 7 6 5 4 3 2 1 y 1 2 3 4 5 6 7 8 9 10 Hits 21. Do x and y have a positive correlation, a negative correlation, or no correlation? 22. Why does the scatter plot show only 28 points? 23. From the scatter plot, does it appear that players with more hits tend to have more runs batted in? 24. Can a player have more runs batted in than hits? Explain. In Exercises 25 28, decide whether a scatter plot relating the two quantities would tend to have a positive correlation, a negative correlation, or no correlation. Explain. 25. The age and value of a car 26. A student s study time and test scores 27. The height and age of a pine tree 28. A student s height and test scores Pressure In Exercises 29 32, use the data in the table, which show the relationship between the altitude A (in thousands of feet) and the air pressure P (in pounds per square inch). Altitude, A 0 5 10 15 20 25 Pressure, P 14.7 12.2 10.1 8.3 6.8 5.5 Altitude, A 30 35 40 45 50 Pressure, P 4.4 3.5 2.7 2.1 1.7 29. Sketch a scatter plot of the data. See Example 6. 30. How are A and P related? 31. Estimate the air pressure at 42,500 feet. x 32. Estimate the altitude at which the air pressure is 5.0 pounds per square inch. Agriculture In Exercises 33 36, use the data in the table, where x is the number of units of fertilizer applied to sample plots and y is the yield (in bushels) of a crop. x 0 1 2 3 4 5 6 7 8 y 58 60 59 61 63 66 65 67 70 33. Sketch a scatter plot of the data. 34. Determine whether the points are positively correlated, are negatively correlated, or have no discernible correlation. 35. Sketch a linear model that you think best represents the data. Find an equation of the line you sketched. Use the line to predict the yield when 10 units of fertilizer are used. See Example 7. 36. Can the model found in Exercise 35 be used to predict yields for arbitrarily large values of x? Explain. Speed of Sound In Exercises 37 40, use the data in the table, where h is altitude in thousands of feet and v is the speed of sound in feet per second. h 0 5 10 15 20 25 30 35 v 1116 1097 1077 1057 1036 1016 994 972 37. Sketch a scatter plot of the data. 38. Determine whether the points are positively correlated, are negatively correlated, or have no discernible correlation. 39. Sketch a linear model that you think best represents the data. Find an equation of the line you sketched. Use the line to predict the speed of sound at an altitude of 27,000 feet. 40. The speed of sound at an altitude of 70,000 feet is approximately 968 feet per second. What does this suggest about the validity of using the model in Exercise 39 to extrapolate beyond the data given in the table? In Exercises 41 44, use a graphing calculator to find the least squares regression line for the data. Sketch a scatter plot and the regression line. 41. (0, 23), (1, 20), (2, 19), (3, 17), (4, 15), (5, 11), (6, 10)

Appendix C Further Concepts in Statisticss C13 42. (4, 52.8), (5, 54.7), (6, 55.7), (7, 57.8), (8, 60.2), (9, 63.1), (10, 66.5) 43. 10, 5.1, 5, 9.8, (0, 17.5), (2, 25.4), (4, 32.8), (6, 38.7), (8, 44.2), (10, 50.5) 44. 10, 213.5, 5, 174.9, (0, 141.7), (5, 119.7), (8, 102.4), (10, 87.6) 45. Sales The table shows the sales y (in billions of dollars) from full-service restaurants for the years 1997 through 2002, where t 7 corresponds to 1997. (Source: National Restaurant Association) t 7 8 9 10 11 12 y 110.3 117.8 125.4 134.5 140.4 146.7 (a) Use a graphing calculator to find the least squares regression line. Use the equation to estimate sales in 2003. (b) Make a scatter plot of the data and sketch the graph of the regression line. 46. Advertising The management of a department store ran an experiment to determine if a relationship existed between sales S (in thousands of dollars) and the amount spent on advertising x (in thousands of dollars). The following data were collected. x 1 2 3 4 5 6 7 8 S 405 423 455 466 492 510 525 559 (a) Use a graphing calculator to find the least squares regression line. Use the equation to estimate sales when $4500 is spent on advertising. (b) Make a scatter plot of the data and sketch the graph of the regression line. In Exercises 47 52, find the mean, median, and mode of the data set. See Example 8. 47. 5, 12, 7, 14, 8, 9, 7 48. 30, 37, 32, 39, 33, 34, 32 49. 5, 12, 7, 24, 8, 9, 7 50. 20, 37, 32, 39, 33, 34, 32 51. Electric Bills A person had the following monthly bills for electricity. What are the mean, median, and mode of the collection of bills? Jan. $67.92 Feb. $59.84 Mar. $52.00 Apr. $52.50 May $57.99 June $65.35 July $81.76 Aug. $74.98 Sept. $87.82 Oct. $83.18 Nov. $65.35 Dec. $57.00 52. Car Rental A car rental company kept the following record of the numbers of miles driven by a car that was rented. What are the mean, median, and mode of these data? Monday 410 Tuesday 260 Wednesday 320 Thursday 320 Friday 460 Saturday 150 53. Six-Child Families A study was done on families having six children. The table shows the number of families in the study with the indicated number of girls. Determine the mean, median, and mode of this set of data. Number of girls 0 1 2 3 4 5 6 Frequency 1 24 45 54 50 19 7 54. Sports A baseball fan examined the records of a favorite baseball player s performance during his last 50 games. The number of games in which the player had 0, 1, 2, 3, and 4 hits are recorded in the table. Number of hits 0 1 2 3 4 Frequency 14 26 7 2 1 (a) Determine the average number of hits per game. (b) The player had 200 at bats during the 50-game series. Determine his batting average. 55. Think About It Construct a collection of numbers that has the following properties. If this is not possible, explain why it is not. Mean 6, median 4, mode 4 56. Think About It Construct a collection of numbers that has the following properties. If this is not possible, explain why it is not. Mean 6, median 6, mode 4 57. Test Scores A professor records the following scores for a 100-point exam. 99, 64, 80, 77, 59, 72, 87, 79, 92, 88, 90, 42, 20, 89, 42, 100, 98, 84, 78, 91 Which measure of central tendency best describes these test scores? See Example 9. 58. Shoe Sales A salesperson sold eight pairs of men s dress shoes. The sizes of the eight pairs were as follows: 101 8, 12, 10, 91 11, and 10 1 2, 101 2, 2, 2. Which measure (or measures) of central tendency best describes the typical shoe size for these data?