Further Concepts in Statistics

Similar documents
Further Concepts in Statistics

C Further Concepts in Statistics

Sample Statistics 5021 First Midterm Examination with solutions

Lecture 26 Section 8.4. Mon, Oct 13, 2008

Nursing Facilities' Life Safety Standard Survey Results Quarterly Reference Tables

Your Galactic Address

Summary of Natural Hazard Statistics for 2008 in the United States

What Lies Beneath: A Sub- National Look at Okun s Law for the United States.

2006 Supplemental Tax Information for JennisonDryden and Strategic Partners Funds

Use your text to define the following term. Use the terms to label the figure below. Define the following term.

5, 0. Math 112 Fall 2017 Midterm 1 Review Problems Page Which one of the following points lies on the graph of the function f ( x) (A) (C) (B)

Math 112 Spring 2018 Midterm 1 Review Problems Page 1

Line Graphs. 1. Use the data in the table to make a line graph. 2. When did the amount spent on electronics increase the most?

Analyzing Severe Weather Data

Swine Enteric Coronavirus Disease (SECD) Situation Report June 30, 2016

Name: What are the landmarks for the data set above? a. maximum b. minimum c. range d. mode(s) e. median

Copyright 2017 Edmentum - All rights reserved.

Mean, Median, Mode, and Range

Evolution Strategies for Optimizing Rectangular Cartograms

Annual Performance Report: State Assessment Data

Samples and Surveys pp

Cluster Analysis. Part of the Michigan Prosperity Initiative

Swine Enteric Coronavirus Disease (SECD) Situation Report Sept 17, 2015

Section 2.5 from Precalculus was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website.

Data, Statistics, and Probability Practice Questions

Correlation Coefficient: the quantity, measures the strength and direction of a linear relationship between 2 variables.

Class business PS is due Wed. Lecture 20 (QPM 2016) Multivariate Regression November 14, / 44

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Statistical Mechanics of Money, Income, and Wealth

Drought Monitoring Capability of the Oklahoma Mesonet. Gary McManus Oklahoma Climatological Survey Oklahoma Mesonet

EXST 7015 Fall 2014 Lab 08: Polynomial Regression

Section 2.1 Exercises

College Algebra. Word Problems

Swine Enteric Coronavirus Disease (SECD) Situation Report Mar 5, 2015

Forecasting the 2012 Presidential Election from History and the Polls

Lesson 8: Informally Fitting a Line

Math 074 Final Exam Review. REVIEW FOR NO CALCULATOR PART OF THE EXAM (Questions 1-14)

SAMPLE AUDIT FORMAT. Pre Audit Notification Letter Draft. Dear Registrant:

Determine the trend for time series data

Reading and Interpreting Circle Graphs

ALGEBRA 1 SEMESTER 1 INSTRUCTIONAL MATERIALS Courses: Algebra 1 S1 (#2201) and Foundations in Algebra 1 S1 (#7769)

date: math analysis 2 chapter 18: curve fitting and models

REGRESSION ANALYSIS BY EXAMPLE

WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities

Math 082 Final Examination Review

Semester Final Exam Review

Algebra 2 Level 2 Summer Packet

Writing Linear Equations and Inequalities

Appendix 5 Summary of State Trademark Registration Provisions (as of July 2016)

E9.2 Histograms, Bar Charts, Pictograms, Scatter Diagrams & Frequency Distributions

Module 19: Simple Linear Regression

Math: Question 1 A. 4 B. 5 C. 6 D. 7

ALGEBRA I SEMESTER EXAMS PRACTICE MATERIALS SEMESTER (1.1) Examine the dotplots below from three sets of data Set A

CONTINUE. Feeding Information for Boarded Pets. Fed only dry food 5. Fed both wet and dry food 11. Cats. Dogs

Making a Climograph: GLOBE Data Explorations

Sample. Test Booklet. Subject: MA, Grade: HS PSSA 2013 Keystone Algebra 1. - signup at to remove - Student name:

14-3. Measures of Variability. OBJECTIVES Find the interquartile range, the semiinterquartile

Short-Term Job Growth Impacts of Hurricane Harvey on the Gulf Coast and Texas

GRADE 6 MATHEMATICS. Form M0110, CORE 1 VIRGINIA STANDARDS OF LEARNING. Spring 2010 Released Test. Property of the Virginia Department of Education

Analyzing Lines of Fit

YEAR 10 GENERAL MATHEMATICS 2017 STRAND: BIVARIATE DATA PART II CHAPTER 12 RESIDUAL ANALYSIS, LINEARITY AND TIME SERIES

Turn to Section 4 of your answer sheet to answer the questions in this section.

Chapter 1 Math in the Real World

Outline. Lesson 3: Linear Functions. Objectives:

Smart Magnets for Smart Product Design: Advanced Topics

MATH 2070 Test 3 (Sections , , & )

GRADE SIX MATH CURRICULUM MAP Content Skills Assessment Activities/Resources

a) Graph the equation by the intercepts method. Clearly label the axes and the intercepts. b) Find the slope of the line.

Introduction Direct Variation Rates of Change Scatter Plots. Introduction. EXAMPLE 1 A Mathematical Model

Math 135 Intermediate Algebra. Homework 3 Solutions

Chapter 1 Review Applied Calculus 31

Using a Graphing Calculator

Reteaching Using Deductive and Inductive Reasoning

Systems and Matrices CHAPTER 7

AIR FORCE RESCUE COORDINATION CENTER

Objectives for Linear Activity. Calculate average rate of change/slope Interpret intercepts and slope of linear function Linear regression

Instructions Answers Calculators must not

Algebra EOC Item Specs Practice Test

Date: Pd: Unit 4. GSE H Analytic Geometry EOC Review Name: Units Rewrite ( 12 3) 2 in simplest form. 2. Simplify

Monthly Long Range Weather Commentary Issued: May 15, 2014 Steven A. Root, CCM, President/CEO

MATH 1710 College Algebra Final Exam Review

(2 x 2-3x + 5) + ( x 2 + 6x - 4) = 3 x 2 + 3x + 1 (continued on the next page)

Champaign-Urbana 2001 Annual Weather Summary

Meteorology 110. Lab 1. Geography and Map Skills

Algebra 1 Fall Semester Final Review Name

3. If a forecast is too high when compared to an actual outcome, will that forecast error be positive or negative?

Linear Regression Communication, skills, and understanding Calculator Use

Algebra EOC Item Specs Practice Test

TeeJay Publishers General Homework for Book 3G Ch 12 - statistics. Statistics. Number of Women

MEP Y7 Practice Book B

Arkansas Council of Teachers of Mathematics Algebra I Regional Exam Spring 2008

6 THE NORMAL DISTRIBUTION

REVIEW: HSPA Skills 2 Final Exam June a) y = x + 4 b) y = 2x + 5 c) y = 3x +2 d) y = 2x + 3

Chapter. Organizing and Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Mathematics Practice Test 2

MATH 1101 Exam 1 Review. Spring 2018

Name: Linear and Exponential Functions 4.1H

Analyze Scatter Plots. How can you analyze data displayed on a scatter plot?

2007 First-Class Mail Rates for Flats* Weight (oz) Rate (dollars) Weight (oz) Rate (dollars)

Mathematical studies Standard level Paper 1

Transcription:

Appendix D Further Concepts in Statistics D1 Appendix D Further Concepts in Statistics Stem-and-Leaf Plots Histograms and Frequency Distributions Line Graphs Choosing an Appropriate Graph Scatter Plots Fitting a Line to Data Measures of Central Tendency Stem-and-Leaf Plots Statistics is the branch of mathematics that studies techniques for collecting, organizing, and interpreting data. In this section, you will study several ways to organize and interpret data. One type of plot that can be used to organize sets of numbers by hand is a stem-and-leaf plot. A set of test scores and the corresponding stem-and-leaf plot are shown below. Test Scores 93, 70, 76, 58, 86, 93, 82, 78, 83, 86, 64, 78, 76, 66, 83, 83, 96, 74, 69, 76, 64, 74, 79, 76, 88, 76, 81, 82, 74, 70 Stems Leaves 5 8 6 4 4 6 9 7 0 0 4 4 4 6 6 6 6 6 8 8 9 8 1 2 2 3 3 3 6 6 8 9 3 3 6 Note that the leaves represent the units digits of the numbers and the stems represent the tens digits. Stem-and-leaf plots can also be used to compare two sets of data, as shown in the following example. Example 1 Comparing Two Sets of Data Use a stem-and-leaf plot to compare the test scores given above with the following test scores. Which set of test scores is better? 90, 81, 70, 62, 64, 73, 81, 92, 73, 81, 92, 93, 83, 75, 76, 83, 94, 96, 86, 77, 77, 86, 96, 86, 77, 86, 87, 87, 79, 88 Begin by ordering the second set of scores. 62, 64, 70, 73, 73, 75, 76, 77, 77, 77, 79, 81, 81, 81, 83, 83, 86, 86, 86, 86, 87, 87, 88, 90, 92, 92, 93, 94, 96, 96 Now that the data have been ordered, you can construct a double stem-and-leaf plot by letting the leaves to the right of the stems represent the units digits for the first group of test scores and letting the leaves to the left of the stems represent the units digits for the second group of test scores.

D2 Appendix D Further Concepts in Statistics Leaves (2nd Group) Stems Leaves (1st Group) 5 8 4 2 6 4 4 6 9 9 7 7 7 6 5 3 3 0 7 0 0 4 4 4 6 6 6 6 6 8 8 9 8 7 7 6 6 6 6 3 3 1 1 1 8 1 2 2 3 3 3 6 6 8 6 6 4 3 2 2 0 9 3 3 6 By comparing the two sets of leaves, you can see that the second group of test scores is better than the first group. Example 2 Using a Stem-and-Leaf Plot The table shows the percent of the population of each state and the District of Columbia that was at least 65 years old in 2006. Use a stem-and-leaf plot to organize the data. (Source: U.S. Census Bureau) AK 6.8 AL 13.4 AR 13.9 AZ 12.8 CA 10.8 CO 10.0 CT 13.4 DC 12.3 DE 13.4 FL 16.8 GA 9.8 HI 14.0 IA 14.6 ID 11.5 IL 12.0 IN 12.4 KS 12.9 KY 12.8 LA 12.2 MA 13.3 MD 11.6 ME 14.6 MI 12.5 MN 12.1 MO 13.3 MS 12.4 MT 13.8 NC 12.2 ND 14.6 NE 13.3 NH 12.4 NJ 12.9 NM 12.4 NV 11.1 NY 13.1 OH 13.4 OK 13.2 OR 12.9 PA 15.2 RI 13.9 SC 12.8 SD 14.2 TN 12.7 TX 9.9 UT 8.8 VA 11.6 VT 13.3 WA 11.5 WI 13.0 WV 15.3 WY 12.2 Begin by ordering the numbers, as shown below. 6.8, 8.8, 9.8, 9.9, 10.0, 10.8, 11.1, 11.5, 11.5, 11.6, 11.6, 12.0, 12.1, 12.2, 12.2, 12.2, 12.3, 12.4, 12.4, 12.4, 12.4, 12.5, 12.7, 12.8, 12.8, 12.8, 12.9, 12.9, 12.9, 13.0, 13.1, 13.2, 13.3, 13.3, 13.3, 13.3, 13.4, 13.4, 13.4, 13.4, 13.8, 13.9, 13.9, 14.0, 14.2, 14.6, 14.6, 14.6, 15.2, 15.3, 16.8 Next construct a stem-and-leaf plot using the leaves to represent the digits to the right of the decimal points.

Appendix D Further Concepts in Statistics D3 Stems Leaves 6. 8 Alaska has the lowest percent. 7. 8. 8 9. 8 9 10. 0 8 11. 1 5 5 6 6 12. 0 1 2 2 2 3 4 4 4 4 5 7 8 8 8 9 9 9 13. 0 1 2 3 3 3 3 4 4 4 4 8 9 9 14. 0 2 6 6 6 15. 2 3 16. 8 Florida has the highest percent. Histograms and Frequency Distributions With data such as those given in Example 2, it is useful to group the data into intervals and plot the frequency of the data in each interval. For instance, the frequency distribution and histogram shown in Figure D.1 represent the data given in Example 2. Frequency Distribution Histogram Interval Tally 36 6, 8 30 8, 10 24 10, 12 18 12 12, 14 6 14, 16 16, 18 Figure D.1 6 8 10 12 14 16 18 Percent of population 65 or older A histogram has a portion of a real number line as its horizontal axis. A bar graph is similar to a histogram, except that the bars can be either horizontal or vertical and the labels of the bars are not necessarily numbers. Another difference between a bar graph and a histogram is that the bars in a bar graph are usually separated by spaces, whereas the bars in a histogram are not separated by spaces. Number of states

D4 Appendix D Further Concepts in Statistics Monthly normal precipitation (in inches) 6 5 4 3 2 1 J FMAM J J A S OND Month Figure D.2 Example 3 Constructing a Bar Graph The data below show the monthly normal precipitation (in inches) in Houston, Texas. Construct a bar graph for these data. What can you conclude? (Source: National Oceanic and Atmospheric Administration) January 3.7 February 3.0 March 3.4 April 3.6 May 5.2 June 5.4 July 3.2 August 3.8 September 4.3 October 4.5 November 4.2 December 3.7 To create a bar graph, begin by drawing a vertical axis to represent the precipitation and a horizontal axis to represent the month. The bar graph is shown in Figure D.2. From the graph, you can see that Houston receives a fairly consistent amount of rain throughout the year the driest month tends to be February and the wettest month tends to be June. Line Graphs A line graph is similar to a standard coordinate graph. Line graphs are usually used to show trends over periods of time. Example 4 Constructing a Line Graph The following data show the number of persons (in thousands) obtaining legal permanent resident status in the United States for the years 1987 through 2006. Construct a line graph of the data. What can you conclude? (Source: U.S. Department of Homeland Security) Year Number Year Number 1987 600 1997 798 1988 641 1998 653 1989 1090 1999 645 1990 1536 2000 841 1991 1827 2001 1059 1992 973 2002 1059 1993 904 2003 704 1994 804 2004 958 1995 720 2005 1122 1996 916 2006 1266 Begin by drawing a vertical axis to represent the number of immigrants in thousands. Then label the horizontal axis with years and plot the points shown in the list. Finally, connect the points with line segments, as shown in Figure D.3 on the next page. From the line graph, you can see that the number of immigrants has increased and decreased over the years.

Appendix D Further Concepts in Statistics D5 Number of persons (in thousands) 2000 1800 1600 1400 1200 1000 800 600 400 200 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 Figure D.3 Year Choosing an Appropriate Graph Line graphs and bar graphs are commonly used for displaying data. When you are using a graph to organize and present data, you must first decide which type of graph to use. Here are some guidelines to help you decide which type of graph to use. 1. Use a bar graph when the data fall into distinct categories and you want to compare totals. 2. Use a line graph when you want to show the relationship between consecutive amounts or data over time. Example 5 Organizing Data with a Graph Listed below are professional golfers who have won the most money on the PGA Tour through 2007. Organize the data graphically. (Source: PGA Tour) Golfer Money Won (in millions) Jim Furyk $35.7 Davis Love III $35.6 Phil Mickelson $45.5 Vijay Singh $54.3 Tiger Woods $77.5 You can use a bar graph because the data fall into distinct categories, and it would be useful to compare totals. The bar graph shown in Figure D.4 is horizontal. This makes it easier to label each bar. Also notice that the golfers are listed in order of most money won.

D6 Appendix D Further Concepts in Statistics Tiger Woods Vijay Singh Golfer Phil Mickelson Jim Furyk Davis Love III Figure D.4 10 20 30 40 50 60 70 80 Money won (in millions of dollars) People (in millions) P 154 152 150 148 146 144 142 140 1 2 3 4 5 6 Year (0 2000) Figure D.5 t Scatter Plots Many real-life situations involve finding relationships between two variables, such as the year and the number of people in the labor force. In a typical situation, data are collected and written as a set of ordered pairs. The graph of such a set is called a scatter plot. From the scatter plot in Figure D.5 that relates the year t with the number of people in the labor force P, it appears that the points describe a relationship that is nearly linear. The relationship is not exactly linear because the labor force did not increase by precisely the same amount each year. A mathematical equation that approximates the relationship between t and P is called a mathematical model. When developing a mathematical model, you strive for two (often conflicting) goals accuracy and simplicity. For the data in Figure D.5, a linear model of the form P at b appears to be best. It is simple and relatively accurate. Consider a collection of ordered pairs of the form (x, y). If y tends to increase as x increases, the collection is said to have a positive correlation. If y tends to decrease as x increases, the collection is said to have a negative correlation. Figure D.6 shows three examples: one with a positive correlation, one with a negative correlation, and one with no (discernible) correlation. y y y x x x Positive correlation Negative correlation No correlation Figure D.6

Appendix D Further Concepts in Statistics D7 Example 6 Interpreting Correlation Test scores Test scores y 100 80 60 40 20 2 y 100 80 60 40 20 4 Figure D.7 4 6 8 Study hours 8 12 16 TV hours 10 20 x x On a Friday, 22 students in a class were asked to record the number of hours they spent studying for a test on Monday and the numbers of hours they spent watching television. The results are shown below. Construct a scatter plot for each set of data. Then determine whether the points are positively correlated, are negatively correlated, or have no discernible correlation. What can you conclude? (The first coordinate is the number of hours and the second coordinate is the score obtained on Monday s test.) Study Hours: (0, 40), (1, 41), (2, 51), (3, 58), (3, 49), (4, 48), (4, 64), (5, 55), (5, 69), (5, 58), (5, 75), (6, 68), (6, 63), (6, 93), (7, 84), (7, 67), (8, 90), (8, 76), (9, 95), (9, 72), (9, 85), (10, 98) TV Hours: (0, 98), (1, 85), (2, 72), (2, 90), (3, 67), (3, 93), (3, 95), (4, 68), (4, 84), (5, 76), (7, 75), (7, 58), (9, 63), (9, 69), (11, 55), (12, 58), (14, 64), (16, 48), (17, 51), (18, 41), (19, 49), (20, 40) Scatter plots for the two sets of data are shown in Figure D.7. The scatter plot relating study hours and test scores has a positive correlation. This means that the more a student studied, the higher his or her score tended to be. The scatter plot relating television hours and test scores has a negative correlation. This means that the more time a student spent watching television, the lower his or her score tended to be. Fitting a Line to Data Finding a linear model that represents the relationship described by a scatter plot is called fitting a line to data. You can do this graphically by simply sketching the line that appears to fit the points, finding two points on the line, and then finding the equation of the line that passes through the two points. Example 7 Fitting a Line to Data People (in millions) 154 152 150 148 146 144 142 140 P Figure D.8 3 P = (t 2) + 145 2 1 2 3 4 5 6 Year (0 2000) t Find a linear model that relates the year and the number of people P (in millions) in the United States who were part of the labor force from 2000 through 2006. (Source: U.S. Bureau of Labor Statistics) Year 2000 2001 2002 2003 2004 2005 2006 People, P 143 144 145 147 147 149 151 Let t represent the year with t 0 corresponding to 2000. After plotting the data in the table, draw the line that you think best represents the data, as shown in Figure D.8. Two points that lie on this line are 2, 145 and 6, 151. Using the point-slope form, you can find the equation of the line to be P 3 2 x 2 145.

D8 Appendix D Further Concepts in Statistics Once you have found a model, you can measure how well the model fits the data by comparing the actual values with the values given by the model, as shown in the following table. t 0 1 2 3 4 5 6 Actual P 143 144 145 147 147 149 151 Model P 142 143.5 145 146.5 148 149.5 151 The sum of the squares of the differences between the actual values and the model s values is the sum of the squared differences. The model that has the least sum is called the least squares regression line for the data. For the model in Example 7, the sum of the squared differences is 2.75. The least squares regression line for the data is P 1.3t 143 Best-fitting linear model Its sum of squared differences is 2.19. Many graphing calculators have built-in least squares regression programs. If your graphing calculator has such a program, enter the data in the table and use it to find the least squares regression line. Measures of Central Tendency In many real-life situations, it is helpful to describe data by a single number that is most representative of the entire collection of numbers. Such a number is called a measure of central tendency. The most commonly used measures are as follows. 1. The mean, or average, of n numbers is the sum of the numbers divided by n. 2. The median of n numbers is the middle number when the numbers are written in numerical order. If n is even, the median is the average of the two middle numbers. 3. The mode of n numbers is the number that occurs most frequently. If two numbers tie for most frequent occurrence, the collection has two modes and is called bimodal. Example 8 Comparing Measures of Central Tendency On an interview for a job, the interviewer tells you that the average annual income of the company s 25 employees is $60,849. The actual annual incomes of the 25 employees are shown below. What are the mean, median, and mode of the incomes? Was the person telling you the truth? $17,305, $478,320, $45,678, $18,980, $17,408, $25,676, $28,906, $12,500, $24,540, $33,450, $12,500, $33,855, $37,450, $20,432, $28,956, $34,983, $36,540, $250,921, $36,853, $16,430, $32,654, $98,213, $48,980, $94,024, $35,671

The mean of the incomes is Appendix D Further Concepts in Statistics D9 Mean 17,305 478,320 45,678 18,980... 35,671 25 1,521,225 25 $60,849. To find the median, order the incomes as follows. $12,500, $12,500, $16,430, $17,305, $17,408, $18,980, $20,432, $24,540, $25,676, $28,906, $28,956, $32,654, $33,450, $33,855, $34,983, $35,671, $36,540, $36,853, $37,450, $45,678, $48,980, $94,024, $98,213, $250,921, $478,320 From this list, you can see that the median (the middle number) is $33,450. From the same list, you can see that $12,500 is the only income that occurs more than once. So, the mode is $12,500. Technically, the person was telling the truth because the average is (generally) defined to be the mean. However, of the three measures of central tendency Mean: $60,849, Median: $33,450 and Mode: $12,500 it seems clear that the median is most representative. The mean is inflated by the two highest salaries. Which of the three measures of central tendency is the most representative? The answer is that it depends on the distribution of the data and the way in which you plan to use the data. For instance, in Example 8, the mean salary of $60,849 does not seem very representative to a potential employee. To a city income tax collector who wants to estimate 1% of the total income of the 25 employees, however, the mean is precisely the right measure. Example 9 Choosing a Measure of Central Tendency Which measure of central tendency is the most representative of the data shown in each of the following frequency distributions? Number 1 2 3 4 5 6 7 8 9 Frequency 7 20 15 11 8 3 2 0 15 Number 1 2 3 4 5 6 7 8 9 Frequency 9 8 7 6 5 6 7 8 9 Number 1 2 3 4 5 6 7 8 9 Frequency 6 1 2 3 5 5 4 3 0

D10 Appendix D Further Concepts in Statistics a. For these data, the mean is 4.23, the median is 3, and the mode is 2. Of these, the median or mode is probably the most representative. b. For these data, the mean and median are each 5 and the modes are 1 and 9 (the distribution is bimodal). Of these, the mean or median is the most representative. c. For these data, the mean is 4.59, the median is 5, and the mode is 1. Of these, the mean or median is the most representative. Appendix D Exercises 1. Exam Scores Construct a stem-and-leaf plot for the following exam scores for a class of 30 students. The scores are for a 100-point exam. 77, 100, 77, 70, 83, 89, 87, 85, 81, 84, 81, 78, 89, 78, 88, 85, 90, 92, 75, 81, 85, 100, 98, 81, 78, 75, 85, 89, 82, 75 2. Insurance Coverage The following table shows the total number of persons (in thousands) without health insurance coverage in the 50 states and the District of Columbia in 2005. Use a stem-and-leaf plot to organize the data. (Source: U.S. Census Bureau) AK 113 AL 657 AR 482 AZ 1183 CA 6757 CO 772 CT 381 DC 71 DE 103 FL 3616 GA 1654 HI 110 IA 241 ID 213 IL 1730 IN 832 KS 278 KY 498 LA 725 MA 583 MD 746 ME 136 MI 1033 MN 408 MO 668 MS 483 MT 145 NC 1312 ND 69 NE 185 NH 126 NJ 1265 NM 393 NV 418 NY 2474 OH 1288 OK 627 OR 566 PA 1196 RI 122 SC 721 SD 90 TN 798 TX 5394 UT 414 VA 951 VT 72 WA 828 WI 509 WV 304 WY 75 Exam Scores In Exercises 3 5, use the following set of data, which lists students scores on a 100-point exam. 93, 84, 100, 92, 66, 89, 78, 52, 71, 85, 83, 95, 98, 99, 93, 81, 80, 79, 67, 59, 90, 55, 77, 62, 90, 78, 66, 63, 93, 87, 74, 96, 72, 100, 70, 73 3. Use a stem-and-leaf plot to organize the data. 4. Complete the following frequency distribution table for the data. Interval Tally [50, 60) [60, 70) [70, 80) [80, 90) [90, 100) [100, 110) 5. Use the frequency distribution table in Exercise 4 to draw a histogram representing the data. 6. Meteorology The data below show the seasonal snowfall (in inches) in Lincoln, Nebraska for the years 1977 through 2006 (the amounts are listed in order by year). How would you organize this data? Explain your reasoning. (Source: National Oceanic and Atmospheric Administration) 21.8, 31.0, 34.4, 23.3, 13.0, 32.3, 38.0, 47.5, 21.5, 18.9, 15.7, 13.0, 19.1, 18.7, 25.8, 23.8, 32.1, 21.3, 21.7, 30.7, 29.0, 44.6, 24.4, 11.7, 37.9, 29.5, 31.7, 35.9, 16.3, 19.5 7. Travel The data below give the places of origin and the numbers of travelers (in millions) to the United States in 2006. Construct a bar graph for this data. (Source: U.S. Department of Commerce) Canada 16.0 Europe 10.1 Mexico 13.3 Far East 6.9 Other 10.8

Appendix D Further Concepts in Statisticss D11 8. Agriculture The data below show the cash receipts (in millions of dollars) from fruit crops of farmers in 2005. Construct a bar graph for this data. (Source: U.S. Department of Agriculture) Apples 1591 Peaches 510 Cherries 549 Strawberries 1383 Grapes 3461 Grapefruit 748 Oranges 1605 Environment In Exercises 9 14, use the line graph which shows municipal solid waste (in millions of tons) generated, recovered, and disposed from 1996 through 2005. (Source: Franklin Associates) Waste (in millions of tons) 400 360 320 280 240 200 160 120 80 40 Total waste Landfilled Recycled Incinerated 6 7 8 9 10 11 12 13 14 15 Year (6 1996) 9. Estimate the total waste in 1996 and 2004. 10. Estimate the amount of incinerated waste in 2003. 11. Which quantities increased every year? 12. During which time period(s) did the amount of landfilled waste decrease? 13. What is the relationship among the four quantities in the line graph? 14. Why do you think recycled waste is increasing? 15. College Enrollment The table shows the enrollment in a liberal arts college. Construct a line graph for the data. Year 2000 2001 2002 2003 Enrollment 1675 1704 1710 1768 Year 2004 2005 2006 2007 Enrollment 1833 1918 1967 1972 16. Oil Imports The table shows the crude oil imports into the United States (in millions of barrels) for the years 2001 through 2005. Construct a line graph for the data. (Source: U.S. Energy Information Administration) Year 2001 2002 2003 2004 2005 Oil imports 3405 3336 3528 3692 3670 17. Stock Market The list below shows stock prices for selected companies in January of 2008. Draw a graph that best represents the data. Explain why you chose that type of graph. (Source: Value Line) Company Stock Price Avon Products $39 Barnes & Noble $33 Cheesecake Factory $19 Target Corp. $49 Tiffany & Co. $41 18. Net Sales The table shows the net sales (in billions of dollars) for Gap, Inc. for the years 2002 through 2006. Draw a graph that best represents the data. Explain why you chose that type of graph. (Source: Gap, Inc.) Year 2002 2003 2004 2005 2006 Net Sales 14.5 15.9 16.3 16.0 15.9 19. Entertainment The factory sales (in millions of dollars) of LCD TVs for the years 2000 through 2005 are shown in the table. Draw a graph that best represents the data. Explain why you chose that type of graph. (Source: Consumer Electronics Association) Year 2000 2001 2002 2003 2004 2005 Sales 64 62 246 664 1579 3295 20. Owning Cats The average numbers (out of 100) of cat owners who state various reasons for owning a cat are listed below. Draw a graph that best represents the data. Explain why you chose that type of graph. Reason for Owning a Cat Number Have a pet to play with 93 Companionship 84 Help children learn responsibility 78 Have a pet to communicate with 62 Security 51

D12 Appendix D Further Concepts in Statistics Interpreting a Scatter Plot In Exercises 21 24, use the scatter plot shown. The scatter plot compares the number of hits x made by 30 softball players during the first half of the season with the number of runs batted in y. Runs batted in 10 9 8 7 6 5 4 3 2 1 y 1 2 3 4 5 6 7 8 9 10 Hits 21. Do x and y have a positive correlation, a negative correlation, or no correlation? 22. Why does the scatter plot show only 28 points? 23. From the scatter plot, does it appear that players with more hits tend to have more runs batted in? 24. Can a player have more runs batted in than hits? Explain. In Exercises 25 28, decide whether a scatter plot relating the two quantities would tend to have a positive correlation, a negative correlation, or no correlation. Explain. 25. The age and value of a car 26. A student s study time and test scores 27. The height and age of a pine tree 28. A student s height and test scores Pressure In Exercises 29 32, use the data in the table, which show the relationship between the altitude A (in thousands of feet) and the air pressure P (in pounds per square inch). Altitude, A 0 5 10 15 20 25 Pressure, P 14.7 12.2 10.1 8.3 6.8 5.5 Altitude, A 30 35 40 45 50 Pressure, P 4.4 3.5 2.7 2.1 1.7 29. Sketch a scatter plot of the data. 30. How are A and P related? 31. Estimate the air pressure at 42,500 feet. x 32. Estimate the altitude at which the air pressure is 5.0 pounds per square inch. Agriculture In Exercises 33 36, use the data in the table, where x is the number of units of fertilizer applied to sample plots and y is the yield (in bushels) of a crop. x 0 1 2 3 4 5 6 7 8 y 58 60 59 61 63 66 65 67 70 33. Sketch a scatter plot of the data. 34. Determine whether the points are positively correlated, are negatively correlated, or have no discernible correlation. 35. Sketch the line that you think best represents the data. Find an equation of the line you sketched. Use the line to predict the yield when 10 units of fertilizer are used. 36. Can the model found in Exercise 35 be used to predict yields for arbitrarily large values of x? Explain. Speed of Sound In Exercises 37 40, use the data in the table, where h is altitude in thousands of feet and v is the speed of sound in feet per second. h 0 5 10 15 20 25 30 35 v 1116 1097 1077 1057 1036 1016 994 972 37. Sketch a scatter plot of the data. 38. Determine whether the points are positively correlated, are negatively correlated, or have no discernible correlation. 39. Sketch the line that you think best represents the data. Find an equation of the line you sketched. Use the line to predict the speed of sound at an altitude of 27,000 feet. 40. The speed of sound at an altitude of 70,000 feet is approximately 968 feet per second. What does this suggest about the validity of using the model in Exercise 39 to predict the speed of sound beyond the values of h given in the table? In Exercises 41 44, use a graphing calculator to find the least squares regression line for the data. Sketch a scatter plot and the regression line. 41. (0, 23), (1, 20), (2, 19), (3, 17), (4, 15), (5, 11), (6, 10)

Appendix D Further Concepts in Statisticss D13 42. (4, 52.8), (5, 54.7), (6, 55.7), (7, 57.8), (8, 60.2), (9, 63.1), (10, 66.5) 43. 10, 5.1, 5, 9.8, (0, 17.5), (2, 25.4), (4, 32.8), (6, 38.7), (8, 44.2), (10, 50.5) 44. 10, 213.5, 5, 174.9, (0, 141.7), (5, 119.7), (8, 102.4), (10, 87.6) 45. Sales The table shows the sales y (in billions of dollars) from full-service restaurants for the years 2002 through 2006, where t 2 corresponds to 2002. (Source: National Restaurant Association) t 2 3 4 5 6 y 141.9 148.3 156.9 165.0 172.8 (a) Use a graphing calculator to find the least squares regression line for the data. Use the equation to estimate sales in 2007. (b) Make a scatter plot of the data and sketch the graph of the regression line. 46. Advertising The management of a department store ran an experiment to determine if a relationship existed between sales S (in thousands of dollars) and the amount spent on advertising x (in thousands of dollars). The following data were collected. x 1 2 3 4 5 6 7 8 S 405 423 455 466 492 510 525 559 (a) Use a graphing calculator to find the least squares regression line. Use the equation to estimate sales when $4500 is spent on advertising. (b) Make a scatter plot of the data and sketch the graph of the regression line. In Exercises 47 50, find the mean, median, and mode of the data set. 47. 5, 12, 7, 14, 8, 9, 7 48. 30, 37, 32, 39, 33, 34, 32 49. 5, 12, 7, 24, 8, 9, 7 50. 20, 37, 32, 39, 33, 34, 32 51. Electric Bills A person had the following monthly bills for electricity. What are the mean, median, and mode of the collection of bills? Jan. $67.92 Feb. $59.84 Mar. $52.00 Apr. $52.50 May $57.99 June $65.35 July $81.76 Aug. $74.98 Sept. $87.82 Oct. $83.18 Nov. $65.35 Dec. $57.00 52. Car Rental A car rental company kept the following record of the numbers of miles driven by a car that was rented. What are the mean, median, and mode of these data? Monday 410 Tuesday 260 Wednesday 320 Thursday 320 Friday 460 Saturday 150 53. Six-Child Families A study was done on families with six children. The table shows the number of families in the study with the indicated number of girls. Determine the mean, median, and mode of this set of data. Number of girls 0 1 2 3 4 5 6 Frequency 1 24 45 54 50 19 7 54. Sports A baseball fan examined the records of a baseball player s performance during his last 50 games. The number of games in which the player had 0, 1, 2, 3, and 4 hits are recorded in the table. Number of hits 0 1 2 3 4 Frequency 14 26 7 2 1 (a) Determine the average number of hits per game. (b) The player had 200 at bats during his last 50 games. Determine his batting average. 55. Think About It Construct a collection of numbers that has the following properties. If this is not possible, explain why it is not. Mean 6, median 4, mode 4 56. Think About It Construct a collection of numbers that has the following properties. If this is not possible, explain why it is not. Mean 6, median 6, mode 4 57. Test Scores A professor records the following scores for a 100-point exam. 99, 64, 80, 77, 59, 72, 87, 79, 92, 88, 90, 42, 20, 89, 42, 100, 98, 84, 78, 91 Which measure of central tendency best describes these test scores? 58. Shoe Sales A salesperson sold eight pairs of men s dress shoes. The sizes of the eight pairs were as follows: 101 8, 12, 101 10, 91 11, and 10 1 2, 2, 2, 2. Which measure (or measures) of central tendency best describes the typical shoe size sold?