Section 6.2: Measures of Variation

Similar documents
Math 082 Final Examination Review

σ. We further know that if the sample is from a normal distribution then the sampling STAT 2507 Assignment # 3 (Chapters 7 & 8)

8.1 Frequency Distribution, Frequency Polygon, Histogram page 326

Chapter 5: Normal Probability Distributions

Biostatistics Presentation of data DR. AMEER KADHIM HUSSEIN M.B.CH.B.FICMS (COM.)

Surds 1. Good form. ab = a b. b = a. t is an integer such that

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Math Want to have fun with chapter 4? Find the derivative. 1) y = 5x2e3x. 2) y = 2xex - 2ex. 3) y = (x2-2x + 3) ex. 9ex 4) y = 2ex + 1

Essential Statistics Chapter 6

IB Questionbank Mathematical Studies 3rd edition. Grouped discrete. 184 min 183 marks

Exam: practice test 1 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Essential Question: What are the standard intervals for a normal distribution? How are these intervals used to solve problems?

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

A C E. Answers Investigation 4. Applications

Continuous random variables

Sections 6.1 and 6.2: The Normal Distribution and its Applications

The Normal Distribution

Topic 5: Statistics 5.3 Cumulative Frequency Paper 1

Honors Algebra 1 - Fall Final Review

STAB22 Statistics I. Lecture 7

STAT 200 Chapter 1 Looking at Data - Distributions

Lecture 2. Descriptive Statistics: Measures of Center

Chapter 6 Group Activity - SOLUTIONS

Math 2311 Sections 4.1, 4.2 and 4.3

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

How spread out is the data? Are all the numbers fairly close to General Education Statistics

(c) Plot the point ( x, y ) on your scatter diagram and label this point M. (d) Write down the product-moment correlation coefficient, r.

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved

download instant at

MODULE 9 NORMAL DISTRIBUTION

Density Curves and the Normal Distributions. Histogram: 10 groups

Algebra 2. Outliers. Measures of Central Tendency (Mean, Median, Mode) Standard Deviation Normal Distribution (Bell Curves)

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

AP Final Review II Exploring Data (20% 30%)

M 225 Test 1 B Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

CONTENTS Page Rounding 3 Addition 4 Subtraction 6 Multiplication 7 Division 10 Order of operations (BODMAS)

Unit 1: Number System Fluency

STT 315 This lecture is based on Chapter 2 of the textbook.

Chapter 6 Assessment. 3. Which points in the data set below are outliers? Multiple Choice. 1. The boxplot summarizes the test scores of a math class?

LC OL - Statistics. Types of Data

Chapter 5. Understanding and Comparing. Distributions

Tabulation means putting data into tables. A table is a matrix of data in rows and columns, with the rows and the columns having titles.

M1-Lesson 8: Bell Curves and Standard Deviation

Intermediate Mathematics League of Eastern Massachusetts

Negative Exponents Scientific Notation for Small Numbers

Mean, Median, Mode, and Range

A is one of the categories into which qualitative data can be classified.

Vocabulary: Samples and Populations

are the objects described by a set of data. They may be people, animals or things.

Shape, Outliers, Center, Spread Frequency and Relative Histograms Related to other types of graphical displays

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

Example: What number is the arrow pointing to?

Accelerated Intermediate 2 Summer Math Packet

Complement: 0.4 x 0.8 = =.6

(a) Find the value of x. (4) Write down the standard deviation. (2) (Total 6 marks)

Section 7.2 Homework Answers

SL - Binomial Questions

1 Probability Distributions

3.1 Measure of Center

Recall that the standard deviation σ of a numerical data set is given by

6 th Grade Mathematics Alignment Common Core State Standards and CT Frameworks

FSA Algebra I End-of-Course Review Packet

California 5 th Grade Standards / Excel Math Correlation by Lesson Number

Measures of the Location of the Data

Sampling, Frequency Distributions, and Graphs (12.1)

ALGEBRA 1. Workbook Common Core Standards Edition. Published by TOPICAL REVIEW BOOK COMPANY. P. O. Box 328 Onsted, MI

Chapter 6 Continuous Probability Distributions

Mathematics Midterm Exam 2. Midterm Exam 2 Practice Duration: 1 hour This test has 7 questions on 9 pages, for a total of 50 points.

Chapter 3: Examining Relationships

Quantitative Bivariate Data

Unit 2. Describing Data: Numerical

Chapter 3. Data Description

Elementary Statistics

Performance of fourth-grade students on an agility test

Algebra 2 Level 2 Summer Packet

FSA Algebra I End-of-Course Review Packet

Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Stat 101 Exam 1 Important Formulas and Concepts 1

[ z = 1.48 ; accept H 0 ]

The Normal Distribution. Chapter 6

DMA 50 Worksheet #1 Introduction to Graphs: Analyzing, Interpreting, and Creating Graphs

Chapter 9: Equations and Inequalities

Section 3.2 Measures of Central Tendency

Slide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts.

The empirical ( ) rule

Chapter 2: Tools for Exploring Univariate Data

Statistics. Industry Business Education Physics Chemistry Economics Biology Agriculture Psychology Astronomy, etc. GFP - Sohar University

6. 5x Division Property. CHAPTER 2 Linear Models, Equations, and Inequalities. Toolbox Exercises. 1. 3x = 6 Division Property

Summer Math Packet for Students Entering Algebra 2 1. Percentages. a. What is 15% of 250 b. 150 is what percent of 750?

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Texas Practice End-of-Course Assessment 3

3. Evaluate: (a) ( 8) 144 ( 9) [2] (b) {[(54 39) ( 5) + 7] ( 3) + ( 6)} ( 3 + 1) [2] (c) ( 2) ( 5) 2 ( 3) 3 [2]

Section 2.3: One Quantitative Variable: Measures of Spread

Math 1020 TEST 3 VERSION A Spring 2017

Measures of Dispersion

Ohio s Learning Standards-Extended. Mathematics. Ratio and Proportional Relationships Complexity a Complexity b Complexity c

IDAHO EXTENDED CONTENT STANDARDS MATHEMATICS

Name: Class: Date: ID: A. Find the mean, median, and mode of the data set. Round to the nearest tenth. c. mean = 9.7, median = 8, mode =15

LHS Algebra Pre-Test

Transcription:

Section 6.2: Measures of Variation Measures of variation (or spread) refers to a set of numerical summaries that describe the degree to which the data are spread out. Why do we need them? Why is using measures of center not sucient in describing data sets? To answer these kind of questions consider the following example. Example 1. Babysitting Suppose you are starting a new babysitting business. You have advertised in your neighborhood and it does not take long for calls to come in. You get calls from two moms, Mom A and Mom B. The information given by both moms is illustrated in the picture below. As you can see the information provided is not enough to distinguish between the two families. It is highly unlikely that these two moms have kids that are the exact same ages. You politely ask the moms for the actual ages of their children. Mom A Mom B 1, 2, 4, 8, 10 3, 4, 5, 6, 7 We will use these two groups of data to nd a new set of measures called measures of variation: range and standard deviation. These measures will help us quantify the age variation for each family and better understand our data. 184

Objective: Compute a range Denition. The range of a set of data is the dierence between the largest value and the smallest value in the data set. Range = Largest value Smallest value The larger the range value, the more scattered the data. Example 2. Babysitting Find the range for our data: Mom A Mom B 1, 2, 4, 8, 10 3, 4, 5, 6, 7 Range of children's ages of Mom A = 10-1 = 9 Range of children's ages of Mom B = 7-3 = 4 Since the range value 9 is the highest of the two range values, we conclude that children's ages of Mom A have greater age variation than the children's ages of Mom B. A disadvantage of using the range is that it only considers the maximum and the minimum values and ignores the rest of data. 185

Objective: Compute a standard deviation We would prefer a measure of variation to quantify spread with respect to the center as well as to use all the data values in the set. This measure is the standard deviation. Denition 3. The standard deviation is the average distance the data values are from the mean. Example 4. Babysitting s = q P (x x) 2 n 1 Consider the Babysitting example again. Find the standard deviation for Mom A children's ages. Step 1: Start by setting up a table like the one below. Step 2: Write each data value in the rst column of the table. In column 2 nd the dierence betwen each age and the mean. Next square the values from column 2 and enter the results in column 3 (see the computations in the table below). Mom A Children's ages mean is x = 5 Age =x Age - Mean = x x (Age Mean) 2 = (x x) 2 1 1-5 = - 4 ( 4) 2 =16 2 2-5 = - 3 ( 3) 2 =9 4 4-5 = -1 ( 1) 2 =1 8 8-5 = 3 (3) 2 =9 10 10-5 = 5 (5) 2 =25 Step 3: Substitute the values in the formula for standard deviation and perform the computations. s = s = q P (x x) 2 n 1 r 16+9 + 1 + 9 + 25 5 1 q s = 60 4 p s = 15 Standard deviation formula. Add the values from column 3 of the table on the numerator. On the denominator subtract 1 from the number of children of Mom A. Divide 60 by 4 Take square root of 15 s 3.87 This is the standard deviation of childrend ages for Mom A. 186

Example 5. Babysitting Find the standard deviation for Mom B children's ages. We repeat the same steps as in previous example to nd the standard deviation of children ages for Mom B. Step 1: Set up the table. Step 2: We write each data value in the rst column of the table. In column 2 we nd the dierence betwen each age and the mean. Square the values from column 2 and enter the results in column 3 (see the computations in the table below). Mom B Children's ages mean is x = 5 Age =x Age - Mean = x x (Age Mean) 2 = (x x) 2 3 3-5 = - 2 ( 2) 2 =4 4 4-5 = - 1 ( 1) 2 =1 5 5-5 = 0 (0) 2 =0 6 6-5 = 1 (1) 2 =1 7 7-5 = 2 (2) 2 =4 Step 3: Substitute the values in the formula for standard deviation and perform the computations. s = q P (x x) 2 n 1 Standard deviation formula. r s = 4 + 1 + 0 + 1 + 4 5 1 q s = 10 4 p s = 2.5 Add the values from column 3 of the table on the numerator. On the denominator subtract 1 from the number of children of Mom A. Divide 10 by 4 Take square root of 2.5 s 1.58 This is the standard deviation of childrend ages for Mom B. 187

Objective: Compare distributions with dierent standard deviations As mentioned previously, the standard deviation can be used to determine the variation (spread) of data. That is, the larger the standard deviaiton, the more the data values are spread. This also permits us to compare data sets with different standard deviations. Let us look at a few examples. Example 6. Babysitting We would like to compare the age variation (spread) between the two families. To visualize the idea of spread consider the dot plots below coresponding to our data sets. Mom A Mom B Clearly, the data in the dot plot A is spread out more from the mean, while the data in the dot plot B is closer to the mean. Thus, the graphs show that Mom A children's ages are more spread out from the mean than the Mom B children's ages. We can also compare the standard deviations calculated previously for each family. We have a standard deviation value of 3.87 which is larger than the standard deviation value of 1.58. Thus, we reach the same conclusion that Mom A children's ages have a greater age variation than Mom B children's ages. Example 7. Emergency Room Waiting Time Consider the following scenario: The mean of waiting times in an emergency room is 90 minutes with a standard deviation of 12 minutes for people who are admitted for additional treatment. The mean waiting time for patients who are discharged after receiving treatment is 130 minutes with a standard deviation of 19 minutes. In this example, by comparing the standard deviation values (13 min. and 19 min.), we can conclude that the waiting time for patients who are discharged after receiving treatment are more spread out than the waiting time for people who are admitted for additional treatment. 188

Example 8. Bank Waiting Time We can also determine which data set has a greater variation by looking at histograms. Consider the two histograms below which illustrate the waiting time (in minutes) of customers at two banks. The data in histogram B is spread out more from the mean, while the data in histogram A is closer to the mean. This means that histogram B has a larger standard deviation than histogram A. Therefore, Bank A has more consistency in time. 189

Objective: Describe data that are bell shaped - using Empirical Rule Empirical Rule species the proportions of the spread in terms of the standard deviation for distributions that are unimodal and symmetric. The Empirical Rule (or 68-95-99.7) If the data distribution is unimodal and symmetric, then: About 68% of the data values will fall within 1 standard deviation of the mean. About 95% of the data values will fall within 2 standard deviations of the mean. About 99.7% of the data values will fall within 3 standard deviations of the mean. Example 9. Assume the amount of milk consumption per day of six-months-old babies are unimodal and symmetric with a mean of 15 ounces and a standard deviation of 2 ounces. a) What percentage of babies consume between 13 and 17 ounces of milk per day? b) What percent of babies consume between 11 and 13 ounces of milk per day? c) What percent of babies consume less than 19 ounces of milk per day? d) What percent of babies consume more than 13 ounces of milk per day? e) 95% of six-months-old babies consume milk between what limits? The key to answer these questions is to sketch a bell-shaped curve and label the horizontal axis with the correct marks. 190

Find the marks which are 1 standard deviation away from the mean. 15-2 = 13 15 +2 = 17 13 is the mark 1 standard deviation below the mean. 17 is the mark 1 standard deviation above the mean. Find the marks which are 2 standard deviations away from the mean. 13-2 = 11 17 +2 = 19 11 is the mark 2 standard deviations below the mean. 19 is the mark 2 standard deviations above the mean. Find the marks which are 3 standard deviations away from the mean. 11-2 = 9 19 +2 = 21 9 is the mark 3 standard deviations below the mean. 21 is the mark 3 standard deviations above the mean. Sketch a bell-shaped curve and label the horizontal axis with the marks. 191

a) To nd the percentage of babies that consume between 13 and 17 ounces of milk per day shade the area under the curve between 13 and 17 (see picture below) and gure out the percent of the shaded area. Answer: 68% b) To nd the percentage of babies that consume between 11 and 13 ounces of milk per day shade the area under the curve between 11 and 13 (see picture below) and gure out the percent of the shaded area. Answer: 13.5% 192

c) To nd the percentage of babies that consume less than 19 ounces of milk per day shade the area under the curve from 19 and below (see picture below) and gure out the percent of the shaded area. Answer: 97.5% d) To nd the percentage of babies that consume more than 13 ounces of milk per day shade the area under the curve from 13 and above (see picture below) and gure out the percent of the shaded area. Answer: 84% 193

e) To nd the requested limits we use the Empirical Rule statement that 95% of data values fall in between 2 standard deviations of the mean. From the picture below the limits are the marks on the horizontal line which are 2 standard deviations away from the mean. Answer: 11 and 19 194

6.2 Practice 1. The number of hours a student went to work after school last week are: 1, 4, 3, 2, 6 a. Find the range. b. Calculate the standard deviation. 2. The daily vehicle pass charge in dollars for ve National parks are: 10, 8, 15, 20, 15 a. Find the range. b. Calculate the standard deviation. 3. The range of scores on a statistics test was 40. The highest score was 99. What was the lowest score? 4. The weather station announced that the temeprature uctuates between a low of 80F and a high of 93F. Which measure of spread could be calculated using just this information? What is its value? 5. The table below shows the preparation tax return times (in hours) of two certi- ed public accountants. Accountant X 7 9 5 11 8 Accountant Z 5 11 14 3 7 a. Find the range preparation time for each accountant. b. Calculate the standard deviations for each accountant. c. Who has the more consistent tax preparation time? 6. The average number of days construction workers miss per year is 11 with a standard deviation of 2.3. The average number of days factory workers miss per year is 8 with a standard deviation of 1.8. Which class of workers is more variable in terms of days missed? 7. The histograms below illustrate the contribution (in dollars) made to two presidential candidates in a recent election. Which histogram has a larger standard deviation? 195

8. A standardized test has a mean score of 500 points with a standard deviation of 100 points. A group of students' scores are listed below: Bill: 589 Maria: 695 John: 725 Ella: 435 Adriana: 265 Which of the students have scores within two standard deviations of the mean? 9. The distribution of scores on a test is unimoad and symmetric with a mean of 75. If 68% of the scores fall between 70 and 80, what is the standard deviation of the distribution? 10. Upload speeds for a particular Internet provider are known to be unimodal and symmetric. The mean upload speed is 500 KBps with a standard deviation of 60 KBps. a. What percent of speeds were between 320 KBps and 680 KBps? b. 68% of speeds are between what limits? c. What percent of speeds are greater than 560 KBps? d. What percent of speeds are between 380 KBps and 440 KBps? 196

6.2 Answers 1. a. 5 hours b. 0.86 hours 2. a. 12 dollars b. 4.72 dollars 3. 59 4. Range, 13 F 5. a. Accountant X: range = 6 hours, Accountant Y: range = 11 hours b. Accountant X: s = 2.24 hours, Accountant Y: s = 4.47 hours c. Accountant X has a more consistant tax preparation time (his standard deviation is smaller) 6. The construction workers have more variability in their days missed because the standard deviation is larger. 7. Candidate B because there appears to be more data (taller bars) farther away from the mean. 8. Bill, Maria, and Ella 9. 5 10. a. 99.7% b. 440 KBps and 560 KBps c. 16% d. 13.5% 197