MEASURING THE SPREAD OF DATA: 6F

Similar documents
Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

CHAPTER 2: Describing Distributions with Numbers

Describing Distributions

QUANTITATIVE DATA. UNIVARIATE DATA data for one variable

In this investigation you will use the statistics skills that you learned the to display and analyze a cup of peanut M&Ms.

2011 Pearson Education, Inc

3.1 Measure of Center

Review for Exam #1. Chapter 1. The Nature of Data. Definitions. Population. Sample. Quantitative data. Qualitative (attribute) data

Math 140 Introductory Statistics

Math 140 Introductory Statistics

MATH 117 Statistical Methods for Management I Chapter Three

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

Chapter 1: Exploring Data

Chapter 2: Tools for Exploring Univariate Data

Describing distributions with numbers

Chapter 6 Group Activity - SOLUTIONS

Example 2. Given the data below, complete the chart:

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Describing distributions with numbers

Math 14 Lecture Notes Ch Percentile

Section 2.3: One Quantitative Variable: Measures of Spread

Section 3. Measures of Variation

Chapter 5. Understanding and Comparing. Distributions

1.3: Describing Quantitative Data with Numbers

Describing Distributions With Numbers

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

CHAPTER 1. Introduction

Lecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Unit 2. Describing Data: Numerical

Determining the Spread of a Distribution

Determining the Spread of a Distribution

Statistics Add Ins.notebook. November 22, Add ins

The Normal Distribution. Chapter 6

Chapter 1. Looking at Data

Descriptive Statistics-I. Dr Mahmoud Alhussami

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Resistant Measure - A statistic that is not affected very much by extreme observations.

STRAND E: STATISTICS. UNIT E4 Measures of Variation: Text * * Contents. Section. E4.1 Cumulative Frequency. E4.2 Box and Whisker Plots

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Unit 2: Numerical Descriptive Measures

Algebra 2. Outliers. Measures of Central Tendency (Mean, Median, Mode) Standard Deviation Normal Distribution (Bell Curves)

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

Exercises from Chapter 3, Section 1

Chapter 3. Data Description

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

6 THE NORMAL DISTRIBUTION

Revision Topic 13: Statistics 1

IB Questionbank Mathematical Studies 3rd edition. Grouped discrete. 184 min 183 marks

Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

F78SC2 Notes 2 RJRC. If the interest rate is 5%, we substitute x = 0.05 in the formula. This gives

Determining the Spread of a Distribution Variance & Standard Deviation

Percentile: Formula: To find the percentile rank of a score, x, out of a set of n scores, where x is included:

2.1 Measures of Location (P.9-11)

Lecture 11. Data Description Estimation

CHAPTER 2 Modeling Distributions of Data

Describing Distributions with Numbers

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Slide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts.

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study

are the objects described by a set of data. They may be people, animals or things.

Chapter 3. Measuring data

Lecture 2. Descriptive Statistics: Measures of Center

PS2.1 & 2.2: Linear Correlations PS2: Bivariate Statistics

Topic 2 Part 1 [195 marks]

1.3.1 Measuring Center: The Mean

Recap: Ø Distribution Shape Ø Mean, Median, Mode Ø Standard Deviations

MATH 1150 Chapter 2 Notation and Terminology

Describing Distributions With Numbers Chapter 12

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

STAT 200 Chapter 1 Looking at Data - Distributions

Section 3.2 Measures of Central Tendency

What are the mean, median, and mode for the data set below? Step 1

Lecture 2 and Lecture 3

Elementary Statistics

Measures of disease spread

Review: Central Measures

Chapter 3 Data Description

additionalmathematicsstatisticsadditi onalmathematicsstatisticsadditionalm athematicsstatisticsadditionalmathem aticsstatisticsadditionalmathematicsst

PS2: Two Variable Statistics

Stat 101 Exam 1 Important Formulas and Concepts 1

MgtOp 215 Chapter 3 Dr. Ahn

Stats Review Chapter 3. Mary Stangler Center for Academic Success Revised 8/16

MATH 10 INTRODUCTORY STATISTICS

UNIVERSITY OF MASSACHUSETTS Department of Biostatistics and Epidemiology BioEpi 540W - Introduction to Biostatistics Fall 2004

Univariate data. topic 12. Why learn this? What do you know? Learning sequence

Measures of Location. Measures of position are used to describe the relative location of an observation

TOPIC: Descriptive Statistics Single Variable

A is one of the categories into which qualitative data can be classified.

Chapters 1 & 2 Exam Review

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

3.3. Section. Measures of Central Tendency and Dispersion from Grouped Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Math 140 Introductory Statistics

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

Describing Data: Two Variables

Transcription:

CONTINUING WITH DESCRIPTIVE STATS 6E,6F,6G,6H,6I MEASURING THE SPREAD OF DATA: 6F othink about this example: Suppose you are at a high school football game and you sample 40 people from the student section about their age. othen you head to a professional game and you sample 40 random people there. You find that you have the same mean as the high school game. owhat is different about the two scenarios? oare they a good representation of the data collected? 1

DISPERSION osometimes mean, median and mode don t give you an accurate description of the distribution. To do that, we need to measure both the centre and its dispersion. owe can identify the centre, but the spread of the data can be analyzed 3 different ways: orange, ointerquartile range, ostandard deviation. RANGE The range of the data, or the max minus the min is not a particularly reliable measure of spread. Why do you think that would be true? 2

THE QUARTILES AND THE INTERQUARTILE RANGE The median divides the data into two even halves. If we look at the middle of the lower half, we have found the 1 st quartile, or the lower quartile. If we look at the middle of the upper half, we have found the 3 rd quartile, or the upper quartile. The distance between the two quartiles is called the Interquartile range. The tells us the range of the middle 50% of the data. = EXAMPLE 1) Reorder the set 2) Find the median 3) Find the lower quartile and upper quartile. If there is a middle term, disregard it when finding the quartiles. If there are 2 terms for the median, use the lower one for and the upper on for. 4) Calculate the 3

CALCULATOR EXAMPLE Use a GDC to calculate the Range, &, and IQR. 6G BOX AND WHISKER PLOT Hopefully you have seen these before. Lets break it down Quickly. Make sure that you use a number line that is in increments. Why would that need to be true? 4

WHAT WOULD A B&W LOOK LIKE IF DRAW AN EXAMPLE! Where on the number line is the outlier? Toward the positive side = positively skewed Toward the negative side = negatively skewed The data was a symmetrical distribution? The data was positively skewed? The data was negatively skewed? OYO: TRY IT Create a box and whisker plot (boxplot) from the data: 13, 24, 14, 11, 9, 31, 33, 33, 33, 18, 29, 28 Use of a calculator can be helpful but it doesn t label the important values for you, so 5

PARALLEL BOXPLOTS Simply put, two sets of data are compared on the same number line with two boxplots. Example: A hospital is trialing a new anesthetic drug and has collected data on how long the new and old drugs take before the patient becomes unconscious. They wish to know which drug acts faster and which is more reliable. Old drug times (s): 8, 12, 9, 8, 16, 10, 14, 7, 5, 21, 13, 10, 8, 10, 11, 8, 11, 9, 11, 14 New drug times (s): 8, 12, 7, 8, 12, 11, 9, 8, 10, 8, 10, 9, 12, 8, 8, 7, 10, 7, 9, 9 Lets put these on the same number line and compare the data. Use a 5-number summary! PARALLEL BOXPLOTS Old drug times (s): 8, 12, 9, 8, 16, 10, 14, 7, 5, 21, 13, 10, 8, 10, 11, 8, 11, 9, 11, 14 New drug times (s): 8, 12, 7, 8, 12, 11, 9, 8, 10, 8, 10, 9, 12, 8, 8, 7, 10, 7, 9, 9 Faster? Reliable? 6

INTERESTING TO NOTE Old drug times (s): 8, 12, 9, 8, 16, 10, 14, 7, 5, 21, 13, 10, 8, 10, 11, 8, 11, 9, 11, 14 Are any of these outliers? CUMULATIVE FREQUENCY GRAPHS Before we get started: Cumulative Frequency: The frequency of an event is the accumulation of the frequencies up to and including the event. Cumulative Relative Frequency of an event is the sum of the relative frequencies up to and including that event divided by the total number n. (the percent of data used thus far) 7

EXAMPLE BY HAND Lengths Tally Frequency Relative frequency 1.00 2 1.25 7 1.50 7 1.75 10 2.00 15 2.25 24 2.50 33 2.75 14 3.00 11 3.25 21 3.50 6 3.75 3 Total Length of steel Rod to 3 decimal places. Cumulative frequency Cumu. Relative Frequency PERCENTILES (EXACT PERCENTILES ARE NOT ON IB EXAM) A percentile is the score below which a certain percentage of the data lies. For example: The 85th percentile is the score below which 85% of the data lies. If your score in a test is the 95th percentile, then 95% of the class have scored less than you. Notice that: the lower quartile (Q1) is the 25th percentile the median (Q2) is the 50th percentile the upper quartile (Q3) is the 75th percentile. 8

CUMULATIVE FREQUENCY GRAPH: Represents only cumulative frequency. It starts at 0 and ends at the total (these are the boundaries). CONTINUED 9

LETS CREATE OUR OWN From the table on slide 18. Length of steel Rod. OYO FROM YOUR BOOK 10

9/27/2017 THE LIMITATIONS Range and IQR are limited in the amount information. We talked about the limitations of range. What would the limitations of IQR be? We need a better way of describing the dispersion of the data!! STANDARD DEVIATION DEF: The measures of deviation between scores and the mean; the measure of dispersal of the data. The larger the standard deviation, then more widely spread the data would be. The smaller the standard deviation, the less spread (less dispersed). How deviated each score is from the mean We calculate it by considering a data set of n values:,,,,.,, with mean. = ( ) 11

9/27/2017 LETS BREAK IT DOWN First thing first. We are talking about individual ungrouped data. = total frequency = individual Score = mean = is the Standard Deviation = SD. ( ) We are looking at the measure of how far an individual score is from the mean. We then sum up all of those distances after we have made them all positive, by squaring them. If this number is smaller, then we know that most of the data values are close to. Dividing by n averages out each data value and square rooting it corrects the units. STANDARD DEVIATION BY HAND This will be an expectation of mine, so learn how to do it. I will be testing you on this, but the IB papers, and IA will not require you to do it by hand. The best way to find standard deviation by hand is to use a table. Lets look at an example and fill in the table by hand. We do it by hand to understand the mechanisms of how the GDC computes the. 12

9/27/2017 EXAMPLE: IA MATH SCORES FOR WILLAMETTE HS. Calculate the SD, or for the data below. We will need to know some information before we can calculate it. What info do we need? Math IA Scores 4 2 5 4 5 6 7 6 4 3 TOTAL = ( ) NOW, USING A CALCULATOR For larger sets up data, it would only make sense to use a GDC. Therefore, lets do an example. Calculate the standard deviation of the data set: 2, 5, 4, 6, 7, 5, 6, 8, 5, 8, 3, 9, 6, 8, 1, 1, 2, 2, 2, 5 As before, we would enter this into a list and use 1- variable stats to calculate the Make sure you always use the standard deviation of the population (this is a new development!). 13

9/27/2017 FREQUENCY TABLES. For frequency tables, we can still find the SD by hand or by use of GDC. By hand, we use the formula. This simply adds one more column to our table. Lets calculate this by hand. = ( ) Math IA Scores Frequency 1 1 2 2 3 4 4 8 5 17 6 11 7 3 GROUPED DATA FREQUENCY TABLES Same thing here, but we use as the midpoint of the class intervals. Lets use technology to calculate this one! Steps for grouped Data: 1. Create 2. Enter into 3. Enter freq. into 4. Press. 1 5. List 6. Freq. 7. Press calculate and find 14

COMPARING THE SPREAD OF TWO DATA SETS The following exam results were recorded by two classes of students studying Spanish: Class A: 64 69 74 67 78 88 76 90 89 84 83 87 78 80 95 75 55 78 81 Class B: 94 90 88 81 86 96 92 93 88 72 94 61 87 90 97 95 77 77 82 90 Compare the results of the two classes including their spread. Lets use the GDC to 1) Compare mean, 2) compare SD for dispersion. CORRECT, in a galaxy far far away 15

THAT PUTS US AT THE END OF THE POWER STANDARD! We will have one day of a review/activity, then take the PS2 assessment! Your homework is 6G.1 #2,4 6G.2 #2,4 (use GDC!!) 6H #1,4, 5 6I.1 #1, 3, 5 6I.2 #1,2, 6,7 6I.3 #1, 3 16