Measures of Dispersion

Similar documents
Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Unit 2. Describing Data: Numerical

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Chapter 3 Statistics for Describing, Exploring, and Comparing Data. Section 3-1: Overview. 3-2 Measures of Center. Definition. Key Concept.

Lecture 2. Descriptive Statistics: Measures of Center

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

Exam: practice test 1 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MALLOY PSYCH 3000 MEAN & VARIANCE PAGE 1 STATISTICS MEASURES OF CENTRAL TENDENCY. In an experiment, these are applied to the dependent variable (DV)

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Lecture 11. Data Description Estimation

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved

Lecture 3: Chapter 3

Continuous Random Variables

Stats Review Chapter 3. Mary Stangler Center for Academic Success Revised 8/16

Estimating a Population Mean

Statistics for Managers using Microsoft Excel 6 th Edition

Where would you rather live? (And why?)

Describing Data: Numerical Measures GOALS. Why a Numeric Approach? Chapter 3 Dr. Richard Jerz

How spread out is the data? Are all the numbers fairly close to General Education Statistics

The Normal Distribution. The Gaussian Curve. Advantages of using Z-score. Importance of normal or Gaussian distribution (ND)

2011 Pearson Education, Inc

Numerical Measures of Central Tendency

Math 120 Chapter 3 additional notes

Describing Data: Numerical Measures

Describing Data: Numerical Measures

Describing Data: Numerical Measures. Chapter 3

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population

MATH 117 Statistical Methods for Management I Chapter Three

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange.

Functions of Random Variables Notes of STAT 6205 by Dr. Fan

Section 2.3: One Quantitative Variable: Measures of Spread

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

A is one of the categories into which qualitative data can be classified.

MEASURES OF LOCATION AND SPREAD

Equations in Quadratic Form

Homework 4 Math 11, UCSD, Winter 2018 Due on Tuesday, 13th February

Sampling, Frequency Distributions, and Graphs (12.1)

Describing distributions with numbers

Chapter (4) Discrete Probability Distributions Examples

Math 416 Lecture 2 DEFINITION. Here are the multivariate versions: X, Y, Z iff P(X = x, Y = y, Z =z) = p(x, y, z) of X, Y, Z iff for all sets A, B, C,

Measures of disease spread

Descriptive Univariate Statistics and Bivariate Correlation

TOPIC: Descriptive Statistics Single Variable

Descriptive Statistics CE 311S

Unit 2: Numerical Descriptive Measures

3.1 Measure of Center

Finding Quartiles. . Q1 is the median of the lower half of the data. Q3 is the median of the upper half of the data

Unit 22: Sampling Distributions

2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

Measures of Central Tendency. For Ungrouped Data

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

STAT 200 Chapter 1 Looking at Data - Distributions

STATISTICS. 1. Measures of Central Tendency

MAT2377. Ali Karimnezhad. Version December 13, Ali Karimnezhad

Density Curves & Normal Distributions

MgtOp 215 Chapter 3 Dr. Ahn

After completing this chapter, you should be able to:

Exercises from Chapter 3, Section 1

Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

Chapter 2: Tools for Exploring Univariate Data

Solving Quadratic Equations

MAT S3.3_3 Measures of Variation. September 02, Chapter 3 Statistics for Describing, Exploring, and Comparing Data.

Identify the scale of measurement most appropriate for each of the following variables. (Use A = nominal, B = ordinal, C = interval, D = ratio.

Chapter (7) Continuous Probability Distributions Examples Normal probability distribution

Range The range is the simplest of the three measures and is defined now.

Chapter 4: Continuous Random Variables and Probability Distributions

equal to the of the. Sample variance: Population variance: **The sample variance is an unbiased estimator of the

MATH 1150 Chapter 2 Notation and Terminology

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Descriptive Statistics and Probability Test Review Test on May 4/5

2.1 Measures of Location (P.9-11)

Solving Quadratic Equations by Graphing 6.1. ft /sec. The height of the arrow h(t) in terms

Introduction to Statistics Using LibreOffice.org Calc. Fouth Edition. Dana Lee Ling (2012)

Precision Correcting for Random Error

Review: Central Measures

Data Analysis and Statistical Methods Statistics 651

Chapter 5. Understanding and Comparing. Distributions

The response variable depends on the explanatory variable.

The empirical ( ) rule

Estimating a population mean

Ch. 17. DETERMINATION OF SAMPLE SIZE

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability

Logarithmic and Exponential Equations and Change-of-Base

Continuous random variables

Introduction to Statistics

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

Practice problems from chapters 2 and 3

Part 13: The Central Limit Theorem

Math 2311 Sections 4.1, 4.2 and 4.3

Higher Secondary - First year STATISTICS Practical Book

For instance, we want to know whether freshmen with parents of BA degree are predicted to get higher GPA than those with parents without BA degree.

University of California, Berkeley, Statistics 131A: Statistical Inference for the Social and Life Sciences. Michael Lugo, Spring 2012

Midterm Exam Business Statistics Fall 2001 Russell

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

Section 6.2: Measures of Variation

Quantitative Tools for Research

Chapter 2 Class Notes Sample & Population Descriptions Classifying variables

Chapter 3 Data Description

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz

Transcription:

Measures of Dispersion MATH 130, Elements of Statistics I J. Robert Buchanan Department of Mathematics Fall 2017

Introduction Recall that a measure of central tendency is a number which is typical of all the values of a variable in a data set. Another important measure is the amount of spread in the values of the variable. A measure of dispersion is a number which describes the spread of the data.

Example A sample of the times (in minutes) spent waiting in line at two different banks is shown below. Find the mean, median, mode, and midrange for each bank. Bank A Bank B 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0

Example A sample of the times (in minutes) spent waiting in line at two different banks is shown below. Find the mean, median, mode, and midrange for each bank. Bank A Bank B 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0 x A = 7.15 x B = 7.15

Example A sample of the times (in minutes) spent waiting in line at two different banks is shown below. Find the mean, median, mode, and midrange for each bank. Bank A Bank B 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0 x A = 7.15 x B = 7.15 M A = 7.20 M B = 7.20

Example A sample of the times (in minutes) spent waiting in line at two different banks is shown below. Find the mean, median, mode, and midrange for each bank. Bank A Bank B 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0 x A = 7.15 x B = 7.15 M A = 7.20 M B = 7.20 mode A = 7.7 mode B = 7.7

Example A sample of the times (in minutes) spent waiting in line at two different banks is shown below. Find the mean, median, mode, and midrange for each bank. Bank A Bank B 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0 x A = 7.15 x B = 7.15 M A = 7.20 M B = 7.20 mode A = 7.7 mode B = 7.7 midrange A = 7.10 midrange B = 7.10

The Range Definition The range, denoted R, is the difference between the largest and smallest data values. R = largest data value smallest data value

The Range Definition The range, denoted R, is the difference between the largest and smallest data values. R = largest data value smallest data value Example Find the range in waiting times for each of the banks in the previous example.

The Range Definition The range, denoted R, is the difference between the largest and smallest data values. R = largest data value smallest data value Example Find the range in waiting times for each of the banks in the previous example. R A = 7.7 6.5 = 1.20 R B = 10.0 4.2 = 5.80

The Variance Idea: the variance of a variable is its deviation about the mean. The population variance will be denoted σ 2. The sample variance will be denoted s 2.

Formula for Population Variance Definition The population variance of a variable is the sum of the squared deviations about the population mean divided by the number of observations in the population. σ 2 = (x 1 µ) 2 + (x 2 µ) 2 + + (x N µ) 2 N = (xi µ) 2 N

Example Suppose the following table contains the total number of hours students spent studying per week for their courses. 28.4 44.4 36.4 33.5 34.9 30.5 32.7 24.6 Find the population variance. Hint: the population mean is µ = 33.18.

Solution Hours Deviation Squared x i x i µ Deviation (x i µ) 2 28.4 44.4 36.4 33.5 34.9 30.5 32.7 24.6 σ 2 =

Solution Hours Deviation Squared x i x i µ Deviation (x i µ) 2 28.4 4.78 44.4 11.22 36.4 3.22 33.5 0.32 34.9 1.72 30.5 2.68 32.7 0.48 24.6 8.58 σ 2 =

Solution Hours Deviation Squared x i x i µ Deviation (x i µ) 2 28.4 4.78 22.8484 44.4 11.22 125.888 36.4 3.22 10.3684 33.5 0.32 0.1024 34.9 1.72 2.9584 30.5 2.68 7.1824 32.7 0.48 0.2304 24.6 8.58 73.6164 σ 2 =

Solution Hours Deviation Squared x i x i µ Deviation (x i µ) 2 28.4 4.78 22.8484 44.4 11.22 125.888 36.4 3.22 10.3684 33.5 0.32 0.1024 34.9 1.72 2.9584 30.5 2.68 7.1824 32.7 0.48 0.2304 24.6 8.58 73.6164 (xi µ) 2 = 243.195 σ 2 =

Solution Hours Deviation Squared x i x i µ Deviation (x i µ) 2 28.4 4.78 22.8484 44.4 11.22 125.888 36.4 3.22 10.3684 33.5 0.32 0.1024 34.9 1.72 2.9584 30.5 2.68 7.1824 32.7 0.48 0.2304 24.6 8.58 73.6164 (xi µ) 2 = 243.195 σ 2 = 243.195 8 = 30.40

Sample Variance Definition The sample variance of a variable is the sum of the squared deviations about the sample mean divided by n 1. s 2 = (x 1 x) 2 + (x 2 x) 2 + + (x n x) 2 n 1 = (xi x) 2 n 1 The quantity n 1 is called the degrees of freedom.

Alternative Formula for s 2 Remark: the formula for variance needs the sample mean. An alternative formula for calculating s 2 which does not require prior knowledge of the sample mean is s 2 = x 2 i ( x i ) 2 n. n 1

Example The percentage of on-time arrivals for a sample of airports is tabulated below. Find the sample variance. Atlanta 79.9 Dallas/Fort Worth 84.9 Los Angeles 81.0 Orlando 82.2 San Diego 78.4 Tampa 79.9

Solution (1 of 2) Data Data Squared x i xi 2 79.9 84.9 81.0 82.2 78.4 79.9

Solution (1 of 2) Data Data Squared x i xi 2 79.9 6384.01 84.9 7208.01 81.0 6561.00 82.2 6756.84 78.4 6146.56 79.9 6384.01

Solution (1 of 2) Data Data Squared x i xi 2 79.9 6384.01 84.9 7208.01 81.0 6561.00 82.2 6756.84 78.4 6146.56 79.9 6384.01 xi = 486.3

Solution (1 of 2) Data Data Squared x i xi 2 79.9 6384.01 84.9 7208.01 81.0 6561.00 82.2 6756.84 78.4 6146.56 79.9 6384.01 xi = 486.3 xi 2 = 39440.43

Solution (2 of 2) s 2 = x 2 i ( x i ) 2 n n 1 xi = 486.3 x 2 i = 39440.43

Solution (2 of 2) xi = 486.3 x 2 i = 39440.43 s 2 = x 2 i ( x i ) 2 n n 1 (486.3) 39440.43 6 = 6 1 236487.69 39440.43 6 = 5 39440.43 39414.615 = 5 = 25.815 5 = 5.16 2

Standard Deviation Definition The population standard deviation denoted σ is the square root of the population variance. σ = σ 2 The sample standard deviation denoted s is the square root of the sample variance. s = s 2

Example Find the standard deviations of the two previous samples (repeated below). Bank A Bank B 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0

Example Find the standard deviations of the two previous samples (repeated below). Bank A Bank B 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0 s A = 0.48 s B = 1.82

Interpreting Standard Deviation Assume the distribution of values for a variable is bell-shaped and symmetric. The mean measures the center of the distribution. The standard deviation measures the spread of the distribution. The larger the standard deviation, the greater the spread in the values of the variable.

Example Recall the bank waiting times presented earlier. Bank A Bank B 6.5 4.2 6.6 5.4 6.7 5.8 6.8 6.2 7.1 6.7 7.3 7.7 7.4 7.7 7.7 8.5 7.7 9.3 7.7 10.0 s A = 0.48 s B = 1.82

The Empirical Rule (1 of 2) If a distribution is roughly bell-shaped, then Approximately 68% of the data will lie within 1 standard deviation of the mean. In other words, approximately 68% of the data lie between µ σ and µ + σ. Approximately 95% of the data will lie within 2 standard deviations of the mean. In other words, approximately 95% of the data lie between µ 2σ and µ + 2σ. Approximately 99.7% of the data will lie within 3 standard deviations of the mean. In other words, approximately 99.7% of the data lie between µ 3σ and µ + 3σ.

The Empirical Rule (2 of 2) 34 34 0.15 2.35 13.5 13.5 2.35 Μ 3Σ Μ 2Σ Μ Σ Μ Μ Σ Μ 2Σ Μ 3Σ 68 95 0.15

Example If college students sleep an average of 7.3 hours per night with a standard deviation of 0.9 hours in the number of hours of sleep, 1. determine the percentage of students whose sleep hours are within 2 standard deviations of the mean. 2. determine the percentage of students who get between 6.4 and 9.1 hours of sleep per night. 3. determine the percentage of students who get less than 5.5 hours of sleep per night.

Chebyshev s Inequality For any data set, regardless of the shape of the distribution, at least (1 1k ) 2 100% of observations will lie within k standard deviations of the mean where k is any number greater than 1. In other words, at least (1 1k ) 2 100% of observations will lie between µ kσ and µ + kσ.

Example 1. Determine the minimum percentage of data which will lie within 2 standard deviations of the mean. 2. Determine the minimum percentage of data which will lie within 3 standard deviations of the mean. 3. Determine the minimum percentage of data which will lie within 1.5 standard deviations of the mean.

Example 1. Determine the minimum percentage of data which will lie within 2 standard deviations of the mean. (1 12 ) ( 2 100% = 1 1 ) 100% = 75% 4 2. Determine the minimum percentage of data which will lie within 3 standard deviations of the mean. 3. Determine the minimum percentage of data which will lie within 1.5 standard deviations of the mean.

Example 1. Determine the minimum percentage of data which will lie within 2 standard deviations of the mean. (1 12 ) ( 2 100% = 1 1 ) 100% = 75% 4 2. Determine the minimum percentage of data which will lie within 3 standard deviations of the mean. (1 13 ) ( 2 100% = 1 1 ) 100% = 88.9% 9 3. Determine the minimum percentage of data which will lie within 1.5 standard deviations of the mean.

Example 1. Determine the minimum percentage of data which will lie within 2 standard deviations of the mean. (1 12 ) ( 2 100% = 1 1 ) 100% = 75% 4 2. Determine the minimum percentage of data which will lie within 3 standard deviations of the mean. (1 13 ) ( 2 100% = 1 1 ) 100% = 88.9% 9 3. Determine the minimum percentage of data which will lie within 1.5 standard deviations of the mean. ( 1 1 ) ( (1.5) 2 100% = 1 1 ) 100% = 55.6% 2.25