Determining the Spread of a Distribution Variance & Standard Deviation

Similar documents
Determining the Spread of a Distribution

Determining the Spread of a Distribution

Chapter 1 - Lecture 3 Measures of Location

Unit 2. Describing Data: Numerical

Chapter 3 Data Description

Test 1 Review. Review. Cathy Poliak, Ph.D. Office in Fleming 11c (Department Reveiw of Mathematics University of Houston Exam 1)

Lecture 11. Data Description Estimation

6.2A Linear Transformations

2011 Pearson Education, Inc

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Exercises from Chapter 3, Section 1

MATH 117 Statistical Methods for Management I Chapter Three

Chapter 2: Tools for Exploring Univariate Data

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

Descriptive Statistics-I. Dr Mahmoud Alhussami

CHAPTER 2 Modeling Distributions of Data

are the objects described by a set of data. They may be people, animals or things.

Quantitative Tools for Research

Preliminary Statistics course. Lecture 1: Descriptive Statistics

The Normal Distribuions

AP Final Review II Exploring Data (20% 30%)

Statistics I Chapter 2: Univariate data analysis

Density Curves & Normal Distributions

P8130: Biostatistical Methods I

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

Statistics I Chapter 2: Univariate data analysis

Introduction to Statistics

Percentile: Formula: To find the percentile rank of a score, x, out of a set of n scores, where x is included:

TOPIC: Descriptive Statistics Single Variable

Lecture 2 and Lecture 3

CHAPTER 2: Describing Distributions with Numbers

Math 7 /Unit 5 Practice Test: Statistics

Slide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts.

Describing distributions with numbers

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

Inverse Normal Distribution and Sampling Distributions

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Chapter 3. Measuring data

Elementary Statistics

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study

Describing Distributions

Describing distributions with numbers

2.1 Measures of Location (P.9-11)

Section 3. Measures of Variation

Estimation and Confidence Intervals

Stats Review Chapter 3. Mary Stangler Center for Academic Success Revised 8/16

Unit 2: Numerical Descriptive Measures

CIVL 7012/8012. Collection and Analysis of Information

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

Int Math 1 Statistic and Probability. Name:

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Section 3.2 Measures of Central Tendency

Describing Distributions With Numbers

Mean, Mode, Median and Range. I know how to calculate the mean, mode, median and range.

LHS Algebra Pre-Test

AP STATISTICS: Summer Math Packet

Math 2311 Sections 4.1, 4.2 and 4.3

The Normal Distribution. Chapter 6

Lecture 3: Chapter 3

MgtOp 215 Chapter 3 Dr. Ahn

Math Sec 4 CST Topic 7. Statistics. i.e: Add up all values and divide by the total number of values.

+ Check for Understanding

Announcements. Lecture 1 - Data and Data Summaries. Data. Numerical Data. all variables. continuous discrete. Homework 1 - Out 1/15, due 1/22

Mean/Average Median Mode Range

Topic-1 Describing Data with Numerical Measures

Q 1 = 23.8 M = Q 3 = 29.8 IQR = 6 The numbers are in order and there are 18 pieces of data so the median is the average of the 9th and 10th

Lecture 3B: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Inference for Regression

MEASURING THE SPREAD OF DATA: 6F

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008

CHAPTER 1. Introduction

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

MEASURES OF LOCATION AND SPREAD

Descriptive Univariate Statistics and Bivariate Correlation

Review: Central Measures

Math 221, REVIEW, Instructor: Susan Sun Nunamaker

Recap: Ø Distribution Shape Ø Mean, Median, Mode Ø Standard Deviations

Lecture 12: Small Sample Intervals Based on a Normal Population Distribution

Course ID May 2017 COURSE OUTLINE. Mathematics 130 Elementary & Intermediate Algebra for Statistics

Practice problems from chapters 2 and 3

Review Packet for Test 8 - Statistics. Statistical Measures of Center: and. Statistical Measures of Variability: and.

Statistics for Managers using Microsoft Excel 6 th Edition

Chapter 3: Displaying and summarizing quantitative data p52 The pattern of variation of a variable is called its distribution.

STOR 155 Introductory Statistics. Lecture 4: Displaying Distributions with Numbers (II)

1.3: Describing Quantitative Data with Numbers

Chapter 6 Assessment. 3. Which points in the data set below are outliers? Multiple Choice. 1. The boxplot summarizes the test scores of a math class?

Gamma and Normal Distribuions

Algebra 2/Trig: Chapter 15 Statistics In this unit, we will

Standard Normal Calculations

Measures of disease spread

Math 14 Lecture Notes Ch Percentile

Measures of the Location of the Data

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Confidence Intervals for Two Means

Transcription:

Determining the Spread of a Distribution Variance & Standard Deviation 1.3 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3 Lecture 3 1 / 32

Outline 1 Describing Quantitative Variables 2 Understanding Standard Deviation 3 Calculating The Standard Deviation 4 Coefficient of Variation Lecture 3 2 / 32

Describing distributions of quantitative variables The distribution of a variable tells us what values it takes and how often it takes these values. There are four main characteristics to describe a distribution: 1. Shape 2. Center 3. Spread 4. Outliers Lecture 3 3 / 32

Describing distributions An initial view of the distribution and the characteristics can be shown through the graphs. Then we use numerical descriptions to get a better understanding of the distributions characteristics. Lecture 3 4 / 32

Describing Center with Numbers Mean: Sum all of the numbers then divide by number of values in the data set. Median: The center value of ordered data. Mode: The value that has the highest frequency in the data. Lecture 3 5 / 32

Types of Measurements for the Spread Range Percentiles Quartiles IQR; Interquartile range Variance Standard deviation Coefficient of Variation Lecture 3 6 / 32

Measuring Spread: The Standard Deviation Measures spread by looking at how far the observations are from their mean. Most common numerical description for the spread of a distribution. A larger standard deviation implies that the values have a wider spread from the mean. Denoted s when used with a sample. This is the one we calculate from a list of values. Denoted σ when used with a population. This is the "idealized" standard deviation. The standard deviation has the same units of measurements as the original observations. Lecture 3 7 / 32

Definition of the Standard Deviation The standard deviation is the average distance each observation is from the mean. Using this list of values from a sample: 3, 3, 9, 15, 15 The mean is 9. By definition, the average distance each of these values are from the mean is 6. So the standard deviation is 6. Lecture 3 8 / 32

Values of the Standard Deviation The standard deviation is a value that is greater than or equal to zero. It is equal to zero only when all of the observations have the same value. By the definition of standard deviation determine s for the following list of values. 2, 2, 2, 2 : standard deviation = 0 125, 125, 125, 125, 125: standard deviation = 0 Lecture 3 9 / 32

Adding or Subtracting a Value to the Observations Adding or subtracting the same value to all the original observations does not change the standard deviation of the list. Using this list of values: 3, 3, 9, 15, 15 mean = 9, standard deviation = 6. If we add 4 to all the values: 7, 7, 13, 19, 19 mean = 13, standard deviation = 6 Lecture 3 10 / 32

Multiplying or Dividing a Value to the Observations Multiplying or dividing the same value to all the original observations will change the standard deviation by that factor. Using this list of values: 3, 3, 9, 15, 15: mean = 9, standard deviation = 6. If we double all the values: 6, 6, 18, 30, 30 mean = 18, standard deviation = 12 athy Poliak, Ph.D. cathy@math.uh.edu (Department of Mathematics 1.3 University of Houston ) Lecture 3 11 / 32

Population Variance and Standard Deviation If N is the number of values in a population with mean mu, and x i represents each individual in the population, the the population variance is found by: σ 2 = N i=1 (x i µ) 2 N and the population standard deviation is the square root, σ = σ 2. Lecture 3 12 / 32

Sample Variance and Standard Deviation Most of the time we are working with a sample instead of a population. So the sample variance is found by: s 2 = n i=1 (x i x) 2 n 1 and the sample standard deviation is the square root, s = s 2. Where n is the number of observations (samples), x i is the value for the i th observation and x is the sample mean. Lecture 3 13 / 32

Calculating the Standard Deviation By Hand When calculating by hand we will calculate s. 1. Find the mean of the observations x. 2. Calculate the difference between the observations and the mean for each observation x i x. This is called the deviations of the observations. 3. Square the deviations for each observation (x i x) 2. 4. Add up the squared deviations together n i=1 (x i x) 2. 5. Divide the sum of the squared deviations by one less than the number of observations n 1. This is the variance s 2 = 1 n 1 n (x i x) 2 i=1 athy Poliak, Ph.D. cathy@math.uh.edu (Department of Mathematics 1.3 University of Houston ) Lecture 3 14 / 32

Step 6: Standard Deviation 6. Find the square root of the variance. This is the standard deviation s = 1 n (x i x) n 1 2 i=1 Lecture 3 15 / 32

Example: Section A Determine the sample standard deviation of the test scores for Section A. Section A Scores (X i ) 65 66 67 68 71 73 74 77 77 77 Lecture 3 16 / 32

Step 1: Calculate the Mean The sample mean is x = 71.5. Lecture 3 17 / 32

Step 2: Calculate Deviations For All Values Variable Deviations Deviations Squared Score (X i ) X i X (X i X) 2 65 65 71.5 = 6.5 66 66 71.5 = 5.5 67 67 71.5 = 4.5 68 68 71.5 = 3.5 71 71 71.5 = 0.5 73 73 71.5 = 1.5 74 74 71.5 = 2.5 77 77 71.5 = 5.5 77 77 71.5 = 5.5 77 77 71.5 = 5.5 sum Lecture 3 19 / 32

Step 3: Calculate Squared Deviations Variable Deviations Deviations Squared Score (X i ) X i X (X i X) 2 65 65 71.5 = 6.5 ( 6.5) 2 = 42.25 66 66 71.5 = 5.5 ( 5.5) 2 = 30.25 67 67 71.5 = 4.5 ( 4.5) 2 = 20.25 68 68 71.5 = 3.5 ( 3.5) 2 = 12.25 71 71 71.5 = 0.5 ( 0.5) 2 = 0.25 73 73 71.5 = 1.5 1.5 2 = 2.25 74 74 71.5 = 2.5 2.5 2 = 6.25 77 77 71.5 = 5.5 5.5 2 = 30.25 77 77 71.5 = 5.5 5.5 2 = 30.25 77 77 71.5 = 5.5 5.5 2 = 30.25 sum athy Poliak, Ph.D. cathy@math.uh.edu (Department of Mathematics 1.3 University of Houston ) Lecture 3 20 / 32

Step 4: Calculate the Sum of the Squared Deviations Variable Deviations Deviations Squared Score(X i ) X i X (X i X) 2 65 65 71.5 = 6.5 ( 6.5) 2 = 42.25 66 66 71.5 = 5.5 ( 5.5) 2 = 30.25 67 67 71.5 = 4.5 ( 4.5) 2 = 20.25 68 68 71.5 = 3.5 ( 3.5) 2 = 12.25 71 71 71.5 = 0.5 ( 0.5) 2 = 0.25 73 73 71.5 = 1.5 1.5 2 = 2.25 74 74 71.5 = 2.5 2.5 2 = 6.25 77 77 71.5 = 5.5 5.5 2 = 30.25 77 77 71.5 = 5.5 5.5 2 = 30.25 77 77 71.5 = 5.5 5.5 2 = 30.25 sum n i=1 (X i X) 2 = 204.5 athy Poliak, Ph.D. cathy@math.uh.edu (Department of Mathematics 1.3 University of Houston ) Lecture 3 21 / 32

Step 5: Calculate the Variance variance = s 2 = 1 n 1 n (x i x) 2 i=1 = 204.5 9 = 22.7222 Lecture 3 22 / 32

Step 6: Take the Square Root of the Variance standard deviation = s = 1 n 1 = 22.7222 = 4.77 n (x i x) 2 i=1 Lecture 3 23 / 32

Sample Standard Deviation of Section A test scores Sample standard deviation is s = 4.77. This implies that from the sample of the 10 students from section A the tests scores has a spread, on average, of 4.77 points from the mean of 71.50 points. athy Poliak, Ph.D. cathy@math.uh.edu (Department of Mathematics 1.3 University of Houston ) Lecture 3 24 / 32

Example A statistics teacher wants to decide whether or not to curve an exam. From her class of 300 students, she chose a sample of 10 students and their grade were: 72, 88, 85, 81, 60, 54, 70, 72, 63, 43 Determine the sample mean. What is the variance? What is the standard deviation? athy Poliak, Ph.D. cathy@math.uh.edu (Department of Mathematics 1.3 University of Houston ) Lecture 3 25 / 32

Add 10 Suppose the statistics instructor decides to curve the grade by adding 10 points to each score. What is the new mean, variance and standard deviation? Lecture 3 26 / 32

Multiply by 2 For the following dataset the mean is x = 4.5, the variance is s 2 = 3.5 and the standard deviation is s = 1.870829. 3, 6, 2, 7, 4, 5 Now, multiply each value by 2. What is the new variance and the new standard deviation? Lecture 3 27 / 32

Calculating Standard Deviation For larger data sets use a calculator or computer software. Each calculator is different if you cannot determine how to compute standard deviation from your calculator ask your instructor. For this course we will be using R as the software. The function for the sample standard deviation in R is sd(data name$variable name). athy Poliak, Ph.D. cathy@math.uh.edu (Department of Mathematics 1.3 University of Houston ) Lecture 3 28 / 32

Coefficient of Variation This is to compare the variation between two groups. The coefficient of variation (cv) is the ratio of the standard deviation to the mean. cv = sd mean A smaller ratio will indicate less variation in the data. Lecture 3 29 / 32

CV of test scores Section A Section B Sample Size 10 10 Sample Mean 71.5 71.5 Sample Standard Deviation 4.770 18.22 4.77 CV 71.5 = 0.066 18.22 71.5 = 0.2548 Lecture 3 30 / 32

CV Example The following statistics were collected on two different groups of stock prices: Portfolio A Portfolio B Sample size 10 15 Sample mean $52.65 $49.80 Sample standard deviation $6.50 $2.95 What can be said about the variability of each portfolio? Lecture 3 31 / 32