Mean/Average Median Mode Range

Similar documents
Statistics, continued

are the objects described by a set of data. They may be people, animals or things.

Last few slides from last time

Determining the Spread of a Distribution

Determining the Spread of a Distribution

Lecture 3. Measures of Relative Standing and. Exploratory Data Analysis (EDA)

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

STAT 200 Chapter 1 Looking at Data - Distributions

TOPIC: Descriptive Statistics Single Variable

Probability Distributions

Determining the Spread of a Distribution Variance & Standard Deviation

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

STP 420 INTRODUCTION TO APPLIED STATISTICS NOTES

Chapter 4. Displaying and Summarizing. Quantitative Data

MATH 10 INTRODUCTORY STATISTICS

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

What are the mean, median, and mode for the data set below? Step 1

CHAPTER 2: Describing Distributions with Numbers

MATH 117 Statistical Methods for Management I Chapter Three

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

3 Lecture 3 Notes: Measures of Variation. The Boxplot. Definition of Probability

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

Sampling, Frequency Distributions, and Graphs (12.1)

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

Exercises from Chapter 3, Section 1

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

AP Final Review II Exploring Data (20% 30%)

Last time. Numerical summaries for continuous variables. Center: mean and median. Spread: Standard deviation and inter-quartile range

Data set B is 2, 3, 3, 3, 5, 8, 9, 9, 9, 15. a) Determine the mean of the data sets. b) Determine the median of the data sets.

Unit 2: Numerical Descriptive Measures

Unit 2. Describing Data: Numerical

2011 Pearson Education, Inc

MATH 1150 Chapter 2 Notation and Terminology

Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Describing distributions with numbers

Chapter 2: Tools for Exploring Univariate Data

Descriptive Statistics-I. Dr Mahmoud Alhussami

Elementary Statistics

a table or a graph or an equation.

Chapter 5. Understanding and Comparing. Distributions

Chapter 8: Confidence Intervals

Section 2.5 Absolute Value Functions

Chapter 1: Exploring Data

Central Limit Theorem and the Law of Large Numbers Class 6, Jeremy Orloff and Jonathan Bloom

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

Statistics for Managers using Microsoft Excel 6 th Edition

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved

Bayesian Inference for Binomial Proportion

Statistics I Chapter 2: Univariate data analysis

Chapters 1 & 2 Exam Review

Lecture 1: Description of Data. Readings: Sections 1.2,

STT 315 This lecture is based on Chapter 2 of the textbook.

CS 361: Probability & Statistics

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Counting principles, including permutations and combinations.

Chapter 3. Data Description

Statistics I Chapter 2: Univariate data analysis

Perhaps the most important measure of location is the mean (average). Sample mean: where n = sample size. Arrange the values from smallest to largest:

MEASURING THE SPREAD OF DATA: 6F

Exam 1 Solutions. Problem Points Score Total 145

Chapter 1 - Lecture 3 Measures of Location

Describing Distributions With Numbers

Math 14 Lecture Notes Ch Percentile

Section 3.4 Normal Distribution MDM4U Jensen

Section 3. Measures of Variation

Lecture 2 and Lecture 3

18.05 Practice Final Exam

4/1/2012. Test 2 Covers Topics 12, 13, 16, 17, 18, 14, 19 and 20. Skipping Topics 11 and 15. Topic 12. Normal Distribution

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

Unit 1: Statistics. Mrs. Valentine Math III

CHAPTER 1. Introduction

The area under a probability density curve between any two values a and b has two interpretations:

AP Statistics Semester I Examination Section I Questions 1-30 Spend approximately 60 minutes on this part of the exam.

Lesson One Hundred and Sixty-One Normal Distribution for some Resolution

3.1 Measure of Center

Recap: Ø Distribution Shape Ø Mean, Median, Mode Ø Standard Deviations

Chapter 6 Group Activity - SOLUTIONS

1 Probability Distributions

Men. Women. Men. Men. Women. Women

+ Check for Understanding

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Discrete Distributions

QUIZ 1 (CHAPTERS 1-4) SOLUTIONS MATH 119 SPRING 2013 KUNIYUKI 105 POINTS TOTAL, BUT 100 POINTS = 100%

Describing Distributions with Numbers

The Union and Intersection for Different Configurations of Two Events Mutually Exclusive vs Independency of Events

ACMS Statistics for Life Sciences. Chapter 9: Introducing Probability

Lecture 6: Chapter 4, Section 2 Quantitative Variables (Displays, Begin Summaries)

Chapter 4.notebook. August 30, 2017

Chapter 1. Looking at Data

Chapter 3 Data Description

MAT Mathematics in Today's World

6 THE NORMAL DISTRIBUTION

Unit Six Information. EOCT Domain & Weight: Algebra Connections to Statistics and Probability - 15%

Chapter 01 : What is Statistics?

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

Slide 1. Slide 2. Slide 3. Pick a Brick. Daphne. 400 pts 200 pts 300 pts 500 pts 100 pts. 300 pts. 300 pts 400 pts 100 pts 400 pts.

Section 5.4. Ken Ueda

Transcription:

Normal Curves

Today s Goals Normal curves! Before this we need a basic review of statistical terms. I mean basic as in underlying, not easy. We will learn how to retrieve statistical data from normal curves. As an application, we ll see how to determine the margin of error of a poll.

Statistics basics Here s some terminology you should be familiar with: Mean/Average: For a set of N numbers, d 1, d 2,..., d N, the mean is given by µ = (d 1 + d 2 + + d N )/N. Median: Sort the data set from smallest to largest: d 1, d 2,..., d N. The median is the middle number. If N is odd, the median is d (N+1)/2. If N is even, the median is the average of d N/2 and d (N/2)+1. Mode: The most common number(s). A data set can have more than one mode. (We won t really study mode. It was just feeling left out so I put it on the slide.) Range: The difference between the highest and lowest values of the data (R = Max Min).

Percentiles The pth percentiles of a data set is a number X p such that p% is smaller or equal to X p and (100 p)% of the data is bigger or equal to X p. To find the pth percentile of a sorted data set d 1, d 2,..., d N, first find the locator L = (p/100) N. If L is a whole number, then X p = d L+d L+1 2. If L is not a whole number, then X p = d L + up. where L + is L rounded

Percentiles This is Evelyn. Evelyn is in the 40th percentile for height (40% of babies Evelyn s age weigh as much or less than she does while 60% weigh as much or more).

Quartiles The first quartile Q 1 is the 25th percentile of a data set. The median is the 50th percentile of a data set (also technically the second quartile). The third quartile Q 3 is the 75th percentile of a data set. The fourth quartile is d N (the last number in the data set). The interquartile range (IQR) is the difference between the third quartile and the first quartile (IQR = Q 3 Q 1 ). IQR tells us how spread out the middle 50% of the data values are.

Hold up! Why aren t we doing any examples? Because I m not going to ask you to compute any of these things directly from a set of data. Instead, we will study visual representations of the data called bell curves. But, I want you to be familiar with the terminology and how it s computed. So bear with me.

Standard deviation and variance Standard deviation tells us how spread out a data set is from the mean. Let A be the mean of a data set. For each value x in the data set, x A is the deviation from the mean. We want to average these values but for technical reasons we actually need to average their squares. This average is called the variance V. The standard deviation is the square root of the variance, σ = V.

Example The number 330.78 is the variance V, the average of squared devations. The standard deviation is then σ = V 18.19

Normal curves Say we flipped a coin 100 times? We expect to get heads 50 times and tails 50 times, but it s also very likely that we would not get this. (For a challenge, compute the probability of this event.) When John Kerrich was a POW during World War II he wanted to test the probabilistic theory on coin flipping with a real life experiment. He flipped a coin 10,000 times and recorded the number of heads for each 100 trials. What took Kerrich weeks (months?) we can do in a matter of seconds via computer software like Maple.

Bell curves A set of data with normal distribution has a bar graph that is perfectly bell shaped.

Properties of normal curves Symmetry: Every normal curve has a vertical axis of symmetry. Median and mean: If a data set, then the median and mean are the same and they correspond to the point where the axis of symmetry intersects the horizontal axis. Standard deviation: The standard deviation is the horizontal distance between the mean and the point of inflection, where the graph changes the direction it is bending.

Normal distributions and curve We say a distribution of data is normal if its bar graph is perfectly bell shaped. This type of curve is called normal

Properties of normal curves Symmetry: Every normal curve has a vertical axis of symmetry.

Properties of normal curves Median and mean: If a data set is normal, then the median and mean are the same and they correspond to the point where the axis of symmetry intersects the horizontal axis.

Properties of normal curves Standard deviation: The standard deviation is the horizontal distance between the mean and the point of inflection, where the graph changes the direction it is bending.

Properties of normal curves Quartiles: The first and third quartiles can be found using the mean µ and the standard deviation σ. Q 1 = µ (.675)σ and Q 3 = µ + (.675)σ.

Properties of normal curves Q 1 = µ (.675)σ

Properties of normal curves Q 3 = µ + (.675)σ

Properties of normal curves The 68-95-99.7 Rule: In a normal data set, Approximately 68% of the data falls between one standard deviation of the mean (µ ± σ). This is the data between P 16 and P 84. Approximately 95% of the data falls within two standard deviations of the mean (µ ± 2σ). This is the data between P 2.5 and P 97.5. Approximately 99.7% of the data falls within three standard deviations of the mean (µ ± 3σ). This is the data between P 0.15 and P 99.85.

Properties of normal curves 68%

Properties of normal curves 95%

Properties of normal curves 99.7%

Example Suppose we have a normal data set with mean µ = 500 and standard deviation σ = 150. We have the following: Q 1 = 500.675 150 399 Q 3 = 500 +.675 150 601 Middle 68%: P 16 = 500 150 = 350, P 84 = 500 + 150 = 650. Middle 95%: P 2.5 = 500 2(150) = 200, P 97.5 = 500 + 2(150) = 800. Middle 99.7%: P 0.15 = 500 3(150) = 50, P 99.85 = 500 + 3(150) = 850.

Example Consider a normal distribution represented by the normal curve with points of inflection at x = 23 and x = 45. Find the mean and standard deviation. Use them to compute Q 1, Q 3 and the middle 68%, 95%, and 99.7%.

Standardizing normal data In essence, all normalized data sets are the same. They all have a mean µ and standard deviation σ. The same percentage of data is located in the same increments of σ from the mean. Thus, there is value in standardizing normal data.

Psychometry This is Laura. Laura is a psychometrist. She conducts psychological assessments. Her patients are adults but their ages range from 18 and up. She uses z-values to standardize her patients assessment scores.

Standardizing Rule In a normal distribution with mean µ and standard deviation σ, the standardized value of a data point x is z = x µ σ. The result of this is the z-value of the data point x.

Conversions Suppose we have a normal data set with mean µ = 120 and standard deviation σ = 30. If x = 100, then the z-value of x is z = x 120 30 = 2 3.67. If a z-value of some x is.5, what is x (for the data above)?.5 = x 120 30 15 = x 120 135 = x Or, we could recognize that a z-value of.5 means that x is 1 2 a standard deviation to the right of the mean (so 120 + 15 = 135).

Variables In algebra, a variable typically is a placeholder for some type of solution or set of solutions. Given the equation x + 3 = 10, then the variable x represents the number 7. Given the equation x 2 + 5 = 21, then x represents a member of the set of solutions { 4, 4}.

Random variables A variable representing a random (probabilistic) event is called a random variable. For example, if we toss a coin 100 times and let X represent the number of times heads comes up, then X is a random variable. Like an algebraic variable, X represents a number between 0 and 100, but the possible values for X are not equally likely. The probability of X = 0 or X = 100 is (1/2) 100, which is a very small number. The probability of X = 50 is about 8%.

Random variable Continuing with the example, we know that X has an approximately normal distribution with mean µ = 50 and standard deviation σ = 5 (for a sufficiently large number of repetitions). What is the (approximate) probability that X will fall between 45 and 55? This is 1 standard deviation from the mean, so the probability is approximately 68%.

The Honest-Coin Principle We can now generalize the previous example to a trial with n tosses. Let X be a random variable representing the number of heads in n tosses of an honest (fair) coin (assume n 30). Then X has an approximately normal distribution with mean µ = n/2 and standard deviation σ = n/2.

The Dishonest-Coin Principle Let X be a random variable representing the number of heads in n tosses of a coin (assume n 30), and let p denote the probability of heads on each toss of the coin. Then X has an approximately normal distribution with mean µ = np and standard deviation σ = np(1 p). Note that when p = 1 2 we recover the Honest-Coin Principle.

Margin of Error In a poll conducted by Public Policy Polling before the recent Democratic primary in Missouri interviewed 839 likely voters. Their poll found almost a tie between Hillary Clinton and Bernie Sanders. Therefore, we can use the Honest-Coin Principle to compute the margin of error for the poll.

Margin of Error According the Honest-Coin Principle, we have µ = 839 839 = 419.5 and σ = = 14.48. 2 2 The standard deviation σ is approximately 1.72% of the sample. This means that the pollsters could assume with 95% confidence that either candidate would get between (50 ± 2(1.72))% of the vote. That is, between 46.55% and 53.45%. The value 2σ is called the margin of error.

Margin of Error On the other hand, in a poll conducted by Public Policy Polling before the recent Democratic primary in North Carolina interviewed 747 likely voters. Their poll found Hillary Clinton with 60% support and Bernie Sanders with 40%. Therefore, we can use the Dishonest-Coin Principle to compute the margin of error for the poll.

Margin of Error According the Dishonest-Coin Principle, we have µ = 747.6 = 448.2 and σ = 747.6.4 = 13.39. The standard deviation σ is approximately 1.79% of the sample. This means that the pollsters could assume with 95% confidence that Clinton candidate would get between (60 ± 2(1.79))% of the vote. That is, between 56.42% and 63.58%. The margin of error in this example is 2σ = 3.58%.