Density curves and the normal distribution

Similar documents
Unit Two Descriptive Biostatistics. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami

Part 13: The Central Limit Theorem

Chapter 6 The Normal Distribution

MATH 117 Statistical Methods for Management I Chapter Three

Section 5.4. Ken Ueda

FREQUENCY DISTRIBUTIONS AND PERCENTILES

Exercises from Chapter 3, Section 1

UNIT 3 CONCEPT OF DISPERSION

A is one of the categories into which qualitative data can be classified.

SESSION 5 Descriptive Statistics

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

2011 Pearson Education, Inc

MODULE 9 NORMAL DISTRIBUTION

1 Probability Distributions

Section 7.1 Properties of the Normal Distribution

Chapter 3 Data Description

3.1 Measure of Center

Sequences A sequence is a function, where the domain is a set of consecutive positive integers beginning with 1.

GRACEY/STATISTICS CH. 3. CHAPTER PROBLEM Do women really talk more than men? Science, Vol. 317, No. 5834). The study

Chapter 4. Displaying and Summarizing. Quantitative Data

TOPIC: Descriptive Statistics Single Variable

Elementary Statistics

Determining the Spread of a Distribution Variance & Standard Deviation

CHAPTER 1. Introduction

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

4.2 The Normal Distribution. that is, a graph of the measurement looks like the familiar symmetrical, bell-shaped

Chapter 5. Understanding and Comparing. Distributions

Using the z-table: Given an Area, Find z ID1050 Quantitative & Qualitative Reasoning

(i) The mean and mode both equal the median; that is, the average value and the most likely value are both in the middle of the distribution.

Looking at data: distributions - Density curves and Normal distributions. Copyright Brigitte Baldi 2005 Modified by R. Gordon 2009.

THE SUMMATION NOTATION Ʃ

EXAM 3 Math 1342 Elementary Statistics 6-7

Data set B is 2, 3, 3, 3, 5, 8, 9, 9, 9, 15. a) Determine the mean of the data sets. b) Determine the median of the data sets.

Z score indicates how far a raw score deviates from the sample mean in SD units. score Mean % Lower Bound

(i) The mean and mode both equal the median; that is, the average value and the most likely value are both in the middle of the distribution.

Describing distributions with numbers

Online Practice Quiz KEY Chapter 2: Modeling Distributions of Data

4/1/2012. Test 2 Covers Topics 12, 13, 16, 17, 18, 14, 19 and 20. Skipping Topics 11 and 15. Topic 12. Normal Distribution

Statistics for Managers using Microsoft Excel 6 th Edition

Sections 6.1 and 6.2: The Normal Distribution and its Applications

Percentile: Formula: To find the percentile rank of a score, x, out of a set of n scores, where x is included:

Lecture 10/Chapter 8 Bell-Shaped Curves & Other Shapes. From a Histogram to a Frequency Curve Standard Score Using Normal Table Empirical Rule

Hi, my name is Dr. Ann Weaver of Argosy University. This WebEx is about something in statistics called z-

MATH 1150 Chapter 2 Notation and Terminology

The Normal Distribution Review

Complement: 0.4 x 0.8 = =.6

6/25/14. The Distribution Normality. Bell Curve. Normal Distribution. Data can be "distributed" (spread out) in different ways.

Introduction to Statistics for Traffic Crash Reconstruction

6.2A Linear Transformations

Review. Midterm Exam. Midterm Review. May 6th, 2015 AMS-UCSC. Spring Session 1 (Midterm Review) AMS-5 May 6th, / 24

NORMAL CURVE STANDARD SCORES AND THE NORMAL CURVE AREA UNDER THE NORMAL CURVE AREA UNDER THE NORMAL CURVE 9/11/2013

z-scores z-scores z-scores and the Normal Distribu4on PSYC 300A - Lecture 3 Dr. J. Nicol

STAT 200 Chapter 1 Looking at Data - Distributions

Chapter 3. Measuring data

Chapter 2. Mean and Standard Deviation

Experimental Design, Data, and Data Summary

Continuous Probability Distributions

Lecture 11. Data Description Estimation

Density Curves and the Normal Distributions. Histogram: 10 groups

Statistics and parameters

Range The range is the simplest of the three measures and is defined now.

What does a population that is normally distributed look like? = 80 and = 10

The Shape, Center and Spread of a Normal Distribution - Basic

Test 2C AP Statistics Name:

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation

Producing data Toward statistical inference. Section 3.3

a. 0 b. 8. c. 16. e. 24

Answers Investigation 2

3.1 Measures of Central Tendency: Mode, Median and Mean. Average a single number that is used to describe the entire sample or population

Math 2311 Sections 4.1, 4.2 and 4.3

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz

Chapter. Numerically Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Math 1342 Test 2 Review. Total number of students = = Students between the age of 26 and 35 = = 2012

Francine s bone density is 1.45 standard deviations below the mean hip bone density for 25-year-old women of 956 grams/cm 2.

How spread out is the data? Are all the numbers fairly close to General Education Statistics

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

Section 4.4 Z-Scores and the Empirical Rule

Numbering Systems. Contents: Binary & Decimal. Converting From: B D, D B. Arithmetic operation on Binary.

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Math Conventions. for the Quantitative Reasoning measure of the GRE General Test.

DSST Principles of Statistics

Stats Review Chapter 3. Mary Stangler Center for Academic Success Revised 8/16

AP Final Review II Exploring Data (20% 30%)

MALLOY PSYCH 3000 MEAN & VARIANCE PAGE 1 STATISTICS MEASURES OF CENTRAL TENDENCY. In an experiment, these are applied to the dependent variable (DV)

Algebra 2. Outliers. Measures of Central Tendency (Mean, Median, Mode) Standard Deviation Normal Distribution (Bell Curves)

EXPERIMENT 2 Reaction Time Objectives Theory

Unit 2. Describing Data: Numerical

Section 3.2 Measures of Central Tendency

CHAPTER 5: EXPLORING DATA DISTRIBUTIONS. Individuals are the objects described by a set of data. These individuals may be people, animals or things.

CENTRAL LIMIT THEOREM (CLT)

Review Notes for IB Standard Level Math

+ Check for Understanding

Chapter. Numerically Summarizing Data Pearson Prentice Hall. All rights reserved

Lecture 10: The Normal Distribution. So far all the random variables have been discrete.

Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.

ST Presenting & Summarising Data Descriptive Statistics. Frequency Distribution, Histogram & Bar Chart

Surds 1. Good form. ab = a b. b = a. t is an integer such that

Review of Multiple Regression

Transcription:

Density curves and the normal distribution - Imagine what would happen if we measured a variable X repeatedly and make a histogram of the values. What shape would emerge? A mathematical model of this shape is called a density curve. The connection between a density curve of a variable X and repeated measurements of X can be described in either of the following ways: The area of the density curve between a range of two values on the measurement axis gives both: a) the probability that a single measurement of X will produce a value withing that range. b) or, if X is measured repeatedly,the

proportion (or percentage) of values that lie withing that range. Basic fact: Since probabilities and percentages can only add up to one, density curves are always scaled so that the total area under the density curve is one. Example A: A random number generator utilizes a uniform or rectangular density. Lets say that it is uniform over the interval [0,3] a) What is the probability that, in a single try, the random number generator produces a value less than 1?

b) If the random number generator selects a large number of numbers, what percentage of these numbers do we expect to see above 2.5? Answers: 1/3, 1/6 Normal Distributions A variable X is said to be normally distributed or to have a normal distribution if, after many many repeated measurements of X, the values display a dispersion pattern that is symmetric and bell shaped, according to a very specific density curve. All normal distribution are characterized simply by two parameters, one that indicates the location of the central value, and another

to indicate who spread out the values are around the mean. Borrowing terminology already used, we call these parameters the mean and the standard deviation. We use the Greek letters for the mean and for the standard deviation. Notation: (, ) N means normal distribution with mean and standard deviation Basic fact (68-95-99.7 rule)

1. The portion within one standard deviation of the mean comprises 68% of the area. 2. The portion within two standard deviations of the mean comprises 95% of the area. 3. The portion within three standard deviations comprises 99.7% of the area. Lets say that typical college applicants take one of two possible tests. Let s also say that in a given year, the range of scores for each

test can be modeled as a normal distribution such that Test A: The scores are normally distributed with 400 and 100. Test B: The scores are normally distributed with 300 and 80. Karen takes Test A and scores a 550. Allison takes Test B and scores a 460. Who performed better relative to their populations? We would like to know who scored higher than a greater percentage of fellow test takers. Although Karen had a higher raw score, more of her fellow students also scored high as well, so it is not clear who

outperformed a greater percentage of classmates. We would like to compute both Karen s and Allison s scoring percentage, which is accomplished by converting to standardized values, or z-scores Standardized values (z-scores) 2 If x is a sample from a N (, ) distribution, then the equation z x relates x to its standardized score (zscore)

Thus, Karen s z-score is 1.5, whereas Allison s z-score is 2. This shows that Allison actually outperformed Karen, even though she had the smaller raw score. In fact, Allison scored better than 97.72% of her classmates, and Karen scored better than 93.32% of her classmates. See the left tail probability table. Tables of the Normal Distribution Probability Content from -oo to Z Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 ----+---------------------------------------------------------------------- 0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359 0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753 0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141 0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517 0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879 0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224 0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549 0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852 0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133

0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389 1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621 1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830 1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015 1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177 1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319 1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441 1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545 1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633 1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706 1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767 2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817 2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857 2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890 2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916 2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936 2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952 2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964 2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974 2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981 2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986 3.0 0.9987 0.9987 0.9987 0.9988 0.9988 0.9989 0.9989 0.9989 0.9990 0.9990 Computing probabilities for a normal distribution The previous example shows the major steps needed to do normal probability calculations. Basically, calculating probabilities involve two steps:

1. Convert raw scores x to standardized scores z 2. Look up the left tail probabilities P( Z z) using a table 3. Compute the area using arithmetic: a) A right tail probability is computed by subtraction of a left tail probability from 1: P( Z z) 1 P( Z z) the one s complement rule b) A bounded region probability is computed by subtraction of left tail probabilities P( a Z b) P( Z b) P( Z a)

Notation: The symbol P( Z z) refers to the probability that a standardized score Z will fall below z. It also refers to the proportion of times one expects Z to fall below z. Thus a statement such as P ( Z 1.00). 8413 means the proportion of z-scores which fall below 1.00, is 84.13% Example: Ball bearing are manufactured to a target diameter of 1.00mm with a tolerance of 0.035 mm above and below this value. Suppose in actuality the diameters are

normally distributed with mean 0.99 mm with a standard deviation of 0.02 mm a. What proportion of ball bearings have diameters within one standard deviation of the mean? (That means between 0.97 and 1.01mm?) b. What proportion of ball bearings have diameters within two standard deviations of the mean? (That means between 0.95 and 1.03mm?) c. What proportion of ball bearings have diameters at least 1.00mm? d. What proportion of ball bearings are within specifications? Answers:

a) Let X denote the diameter of a given ball bearing. We wish to calculate P (. 97 X 1.01). By the 68-95-99.7 rule, this is approximately 68%. On the other hand, we can get a more accurate result from the tables. Since the z-score of.97 is z x.97.99.02 and the z-score of 1.01 is 1.00 z x 1.01.99.02 1.00 We convert the probability P(.97 X 1.01) to P( 1.00 Z 1.00). We can now use the standardized normal tables to compute P ( 1.00 Z 1.00). Since this is a bounded area, we subtract:

P( 1.00 Z 1.00) P( Z 1.00) P( Z 1.00).8413.1587.6826 b) This time we wish to calculate P (. 95 X 1.03). By the 68-95-99.7 rule, this is approximately 95%. On the other hand, we can get a more accurate result from the tables. Since the z-score of.95 is z x.95.99.02 and the z-score of 1.03 is 2.00 z x 1.03.99.02 2.00 We convert the probability P(.95 X 1.03) to P( 2.00 Z 2.00). We can now use the standardized normal tables to

compute P ( 2.00 Z 2.00). Since this is a bounded area, we subtract: P( 2.00 Z 2.00) P( Z 2.00) P( Z 2.00).9772.0228.9544 c) We can t use the 68-95-99.7 rule, so we have to use our strategy. Since the z-score of 1.00 is z x 1.00.99.02 0.50 we convert the probability P ( X 1.00) to P( Z 0.50). This is a right tail area, so we can use the normal tables and the one s complement rule to obtain

P( Z 0.50) 1 P( Z 0.50) 1.6915.3185 d) We wish to calculate (. 965 X 1.035) (The 68-95-99.7 rule does not apply) Since the z-score of.96 is P. z x.965.99.02 and the z-score of 1.035 is 1.25 z x 1.035.99.02 2.25 We convert the probability P(.965 X 1.035) to P( 1.25 Z 2.25). We can now use the standardized normal tables to compute P ( 1.25 Z 2.25). Since this is a bounded area, we subtract:

P( 1.00 Z 1.00) P( Z 2.25) P( Z 1.25).9878.1056.8822 Working Backwards Example: In a national exam, the score distribution is approximately normal with mean 550 and standard deviation 110. Sondra scored at the 63 rd percentile, meaning she scored at least as well as 63% of the test takers. What was her score? Solution:

In this case, we have to first assign Sondra her correct z-score. We then have to find the raw score that corresponds to this z-score. Using the normal tables we find that the 63 rd percentile occurs at a z-score of 0.33 (approximately). So Sondra s z-score is 0.33 Now we can find her raw score using the relation : z x x 550 110 in 550 and 110 plugging We know z, it is 0.33. so now we can solve for x

550 0.33 x 110 ( 0.33)(110) x 550 x 550 (0.33)(110) 586 Example: Assume diastolic blood pressure readings are normally distributed with mean 88 and standard deviation 12. What range of blood pressures encompasses the middle 75% of readings? Solution: If X is the variable of blood pressure readings, we want to find numbers c and d such that P ( c X d). 7500. Because of symmetry, it is easy to find c once d is found. Now since the remainder area is

.2500, and this is split into two tails, it must be true that the area above d is.1250. Thus d must be the 87.50 th percentile. So using the table: The z-score of d is 1.15. Therefore: 88 1.15 d 12 d 88 (12)(1.15) 101.8 So d is 13.8 units above the central reading of 88. Therefore c is 13.8 units below 88. So c = 74.2 and d = 101.8