(mu, pronounced mew ) and standard deviation σ (sigma).

STAT 203 Lecture 4-1. - The normal distribution is symmetric. - Getting the probability from between two z-scores - Translating standard scores to and from raw scores. - Extreme values beyond the table. So Majestic!

Text from last Friday: Say a value X followed the normal distribution, with mean μ (mu, pronounced mew ) and standard deviation σ (sigma). We used the z-table to find things like the probability that X is greater than 1.28 standard deviations above the mean. In other words, we found Pr( X > μ + 1.28σ)

μ + 1.28σ means a z-score of 1.28. From the z-table, page 515 z Area between Mean Area beyond z and z 1.27 39.80 10.20 1.28 39.97 10.03 1.29 40.15 9.85 Since we re looking at the values farther away from the mean than the cutoff, we want the area beyond z.

Pr( X > μ + 1.28σ) = 10.03%, or about 10% Can we find Pr( X > μ - 1.28σ)? Hint: Think symmetry.

We can find Pr( X > μ - 1.28σ) Symmetry: The same on both sides.

What is the chance that this value, X, is more than 1 standard deviation away the mean in either direction? Start with Pr( X > μ + 1σ), or, because it s simpler to write: Pr( Z > 1) By the table (page 514) z Area between Mean Area beyond z and z.99 33.89 16.11 1.00 34.13 15.87 1.01 34.38 15.62

Pr( Z > 1) =.16

Pr( Z > 1) =.16, so Pr( Z < -1) =.16 also

Pr( Z > 1) + Pr(Z < -1) =.32 Not surprizing since Pr( -1 < Z < 1) =.68,.68 +.32 = 1.00

We could have done this the other way too: Working backwards from Pr( -1 < Z < 1) =.68 We could get by converse Pr(Z < -1) + Pr(Z > 1) =.32 and get by symmetry Pr(Z > 1) =.16 One other thing to note is that Z = 0 right at the mean, because the mean is 0 standard deviations above or below the mean.

Let s try with some uglier z-scores. Pr( -1.75 < Z < 0.52) z Area between Mean Area beyond z and z 0.51 19.50 30.50 0.52 19.85 30.15 1.74 45.91 4.09 1.75 45.99 4.01

Doing the math

Pr( -1.75 < Z < 0.52) can be split into two ranges using the mean as the split point. Pr( -1.75 < Z < 0 ) + Pr( 0 < Z < 0.52) Why would we do this? Because the table has everything from the mean. Pr(-1.75 < Z < 0) =.4599 Pr(0 < Z < 0.52) =.1985.4599 +.1985 =.6584 About 66% of the area.

Pic of the 66%

Z-scores, or standard scores, are a bridge between real data and probabilities surrounding them. We find z-scores with this (important!):

X is the value that we re interested in. We usually want to know the probability of getting a value below or above X.

X is also called the raw score, meaning we haven t prepared it for use at all. Raw as in uncooked.

μ is the mean, in most cases this will be given to you. Look for clues like average, and centered around.

σ is the standard deviation, in most cases it s given or computed from SPSS.

The Z-Score is the number of standard deviations above the mean.

Z-Score is also called Standard Score.

Example problem: The time spent on homework in hours/week for full time students is normally distributed with mean 25, and standard deviation = 7 What proportion of students spend more than 20 hours on homework?

Step 1: Identify μ = 25, σ = 7, x = 20. We want the proportion, which is like the probability. We know the distribution is normal. These are clues to find the z-score / standard score, and use it in the z-table to get the proportion.

Step 2: Apply. What do we want?! Z!!!! What do we have?! μ = 25, σ = 7, x = 20.!!!! Use the formula that has Z on one side, and μ, σ, and x on the other.

-0.71 isn t on the table, but by symmetry, we can use 0.71.

By the table, 26.11% is between the mean and z=0.71,23.89% is beyond z=0.71. We want Pr( X > 20), which is Pr(Z > -0.71) Method 1: Split Pr( Z > -0.71) = Pr( Z >0) + Pr(-0.71 < Z < 0) =.5000 +.2611 =.7611 Method 2: Converse Pr( Z > -0.71) = 1 Pr(Z < -0.71) = 1 -.2389 =.7611

We can work backwards from a probability to get a value too, with this: (also important) This is the same formula as the z-score (standard score) formula, but rearranged so that X is the value we get out of it.

Example problem: Homework/week is normally distributed, μ = 25, σ = 7 What s the minimum homework I can expect 90% of the class to do? In other words Pr(X >??? ) =.9000 Step 1: Identify. We have the proportion, and we want the value x. Again, z-score is going to be our bridge.

Going X Z Prob, we used the table last. Going Prob Z X, we ll use the table first. We want the Z value such that 10% of the area is beyond the mean.

As z increases, the area beyond that value decreases. Z % Area Beyond 0.00 50.00 0.01 49.60 0.02 49.20 0.03 48.80 0.04 48.40 0.05 48.01

We can use that to find the Z-score with 10% beyond. (Approximation may be needed) Z % Area Beyond 0.00 50.00 0.01 49.60 0.02 49.20 0.44 33.00 0.45 32.64 0.46 32.36 1.27 10.20 1.28 10.03

Now we know Pr( Z > 1.28) = 10.03%, that s the closest z-score to 10% in the table. What do we want?! X!!!! What do we have?! μ = 25, σ = 7, z = -1.28!!!!

So 90% of the full-time students spend 16.04 hours or more on homework.

What proportion of students spend more than 60 hours/week? μ = 25, σ = 7, x = 60.

Now we have z = 5, how do we get Pr(Z > 5)? The table only goes to z = 3.5ish. Use inference: We want the area beyond z=5, and the area shrinks as z goes up. The smallest area is 0.01%, so the area beyond z=5 must be smaller than that. That s all we can tell from this table. Fewer than 0.01% of students spend 60 hours/week on homework.

(for interest) Very few data points are going to be more than six standard deviations above or below the mean. Far less than 0.01% Six Sigma is a business practice based on making each part in a machine consistent enough that it will work as long as it s within six standard deviations, or 6σ of the mean.

Next time: - A few more notes on Z-scores - Discuss Midterm - We start chapter 6.