Section 7.1 How Likely are the Possible Values of a Statistic? The Sampling Distribution of the Proportion

CNN / USA Today / Gallup Poll September 22-24, 2008 www.poll.gallup.com 12% of Americans describe the current economic conditions in the country as "excellent" (1%) or "good" (11%). Thirty-three percent say the economy is "only fair" and 55% say it is "poor." Quick Check: Are these numbers statistics or parameters? Results are based on telephone interviews with 1,520 national adults, aged 18 and older.

Sampling Variability We don t expect samples to match up with the population exactly but we hope that our sample is fairly representative and therefore that the sample statistic will fairly estimate the population parameter. The larger the sample, the better the sample should be at describing the population.

How can we be sure that the sample proportions from ONE sample reflect the views of all Americans? Because we know (or will shortly know) that although sample proportions vary, they vary according to a predictable pattern. The key to finding this answer will involve us knowing how do sample statistics vary from sample to sample?

Sample Proportions When working with a categorical variable the most common statistic computed is the sample proportion. We will denote the sample proportion by pˆ ( p-hat ) The population proportion will be denoted by p. Example: In the Gallup Poll, pˆ What is? What is p?

Sampling Distribution The sampling distribution of a statistic describes how the statistic would vary if the experiment were conducted many, many times, each time with the same number of subjects. What is the sampling distribution of? In other words, can you describe how you would expect values of pˆ to vary if repeated random samples of each of size n were taken from the same population. pˆ

Sampling Distribution of p-hat Histograms of proportions from samples tend to be roughly normally distributed.

Sampling Distribution of p-hat Histograms of sample proportions tend to be roughly normally distributed and centered at the proportion of the population from which the sample was drawn, p. Also, as the sample size increases the variability (or standard deviation) of the sample proportion decreases.

Standard Error Standard deviation of the sampling distribution is a mouthful, so we ll simply say standard error. The formula for standard error of the sample proportion is: ( 1 ) p p n where p is the population proportion and n is the sample size.

How varied are the sample proportions? If the histogram of the sample proportions is roughly normal then (by the empirical rule) about 68% of the sample proportions fall within 1 standard error of the center and 95% of the sample proportions fall within 2 standard errors of the center.

Does this bell-shaped phenomena just work in this situation? No. The bell shape that appears in histograms of proportions appears in lots of systems that involve chance. Some examples include stock prices, mortality rates, birth rates, SAT scores, etc. As long as both the population being sampled and the sample size each stay the same, as the number of samples increases, the bell shape begins to emerge.

Practice Problem In the 2003 recall election of CA s governor, Gray Davis, the exit poll showed that 54% of the 3160 people sampled were in favor. If the exit poll constituted a random sample, how confident would you be in predicting that more than 50% of the population voted yes? If the population proportion of those supporting the recall was 50%, would you be surprised to see an exit poll result of 54%?