Lecture 8 Sampling Theory Thais Paiva STA 111 - Summer 2013 Term II July 11, 2013 1 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Lecture Plan 1 Sampling Distributions 2 Law of Large Numbers 3 Central Limit Theorem 2 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Statistical Inference We want to study some quantities of interest (parameter) in a large population. Example: Obama s approval rating. But we cannot observe the whole population. What do we do? Design a study to sample individuals from the population. Example: eligible voters Study the quantity of interest on your sample Infer (conclude) about the unknown parameter. Example: 1 Determine a range that will include the parameter of interest: 0.45 < approval rating < 0.55 2 Test a hypothesis: is the approval rating > 0.5? 3 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Statistical Inference A statistic refers to a characteristic of the sample (e.g., sample mean, sample deviation, sample maximum) A parameter refers to a characteristic of the population (e.g., population mean, population standard deviation, population proportion that votes for republicans) Our goal is to use statistics to infer the parameter in the population (e.g., what is the relation between the sample mean and the population mean?) The sampling distribution is the bridge! 4 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Example Population: STA 111 Heights Distribution of students height Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 50 55 60 65 70 75 80 Height (in) Let s assume this is the true population with parameter µ = 68.4 σ 2 = 18.6 We wish to take a sample to estimate µ and σ 2. 5 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Samples size = 4 Let s say we take a sample of size 4 and repeat it 5 times. For each sample, we calculate the sample mean x and the variance s 2. Sample # x 1 x 2 x 3 x 4 x s 2 1 63 72 73 71 69.80 20.90 2 72 70 73 73 72.00 2.00 3 70 71 63 60 66.00 28.70 4 62 76 74 72 71.00 38.70 5 73 74 71 75 73.20 2.90 We see that the x s are pretty close around µ = 68.4. There is quite some variability in s 2 across samples. 6 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Sampling Distribution (n = 4) What if I carry on and repeat it 1000 times? Frequency 0 50 100 150 Some x s are quite extreme! But most of them seem to hover around the population mean (red vertical line). 60 65 70 75 Sample Mean (n=4) 7 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Sampling Distribution What if we change the sample size? Frequency 0 50 150 60 65 70 75 Sample Mean (n=4) Frequency 0 100 200 300 60 65 70 75 Sample Mean (n=15) Frequency 0 20 40 60 80 60 65 70 75 Sample Mean (n=50) Frequency 0 5 10 20 60 65 70 75 Sample Mean (n=100) 8 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Sampling Distribution The previous histograms are examples of Sampling Distributions Distributions of a statistic calculated from a random sample Each individual in the population is equally likely to be chosen every time we draw an observation A statistic is random because each sample is different: if the data have not been recorded yet, the statistic is simply a function of same random elements Viewing a statistic as a random variable, we can define its mean and variance. For example, E( X ) = µ X V ( X ) = σ 2 X (Tricky notation: Population mean µ of a statistic X!!) 9 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Estimator We saw that The sampling distribution of x is centered around µ The variability of x becomes smaller with larger sample size If we use x to infer about µ, we call x an estimator of µ. There are many other potential estimators for µ. For example, if the underlying population is Normal, we can use the sample median. In the next lecture, we will discuss ways to evaluate and compare estimators. 10 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Combination of Random Variables There are two important properties of random variables that are useful in studying estimators. If we let X and Y be two independent random variables, then E(X + Y ) = E(X ) + E(Y ) Var(X + Y ) = Var(X ) + Var(Y ) We will discuss these properties later in the class. 11 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Mean and Variance of the Sample Mean Let X 1,..., X n be independent and identically distributed random variables. The above assumption says X 1,..., X n are randomly sampled from the same distribution (= random sample). Then ( ) X1 +... + X n E( X ) = E n ( ) Var( X X1 +... + X n ) = Var n = E(X 1) +... + E(X n ) n = Var(X 1) +... + Var(X n ) n 2 = nµ n = µ = nσ2 n 2 = σ2 n 12 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Mean and Variance of the Sample Mean E( X ) = µ says If I repeatedly collect my sample, the overall average of X is µ, the true population mean In reality, we usually only collect the sample once This holds for any sample size! V ( X ) = σ2 n says The variability in X decreases as the sample size increases. Specifically, it goes down by a rate of 1/n The variability also depends on the underlying population s variability! 13 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Mean and Variance of the Sample Mean However, E( X ) = µ by itself does not guarantee that X = µ! Luckily, V ( X ) = σ2 n says that the variance of X decreases toward zero as the sample size increases. So, when the sample is large, the uncertainty goes to zero, and therefore lim Var( X ) = 0 n lim X = µ n 14 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Mean and Variance of the Sample Mean Recall our first example: Frequency 0 50 150 60 65 70 75 Sample Mean (n=4) Frequency 0 100 200 300 60 65 70 75 Sample Mean (n=15) Frequency 0 20 40 60 80 60 65 70 75 Sample Mean (n=50) Frequency 0 5 10 20 60 65 70 75 Sample Mean (n=100) 15 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Law of Large Numbers: Interpretation Suppose you want to estimate µ on a specific population What you can do is to extract a sample from the population and estimate the sample mean x If the sample is big enough, x will be close to µ If you increase the sample size, x should get closer to µ The more you increase the sample size, the closer x to µ 16 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Central Limit Theorem The Law of Large Numbers tells me how X behaves in terms of central tendency and variability. That is useful information, but it does not tell me its actual distribution! The Central Limit Theorem says: when n is large, X is approximately normally distributed ( σ X 2 ) N µ, n Important: the CLT holds regardless of the underlying distribution of X! No matter what the shape of the original distribution is, the sampling distribution of the mean approaches a normal distribution. 17 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Central Limit Theorem Density 0.00 0.05 0.10 0.15 0 5 10 15 X Here is a weird distribution with parameter By CLT, µ = 6.5 σ = 2.9 if n = 10: ( X N if n = 50: ( X N 6.5, 2.92 10 6.5, 2.92 50 ) ) 18 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Central Limit Theorem Sample Mean (n = 10) Sample Mean (n = 50) Density 0.0 0.1 0.2 0.3 0.4 4 6 8 10 Density 0.0 0.2 0.4 0.6 0.8 4 6 8 10 Amazing. 19 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Using CLT: Height Example Assume the distribution of height in our class has a mean of 70 (inches) and a variance of 100 (inches 2 ). In my study, I will obtain measurements of 20 individuals. What is the probability that X will be between 65 and 75? By CLT, X N (µ, ), σ2 where µ = 70 and σ2 n n = 100 20 = 5. ( ) P(65 < X 65 70 75 70 < 75) = P < Z < 5 5 ( = P 5 < Z < ) 5 = 0.974 20 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Using CLT: Height example Note: in the previous example we calculated P(65 < X < 75) = 0.974 NOT ( ) 65 70 75 70 P(65 < X < 75) = P < Z < 100 100 = P ( 0.5 < Z < 0.5) = 0.38 The first one is the sample average; the second one is just one actual height!!! 21 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Using CLT: Sample Size Assume the distribution of height in our class has a mean of 70 (inches) and a variance of 100 (inches 2 ). In designing my study, what sample size should I use so that the probability that my sample average X is between 69 and 71 is equal to 90%? P(69 < X < 71) = P ( ) 69 70 71 70 < Z < 100/n 100/n ( n = P ( P Z > ) n 10 < Z < = 0.90 10 ) ( n = P Z < 10 Because Z is symmetric: n = 1.64 n = 270.5 10 ) n = 0.05 10 22 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Sample Percentage The sample percentage is defined to be the ratio between the number of successes over the number of trials n i=1 P = X i n For example, the batting averages (P) are estimates of the unknown proportion of successful batting in the whole career (π) If we could observe the data for the whole career, then we would know the true value 23 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Sample Percentage n i=1 P = X i n E(P) = E[X 1] +... + E[X n ] n = π +... + π n = π Var(P) = Var[X1]+...+Var[Xn] n 2 Law of Large Numbers: Central Limit Theory: = π(1 π)+...+π(1 π) n 2 lim P = π n [ P N π, ] π(1 π) n = π(1 π) n 24 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013
Sample Percentage Suppose tossing a fair coin 1,000 times. What is the probability of observing heads less than half of the times? Fair coin means that π = 0.5 ( P P < 500 ) ( 500 1000 = P Z < 0.5 ) 1000 0.5(1 0.5)/1000 = P(Z < 0) = 0.5 We could also try to work with Binomial distribution probabilities but n is very large here 25 / 25 Thais Paiva STA 111 - Summer 2013 Term II Lecture 8, 07/11/2013