Lecture 7: Confidence interval and Normal approximation

Size: px

Start display at page:

Download "Lecture 7: Confidence interval and Normal approximation"

Meagan Wilson
5 years ago
Views:

1 Lecture 7: Confidence interval and Normal approximation 26th of November 2015 Confidence interval 26th of November / 23

2 Random sample and uncertainty Example: we aim at estimating the average height of British men. We measure the height of 198 men. The sample mean of 198 men s heights is 1732mm, and the sample standard deviation is 68.8mm. What does this tell us about the average height of British men? We only measured 198 of the many millions of men in the country. Measuring the height of 198 other means would have lead to another estimation of the average height of British men Confidence interval 26th of November / 23

3 Imagine the height of British men follows a normal distribution N(µ, σ 2 ). Denote by X 1,..., X n the random measurements of the height of n = 198 mens. Our best guess for µ will of course be the sample mean X = 1 n (X X n ) What is the distribution of X? It is the sum of normally distributed random variables so it follows a normal distribution. Its mean is µ. Its variance is V ar( X) ) = (V 1 ar(x n 2 1 ) + + V ar(x n ) Therefore X N ) (µ, σ2 n. = σ2 n. Confidence interval 26th of November / 23

4 We can standardise X by writing Z = X µ σ/, where Z N(0, 1). n Using the standard normal tables we can show that P ( 1.96 Z 1.96) = 0.95 Therefore we have that ( P X 1.96 σ µ X σ ) = n n i.e, the probability is 0.95 that µ is in the range X 1.96 σ n to X σ n ; σ can be approximated using the sample standard deviation. We call this interval a 95% confidence interval for the unknown population mean. Confidence interval 26th of November / 23

5 General procedure for normal confidence interval Suppose X 1,..., X n are independent samples from a normal distribution with unknown mean µ, and known variance σ 2. Then a (symmetric) c% normal confidence interval for µ is the interval ( X z σ n, X + z σ n ), which we also write as X ± z σ n, z is the number such that (100 c)/2% of the probability in the standard normal distribution is above z. For a 95% confidence interval, we take z = 1.96 For a 99% confidence interval, we take z = 2.6 Confidence interval 26th of November / 23

6 Standard error A c% normal confidence interval for µ is the interval ( X z σ n, X + z σ n ). The quantity σ n, which determines the scale of the confidence interval, is called the Standard Error (SE) for the sample mean. Here we assume that σ is known. If it is not, we will estimate it from the data. Confidence interval 26th of November / 23

7 Interpreting the confidence interval What does confidence mean? The quantity µ is a fact, not a random quantity. Example: imagine we obtain the following 5% confidence interval: (1722mm, 1742mm) for the men height. We can not say: The probability is 0.95 that µ is between 1722mm and 1742mm. The randomness is in our estimate X. The probability statement is about the random interval. Confidence interval 26th of November / 23

8 Interpreting the confidence interval A (α 100)% confidence interval for a parameter µ, based on observations X = (X 1,..., X n ) is a pair of statistics A(X) and B(X), such that P (A(X) µ B(X)) = α. We can say that: (α 100)% of the time, the random interval generated according to this recipe will cover the true parameter. The quantity P (A(X) µ B(X)) is called the coverage probability for µ. A (α 100)% confidence interval is sometimes also called a confidence interval with confidence coefficient or confidence level α. Confidence interval 26th of November / 23

9 Example: blood sample Four measurements for 100 patients 95% confidence interval Blood Pressure Blood Pressure Trial Trial 90% confidence interval 68% confidence interval Blood Pressure Trial Trial Blood Pressure Confidence interval 26th of November / 23

10 The Normal Approximation So far, we have been assuming that our data are sampled from a population with a normal distribution. What justification do we have for this assumption? And what do we do if the data come from a different distribution? One of the great early discoveries of probability theory is that many different kinds of random variables come close to a normal distribution when you average enough of them. You have already seen examples of this phenomenon in the normal approximation to the binomial distribution and the Poisson. Confidence interval 26th of November / 23

11 The Normal Approximation Approximation Theorems in Probability Suppose X 1, X 2,..., X n are independent samples from a probability distribution with mean µ and variance σ 2. Then Law of Large Numbers (LLN): For n large, X will be close to µ. Central Limit Theorem (CLT): For n large, the error in the LLN is close to a normal distribution, with variance σ 2 /n. That is, using our standardisation procedure for the normal distribution, Z = X µ σ/ n (1) is close to having a standard normal distribution. Equivalently, X X n has approximately N(nµ, nσ 2 ) distribution. Confidence interval 26th of November / 23

12 In probability textbooks, you can find very precise statements about what it means for the distribution to be close. For our purposes, we will simply treat Z as being actually a standard normal random variable. However, we also need to know what it means for n to be large. For most distributions that you might encounter, 20 is usually plenty, while 2 or 3 are not enough. The key rules of thumb are that the approximation works best when the distribution of X 1 is reasonably symmetric: Not skewed in either direction. 2 has thin tails: Most of the probability is close to the mean, not many standard deviations away from the mean. More specific indications will be given in the following examples. Confidence interval 26th of November / 23

13 The Poisson distribution Suppose X i are drawn from a Poisson distribution with parameter µ. The variance is then also µ. We know that X X n has Poisson distribution with parameter nµ. The CLT tells us that for n large enough, the P o(nµ) distribution is very close to the N(nµ, nµ) distribution; or, in other words, P o(λ) is approximately the same as N(λ, λ) for λ large. How large should it be? Confidence interval 26th of November / 23

14 λ = 1 λ = 4 Density Z λ = 10 Density Density Z λ = 20 Density Z Note that when λ < 2.5 λ or, equivalently, λ < 6.2, the normal curve will have substantial probability below 0.5. Z Confidence interval 26th of November / 23

15 The Bernoulli distribution Bernoulli variables is the name for random variables that are 1 or 0, with probability p or 1 p respectively. Then B = X X n is the number of successes in n trials, with success probability p each time that is, a Bin(n, p) random variable. The CLT implies that Bin(n, p) N(np, np(1 p)) for large values of n. Note that B/n is the proportion of successes in n trials, and this has approximately N(p, p(1 p)/n) distribution. In other words, the observed proportion will be close to p, but will be off by a small multiple of the standard deviation, which shrinks as σ/ n, where σ = p(1 p). Confidence interval 26th of November / 23

16 The Bernoulli distribution How large does n need to be? As in the case of the Poisson distribution, a minimum requirement is that the mean be substantially larger than the standard deviation np np(1 p) so that n 1/p. The condition is symmetric, so we also need n 1/(1 p). This fits with our rule of thumb that n needs to be bigger when the distribution of X is skewed, which is the case when p is close to 0 or 1. Confidence interval 26th of November / 23

17 The Bernoulli distribution When p = 0.5 the normal approximation is quite good, even when n is only 10 when p = 0.1 we have a good normal approximation when n = 100, but not when n is 25. p = 0.5, n = 3 p = 0.5, n = 10 p = 0.5, n = 25 p = 0.5, n = p = 0.1, n = p = 0.1, n = p = 0.1, n = p = 0.1, n = Confidence interval 26th of November / 23

18 CLT for real data Example: household incomes in the state of California in 1999 Challenge: household income has a highly skewed distribution hence a poor candidate for applying the CLT. Histogram of California household income 1999 Averages of 5 California household incomes Averages of 50 California household incomes Density income in $thousands Averages of 2 California household incomes Density Density income in $thousands Averages of 10 California household incomes Density Density income in $thousands Averages of 100 California household incomes Density income in $thousands income in $thousands income in $thousands Confidence interval 26th of November / 23

19 Use of the CLT The CLT enables us to approximate probability distributions by a normal distribution. There are many implications of the Central Limit Theorem. Here we discuss one crucial application: The CLT allows us to compute normal confidence intervals to data that are not themselves normally distributed Confidence interval 26th of November / 23

20 Example: average incomes Suppose we take a random sample of 400 households in Oxford. We find that they have an average income of 36,200, with an SD of 26,400. What can we infer about the average income of all households in Oxford? Answer: Although the distribution of incomes is not normal, the average of 400 incomes is approximatively normally distributed. The SE for the mean is 26400/ 400 = 1320, so a 95% confidence interval for the average income in the population will be ± = ( 33560, 38840). A 99% confidence interval is ± = ( 32800, 39600). Confidence interval 26th of November / 23

21 Confidence intervals for probability of success Example: The Gallup organisation carried out a poll in October, 2005, of Americans attitudes about guns They surveyed 1,012 Americans, chosen at random. 30% said they personally owned a gun. If they d picked different people, purely by chance they would have gotten a somewhat different percentage. What does this survey tell us about the true proportion of Americans who own guns? Here, n = 1, 012 is large so, according to the CLT, the distribution of the proportion of Americans who own guns can be approximated by a N(p, p(1 p)/n) distribution. Confidence interval 26th of November / 23

22 We can compute a 95% confidence interval for the true proportion of Americans who own guns as 0.30 ± 1.96SE where SE can be approximated using the sample mean p 0.3: SE = p(1 p) n = So a 95% confidence interval for the true fraction of Americans who own guns is 0.30 ± = (0.271, 0.329). Loosely put, we can be 95% confident that the true proportion supporting EPP is between 27% and 33%. A 99% confidence interval comes from multiplying by 2.6 instead of 1.96: it goes from 26.3% to 33.7%. Confidence interval 26th of November / 23

23 Summary Definition of confidence interval for the mean when data are normally distributed Approximation theorems in probability the law of large numbers the central limit theorem (CLT) Indications about the conditions of the CLT for different probability distributions Applications of the CLT to compute confidence intervals when data are not normally distributed Reminder: the lecture notes contain more details and more examples; they are available on my website. Confidence interval 26th of November / 23

Fundamental Tools - Probability Theory IV

Fundamental Tools - Probability Theory IV MSc Financial Mathematics The University of Warwick October 1, 2015 MSc Financial Mathematics Fundamental Tools - Probability Theory IV 1 / 14 Model-independent