Confidence intervals for parameters of normal distribution.

Size: px

Start display at page:

Download "Confidence intervals for parameters of normal distribution."

Jared Phillips
6 years ago
Views:

1 Lecture 5 Confidence intervals for parameters of normal distribution. Let us consider a Matlab example based on the dataset of body temperature measurements of 30 individuals from the article []. The dataset can be downloaded from the journal s website. This dataset was derived from the article []. First of all, if we use dfittool to fit a normal distribution to this data we get a pretty good approximation, see figure body temperature normal fit body temperature normal fit Density Cumulative probability Data Data Figure 5.: Fitting a body temperature dataset. (a) Histogram of the data and p.d.f. of fitted normal distribution; (b) Empirical c.d.f. and c.d.f. of fitted normal distribution. The tool also outputs the following MLEstimates ˆµ and ˆα of parameters µ, α of normal distribution: Parameter Estimate Std. Err. mu sigma

2 Also, if our dataset vector name is normtemp then using the matlab function normfit by typing [mu,sigma,muint,sigmaint]=normfit(normtemp) outputs the following: mu = 98.49, sigma = 0.733, muint = [98., ], sigmaint = [0.654, 0.835]. The last two intervals here are 95% confidence intervals for parameters µ and α. This means that not only we are able to estimate the parameters of normal distribution using MLE but also to garantee with confidence 95% that the true unknown parameters of the distribution belong to these confidence intervals. How this is done is the topic of this lecture. Notice that conventional normal temperature 98.6 does not fall into the estimated 95% confidence interval [98., ]. Distribution of the estimates of parameters of normal distribution. Let us consider a sample X,..., X n N(µ, α ) from normal distribution with mean µ and variance α. MLE gave us the following estimates of µ and α - µˆ = X and ˆα = X (X ). The question is: how close are these estimates to actual values of the unknown parameters µ and α? By LLN we know that these estimates converge to µ and α, X µ, X (X ) α, n, but we will try to describe precisely how close X and X (X ) are to µ and α. We will start by studying the following question: What is the joint distribution of ( X, X ( X) ) when X,..., X n are i.i.d from N(0, )? A similar question for a sample from a general normal distribution N(µ, α ) can be reduced to this one by renormalizing Z i = (X i µ)/α. We will need the following definition. Definition. If X,..., X n are i.i.d. standard normal then the distribution of X X n is called the χ n-distribution (chi-squared distribution) with n degrees of freedom. We will find the p.d.f. of this distribution in the following lectures. At this point we only need to note that this distribution does not depend on any parameters besides degrees of freedom n and, therefore, could be tabulated even if we were not able to find the explicit formula for its p.d.f. Here is the main result that will allow us to construct confidence intervals for parameters of normal distribution as in the Matlab example above. Theorem. If X,..., X n are i.d.d. standard normal, then sample mean X and sample variance X (X ) are independent, nx N(0, ) and n( X (X ) ) χ n, i.e. nx has standard normal distribution and n(x (X ) ) has χ (n ) degrees of freedom. n distribution with 8

3 Proof. Consider a vector Y given by a specific orthogonal transformation of X: Y n n X Y = = V X=..?... Y n Here we choose a first row of the matrix V to be equal to v =,..., n n and let the remaining rows be any vectors such that the matrix V defines orthogonal transformation. This can be done since the length of the first row vector v =, and we can simply choose the rows v,..., v n to be any orthogonal basis in the hyperplane orthogonal to vector v. Let us discuss some properties of this particular transformation. First of all, we showed above that Y,..., Y n are also i.i.d. standard normal. Because of the particular choice of the first row v in V, the first r.v. Y = n X n X n = nx X n and, therefore, X = Y. (5.0.) n Next, n times sample variance can be written as n(x (X ) ) = X X n (X X n ) n = X X n Y. The orthogonal transformation V preserves the length of X, i.e. Y = V X = X or and, therefore, we get Y + + Y n = X + + X n n(x (X ) ) = Y Y Y = Y Y. (5.0.) n n Equations (5.0.) and (5.0.) show that sample mean and sample variance are independent since Y and (Y,..., Y n ) are independent, nx = Y has standard normal distribution and n(x (X ) ) has χ n distribution since Y,..., Y n are independent standard normal. Let us write down the implications of this result for a general normal distribution: X,..., X n N(µ, α ). 9

4 In this case, we know that Z = X µ,, Zn = X n µ N(0, ) α α are independent standard normal. Theorem applied to Z,..., Z n gives that n nz = n X i µ = n(x µ) N(0, ) n α α i= and n(z (Z ) ) = n Xi µ X i µ n α n α n Xi µ X i µ = n n i= α n α = n X (X ) χ n. α We proved that MLE ˆµ = X and ˆα = X (X ) are independent and n(ˆµ µ) N(0, ), nσˆ χ σ σ n. Confidence intervals for parameters of normal distribution. We know that by LLN a sample mean µˆ and sample variance ˆα converge to mean µ and variance α : µˆ = X µ, αˆ = X (X ) α. In other words, these estimates are consistent. Based on the above description of the joint distribution of the estimates, we will give a precise quantitative description of how close ˆµ and ˆα are to the unknown parameters µ and α. Let us start by giving a definition of a confidence interval in our usual setting when we observe a sample X,..., X n with distribution P θ0 from a parametric family {P θ : θ Θ}, and θ 0 is unknown. Definition: Given a confidence level parameter α [0, ], if there exist two statistics such that probability S = S (X,..., X n ) and S = S (X,..., X n ) P θ0 (S θ 0 S ) = α ( or α) then we will call [S, S ] a confidence interval for the unknown parameter θ 0 with the confidence level α. 30

5 This definition means that we can garantee with probability/confidence α that our unknown parameter lies within the interval [S, S ]. We will now show how in the case of a normal distribution N(µ, α ) we can construct confidence intervals for unknown µ and α. Let us recall that in the last lecture we proved that if then X,..., X n are i.d.d. with distribution N(µ, α ) A = n(ˆµ µ) nαˆ N(0, ) and B = α α χ n and the random variables A and B are independent. If we recall the definition of χ distribution, this means that we can represent A and B as for some Y,..., Y n - i.d.d. standard normal. A = Y and B = Y Y n Tails of χ n -distribution α α 0 0 c 5 0 c lacements Figure 5.: p.d.f. of χ n -distribution and α-confidence interval. First, let us consider p.d.f. of χ n distribution (see figure 5.) and choose points c and c so that the area in each tail is ( α)/. Then the area between c and c is α which means that P(c B c ) = α. Therefore, we can garantee with probability α that nαˆ c α c. 3

6 Solving this for α gives This precisely means that the interval nαˆ c nαˆ α. c nαˆ nαˆ ], c is the α-confidence interval for the unknown variance α. Next, let us construct the confidence interval for the mean µ. We will need the following definition. Definition. If Y 0, Y,..., Y n are i.i.d. standard normal then the distribution of the random variable Y 0 (Y Y ) n n is called (Student) t n -distribution with n degrees of freedom. We will find the p.d.f. of this distribution in the following lectures together with p.d.f. of χ -distribution and some others. At this point we only note that this distribution does not depend on any parameters besides degrees of freedom n and, therefore, it can be tabulated. Consider the following expression: c A Y = B n n (Y Y n ) t n which, by definition, has t n -distribution with n degrees of freedom. On the other hand, A B n = n (ˆµ µ) / α n nαˆ α n = (ˆµ µ). αˆ If we now look at the p.d.f. of t n distribution (see figure 5.3) and choose the constants c and c so that the area in each tail is ( α)/, (the constant is the same on each side because the distribution is symmetric) we get that with probability α, n c (ˆµ µ) c αˆ and solving this for µ, we get the confidence interval αˆ αˆ µˆ c µ µˆ + c. n n Example. (Textbook, Section 7.5, p. 4)) Consider a sample of size n = 0 from normal distribution with unknown parameters: 0.86,.53,.57,.8, 0.99,.09,.9,.78,.9,.58. 3

7 0.4 Tails of t n -distribution α α c 0 c 4 6 ements Figure 5.3: p.d.f. of t n distribution and confidence interval for µ. We compute the estimates µˆ = X =.379 and ˆα = X (X ) = Let us choose confidence level α = 95% = We have to find c, c and c as explained above. Using the table for t 9 -distribution we need to find c such that t 9 (, c) = which gives us c =.6. To find c and c we have to use the χ 9-distribution table so that χ ([0, c ]) = 0.05 c =.7 9 χ ([0, c ]) = c = Plugging these into the formulas above, with probability 95% we can garantee that X c (X (X ) 9 ) µ X + c (X (X ) 9 ).446 µ.634 and with probability 95% we can garantee that or n(x (X ) ) n(x (X ) ) α c α c 33

8 These confidence intervals may not look impressive but the sample size is very small here, n = 0. References. [] Allen L.Shoemaker (996), What s Normal? - Temperature, Gender, and Heart Rate. Journal of Statistics Education, v.4, n.. [] Mackowiak, P. A., Wasserman, S. S., and Levine, M. M. (99), A Critical Appraisal of 98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and Other Legacies of Carl Reinhold August Wunderlich. Journal of the American Medical Association, 68,

7.3 The Chi-square, F and t-distributions

7.3 The Chi-square, F and t-distributions Ulrich Hoensch Monday, March 25, 2013 The Chi-square Distribution Recall that a random variable X has a gamma probability distribution (X Gamma(r, λ)) with parameters