Math 6, roblem set Due //. (4..8) Determine the mean variance of the mean X of a rom sample of size 9 from a distribution having pdf f(x) = 4x, < x <, zero elsewhere. We have that: Using the pdf, so E[ X] = E[ n Xi ] = n ne[x i] = E[X i ] var( X) = var( X i ) n = var(x i). n E[X] = E[X ] = var(x i ) = ( 4 5 4x 4 dx = 4 5. 4x 5 dx =, var( X) = 675. ) = 75,. (4..5) Let S = n (Xi X) d denote the sample variance of a rom sample from a distribution with variance σ >. Since E[S ] = σ, why isn t E[S] = σ? Note: There is a hint in the book that gives it away, but maybe think about it for a second before looking there. This follows from the fact that f(x) = x is strictly convex (that is f (x) = >, from Jensen s inequality: σ = E[S ] > (E[S] ),
so E[S] < σ. Note: some of you also (cleverly) noticed this follows from the fact that var(s) >.. (5..4) Let X,..., X n be a rom sample from the Γ(, θ) distribution, where θ is unknown. (Recall, look back in chapter if you forget the specifics of the Γ distribution). Let Y = n i= X i. (a) Find the distribution of Y determine c so that cy is anunbiased estimator of θ. (b) If n = 5 show that (9.59 < Yθ ) < 4. =.95 (c) Using (b), show that if y is the value of Y once the sample is drawn then the interval ( ) y 4., y 9.59 is a 95% confidence interval for Θ. (d) Suppose the sample results in the values, 44.879.55.99.574 4.5 Based on these data, obtain the point estimate of θ as described in art (a) the computed 95% confidence interval in art (c). What does the confidence interval mean? For (a), Y Γ(n, θ) by the additive property of the Gamma distribution. Thus E[Y ] = nθ we should take c = n. For (b), we will simply set up the integral. (9.59 < Yθ ) < 4. = (4.795θ < Y < 7.θ) = = 7.θ 4.795θ 7. 4.795 9!θ x9 e x/θ dx u 9 9! e u du =.95.95. Note: The important part here is that this does not depend on θ.
For (c), note that: (9.59 < Yθ ) < 4. = ( y 4. < θ < y 9.59 ) =.95 so the given interval is a 95% confidence interval. For (d), θ y n =.8654, we have a confidence interval of (6.685,.84), meaning that the true value of θ has a 95% chance of ling between these values. 4. (5..5) Suppose the number of customers X that enter a store between the hours of 9AM AM follows a oisson distribution with parameter θ. Suppose a rom sample of the number of customers for days results in the values 9 7 9 5 7 Based on these data obtain an unbiased point estimate of θ. Explain the meaning of this estimate in terms of the number of customers. Since E[X] = θ for a oisson θ rom variable, we have a point estimate of 9.5 customers per day. (Of course while it is impossible to actually obtain this on any given day, but it s a fine estimate of the average - it s even perfectly fine if it is the true mean of the distribution. 5. (5..) Obtain the probability that an observationis a potential outlier for the following distributions (a) The underlying distribution is normal (b) The underlying distribution is logistic, in other words it has pdf: f(x) = e x ( + e x ), < x < (c) The underlying distribution is Laplace, the pdf given by: f(x) = e x, < x <. For (a), we find (from the back of the book) that Q =.675 Q =.675.
Thus h =.5, UF =.7 LF =.7. We have (Z > UF ) =.46697, so (LF < Z < UF ) =.9966, so the probability of a potential outlier is.7 For (b), we have that Q e x ( + e x ) dx = 4, similar for Q. We thus get Q = ln() Q = ln(). Thus h = ln(), UF = 4 ln() LF = 4 ln(). We have then, that the (LF < X < UF ) = 4 ln() 4 ln() e x 4 ( + e x dx = ) 4, so the probability of a potential outlier is 4.49. For (c) we have that Q e x = 4. Solving, we find Q = ln() likewise Q = ln(). Thus h = ln(), we have UF = 4 ln() LF = 4 ln(). Since 4 ln() 4 ln() e x dx = 5 6. Thus the probability of a potential outlier is 6 =.65. 6. (5..5) Let Y < Y < Y < Y 4 be the order statistics of a rom sample of size 4 from the distribution having pdf f(x) = e x, < x <, zero elsewhere. Find ( Y 4 ). We have that F (x) = x e t dx = e t. Using the formula we derived in class: (Y 4 ) = 4( e t ) e t dt = ( e ) 4. 7. (5..) Let Y < Y < Y be the order statistics of a rom sample of size from a distribution having the pdf f(x) = x, < x <, zero elsewhere. Show that Z = Y /Y, Z = Y /Y Z = Y are mutually independent. 4
We have that Y = Z, Y = Z Z Y = Z Z Z. Thus, z z z z z z J = z z = z z. We have f Z,Z,Z (z, z, z ) = f Y,Y,Y (z z z, z z, z )z z =!(z z z )(z z )(z )(z z) = (z )(4z )(6z 5 ), where < z i <. This is the product of the marginal pdfs of Z, Z Z which shows that the variables are mutually independent. 8. (5..) Let X, X,..., X n be a rom sample. A measure of spread is Gini s mean difference n j ( ) n G = X i X j / j= i= (a) If n =, find a,..., a so that G = i= a iy i, where Y, Y,..., Y are the order statistics of the sample. (b) Show that E[G] = σ/ π if the sample arises from the normal distribution N(µ, σ ). Hint: This looks worse than it is: Think about the fact that X i X j is either X i X j or X j X i depending on which is larger. The Y i appear in the sum G, as they are some X j. The key observation is that Y i Y j = Y i Y j if i > j Y i Y j = Y j Y i, if i < j. Thus Y i appears with a positive sign i times, with a negative sign n i times. In all, Y i has coefficient: (i ) (n i) a i = ( n. ) Since n = 9, we have that for i =,..., 9.. a i = i 45 For (b), note that E[G] = E[ X i X j ]. Since X i X j N(, σ ) we have that E[ X i X j ] = = x x (π)(σ ) exp( σ )dx σ e u du = σ. π π 5