Inference and Regression Assignment 3 Department of IOMS Professor William Greene Phone:.998.0876 Office: KMC 7-90 Home page:www.stern.nyu.edu/~wgreene Email: wgreene@stern.nyu.edu Course web page: www.stern.nyu.edu/~wgreene/econometrics/econometrics.htm. The random variable X has mean µ x and standard deviation σ x. The random variable Y has mean µ y and standard deviation σ y. If X and Y are independent, find E[ XY ], Var[ XY ] and the standard deviation of XY. If they are independent, E[XY] E[X]E[Y] µ x µ y. Var[XY] E[(XY) ] {E[XY]} E[X Y ] (E[X]E[Y]) E[X ]E[Y ] (E[X]) (E[Y]) (σ x + µ x )(σ y + µ y ) - µ x µ y σ x σ y + µ x µ y + σ x µ y + σ y µ x - µ x µ y σ x σ y + σ x µ y + σ y µ x The standard deviation is the square root of this.. The number of auto vandalism claims reported per month at Sunny Daze Insurance Company (SDIC) has mean 00 and standard deviation 0. The individual losses have mean $,00/claim and standard deviation $80/claim. The number of claims and the amounts of individual losses are independent. Using the normal approximation, calculate the probability that SDIC s aggregate auto vandalism losses reported for a month will be less than $0,000. Aggregate losses are claims times loss per claim. Expected value is the product of the means, 00claims $,00/claim $0,000 Variance based on question s 0 80 + 0,00 + 80 00 64,560,000 Standard deviation 5,349. Prob[ losses < 0,000] Prob[z < (0,000 0,000)/5,349] Prob[z < -.394 ].347. 3. Let x be the average of a sample of 6 independent normal random variables with mean 0 and variance. Determine c such that Prob( x < c).5. Prob( x-bar < c) implies that the probability that x-bar is less than c is.5 (and the probability that is it is greater than c is.5). X-bar is normally distributed with mean /sqr(6) ¼.5. So, Prob(x-bar < -c) Prob[(xbar 0)/.5 < -c/.5).5. From the normal table, -c/.5 -.675. So, 4c.675 or c.6875.
4. Show that if X ~ F n,m, then Y /X ~ F m,n. It is possible to do this problem by brute force, using a change of variable and the density of F n,m. But, the result follows trivially from the definition of F. F n,m [chi-squared(n) / n] / [chi-squared(m) / m]. Then, /F n,m [chi-squared(m) / m] / [chi-squared(n) / n] 5. Show that if T ~ t n, then T ~ F,n. T t n T Normal(0,) / sqr[chi-squared(n)/n] [Normal(0,)] / chi-squared(n)/n. The key now is to note that the square of a standard normal is chi-squared with degree of freedom. The definition of F,n then follows immediately. 6. Suppose that X,X,,X n are i.i.d. random variables on the interval [0,] with density Γ(3 α) α α f( x α ) x ( x), α> 0, 0 x. ΓαΓ ( ) ( α) The parameter α is to be estimated from the sample. It can be shown that E[X] /3 and Var[X]/[9(3α+)]. a. How could the method of moments be used to estimate α? b. What equation does the MLE of α satisfy. (I.e., what is the likelihood equation?) c. What is the asymptotic variance of the MLE of α? d. Find a sufficient statistic for α. a. You could not use the mean in the method of moments since E[X] does not depend on α. So, use the variance. Equate s to /[9(3α+)]. Solving, /s 9(3α+) or (/9)/s 3α +, or [/(9s ) ]/3 α b. The log likelihood would be the sum of the logs of the density, logl nlogγ(3α) - nlogγ(α) nlogγ(α) + (α-)σ i logx i + (α-)σ i log(-x i ). logl/ α 3nΨ(3α) - nψ(α) - nψ(α) + Σ i logx i - Σ i log(-x i ) 0. c. (Ouch!). Differentiate it again. logl/ α 9nΨ (3α) - nψ (α) 4nΨ (α) H The asymptotic variance is -/H. Since H does not involve x i, we don t need to take the expected value. d. There are two sufficient statistics that are jointly sufficient for α. The density is an exponential family, so we can see immediately that the sufficient statistics are Σ i logx i and Σ i log(-x i ). The reason there are two statistics and one parameter is that this is a version of the beta distribution which generally has two parameters, α and β and the two sufficient statistics shown. This distribution forces β to equal α. 7. Suppose that X,X,,X n are an i.i.d. random sample from a Rayleigh distribution with parameter > 0, f(x ) x exp[ x / ( )], x 0. a. Find a method of moments estimator of. b. Find the MLE of. c. Find the asymptotic variance of the MLE of.
a. First find the expected value: x E[x] exp( x / ( ) dx x exp( ax ) dx, where a /( ) 0 0 This is a gamma integral. There is a formula for this integral, but it is revealing to work through it. Make a change of variable to z x so x dx (/)z /. Make the change of variable to get / / x exp( ax ) dx z exp( az)z dz a z exp( az) dz 0 0 0 Γ(3 / ) / π a a ( / ) Γ (/ ) π. 3/ a Since this is the mean, the method of moments estimator is ˆ x. π b. The log likelihood is the sum of the logs of the densities N N xi logl log x log n i i log L n N The likelihood equation is + x 0. 3 i The solution is ˆ Σ x n N i c. Differentiate the log likelihood function again log L n 3 n x. 4 i We need the expected value of this. Superficially, this needs a solution for the expected square of x. But, we have a trick available to us. i The expected value of the first derivative is zero. So, solving the moment n equation, we find E x. Use this to find the expected value i n n 3 4n of the second derivative. This is n. The negative of the 4 inverse of the second derivative gives the asymptotic variance,. 4n 8. In the application of Bayesian estimation of a proportion of defectives that we examined in class and that is detailed in Notes-3, we used a uniform prior on [0,] for the distribution of. Suppose we assume, instead, an informative, beta prior with parameters α 3 and β 7. Repeat the analysis of the problem to find the Bayesian estimator of given this prior. (Hint: The beta distribution is a conjugate prior for the likelihood function given in the problem.) Consider estimation of the probability that a production process will produce a defective product. In case, suppose the sampling design is to choose N 5 items from the production line and count the number of defectives. If the probability that any item is defective is a constant between zero and one, then the likelihood for the sample of data is z and
L( data) D ( ) 5 D, where D is the number of defectives, say, 8. The maximum likelihood estimator of will be p D/5 0.3, and the asymptotic variance of the maximum likelihood estimator is estimated by p( p)/5 0.008704. Now, consider a Bayesian approach to the same analysis. The posterior density is obtained by the following reasoning: p(, data) p(, data) p( data ) p( ) p( data) p( data) p(, data) d p( data) Likelihood( data ) p( ) p( data) where p() is the prior density assumed for. [We have taken some license with the terminology, since the likelihood function is conventionally defined as L( data).] Inserting the results of the sample first drawn, we have the posterior density: D N D ( ) p( ) p( data ). D N D ( ) p( ) d Γ (3 + 7) 3 7 With the beta(3,7) prior, p( ) ( ). The posterior is Γ(3) Γ(3) D N DΓ (3 + 7) 3 7 ( ) ( ) Γ(3) Γ(7) p( data) D N DΓ (3 + 7) 3 7 ( ) ( ) d Γ(3) Γ(7) D+ 3 N D+ 7 ( ) D+ 3 N D+ 7 ( ) d Γ ( D+ 3 + N D+ 7) D+ 3 N D+ ( ) Γ ( D+ 3) Γ( N D+ 7) This is the density of a random variable with a beta distribution with parameters (α, β) ( D+3, N D+7). The mean of this random variable is (D+3)/(N+0) /35 0.343 (as opposed to 0.3, the MLE). The posterior variance is [(D+3)/(N D+7) ]/ [(N + ) (N + 0) ] 0.000004. 7.
9. The angle at which electrons are emitted in muon decay has a distribution with density f(x α) ( + αx)/, - < x < and - < α < where x cos. The parameter α is related to polarization. Physical considerations dictate that α < /3, but f(x α) is a density for α <. The method of moments may be used to estimate α from a sample of experimental measurements, x,,x n. The mean of the random variable x is E[x] µ x( +α x) dx α/ 3. Thus, the method of moments of α is α ˆ 3x. a. Show that E[ ˆα ] α. That is, the estimator is unbiased. If E[x] µ α/3, then by random sampling results, E[ x ] E[x] µ so E[3 x ] 3µ α. b. Show that Var[ ˆα ](3-α )/n. (Hint: what is Var[ x ]?) Use integration to find E[x ] to get Var[x] E[x ] - µ (/3 - α /9). The variance of [3 x ] 9Var[x]/n (3 - α )/n. c. Use the central limit theorem to deduce a normal approximation to the sampling distribution of ˆα. ˆα will be asymptotically normally distributed with mean α and variance (3 - α )/n. d. According to this approximation, if n 5 and α 0, what is Prob( ˆα >.5)? If n 5 and α 0, then the mean of the asymptotic distribution will be 0 and the variance will be 3/5. The probability that a normal variable with mean 0 and standard deviation sqr(3)/5.35 is >.5 or < -.5 is times the probability that the normal variable is greater than.5 or times the probability that the the standard normal variable is greater than (.5-0)/.35.43 *.07636.57. d. Show how you would obtain the MLE of α. (Note, there is no simple direct solution for the MLE.) n i [ x ] ln L ln + ln( +α ) ln L n x The likelihood equation to be solved is 0 i α ( + αx)