STAT 514 Solutions to Assignment #6

Similar documents
Review Quiz. 1. Prove that in a one-dimensional canonical exponential family, the complete and sufficient statistic achieves the

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

TUTORIAL 8 SOLUTIONS #

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

Statistics 3858 : Maximum Likelihood Estimators

10. Composite Hypothesis Testing. ECE 830, Spring 2014

Lecture 21. Hypothesis Testing II

1 Probability Distributions

Answer Key for STAT 200B HW No. 7

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Homework 7: Solutions. P3.1 from Lehmann, Romano, Testing Statistical Hypotheses.

Chapter 7. Confidence Sets Lecture 30: Pivotal quantities and confidence sets

557: MATHEMATICAL STATISTICS II HYPOTHESIS TESTING: EXAMPLES

Mathematical Statistics

Topic 19 Extensions on the Likelihood Ratio

Lecture 4 September 15

Statistics for scientists and engineers

Hypothesis Testing: The Generalized Likelihood Ratio Test

HYPOTHESIS TESTING: FREQUENTIST APPROACH.

Theory of Statistical Tests

Maximum Likelihood Estimation

Statistics. Statistics

Slides 5: Random Number Extensions

Methods of evaluating tests

2.6.3 Generalized likelihood ratio tests

Probability Distributions Columns (a) through (d)

STAT 512 sp 2018 Summary Sheet

Math 494: Mathematical Statistics

Asymptotic Tests and Likelihood Ratio Tests

Topic 15: Simple Hypotheses

MATH5745 Multivariate Methods Lecture 07

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

Lecture 12 November 3

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

A Very Brief Summary of Statistical Inference, and Examples

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER.

Hypothesis testing: theory and methods

The University of Hong Kong Department of Statistics and Actuarial Science STAT2802 Statistical Models Tutorial Solutions Solutions to Problems 71-80

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Chapter 9: Hypothesis Testing Sections

Master s Written Examination - Solution

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

A Very Brief Summary of Statistical Inference, and Examples

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

F79SM STATISTICAL METHODS

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

Lecture 16 November Application of MoUM to our 2-sided testing problem

Chapter 4. Theory of Tests. 4.1 Introduction

Hypothesis Testing. A rule for making the required choice can be described in two ways: called the rejection or critical region of the test.

STAT 3610: Review of Probability Distributions

Parameter Estimation

This paper is not to be removed from the Examination Halls

SOLUTION FOR HOMEWORK 4, STAT 4352

First Year Examination Department of Statistics, University of Florida

Final Examination a. STA 532: Statistical Inference. Wednesday, 2015 Apr 29, 7:00 10:00pm. Thisisaclosed bookexam books&phonesonthefloor.

Central Limit Theorem ( 5.3)

Problems ( ) 1 exp. 2. n! e λ and

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Optimal Tests of Hypotheses (Hogg Chapter Eight)

Frequentist-Bayesian Model Comparisons: A Simple Example

STAT 830 Hypothesis Testing

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Computing the MLE and the EM Algorithm

Statistical hypothesis testing The parametric and nonparametric cases. Madalina Olteanu, Université Paris 1

Maximum Likelihood Estimation

Was there under-representation of African-Americans in the jury pool in the case of Berghuis v. Smith?

Review. December 4 th, Review

Confidence & Credible Interval Estimates

Answer Key for STAT 200B HW No. 8

Interval Estimation. Chapter 9

STAT 801: Mathematical Statistics. Hypothesis Testing

STAT215: Solutions for Homework 1

Answers to the 8th problem set. f(x θ = θ 0 ) L(θ 0 )

2. The CDF Technique. 1. Introduction. f X ( ).

Mathematical statistics

AGEC 661 Note Eleven Ximing Wu. Exponential regression model: m (x, θ) = exp (xθ) for y 0

Some New Aspects of Dose-Response Models with Applications to Multistage Models Having Parameters on the Boundary

STAT 830 Hypothesis Testing

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Hypothesis Testing Chap 10p460

7. Estimation and hypothesis testing. Objective. Recommended reading

ISQS 5349 Final Exam, Spring 2017.

A732: Exercise #7 Maximum Likelihood

11 Survival Analysis and Empirical Likelihood

STT 843 Key to Homework 1 Spring 2018

1 Statistical inference for a population mean

8 Testing of Hypotheses and Confidence Regions

The comparative studies on reliability for Rayleigh models

Statistics Masters Comprehensive Exam March 21, 2003

Lecture 21: October 19

Exercises Chapter 4 Statistical Hypothesis Testing

APPENDICES APPENDIX A. STATISTICAL TABLES AND CHARTS 651 APPENDIX B. BIBLIOGRAPHY 677 APPENDIX C. ANSWERS TO SELECTED EXERCISES 679

Statistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm

Chapters 10. Hypothesis Testing

Transcription:

STAT 514 Solutions to Assignment #6 Question 1: Suppose that X 1,..., X n are a simple random sample from a Weibull distribution with density function f θ x) = θcx c 1 exp{ θx c }I{x > 0} for some fixed and known value of c. This situation comes up occasionally in the context of lifetime testing of mechanical equipment. When c = 1, a Weibull distribution is simply an exponential distribution.) a) Show that X c i is exponentially distributed. b) If we wish to test H 0 : θ θ 0 against H 1 : θ < θ 0, derive the value k α such that the test that rejects H 0 whenever n Xc i > k α is of size α. Relate k α to a critical value of the χ 2 distribution on 2n degrees of freedom. c) Show that the test in part b) is the U.M.P. level α test. d) Using a normal approximation to the distribution of the test statistic in part b), give an explicit approximate form of the power function of the test. Then, use this approximation to find the sample size needed for a size 0.05 test to have power at least 0.9 at the alternative θ = 0.2 if θ 0 = 0.25. In other words, assume that the test statistic is approximately normally distributed with exactly the same mean and variance that it has in reality. Solution: a) The Weibull random variable X has a closed-form cumulative distribution function cdf), given by F θ x) = 1 exp{ θx c }I{x > 0}. Therefore, the cdf of X c is P X c < x) = P X < x 1/c ) = F θ x 1/c ) = 1 exp{ θx 1/c ) c } = 1 exp{ θx} for x > 0, which is easily seen to be the cdf of an exponential distribution. b) Using the same logic as in part a), we find that θx c has a standard exponential distribution: P θx c < x) = P X < x/θ) 1/c ) = F θ [x/θ) 1/c ] = 1 exp{ θ[x/θ) 1/c ] c } = 1 exp{ x}. If T = n Xc i, this means that 2θT has the same distribution as the sum of n independent exponential random variables with mean 2. Since χ 2 2 is the same as exponential with mean 2, we conclude that when θ = θ 0, 2θT is distributed as a chi-squared random variable with 2n degrees of freedom. This means that if we set k α = χ2 2n1 α) 2θ 0, where χ 2 2n1 α) is the 1 α quantile of the χ 2 2n distribution, then the test that rejects H 0 : θ θ 0 whenever T > k α is of size α. c) The likelihood is ] Lθ) = θc) n n X c 1 i exp{ θ [ n Xi c } = θc) n X c 1 i [θc) n exp{ θt }], so T is sufficient for this family of distributions. By the Karlin-Rubin Theorem, it suffices to verify that this family has a monotone likelihood ratio MLR) in the statistic T. To verify this, we check that for θ 1 < θ 2, Lθ 2 ) Lθ 1 ) = θ2 θ 1 ) n exp{θ 1 θ 2 )T }

is a strictly decreasing function of T. d) The result of part a) shows that Xi c is exponential with mean 1/θ and variance 1/θ 2. Therefore, T has mean n/θ and variance n/θ 2. We conclude that θt n)/ n has mean 0 and variance 1. Using the normal approximation, with Z denoting a standard normal random variable, the power function is θt n βθ) = P θ T > k α ) = P θ > θk ) α n P θ Z > θk ) ) α n θkα n = 1 Φ. n n n n It is not easy to find a closed-form solution for the smallest n that gives β0.2) 0.9 when θ 0 = 0.25, since k α depends on n in a complicated way. Thus, we ll find this n numerically. We can depict the power of the size 0.05 test at θ = 0.2 for various values of n as follows: > n <- 1:200 > k <- qchisq.95, 2*n) / 2*0.25) # This gives size 0.05 when theta0=0.25 > beta <- 1-pnorm0.2*k-n)/sqrtn)) > plotn, beta, type="l") > ablineh=0.9, lty=2) # Put a line at 0.9, the desired power beta 0.2 0.4 0.6 0.8 0 50 100 150 200 n To find the smallest n that achieves the desired power, we may use the which function with the min function: > minwhichbeta >= 0.9)) [1] 174 Many people used the normal approximation to find the value of k α on which β0.2) is based in fact, this might be the most sensible course of action given that I was not specific about whether to do this. Here is the result: > k2 <- qnorm.95, mean=n/0.25, sd=sqrtn)/0.25) # Use normal approx for k_alpha > beta2 <- 1-pnorm0.2*k2-n)/sqrtn)) > minwhichbeta2 >= 0.9))

[1] 169 In fact, for the sake of comparison we can simply find the exact result, which does not use any normal approximation: > beta3 <- 1-pchisq2*0.2*k, 2*n) > minwhichbeta3 >= 0.9)) [1] 171 Question 2: Suppose that X 1,..., X m and Y 1,..., Y n are independent simple random samples, with X i Nθ 1, σ 2 ) and Y i Nθ 2, σ 2 ) for an unknown σ 2. We wish to test H 0 : θ 1 θ 2 against H 1 : θ 1 > θ 2. Show that the likelihood ratio test is equivalent to the two-sample t-test. Solution: The likelihood function is Lθ 1, θ 2, σ 2 1 ) = exp 2πσ 2 )m+n)/2 1 m 2σ 2 X i θ 1 ) 2 + Y i θ 2 ) 2. Maximizing with no constraints to obtain the denominator of the LRT statistic, we easily obtain ˆθ 1 = X and ˆθ 2 = Y. This leads to ˆσ 2 = 1 m X i X) 2 + Y i Y ) 2, 1) and we will refer to the expression in Equation 1) as the pooled sample variance, or S 2 pooled. As for the numerator of the LRT statistic, we first note that in the case X Y, the unconstrained maximizer described above lies in the null space Θ 0, and we conclude that the LRT statistic equals one whenever X Y. However, in the case X > Y, we must carry out a constrained maximization and find the largest possible value of Lθ 1, θ 2, σ 2 ) such that θ 1 θ 2. We will prove that the maximizer in this case occurs on the boundary, i.e., we must have ˆθ 1 = ˆθ 2. One way to prove this is to notice that if there is a local maximizer on the interior of Θ 0, i.e., such that θ 1 < θ 2, then the gradient vector must be zero at that local max since it is on the interior of the space. But the unique zero of the gradient function satisfies θ 1 = X and θ 2 = Y. So there is no interior local maximum. Another way to prove that we only need to consider θ 1 = θ 2 is as follows: We will show directly that when X > Y and θ 1 < θ 2, either Lθ 1, θ 1, σ 2 ) > Lθ 1, θ 2, σ 2 ) or Lθ 2, θ 2, σ 2 ) > Lθ 1, θ 2, σ 2 ), 2) which means that we can eliminate any point θ 1 < θ 2 from consideration in the maximization problem. To prove 2), we may verify after a bit of algebra that log Lθ 1, θ 1, σ 2 ) log Lθ 1, θ 2, σ 2 n ) = 2σ 2 θ 2 θ 1 )[θ 1 + θ 2 2Y ], log Lθ 2, θ 2, σ 2 ) log Lθ 1, θ 2, σ 2 ) = m 2σ 2 θ 2 θ 1 )[2X θ 2 + θ 1 )]. Since X > Y, at least one of the two expressions in square brackets must be greater than zero no matter what the value of θ 1 + θ 2 is. This proves the result. Maximizing Lθ, θ, σ 2 ) is straightforward because it is the same likelihood we would obtain if X 1,... X m, Y 1,..., Y n were a simple random sample of size m+n from Nθ, σ 2 ). Therefore, we find ˆθ = m X i + n Y i j=1 j=1 [ and ˆσ 2 = 1 m X i ˆθ) 2 + ] Y i ˆθ) 2,

where we will denote the latter expression by S 2 0. By writing X i ˆθ) = X i X +X ˆθ) and Y i ˆθ) = Y i Y + Y ˆθ), we may simplfy the numerator of S 2 0 as follows: m X i ˆθ) 2 + Y i ˆθ) m 2 = X i X) 2 + In the case X > Y, the LRT statistic is therefore S 2 pooled S 2 0 ) m+n)/2 = = Y i Y ) 2 + mn X Y )2. S 2 pooled Spooled 2 + mnx Y )2 /) 2 ) m+n)/2 1 1 + kt 2, ) m+n)/2 where T = X Y ) m+n m+n 2 Spooled 2 1 m + 1 n is the usual two-sample T statistic and k is a constant depending only on m and n. Thus, rejecting H 0 : θ 1 θ 2 when the LRT statistic is too small is equivalent to rejecting H 0 : θ 1 θ 2 when X > Y and T 2 is too large, which proves that the LRT is equivalent to the usual two-sample T test. Question 3: In each of the cases, calculate a valid p-value: a) We observe 20 successes in 90 Bernoulli trials to test H 0 : p 1/6 against H 1 : p > 1/6. b) We observe data 4, 3, 1, 4, 1 from a sample of n = 5 independent Poissonλ) random variables and we wish to test H 0 : λ 2 against H 1 : λ > 2. c) For the test in part b), imagine that you have not yet observed the data, which still consist of n = 5 observations from Poissonλ). Plot the power function for the most powerful level 0.05 test and give its exact values at λ = 2 and λ = 3. Is the size exactly 0.05? Solution: a) Since more extreme values of X according to H 1 are larger values, we need to consider the probability that X 20. Since P p X 20) is an increasing function of p for X binom90, p) we can verify this easily for p p 0 = 1/6 since the pmf at every value 20 is an increasing function of p for p < 1/6), p-value = sup p p 0 P p X 20) = P p0 X 20). We may therefore find the p-value as follows: > 1-pbinom19, 90, 1/6) [1] 0.104315 b) In this case we know that T = 5 X i is sufficient and that T is distributed as Poisson with parameter 5λ. Since the observed value of T in this case is 13, we use the same logic as in part a) to find the p-value as > 1-ppois12, 10) [1] 0.2084435 c) Here, it is easy to verify that the Poisson has a MLR property in T = 5 X i. Thus, the test should reject H 0 : λ 2 when T c for some c, and the smallest c which gives a level 0.05 test is found below: > minwhichppois1:50, 10) >.95))

[1] 15 In other words, our UMP level 0.05 test rejects whenever T 15. The power function is the probability that this happens as a function of the true λ, so we may plot this function for a range of lambda values as follows: > lambda <- seq1, 4, len=200) > plotlambda, 1-ppois14, 5*lambda), type="l") > ablineh=0.05, v=2:3, lty=2) 1 ppois14, 5 * lambda) 0.0 0.2 0.4 0.6 0.8 1.0 1.5 2.0 2.5 3.0 3.5 4.0 lambda Finally, here are the values of βλ) for λ = 2 and λ = 3: > 1-ppois14, 5*c2,3)) [1] 0.08345847 0.53434629 We see that the exact size of the test is actually 0.083, not 0.05.