Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Size: px

Start display at page:

Download "Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n"

Donna Thornton
5 years ago
Views:

1 Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible tests. To this end, let l(θ) denote the loglikelihood and ˆθ n the consistent root of the likelihood equation. Intuitively, the farther θ 0 is from ˆθ n, the stronger the evidence against the null hypothesis. But how far is far enough? Note that if θ 0 is close to ˆθ n, then l(θ 0 ) should also be close to l(ˆθ n ) and l (θ 0 ) should be close to l (ˆθ n ) = 0. If we base a test upon the value of (ˆθ n θ 0 ), we obtain a Wald test. If we base a test upon the value of l(ˆθ n ) l(θ 0 ), we obtain a likelihood ratio test. If we base a test upon the value of l (θ 0 ), we obtain a (Rao) score test. Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : ( ) d 1 n(ˆθn θ 0 ) N 0, ; (9.1) I(θ 0 ) 1 n l (θ 0 ) d N{0, I(θ 0 )}; (9.) 1 n l (θ 0 ) P I(θ 0 ). (9.3) 139

2 Equation (9.) is proved using the central limit theorem; Equation (9.3) is proved using the weak law of large numbers; and Equation (9.1) is the result of a Taylor expansion together with Equations (9.) and (9.3). The three equations above will help us to justify the definitions of Wald, score, and likelihood ratio tests to follow. Equation (9.1) suggests that if we define W n = ni(θ 0 )(ˆθ n θ 0 ), (9.4) then W n, called a Wald statistic, should converge in distribution to standard normal under H 0. Note that this fact remains true if we define W n = nî(ˆθ n θ 0 ), (9.5) where Î is consistent for θ 0; for example, Î could be I(ˆθ n ). Definition 9.1 A Wald test is any test that rejects H 0 : θ = θ 0 in favor of H 1 : θ θ 0 when W n u α/ for W n defined as in Equation (9.4) or Equation (9.5). As usual, u α/ denotes the 1 α quantile of the standard normal distribution. Equation (9.) suggests that if we define R n = 1 ni(θ0 ) l (θ 0 ), (9.6) then R n, called a Rao score statistic or simply a score statistic, converges in distribution to standard normal under H 0. We could also replace I(θ 0 ) by a consistent estimator Î as in Equation (9.5), but usually this is not done: One of the main benefits of the score statistic is that it is not necessary to compute ˆθ n, and using I(ˆθ n ) instead of I(θ 0 ) would defeat this purpose. Definition 9. A score test, sometimes called a Rao score test, is any test that rejects H 0 : θ = θ 0 in favor of H 1 : θ θ 0 when R n u α/ for R n defined as in Equation (9.6). The third type of test, the likelihood ratio test, requires a bit of development. It is a test based on the statistic If we Taylor expand l (ˆθ n ) = 0 around the point θ 0, we obtain n = l(ˆθ n ) l(θ 0 ). (9.7) l (θ 0 ) = (ˆθ n θ 0 ) { l (θ 0 ) + o P (1)}. (9.8) 140

3 We now use a second Taylor expansion, this time of l(ˆθ n ), to obtain n = (ˆθ n θ 0 )l (θ 0 ) + 1 (ˆθ n θ 0 ) [l (θ 0 ) + o P (1)]. (9.9) If we now substitute Equation (9.8) into Equation (9.9), we obtain { n = n(ˆθ n θ 0 ) 1 n l (θ 0 ) + 1 ( )} 1 n l (θ 0 ) + o P. (9.10) n By Equations (9.1) and (9.3) and Slutsky s theorem, this implies that d n χ 1 under the null hypothesis. Noting that the 1 α quantile of a χ 1 distribution is u α/, we make the following definition. Definition 9.3 A likelihood ratio test is any test that rejects H 0 : θ = θ 0 in favor of H 0 : θ θ 0 when n u α/ or, equivalently, when n u α/. Since it may be shown that n W n P 0 and W n R n P 0, the three tests defined above Wald tests, score tests, and likelihood ratio tests are asymptotically equivalent in the sense that under H 0, they reach the same decision with probability approaching 1 as n. However, they may be quite different for a fixed sample size n, and furthermore they have some relative advantages and disadvantages with respect to one another. For example, It is straightforward to create one-sided Wald and score tests (i.e., tests of H 0 : θ = θ 0 against H 1 : θ > θ 0 or H 1 : θ < θ 0 ), but this is more difficult with a likelihood ratio test. The score test does not require ˆθ n whereas the other two tests do. The Wald test is most easily interpretable and yields immediate confidence intervals. The score test and likelihood ratio test are invariant under reparameterization, whereas the Wald test is not. Example 9.4 Suppose that X 1,..., X n are independent with density f θ (x) = θe xθ I{x > 0}. Then l(θ) = n(log θ θx n ), which yields ( ) 1 l (θ) = n θ X n and l (θ) = n θ. From the form of l (θ), we see that I(θ) = θ, and setting l (θ) = 0 yields ˆθ n = 1/X n. From these facts, we obtain as the Wald, score, and likelihood ratio 141

4 statistics ( ) n 1 W n = θ 0, θ 0 X ( n ) 1 R n = θ 0 n X n = W n, and θ 0 θ 0 X n n = n { X n (X n θ 0 ) log(θ 0 X n ) }. Thus, we reject H 0 : θ = θ 0 in favor of H 1 : θ θ 0 whenever W n u α/, R n u α/, or n u α/, depending on which test we re using. Generalizing the three tests to the multiparameter setting is straightforward. Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0, where θ R k. Then W n R n n def = n(ˆθ θ 0 ) T I(θ 0 )(ˆθ θ 0 ) def = 1 n l(θ0 ) T I 1 (θ 0 ) l(θ 0 ) def = l(ˆθ) l(θ 0 ) d χ k, (9.11) d χ k, and (9.1) d 1 χ k. (9.13) Therefore, if c k α denotes the 1 α quantile of the χ k distribution, then the multivariate Wald test, score test, and likelihood ratio test reject H 0 when W n c k α, R n c k α, and n c k α, respectively. As in the one-parameter case, the Wald test may also be defined with I(θ 0 ) replaced by a consistent estimator Î. Exercises for Section 9.1 Exercise 9.1 Let X 1,..., X n be a simple random sample from a Pareto distribution with density f(x) = θc θ x (θ+1) I{x > c} for a known constant c > 0 and parameter θ > 0. Derive the Wald, Rao, and likelihood ratio tests of θ = θ 0 against a two-sided alternative. Exercise 9. Suppose that X is multinomial(n, p), where p R k. In order to satisfy the regularity condition that the parameter space be an open set, define θ = (p 1,..., p k 1 ). Suppose that we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. (a) Prove that the Wald and score tests are the same as the usual Pearson chisquare test. (b) Derive the likelihood ratio statistic n. 14

5 9. Contiguity and Local Alternatives Suppose that we wish to test the hypotheses H 0 : θ = θ 0 and H 1 : θ > θ 0. (9.14) The test is to be based on a statistic T n, where as always n denotes the sample size, and we shall decide to reject H 0 in (9.14) if T n C n (9.15) for some constant C n. The test in question may be one of the three types of tests introduced in Section 9.1, or it may be an entirely different test. We may define some basic asymptotic concepts regarding tests of this type. Definition 9.5 If P θ0 (T n C n ) α for test (9.15), then test (9.15) is said to have asymptotic level α. Definition 9.6 If two different tests of the same hypotheses reach the same conclusion with probability approaching 1 under the null hypothesis as n, the tests are said to be asymptotically equivalent. The power of test (9.15) under the alternative θ is defined to be β n (θ) = P θ (T n C n ). We expect that the power should approach 1. Definition 9.7 A test (or, more precisely, a sequence of tests) is said to be consistent against the alternative θ if β n (θ) 1. Unfortunately, the concepts we have defined so far are of limited usefulness. If we wish to compare two different tests of the same hypotheses, then if the tests are both sensible they should be asymptotically equivalent and consistent. Thus, consistency is nice but it doesn t tell us much; asymptotic equivalence is nice but it doesn t allow us to compare tests. We make things more interesting by considering, instead of a fixed alternative θ, a sequence of alternatives θ 1, θ,.... Let us make some assumptions about the asymptotic distribution of the test statistic T n in (9.15). First, define µ(θ) and nτ (θ) to be the mean and variance, respectively, of T n when θ is the true value of the parameter. In other words, let µ(θ) = E θ (T n ) and τ (θ) = 1 n Var θ(t n ). We assume that if the null hypothesis is true, which means θ = θ 0, then n{tn µ(θ 0 )} d N(0, 1) (9.16) τ (θ 0 ) 143

6 as n. Furthermore, we assume that if the alternatives are true, which means that θ = θ n for each n, then n{tn µ(θ n )} τ (θ n ) d N(0, 1) (9.17) as n. The limit in (9.17) is trickier than the one in (9.16) because it assumes that the underlying parameter is changing along with n. Consider Example 9.8. Example 9.8 Suppose we have a sequence of independently and identically distributed random variables X 1, X,... with common distribution F θ. In a one-sample t- test, the test statistic T n may be taken to be the sample mean X n. In this case, the limit in (9.16) follows immediately from the central limit theorem. Yet to verify (9.17), it is necessary to consider a triangular array of random variables: X 11 F θ1 X 1, X F θ X 31, X 3, X 33 F θ3... We may often check that the Lyapunov or Lindeberg condition is satisfied, then use the results of Section 4. to establish (9.17). In fact, the existence of, say, a finite third absolute moment, γ(θ) = E θ X 1 3, is sufficient because 1 ( nτ (θ n )) 3 E X ni 3 = γ(θ n) nτ 3 (θ n ) and the Lyapunov condition holds as long as γ(θ n )/τ 3 (θ n ) tends to some finite limit. We generally assume that θ n θ 0, so as long as γ(θ) and τ(θ) are continuous, γ(θ n )/τ 3 (θ n ) tends to γ(θ 0 )/τ 3 (θ 0 ). If (9.16) and (9.17) are true, we may calculate the power of the test against the sequence of alternatives θ 1, θ,... in a straightforward way using normal distribution calculations. As mentioned in Example 9.8, we typically assume that this sequence converges to θ 0. Such a sequence is often referred to as a contiguous sequence of alternatives, since contiguous means next to ; the idea of contiguity is that we choose not a single alternative hypothesis but a sequence of alternatives next to the null hypothesis. First, we should determine a value for C n so that test (9.15) has asymptotic level α. Define u α to be the 1 α quantile of the standard normal distribution. By limit (9.16), { P θ0 T n µ(θ 0 ) τ(θ } 0)u α α; n 144

7 therefore, we define a new test, namely reject H 0 in (9.14) if T n µ(θ 0 ) + τ(θ 0)u α n (9.18) and conclude that test (9.18) has asymptotic level α as desired. We now calculate the power of test (9.18) against the alternative θ n : { β n (θ n ) = P θn T n µ(θ 0 ) + τ(θ } 0)u α n = P θn { n{tn µ(θ n )} τ(θ n ) τ(θ n) τ(θ 0 ) + n(θn θ 0 ) µ(θ n) µ(θ 0 ) τ(θ 0 ) θ n θ 0 u α }.(9.19) Thus, β n (θ n ) tends to an interesting limit (i.e., a limit between α and 1) if τ(θ n ) τ(θ 0 ); n(θn θ 0 ) tends to a nonzero, finite limit; and µ(θ) is differentiable at θ 0. This fact is summarized in the following theorem: Theorem 9.9 Let θ n > θ 0 for all n. Suppose that limits (9.17) and (9.16) hold, τ(θ) is continuous at θ 0, µ(θ) is differentiable at θ 0, and n(θ n θ 0 ) for some finite > 0. If µ (θ 0 ) or τ(θ 0 ) depends on n, then suppose that µ (θ 0 )/τ(θ 0 ) tends to a nonzero, finite limit. Then if β n (θ n ) denotes the power of test (9.18) against the alternative θ n, ( ) µ (θ 0 ) β n (θ n ) lim Φ u α. n τ(θ 0 ) The proof of Theorem 9.9 merely uses Equation (9.19) and Slutsky s theorem, since the hypotheses of the theorem imply that τ(θ n )/τ(θ 0 ) 1 and {µ(θ n ) µ(θ 0 )}/(θ n θ 0 ) µ (θ 0 ). Example 9.10 Let X Binomial(n, p n ), where p n = p 0 + / n and T n = X/n. To test H 0 : p = p 0 against H 1 : p > p 0, note that n(tn p 0 ) d N(0, 1) p0 (1 p 0 ) under H 0. Thus, test (9.18) says to reject H 0 whenever T n p 0 +u α p0 (1 p 0 )/n. This test has asymptotic level α. Since τ(p) = p(1 p) is continuous and µ(p) = p is differentiable, Theorem 9.9 applies in this case as long as we can verify the limit (9.17). Let X n1,..., X nn be independent Bernoulli(p n ) random variables. Then if X n1 p n,..., X nn p n can be shown to satisfy the Lyapunov condition, we have n(tn p n ) d N(0, 1) τ(p n ) 145

8 and so Theorem 9.9 applies. The Lyapunov condition follows since X ni p n 1 implies 1 {Var (nt n )} E X ni p n 4 n {np n (1 p n )} 0. Thus, we conclude by Theorem 9.9 that ( ) β n (p n ) Φ p0 (1 p 0 ) u α. To apply this result, suppose that we wish to test whether a coin is fair by flipping it 100 times. We reject H 0 : p = 1/ in favor of H 1 : p > 1/ if the number of successes divided by 100 is at least as large as 1/ + u.05 /0, or The power of this test against the alternative p = 0.6 is approximately ( ) 100( ) Φ = Φ( 1.645) = Compare this asymptotic approximation with the exact power: The probability of at least 59 successes out of 100 for a binomial(100, 0.6) random variable is As seen in Example 9.10, the asymptotic result of Theorem 9.9 may be applied to a fixed sample size n and a fixed alternative θ as follows: ( ) n{µ(θ) µ(θ0 )} β n (θ) Φ u α (9.0) τ(θ 0 ) There is an alternative formulation that yields a slightly different approximation. Starting from { } n{tn µ(θ)} τ(θ 0 ) n{µ(θ) β n (θ) = P θ u α τ(θ) τ(θ) µ(θ0 )}, τ(θ) we obtain ( ) n{µ(θ) µ(θ0 )} τ(θ 0 ) β n (θ) Φ u α. (9.1) τ(θ) τ(θ) Applying approximation (9.1) to the binomial case of Example 9.10, we obtain instead of 0.63 for the approximate power. 146

9 We may invert approximations (9.0) and (9.1) to obtain approximate sample sizes required to achieve desired power β against alternative θ. From (9.0) we obtain and from (9.1) we obtain n (u α u β )τ(θ 0 ) µ(θ) µ(θ 0 ) (9.) u α τ(θ 0 ) u β τ(θ) n. (9.3) µ(θ) µ(θ 0 ) We may compare tests by considering the relative sample sizes necessary to achieve the same power at the same level against the same alternative. Definition 9.11 Given tests 1 and of the same hypotheses with asymptotic level α and a sequence of alternatives {θ k }, suppose that β (1) m k (θ k ) β and β () n k (θ k ) β as k for some sequences {m k } and {n k } of sample sizes. Then the asymptotic relative efficiency (ARE) of test 1 with respect to test is assuming this limit exists. n k e 1, = lim, k m k In Examples 9.1 and 9.13, we consider two different tests for the same hypotheses. Then, in Example 9.15, we compute their asymptotic relative efficiency. Example 9.1 Suppose we have paired data (X 1, Y 1 ),..., (X n, Y n ). Let Z i = Y i X i for all i. Assume that the Z i are independent and identically distributed with distribution function P (Z i z) = F (z θ) for some θ, where f(z) = F (z) exists and is symmetric about 0. Let W 1,..., W n be a permutation of Z 1,..., Z n such that W 1 W W n. We wish to test H 0 : θ = 0 against H 1 : θ > 0. First, consider the Wilcoxon signed rank test. Define R n = ii{w i > 0}. Then under H 0, the I{W i > 0} are independent Bernoulli(1/) random variables. Thus, i n(n + 1) i n(n + 1)(n + 1) E R n = = and Var R n = =

10 Furthermore, one may prove (see Exercise 9.3) that R n E R n Var Rn d N(0, 1). Thus, a test with asymptotic level α rejects H 0 when R n n(n + 1) 4 + u ατ(0) n, where τ(0) = n (n + 1)(n + 1)/4. Furthermore, we may apply Theorem 9.9. Now we must find E R n under the alternative θ n = / n. As earlier, we let µ(θ) denote E R n under the assumption that θ is the true value of the parameter. Note that in this case, µ(θ) depends on n even though this dependence is not explicit in the notation. First, we note that since W i W j for i j, W i +W j > 0 if and only if W j > 0. Therefore, j I{W i + W j > 0} = ji{w j > 0} and so we may rewrite R n in the form R n = j=1 Therefore, we obtain µ(θ n ) = j=1 j I{W i + W j > 0} = j=1 j I{Z i + Z j > 0}. j P θn (Z i + Z j > 0) = np θn (Z 1 > 0) + Since P θn (Z 1 > 0) = P θn (Z 1 θ n > θ n ) = 1 F ( θ n ) and ( ) n P θn (Z 1 + Z > 0). P θn (Z 1 + Z > 0) = P θn {(Z 1 θ n ) + (Z θ n ) > θ n } we conclude that = E P θn {Z 1 θ n > θ n (Z θ n ) Z } = µ (θ) = nf(θ) + Thus, because f( z) = f(z) by assumption, µ (0) = nf(0) + n(n 1) {1 F ( θ n z)}f(z) dz, ( ) n f( θ z)f(z) dz. 148 f (z) dz.

11 Letting we obtain µ (0) lim n τ(0) = lim n K = f (z) dz, 4{f(0) + (n 1)K} (n + 1)(n + 1) = K 1. Therefore, Theorem 9.9 shows that β n (θ n ) Φ( K 1 u α ). (9.4) Example 9.13 As in Example 9.1, suppose we have paired data (X 1, Y 1 ),..., (X n, Y n ) and Z i = Y i X i for all i. The Z i are independent and identically distributed with distribution function P (Z i z) = F (z θ) for some θ, where f(z) = F (z) exists and is symmetric about 0. Suppose that the variance of Z i is σ. Since the t-test (unknown variance) and z-test (known variance) have the same asymptotic properties, let s consider the z-test for simplicity. Then τ(θ) = σ for all θ. The relevant statistic is merely Z n, and the central limit theorem implies nzn /σ d N(0, 1) under the null hypothesis. Therefore, the z-test in this case rejects H 0 : θ = 0 in favor of H 1 : θ > 0 whenever Z n > u α σ/ n. The central limit theorem shows that n(zn θ n ) d N(0, 1) σ under the alternatives θ n = / n. Therefore, by Theorem 9.9, we obtain since µ (θ) = 1 for all θ. β n (θ n ) Φ( /σ u α ) (9.5) Before finding the asymptotic relative efficiency (ARE) of the Wilcoxon signed rank test and the t-test, we prove a lemma that enables this calculation. Suppose that for two tests, called test 1 and test, we use sample sizes m and n, respectively. We want m and n to tend to infinity together, an idea we make explicit by setting m = m k and n = n k for k = 1,,.... Suppose that we wish to apply both tests to the same sequence of alternative hypotheses θ 1, θ,.... As usual, we make the assumption that (θ k θ 0 ) times the square root of the sample size tends to a finite, nonzero limit as k. Thus, we assume mk (θ k θ 0 ) 1 and n k (θ k θ 0 ). 149

12 Then if Theorem 9.9 may be applied to both tests, define c 1 = lim µ 1(θ 0 )/τ 1 (θ 0 ) and c = lim µ (θ 0 )/τ (θ 0 ). The theorem says that β mk (θ k ) Φ { 1 c 1 u α } and β nk (θ k ) Φ { c u α }. (9.6) To find the ARE, then, Definition 9.11 specifies that we assume that the two limits in (9.6) are the same, which implies 1 c 1 = c, or n k c 1. m k Thus, the ARE of test 1 with respect to test equals (c 1 /c ). This result is summed up in the following lemma, which defines a new term, efficacy. c Lemma 9.14 to be For a test to which Theorem 9.9 applies, define the efficacy of the test c = lim n µ (θ 0 )/τ(θ 0 ). (9.7) Suppose that Theorem 9.9 applies to each of two tests, called test 1 and test. Then the ARE of test 1 with respect to test equals (c 1 /c ). Example 9.15 Using the results of Examples 9.1 and 9.13, we conclude that the efficacies of the Wilcoxon signed rank test and the t-test are 1 f (z) dz and 1 σ, respectively. Thus, Lemma 9.14 implies that the ARE of the signed rank test to the t-test equals ( 1σ f (z) dz). In the case of normally distributed data, we may verify without too much difficulty that the integral above equals (σ π) 1, so the ARE is 3/π Notice how close this is to one, suggesting that for normal data, we lose very little efficiency by using a signed rank test instead of a t-test. In fact, it may be shown that this asymptotic relative efficiency has a lower bound of However, there is no upper bound on the ARE in this case, which means that examples exist for which the t-test is arbitrarily inefficient compared to the signed rank test. Exercises for Section

13 Exercise 9.3 Assume that θ n 0 for some sequence θ 1, θ,.... In Example 9.1, Prove that R n E R n Var Rn d N(0, 1) under the assumption that for each n, the distribution of R n is determined by the parameter θ n. Note that this proof allows us to apply Theorem 9.9, since it covers both the contiguous alternatives and, by taking θ n = θ 0 for all n, the null hypothesis. Exercise 9.4 For the hypotheses considered in Examples 9.1 and 9.13, the sign test is based on the statistic N + = #{i : Z i > 0}. Since n(n + /n 1) d N(0, 1) under the null hypothesis, the sign test (with continuity correction) rejects H 0 when N + 1 u α n + n. (a) Find the efficacy of the sign test. Make sure to indicate how you go about verifying Equation (9.17). (b) Find the ARE of the sign test with respect to the signed rank test and the t-test. Evalaute each of these for the case of normal data. Exercise 9.5 Suppose X 1,..., X n are a simple random sample from a uniform (0, θ) distribution. We wish to test H 0 : θ = θ 0 against H 1 : θ > θ 0 at α =.05. Define Q 1 and Q 3 to be the first and third quartiles of the sample. Consider test A, which rejects when and test B, which rejects when Q 3 Q 1 θ 0 A n, X θ 0 B n. Based on the asymptotic distribution of X and the joint asymptotic distribution of (Q 1, Q 3 ), find the values of A n and B n that correspond with the test in (9.18). Then find the asymptotic relative efficiency of test A relative to test B. Exercise 9.6 Let P θ be a family of probability distributions indexed by a real parameter θ. If X P θ, define µ(θ) = E (X) and σ (θ) = Var (X). Now let θ 1, θ,... be a sequence of parameter values such that θ n θ 0 as n. Suppose that 151

14 E θn X +δ < M for all n for some positive δ and M. Also suppose that for each n, X n1,..., X nn are independent with distribution P θn and define X n = n X ni/n. Prove that if σ (θ 0 ) < and σ (θ) is continuous at the point θ 0, then n[xn µ(θ n )] d N(0, σ (θ 0 )) as n. Hint: a + b +δ +δ ( a +δ + b +δ). Exercise 9.7 Suppose X 1, X,... are independent exponential random variables with mean θ. Consider the test of H 0 : θ = 1 vs H 1 : θ > 1 in which we reject H 0 when where α =.05. X n 1 + u α n, (a) Derive an asymptotic approximation to the power of the test for a fixed sample size n and alternative θ. Tell where you use the result of Problem 9.6. (b) Because the sum of independent exponential random variables is a gamma random variable, it is possible to compute the power exactly in this case. Create a table in which you compare the exact power of the test against the alternative θ = 1. to the asymptotic approximation in part (a) for n {5, 10, 15, 0}. Exercise 9.8 Let X 1,..., X n be independent from Poisson (λ). Create a table in which you list the exact power along with approximations (9.0) and (9.1) for the test that rejects H 0 : λ = 1 in favor of H 1 : λ > 1 when n(xn λ 0 ) λ0 u α, where n = 0 and α =.05, against each of the alternatives 1.1, 1.5, and. Exercise 9.9 Let X 1,..., X n be an independent sample from an exponential distribution with mean λ, and Y 1,..., Y n be an independent sample from an exponential distribution with mean µ. Assume that X i and Y i are independent. We are interested in testing the hypothesis H 0 : λ = µ versus H 1 : λ > µ. Consider the statistic T n = (I i 1/)/ n, 15

15 where I i is the indicator variable I i = I(X i > Y i ). (a) Derive the asymptotic distribution of T n under the null hypothesis. (b) Use the Lindeberg Theorem to show that, under the local alternative hypothesis (λ n, µ n ) = (λ + n 1/ δ, λ), where δ > 0, n (I i ρ n ) L N(0, 1), where ρ n = λ n = λ + n 1/ δ nρn (1 ρ n ) λ n + µ n λ + n 1/ δ. (c) Using the conclusion of part (b), derive the asymptotic distribution of T n under the local alternative specified in (b). Exercise 9.10 Suppose X 1,..., X m is a simple random sample and Y 1,..., Y n is another simple random sample independent of the X i, with P (X i t) = t for t [0, 1] and P (Y i t) = (t θ) for t [θ, θ + 1]. Assume m/(m + n) ρ as m, n and 0 < θ < 1. Find the asymptotic distribution of m + n[g(y X) g(θ)]. 9.3 The Wilcoxon Rank-Sum Test Suppose that X 1,..., X m and Y 1,..., Y n are two independent simple random samples, with P (X i t) = P (Y j t + θ) = F (t) (9.8) for some continuous distribution function F (t) with f(t) = F (t). Thus, the distribution of the Y j is shifted by θ from the distribution of the X i. We wish to test H 0 : θ = 0 against H 1 : θ > 0. To do the asymptotics here, we will assume that n and m are actually both elements of separate sequences of sample sizes, indexed by a third variable, say k. Thus, m = m k and n = n k both go to as k, and we suppress the subscript k on m and n for convenience of notation. Suppose that we combine the X i and Y j into a single sample of size m + n. Define the Wilcoxon rank-sum statistic to be S k = Rank of Y j among combined sample. j=1 Letting Y (1),..., Y (n) denote the order statistics for the sample of Y j as usual, we may rewrite S k in the following way: S k = Rank of Y (j) among combined sample j=1 153

16 = = = ( j + #{i : Xi < Y (j) } ) j=1 n(n + 1) n(n + 1) + + j=1 j=1 m I{X i < Y (j) } m I{X i < Y j }. (9.9) Let N = m + n, and suppose that m/n ρ as k for some constant ρ (0, 1). First, we will establish the asymptotic behavior of S k under the null hypothesis. Define µ(θ) = E θ S k and τ(θ) = N Var S k. To evaluate µ(θ 0 ) and τ(θ 0 ), let Z i = n j=1 I{X i < Y j }. Then the Z i are identically distributed but not independent, and we have E θ0 Z i = n/ and Furthermore, E θ0 Z i Z j = which implies Therefore, and r=1 s=1 Var θ0 Z i = n 4 + n(n 1) Cov θ 0 (I{X i < Y 1 }, I{X i < Y }) = n n(n 1) n(n + ) =. 1 P θ0 (X i < Y r and X j < Y s ) = Cov θ0 (Z i, Z j ) = µ(θ 0 ) = n(n + 1) n(n 1) 4 + mn n(n 1) 4 n(n 1) 4 + n 3 n 4 = n 1. = n(n + 1) τ (θ 0 ) = Nm Var Z 1 + Nm(m 1) Cov (Z 1, Z ) = Nmn(n + ) Nm(m 1)n = Nmn(N + 1) np θ0 (X i < Y 1 and X j < Y 1 ),

17 Let θ 1, θ,... be a sequence of alternatives such that N(θ k θ 0 ) for a positive, finite constant. It is possible to show that under H 0 and N{Sk µ(θ 0 )} τ(θ 0 ) d N(0, 1) (9.30) N{Sk µ(θ k )} τ(θ 0 ) d N(0, 1) (9.31) under the alternatives {θ k } (see Exercise 9.11). By expression (9.30), the test based on S k with asymptotic level α rejects H 0 : θ = 0 in favor of H a : θ > 0 whenever S k µ(θ 0 ) + u α τ(θ 0 )/ N, or S k n(n + 1) mn(n + 1) + u α. 1 The Wilcoxon rank-sum test is sometimes referred to as the Mann-Whitney test. This alternative name helps to distinguish this test from the similarly named Wilcoxon signed rank test. To find the limiting power of the rank-sum test, we may use Theorem 9.9 to conclude that ( ) µ (θ 0 ) β k (θ k ) lim Φ u α. (9.3) k τ(θ 0 ) According to expression (9.3), we must evaluate µ (θ 0 ). To this end, note that Therefore, This gives P θ (X 1 < Y 1 ) = E θ {P θ (X 1 < Y 1 Y 1 )} = E θ F (Y 1 ) = = d dθ P θ(x 1 < Y 1 ) = µ (0) = mn F (y)f(y θ) dy F (y + θ)f(y) dy. f(y + θ)f(y) dy. f (y) dy. 155

18 Thus, the efficacy of the Wilcoxon rank-sum test is µ (θ 0 ) lim k τ(θ 0 ) = lim k mn 1 f (y) dy mnn(n + 1) = 1ρ(1 ρ) The asymptotic power of the test follows immediately from (9.3). f (y) dy. Exercises for Section 9.3 Exercise 9.11 (a) Prove expression (9.30) under the null hypothesis H 0 : θ = θ 0 (where θ 0 = 0). (b) Prove expression (9.31) under the sequence of alternatives θ 1, θ,... satisfying N(θk θ 0 ). Hint: In both (a) and (b), start by verifying either the Lindeberg condition or the Lyapunov condition. Exercise 9.1 Suppose Var X i = Var Y i = σ < and we wish to test the hypotheses H 0 : θ = 0 vs. H 1 : θ > 0 using the two-sample Z-statistic Y X σ 1 m + 1 n Note that this Z-statistic is s/σ times the usual T-statistic, where s is the pooled sample standard deviation, so the asymptotic properties of the T-statistic are the same as those of the Z-statistic. (a) Find the efficacy of the Z test. Justify your use of Theorem 9.9. (b) Find the ARE of the Z test with respect to the rank-sum test for normally distributed data. (c) Find the ARE of the Z test with respect to the rank-sum test if the data come from a double exponential distribution with f(t) = 1 λ e t/λ. (d) Prove by example that the ARE of the Z-test with respect to the rank-sum test can be arbitrarily close to zero.. 156

Maximum Likelihood Estimation

Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function