BEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.

BEST TESTS Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized. 1. Most powerful test Let {f θ } θ Θ be a family of pdfs. We will consider the simple case where Θ = {θ 0, θ 1 }, so that the family of pdfs only contain two pdfs. Let θ Θ, where θ is unknown. Consider the following simple hypothesis test with null hypothesis H 0 : θ = θ 0 and critical region C; so that we reject H 0 if X C. Suppose α = P θ0 (X C). We say that the test is a best test, at size α, if for any other possible regions, A, with P θ0 (X A) α, we have P θ1 (X C) P θ1 (X A). Sometimes best tests are also called most powerful tests. In these notes, sometimes, we will write P 0 = P θ0 and P 1 = P θ1. Exercise 1. Suppose that f 0 is the pdf for a N(0, 1) random variable and f 1 is the pdf for a N(1, 1) random variable. We wish to test the hypothesis H 0 : µ = 0 versus H 1 : µ = 1. We have only one single sample X. Consider the set C = {x R : 1 x 2}. Show that as a critical region, the set C does not correspond to a best test. Solution. Let b so that P 0 (X b) = P 0 (X C). Set A := {x R : x b}. In fact, P 0 (X C) 0.1359 and b 1.099. We claim that A := {x R : x b} gives a rejection region with more power. We have that P 1 (X C) = P 0 (0 X 1) 0.34134 whereas, P 1 (X A) = P 0 (X b 1) 0.4606. 2. Randomization Consider the following simple test. We have one sample X N(µ, 1) and we want to test H 0 : µ = 0 against H 1 : µ = 1. It is easy to define to a test with exact size α, for any α (0, 1); in particular, we even have notation for this: P 0 (X > z α ) = α, so that we can consider the 1

2 BEST TESTS test φ(x) = 1[x > z α ], where we reject H 0 if φ(x) = 1. This is possible since X is a continuous random. When X is a discrete random variable, this is no longer possible, without additional randomization. Consider the following randomized test. We have one sample X Bern(p), and we want to test H 0 : p = 1/2 against p = 3/4. Any regular non-randomized test will have reject H 0, when H 0 is true, with probability 1/2. However, we can get different values of α in the following way. Suppose α < 1/2. Let φ(x) = 1[x = 1]2α. Let U be independent of X. We reject H 0 if U φ(x), in other words, if X = 1, we reject H 0 with probability 2α, so that E 0 φ(x) = α is the probability that we reject H 0, when H 0 is true. Let X = (X 1,..., X n ) be a random sample from f θ, where θ {θ 0, θ 1 }. Consider the hypothesis test of H 0 : θ = θ 0 with critical function φ. In the randomized setting, the power function is given by β φ (θ) = E θ φ(x). The power of a test is given by β φ (θ 1 ). We say that a critical function φ defines a best test at level α, if E 0 φ(x) α and for all critical functions φ with E 0 φ α we have that β φ (θ 1 ) β φ (θ 1 ). Theorem 2 (Neyman-Pearson). Let X = (X 1,..., X n ) be a random sample for f θ, where θ {θ 0, θ 1 }. Consider the null hypothesis θ = θ 0. Let α (0, 1). Set R(X) := L(X; θ 0) L(X; θ 1 ). There exists a critical function φ and a constant k > 0 such that (a) E 0 φ(x) = α and (b) φ(x) = 1 when R(X) < k, and φ(x) = 0, when R(X) > k. Moreover, if a critical function satisfies both conditions, then it is a most powerful (randomized) test at level α. In addition, if φ is (another) most powerful (randomized) test at level α, then it satisfies second condition, and it also satisfies the first condition, except in the case where there is a test of size α < α with power 1. Note that if x is such that L(x; θ 1 ) = 0 and L(x; θ 0 ) > 0, then it does not make any practical sense to reject H 0, if x is observed. Similarly, if L(x; θ 0 ) = 0 and L(x; θ 1 ) > 0, then we should reject H 0, if x is observed.

BEST TESTS 3 Let us remark that in Theorem 2, we do not specify what happens to φ when R(X) = k; it is on this event that the randomization is necessary on this event φ will take values between (0, 1). Often, when the random variables involved are continuous, R(X) = k happens with probability zero and when the random variables involved are discrete we may required additional randomization. The idea of the proof of Theorem 2 is nice. Consider the case where the X i take values in A; we want to find C A n that maximizes L(x; θ 1 ) subject to x C P 1 (X = x) = x C L(x; θ 0 ) α. P 0 (X = x) = x C x C Which elements of A n should be allowed to be in the set C? Think of each element x C as having a cost L(x; θ 0 ) and a value L(x; θ 1 ). One guess would be elements x A n that have high relative value; that is, x A n where R(x) is small how small, well that depends on α. So, one way to build the set C is to order the elements of A n in terms of R(x), and we add elements to C starting from high to low value. However, as the cost approaches α, we may be forced to either break the order, that is, choose an element of lower relative value and/or stop before reaching spending limit α. Randomization solves this problem, as it allows us to spend the maximum limit α. Exercise 3. Let X be an integer-valued random variable with pdf f f 0, f 1, where f 0 is the discrete uniform distribution on the 13 numbers, {0, 1, 2,..., 12} and f 1 is the tent function given by f 1 (x) = x/36 for all x {0, 1,..., 6} and f 1 (x) = 1/3 x/36 for all x {7, 8,..., 12}. Consider the null hypothesis H 0 : f = f 0. On the basis of one single observation X, find the best test at significance level α = 3/13 0.23. Define a randomized best test at level α = 0.25. Find the power of your tests; that is, compute the power function on the alternate hypothesis. Solution. Consider the set C = {5, 6, 7} and critical function given by 1[X C], so that reject H 0 if X C. We have that P 0 (X C) = 3/13 = α. The power of this test is given P 1 (X C) = 5/36 + 6/36 + 5/36 = 4/9. In order to show that it is a best test, we will appeal to Theorem 2. We need to find a k such that R(X) < k if and only if X C. Note that R(6) = (1/13)/(6/36) 0.461, R(5) = R(7) = (1/13)/(5/36) 0.553

4 BEST TESTS and R(4) = R(8) = (1/13)/(4/36) 0.69. Take k = 0.6, then R(X) < k if and only if X {5, 6, 7}. If α = 0.25, then we can apply randomization in the following way. Note that P 0 (X = 4) = 1/13. So that if we expanded our critical set to contain 4, P 0 (X (4, 5, 6, 7)) = 4/13 0.308 > 0.25. Moreover, if we wanted apply the Theorem 2, we would be force to include 8, since R(4) = R(8). Consider the test that is exactly the same as before, except that when X = 4, we reject H 0 with probability 1/4; that is, set φ(x) = 1 if x {5, 6, 7}, φ(x) = 0 if x {0, 1, 2, 3, 8, 9, 10, 11, 12}, and φ(4) = 1/4. Clearly, E 0 φ(x) = α. The power is given by 4/9 + 1/9(1/4) = 17/36. Notice that we can take k = R(4), then Theorem 2 applies. Let us remark that referring to Exercise 3, in practice, one would prefer the non-randomized best test at level α = 3/13 over the randomized best test at level α = 0.25. Exercise 4. Let X be a continuous random variable with pdf f {f 0, f 1 }, where f 0 is the pdf of a uniform distribution on [0, 1] and f 1 is the pdf of a uniform distribution on [0, 2]. Consider the null hypothesis H 0 : f = f 0. On the basis of one single observation X, find the best test at significance level α. Exercise 5. Let X be a continuous random variable with pdf given by f(x; θ) = θx θ 1 1[x (0, 1)], where θ {1, 2} Consider the null hypothesis H 0 : θ = 1. On the basis of one single observation X, find the best test at significance level α. Solution. By Theorem 2, we want to find k > 0 so that P 0 (R < k) = α. We reject H 0 if R < k. Notice that under H 0, we have that X is uniformly distributed in [0, 1]. We have that R(X) = 1, so that 2X thus k = 1 2(1 α). P 0 (R < k) = P 0 (1/2k < X) = 1 1/2k = α; Exercise 6. Let X = (X 1,..., X n ) be a random sample where X 1 N(µ, 1), where µ {0, 1}. Consider the null hypothesis H 0 : µ = 0. On the basis of the random sample X, find the best test at significance level α. Solution. By Theorem 2, we want to find k > 0 so that P 0 (R < k) = α. Let T = X 1 + + X n. We know that T N(nµ, n) is a sufficient

statistic for µ; in particular, we know that BEST TESTS 5 L(x; µ) = g(t(x); µ)h(x), where h does not depend on µ and g(t, µ) is the pdf for T. So that R(X) = g(t ; 0)/g(T ; 1) = exp[ T + n 2 ]. Thus R(X) < k if and only if T + n < log k if and only if 2 Z := T n > n 2 log k =: c(k). n Notice that under H 0, we have that Z N(0, 1). Choose k so that c(k) = z α. Exercise 7. Let X = (X 1,..., X n ) be a random sample where X 1 is an exponential random variable with mean µ {2, 3} Consider the null hypothesis H 0 : µ = 2. On the basis of the random sample X, find the best test at significance level α. Proof of Theorem 2. Let F (t) = P 0 (R(X) t) be the cdf for R(X) under H 0. We have that lim t F (t) = 0 and lim t F (t) = 1 (assuming that P 0 (R(X) = ) = 0 ) Recall that F is right continuous, so that lim t a + F (t) = F (a). However, it may not be left continuous. We set F (a ) := lim x a F (t) = P 0(R(X) < a). We have that F (a ) F (a). and P 0 (R(X) = a) = F (a) F (a ). Given α (0, 1), let k > 0 be a point such that that F (k ) α F (k). (We may be forced to take k = if P 0 (R(X) = ) > 0). Set φ(x) := 1[R(x) < k] + if P 0 (R(X) = k) > 0; otherwise, set α F (k ) 1[R(x) = k], P 0 (R(X) = k) φ(x) := 1[R(x) < k]. Clearly, E 0 (φ(x)) = α, so that we constructed a critical function with the required two properties. Moreover, our φ has the property that it is a constant when R(x) = k.

6 BEST TESTS Suppose now that φ is a critical function that satisfies the two properties, we will show that φ is a best test at level α. Let φ be another critical function with E 0 φ (X) α. Note that [φ(x) φ (x)][l(x; θ 0 ) kl(x; θ 1 )] 0, since the two terms being multiplied always have the different signs. Write dx 1 dx n = dx. In the case that X i are continuous random variables, we have that [φ(x) φ (x)][l(x; θ 0 ) kl(x; θ 1 )]dx 0. This gives us a bound on the difference of the powers, since it implies that β φ (θ 1 ) β φ (θ 1 ) = [φ(x) φ (x)]l(x; θ 1 )dx 1 [φ(x) φ (x)]l(x; θ 0 )dx k = 1 k [α P 0(φ(X))] 0. In the discrete case, one replaces the integrals by sums. Finally, suppose φ is a most powerful test. Let φ be a most powerful test satisfying the two conditions. We will show that φ and φ are equal on the set {x : R(x) k}. Towards a contradiction, let D = {x : φ(x) φ (x) 0} {x : R(x) k}. For the continuous case, assume that D has positive Lebesgue measure; that is, 1[x D]dx > 0. Now we have that for x D [φ(x) φ (x)][l(x; θ 0 ) kl(x; θ 1 )] < 0, from which we deduce that φ is more powerful than φ, a contradiction. In the discrete case, we need to only assume that D is non-empty for a similar contradiction. Thus φ satisfies the two conditions. In order to argue that E 0 φ (X) = α, we note that if E 0 φ (X) < α, then we could include more points to be (randomly) rejected, thereby

BEST TESTS 7 increasing the power. Thus we must have that the power is 1 or the size is α. Exercise 8. Find an example where you have that the power is 1 and the size is not 1. Exercise 9. Let X = (X 1,..., X n ) be a random sample for f θ, where θ {θ 0, θ 1 }. Consider the null hypothesis θ = θ 0. Let α (0, 1). Suppose φ is a critical function which gives E 0 φ(x) < α and E 1 φ(x) < 1. Show that there exists a critical function φ with φ > φ, E 0 φ(x) α, and E 1 φ (X) > E 1 φ(x). Corollary 10. In the context of Theorem 2, if b is the power of a most powerful test at level α (0, 1), then α < b, unless we are in the trivial case that f θ0 = f θ1. Proof of Corollary 10. Consider the test which ignores the data, where D(x) = α for all x. Clearly, E 0 D(X) = α and E 1 D(X) = α. So the critical function D gives a test of size α with power α. So, we must have that α b. Moreover, if α = b, then D is also a most powerful test, and we have that D satisfies the second condition of Theorem 2 since α (0, 1), this forces the condition that L(X; θ 0 ) = kl(x; θ 1 ), for some k, which also forces the condition that k = 1 from which we deduce that f θ0 = f θ1. Sometimes, a test with the property that the significance level is no greater than power is called unbiased. Corollary 10 gives that a best test is unbiased.