Nonparametric one-sided testing for the mean and related extremum problems

Similar documents
h(x) lim H(x) = lim Since h is nondecreasing then h(x) 0 for all x, and if h is discontinuous at a point x then H(x) > 0. Denote

Stochastic Dynamic Programming: The One Sector Growth Model

LECTURE 15: COMPLETENESS AND CONVEXITY

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

CHARACTERIZATION OF (QUASI)CONVEX SET-VALUED MAPS

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974

A LITTLE REAL ANALYSIS AND TOPOLOGY

SUPPLEMENTARY MATERIAL TO IRONING WITHOUT CONTROL

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

On a Class of Multidimensional Optimal Transportation Problems

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

On Kusuoka Representation of Law Invariant Risk Measures

The Mean Value Theorem. Oct

Estimates for probabilities of independent events and infinite series

Chapter 2 Convex Analysis

2 Sequences, Continuity, and Limits

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

3 Integration and Expectation

Math212a1413 The Lebesgue integral.

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Centre for Mathematics and Its Applications The Australian National University Canberra, ACT 0200 Australia. 1. Introduction

Chapter 4. Measure Theory. 1. Measure Spaces

CHAPTER 7. Connectedness

REAL AND COMPLEX ANALYSIS

Copyright 2010 Pearson Education, Inc. Publishing as Prentice Hall.

MATH5011 Real Analysis I. Exercise 1 Suggested Solution

Behaviour of Lipschitz functions on negligible sets. Non-differentiability in R. Outline

WHY SATURATED PROBABILITY SPACES ARE NECESSARY

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

Social Welfare Functions for Sustainable Development

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

MATH 409 Advanced Calculus I Lecture 7: Monotone sequences. The Bolzano-Weierstrass theorem.

INDISTINGUISHABILITY OF ABSOLUTELY CONTINUOUS AND SINGULAR DISTRIBUTIONS

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales

Equilibria with a nontrivial nodal set and the dynamics of parabolic equations on symmetric domains

Functional Analysis HW #3

Value Iteration and Action ɛ-approximation of Optimal Policies in Discounted Markov Decision Processes

1 Stochastic Dynamic Programming

COMPLETELY INVARIANT JULIA SETS OF POLYNOMIAL SEMIGROUPS

Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon Measures

Problem List MATH 5143 Fall, 2013

Convex Optimization Notes

Convex Analysis and Optimization Chapter 2 Solutions

Continuity. Chapter 4

Maths 212: Homework Solutions

Stability of a Class of Singular Difference Equations

Locally convex spaces, the hyperplane separation theorem, and the Krein-Milman theorem

Brownian Motion and Stochastic Calculus

Lecture 4 Lebesgue spaces and inequalities

1 Directional Derivatives and Differentiability

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Exercise Solutions to Functional Analysis

Continuity. Chapter 4

Set, functions and Euclidean space. Seungjin Han

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

Final. due May 8, 2012

An introduction to some aspects of functional analysis

Topology. Xiaolong Han. Department of Mathematics, California State University, Northridge, CA 91330, USA address:

From now on, we will represent a metric space with (X, d). Here are some examples: i=1 (x i y i ) p ) 1 p, p 1.

MORE ON CONTINUOUS FUNCTIONS AND SETS

Mean-field dual of cooperative reproduction

POINTWISE PRODUCTS OF UNIFORMLY CONTINUOUS FUNCTIONS

DR.RUPNATHJI( DR.RUPAK NATH )

Lecture 7: Semidefinite programming

Picard s Existence and Uniqueness Theorem

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

4 Countability axioms

Measures. 1 Introduction. These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland.

Chapter 3 Continuous Functions

Chapter 2 Metric Spaces

Solution of the 8 th Homework

An invariance result for Hammersley s process with sources and sinks

Introduction to Real Analysis Alternative Chapter 1

Conditional Distributions

The Codimension of the Zeros of a Stable Process in Random Scenery

HOMEWORK ASSIGNMENT 6

Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 2017 Nadia S. Larsen. 17 November 2017.

Lecture Notes Introduction to Ergodic Theory

A SHORT INTRODUCTION TO BANACH LATTICES AND

INDECOMPOSABILITY IN INVERSE LIMITS WITH SET-VALUED FUNCTIONS

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

arxiv: v1 [math.oc] 21 Mar 2015

On Riesz-Fischer sequences and lower frame bounds

Foundations of Mathematics MATH 220 FALL 2017 Lecture Notes

ANSWER TO A QUESTION BY BURR AND ERDŐS ON RESTRICTED ADDITION, AND RELATED RESULTS Mathematics Subject Classification: 11B05, 11B13, 11P99

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE

Immerse Metric Space Homework

MA5206 Homework 4. Group 4. April 26, ϕ 1 = 1, ϕ n (x) = 1 n 2 ϕ 1(n 2 x). = 1 and h n C 0. For any ξ ( 1 n, 2 n 2 ), n 3, h n (t) ξ t dt

Solving a linear equation in a set of integers II

Convergence of generalized entropy minimizers in sequences of convex problems

Integration on Measure Spaces

LECTURE 6. CONTINUOUS FUNCTIONS AND BASIC TOPOLOGICAL NOTIONS

THEOREMS, ETC., FOR MATH 515

Transcription:

Nonparametric one-sided testing for the mean and related extremum problems Norbert Gaffke University of Magdeburg, Faculty of Mathematics D-39016 Magdeburg, PF 4120, Germany E-mail: norbert.gaffke@mathematik.uni-magdeburg.de Abstract We consider the nonparametric model of n i. i. d. nonnegative real random variables whose distribution is unknown. An interesting parameter of that distribution is its expectation µ. Wang & Zhao (2003) studied the problem of testing the one-sided hypotheses H 0 : µ µ 0 vs. H 1 : µ > µ 0 (with a given µ 0 > 0, where w.l.g. one may take µ 0 = 1). For n = 1 there is a UMP nonrandomized level test. Somewhat surprisingly, for n = 2 Wang & Zhao obtained a UMP nonrandomized monotone symmetric level test. However, they conjectured that the result will not carry over to larger sample size n 3. Unfortunately, their conjecture is true as we will show. Also, we present an alternative proof of their (positive) result for n = 2. Our derivations are based on a study of related classes of extremum problems on products of probability measures. Key words: Monotone symmetric test, UMP test, order statistics, probability measures, weak topology, semi-continuity. 2000 Mathematics Subject Classification: Primary 62G10, secondary 28C15. 1 Introduction Let X 1,..., X n be nonnegative real i.i.d. random variables whose distribution P X i is unknown. We are interested in the expectation parameter µ = E(X i ). Note that µ [ 0, ]. Consider the one-sided testing problem, H 0 : µ µ 0 vs. H 1 : µ > µ 0, for a given µ 0 > 0. W.l.g. we may assume that µ 0 = 1, since for any given value µ 0 > 0 we can transform to that case via X i = X i /µ 0 (1 i n). So in the following we will restrict to the testing problem H 0 : µ 1 vs. H 1 : µ > 1. (1) The recent paper [1] deals with testing the one-sided hypotheses (1). For n = 1 the UMP nonrandomized level test is given by φ = 1 [1/, ). For n = 2 there was derived in [1] a UMP test within a class of tests slighly larger than the class of all nonrandomized 1

monotone (nondecreasing) symmetric level tests (see Section 3). This is is a highly nontrivial and somewhat surprising result. Note that a test φ is said to be symmetric iff it is a symmetric function of the observations x 1,..., x n, and φ is said to be monotone (nondecreasing) iff x, z [ 0, ) n and x z (w.r.t. the componentwise semi-ordering) imply φ(x) φ(z). Unfortunately, that result cannot be extended to larger sample size n 3, as we will show in Section 5 and as it was conjectured in [1], p. 76. Also, we will give a proof of the positive result for n = 2 which, as we think, is more tranparent than that presented in [1]. Our derivations are based on a study of related classes of extremum problems. E. g., an extremum problem arises when the true level of a given test for (1) has to be determined. The auxiliary results on such extremum problems presented in Section 2 might also be of some independent interest. 2 2 Related extremum problems Let S be a nonempty compact subset of R n, and let Prob(S) denote the set of all Borel probability measures on S. Endowed with the weak topology (inducing weak convergence) Prob(S) is a compact space (cf. [2], Satz 31.2 and Bemerkung 1 on p. 237). Clearly, Prob(S) is a convex set, i.e., if Q 1, Q 2 Prob(S) and 0 λ 1 then λq 1 + (1 λ)q 2 Prob(S). A real function g : S R is said to be upper semi-continuous (u.s.c.) iff for any given u R the set { z S : g(z) u} is closed. Equivalently, g is u.s.c. iff for any sequence z k S (k N) converging to some z 0 S one has lim sup k g(z k ) g(z 0 ). Similarly, we have the notion of upper semi-continuity of a real function on Prob(S). A function G : Prob(S) R is said to be u.s.c. iff for any given u R the set { Q Prob(S) : G(Q) u } is a closed set w.r.t. the weak topology. Equivalently, the u.s.c. property of G means that whenever Q k Prob(S) (k N) is a sequence converging weakly to some Q 0 Prob(S) then lim sup k G(Q k ) G(Q 0 ). Lemma 2.1 Let S be a nonempty compact subset of R n and g : S R be a nonnegative u.s.c. function. Then, by G(Q) = E Q (g) (the expectation of g w.r.t. Q), for all Q Prob(S), we have a nonnegative real u.s.c. function on Prob(S). Proof. Firstly, we consider the special case that g is the indicator function of a closed subset C of S, i.e., g = 1 C on S. Then G(Q) = Q(C) for all Q Prob(S). If Q k (k N) is a sequence in Prob(S) which converges weakly to some Q 0 Prob(S), then by [3], Theorem 2.1, we have lim sup k Q k (C) Q 0 (C), i.e., the function G is u.s.c. Now let g be an arbirary nonnegative u.s.c. function on S. As an u.s.c. function on the compact set S the function g is bounded above, and hence 0 g(z) v for all z S with some finite v > 0. In particular, G(Q) = E Q (g) is nonnegative and finite for all Q Prob(S). Also, together with a well-known formula for expectations of nonnegative functions, we have for all Q Prob(S), abbreviating C g (u) = { z S : g(z) u }, E Q (g) = 0 Q( C g (u) ) du = v 0 Q( C g (u) ) du. Let Q k be a sequence in Prob(S) which converges weakly to Q 0 Prob(S). Since for any 0 u v the set C g (u) is closed, we have by the above, lim sup k Q k ( C g (u) ) Q 0 ( C g (u) ) for all 0 u v, and hence (by a result from integration theory, cf. [2],

3 p. 100, Ex. 1), lim sup k v 0 Q k ( C g (u) ) du v i.e., lim sup k G(Q k ) G(Q o ), and so G is u.s.c. 0 v lim sup Q k ( C g (u) ) du Q 0 ( C g (u) ) du, k 0 The support of a probability measure Q Prob(S), denoted by supp(q), is the smallest closed subset of S which has Q-probability equal to one. That is, supp(q) is a closed subset of S with Q(supp(Q)) = 1, and if A is any closed subset of S with Q(A) = 1 then supp(q) A. More explicitly, supp(q) consists of all points z S such that for each open neighborhood U z of z in R n one has Q( U z S ) > 0. Clearly, if Q is an r-point distribution (where r N) giving probabilities q 1,..., q r > 0 to distinct points z 1,..., z r S, resp., (where r i=1 q i = 1), for short ( ) z1... z Q = r, q 1... q r then supp(q) = { z 1,..., z r }. Theorem 2.2 For a given real number c > 1 consider the compact interval I = [ 0, c ] of the real line, and let g : I R be a nonnegative u.s.c. function. Consider the extremum problem (EP1) maximize G(Q) = E Q (g) s.t. Q Prob(I), E[Q] = 1, where E[Q] = z dq(z) denotes the expectation of Q. Then: I (i) There exists an optimal solution Q 0 to problem (EP1). (ii) Let Q 0 Prob(I) with E[Q 0 ] = 1 be given. Q 0 is an optimal solution to (EP1) if and only if there exist ρ [ 0, ) and τ R such that g(z) ρ + τz z I and g(z) = ρ + τz z supp(q 0 ). (iii) There exists a one- or two-point distribution Q 0 which is an optimal solution to (EP1). Proof. The feasible region of (EP1) is compact w.r.t. the weak topology and, by Lemma 2.1, the objective function G is u.s.c. w.r.t. the weak topology. This proves (i). To prove the only if part of (ii), let Q 0 be an optimal solution to (EP1). Consider the subset of R 2, C = { ( E[Q], G(Q) ) : Q Prob(I) }. Because of E[λQ 1 + (1 λ)q 2 ] = λe[q 1 ] + (1 λ)e[q 2 ] and G(λQ 1 + (1 λ)q 2 )) = λg(q 1 ) + (1 λ)g(q 2 ) for all Q 1, Q 2 Prob(I) and all 0 λ 1, the set C is convex. Denote by γ the optimum value of (EP1), i.e., γ = G(Q 0 ). So the point (1, γ) belongs to C and, clearly, it must be a boundary point of C. So there is a supporting straight line for C at (1, γ), i.e., there exist a, b R not both equal to zero and such that ae[q] + bg(q) a + bγ Q Prob(I). (2)

4 Specializing to one-point distributions, (2) yields az + bg(z) a + bγ z I. (3) The expectation w.r.t. Q 0 of the l.h.s. of (3) equals a + bγ, and hence az + bg(z) = a + bγ Q 0 -a.s. (4) Case 1: b < 0. Then (3) rewrites as g(z) a b + γ a b z z I. Suppose that there is some z 0 I such that the strict inequality holds, i.e., g(z 0 ) > a b + γ a b z 0. Then we can construct a one- or a two-point distribution Q 1 Prob(I) which has z 0 as a support point and with E[Q 1 ] = 1. So Q 1 is feasible for (EP1) and G(Q 1 ) = E Q1 (g) > γ, which is a contradiction. Hence it follows that g(z) = a b + γ a b z z I, and thus, taking ρ = a +γ and τ = a, we have g(z) = ρ+τz for all z I. In particular, b b ρ = g(0) 0 and Q 0 satisfies the necessary condition of (ii). Case 2: b = 0. Then a 0 and (3) rewrites as az a for all z I. This yiels either z 1 for all z I or z 1 for all z I, which is a contradiction (I = [ 0, c ] and c > 1). So actually, case 2 cannot occur. Case 3: b > 0. Then, with ρ = a + γ and τ = a, (3) and (4) rewrite as b b g(z) ρ + τz z I, and g(z) = ρ + τz Q 0 -a.s. The latter can be stated as ( ) Q 0 {z I : g(z) ρ + τz} = 1, and since g is u.s.c. the set of all z I with g(z) ρ + τz is closed. So the support of Q 0 must be a subset of that set, and we have g(z) = ρ + τz z supp(q 0 ). Clearly, ρ 0 follows from 0 g(0) ρ. To prove the if part of (ii), let Q 0 Prob(I) with E[Q 0 ] = 1, and ρ 0, τ R such that g(z) ρ + τz for all z I and g(z) = ρ + τz for all z supp(q 0 ). Then, for any Q Prob(I) with E[Q] = 1, we obtain E Q (g) ρ + τ and E Q0 (g) = ρ + τ, hence G(Q) G(Q 0 ), and so Q 0 is an optimal solution to (EP1). It remains to prove (iii). Choose any optimal solution Q 0 of (EP1) according to (i), and choose ρ and τ according to (ii). If 1 supp(q 0 ) then the one-point distribution at 1, δ 1

5 say, is trivially feasible to (EP1), and G(δ 1 ) = g(1) = ρ + τ = E Q0 (g) = G(Q 0 ), i.e.. δ 1 is also an optimal solution to (EP1). Now let 1 supp(q 0 ). Because of E[Q 0 ] = 1, the support of Q 0 is neither a subset of the subinterval [ 0, 1 ] nor a subset of the subinterval [ 1, c ]. Hence there exist z 1, z 2 supp(q 0 ) with z 1 < 1 < z 2. Choose 0 < λ < 1 with λz 1 + (1 λ)z 2 = 1. So the two-point distribution ( ) z1 z Q 1 = 2 λ 1 λ is feasible for (EP1) and g(z) = ρ + τz on the support {z 1, z 2 } of Q 1. By (ii), Q 1 is also an optimal solution to (EP1). Now, for I = [ 0, c ] with a given finite c > 1 as above, we consider the n-dimensional cube I n, where n N, n 2. For Q 1,..., Q n Prob(I) we denote by n i=1 Q i the product of Q 1,..., Q n, i.e., the joint distribution of n stochastically independent random variables X 1,..., X n with distributions Q 1,..., Q n, resp., which is a member of Prob(I n ). If Q 1 =... = Q n = Q, say, then we write Q n instead of n i=1 Q. Let g : In R be a given nonnegative u.s.c. function on the cube I n. Firstly, we will restrict to the case that g is symmetric, i.e., permutationally invariant, g(z π(1),..., z π(n) ) = g(z 1,..., z n ) for all (z 1,..., z n ) I n and all permutations π of 1,..., n. We consider the following extremum problem. (EP2) maximize G n (Q) = E Q n(g) s.t. Q Prob(I), E[Q] = 1. A necessary condition that a given feasible (for (EP2)) Q 0 is an optimal solution to (EP2) is that the directional derivatives of the objective function at Q 0 for all feasible directions are nonpositive, i.e., 1 lim ε 0+ ε ( G n ( (1 ε)q 0 + εq ) G n (Q 0 ) ) 0 Q Prob(I) with E[Q] = 1. (5) In fact, the directional derivative (the limit on the l.h.s. of (5)) exists and is given by ( ) n E Q n 1 0 Q (g) G n(q 0 ), (6) which can be seen as follows. By the symmetry of g we get for any 0 < ε < 1, G n ( (1 ε)q 0 + εq ) = E ( (1 ε)q0 +εq) n (g) = (1 ε) n E Q n 0 (g) + n 1 (1 ε) j ε n j( n j=1 j) EQ j 0 Qn j (g) + ε n E Q n(g). Subtracting by G n (Q 0 ) = E Q n 0 (g), dividing by ε, and letting ε 0 the limit becomes ( ) (1 ε) n 1 G ε n (Q 0 ) + ( n n 1) EQ n 1 0 Q (g), lim ε 0+

6 (1 ε) and formula (6) follows from lim n 1 ε 0+ ε expectation in (6) can be written as = n and ( n n 1) = n. We note that the E Q n 1 0 Q (g) = E Q(g Q0 ), where (7) g Q0 (t) = g(z 1,..., z n 1, t) dq n 1 0 (z 1,..., z n 1 ) t I. (8) I n 1 Theorem 2.3 Let I = [0, c ] with a real c > 1, let n N, n 2, and g : I n R be nonnegative, symmetric, and u.s.c. Then, for the extremum problem (EP2) above one has: (i) There exists an optimal solution to (EP2). (ii) If Q 0 is an optimal solution to (EP2) then there exist ρ [ 0, ) and τ R such that the function g Q0 from (8) satisfies g Q0 (t) ρ + τt t I and g Q0 (t) = ρ + τt t supp(q 0 ), and for the optimum value of (EP2) we have G n (Q 0 ) = ρ + τ. Proof. By Lemma 2.1 the function G : Prob(I n ) R, G(R) = E R (g) is u.s.c. w.r.t. the weak topology on Prob(I n ). Since Q Q n is a continuous mapping from Prob(I) to Prob(I n ) (w.r.t. the weak topologies), we see from G n (Q) = G(Q n ), Q Prob(I), that the objective function G n of (EP2) is u.s.c. By the compactness of the feasible region of (EP2) statement (i) follows. To prove (ii), let Q 0 be an optimal solution to (EP2). Then, for any Q Prob(I) with E[Q] = 1 and any 0 < ε < 1 the convex combination (1 ε)q 0 + εq is again feasible for (EP2) and hence G n ( (1 ε)q 0 + εq ) G n (Q 0 ). So the directional derivatives from (5) must be nonpositive and together with (6), (7), (8) we get E Q (g Q0 ) E Q0 (g Q0 ). In other words, Q 0 is also an optimal solution to the extremum problem maximize G Q0 (Q) = E Q (g Q0 ) s.t. Q Prob(I), E[Q] = 1, which is of type (EP1) considered in Theorem 2.2. To apply Theorem 2.2 we have to verify the conditions that g Q0 is nonnegative real and u.s.c. As a nonnegative u.s.c. function g satisfies 0 g(z) v for all z I n for some finite v > 0, from which we immediately get 0 g Q0 (t) v for all t I. Let t k I (k N) be a sequence converging to some t 0 I. The u.s.c. property of g implies lim sup k g(z 1,..., z n 1, t k ) g(z 1,..., z n 1, t 0 ) for all (z 1,..., z n 1 ) I n 1, and thus lim sup k g Q0 (t k ) = lim sup k I n 1 g(z 1,..., z n 1, t k ) dq n 1 0 (z 1,..., z n 1 ) lim sup g(z 1,..., z n 1, t k ) dq n 1 0 (z 1,..., z n 1 ) I n 1 k I n 1 g(z 1,..., z n 1, t 0 ) dq 0 n 1 (z 1,..., z n 1 ) = g Q0 (t 0 ).

This proves the u.s.c. property of g Q0. Now we can apply Theorem 2.2, part (ii), which shows that Q 0 satisfies the condition in (ii) of Theorem 2.3, where G n (Q 0 ) = ρ+τ follows from G n (Q 0 ) = E Q0 (g Q0 ) = ρ + τe[q 0 ] = ρ + τ. 7 We consider still another extremum problem, (EP3) maximize Gn (Q 1,..., Q n ) = E N n i=1 Q (g) i s.t. Q 1,..., Q n Prob(I), E[Q i ] = 1, 1 i n. Again, the given function g : I n R is assumed to be nonnegative and u.s.c., but we do not impose here a symmetry condition on g. Theorem 2.4 Let I = [0, c ] with a real c > 1, let n N, n 2, and g : I n R be nonnegative and u.s.c. Then for the extremum problem (EP3) from above there exists an optimal solution (Q 1,..., Q n) such that each Q i (1 i n) is a one- or two-point distribution. Proof. By Lemma 2.1 the function G : Prob(I n ) R, G(R) = E R (g) is u.s.c. w.r.t. the weak topology on Prob(I n ). Since (Q 1,..., Q n ) n i=1 Q i is a continuous mapping from (Prob(I)) n to Prob(I n ) (w.r.t. the weak topologies), we see from n ) G n (Q 1,..., Q n ) = G( that the objective function G n of (EP3) is u.s.c. By i=1 Q i the compactness of the feasible region of (EP3) it follows that there exists an optimal solution (Q 1,..., Q n) to (EP3). It remains to show that each Q i can be chosen as a one- or two-point distribution. To this end let (Q 1,..., Q n) be any optimal solution to (EP3). For a given i {1,..., n} we can write G n (Q 1,..., Q n) = gi dq i, where g i (t) = I ( n g(z 1,..., z i 1, t, z i+1,..., z n ) d I n 1 j=1 j i Q j ) (z 1,..., z i 1, z i+1,..., z n ) for all t I. Along similar lines as in the proof of Theorem 2.3 one can see that gi nonnegative real u.s.c. function on I. Considering the (EP1), is a maximize G i (Q) = E Q (g i ) s.t. Q Prob(I), E[Q] = 1, we get from Theorem 2.2, part (iii), the existence of a one- or two-point distribution Q i which is an optimal solution to that (EP1). Observing that E Q (g i i ) = G n (Q 1,..., Q i 1, Q i, Q i+1,..., Q n), E Q i (gi ) = G n (Q 1,..., Q n), we have G n (Q 1,..., Q i 1, Q i, Q i+1,..., Q n) G n (Q 1,..., Q n),

and, since (Q 1,,..., Q n) is an optimal solution to our original (EP3), the inequality is actually an equality. So we have shown that if (Q 1,..., Q n) is an optimal solution to (EP3) then for any given i {1,..., n} there is a one- or two-point distribution Q i such that (Q 1,..., Q i 1, Q i, Q i+1,..., Q n) is also an optimal solution to (EP3). Applying this successively for i = 1,..., n the result follows. We close this section by adding a technical lemma, which will enable us for certain extremum problems below to cut down probability distributions on the real half line [ 0, ) to a compact interval I = [ 0, c ] (with a c > 1) as considered in the above extremum problems. By Prob( [ 0, ) ) we denote the set of all Borel probability measures on [ 0, ). A function g : [ 0, ) n R is said to be nondecreasing iff x, y [ 0, ) n and x y (componentwise) imply g(x) g(y). Lemma 2.5 Let g : [ 0, ) n R be nonnegative and nondecreasing (and Borelmeasurable). Assume that there is a real constant c > 1 such that g(z 1,..., z n ) = g( h c (z 1 ),..., h c (z n ) ) z 1,..., z n [ 0, ), where we denote h c : [ 0, ) [ 0, c ], h c (t) = min{t, c}. Let Q 1,..., Q n Prob( [ 0, ) ) with E[Q i ] 1 (1 i n) be given. Denote by Q hc i Prob( [0, c ]) the distribution of h c under Q i (1 i n), by δ c Prob([ 0, c ]) the one-point distribution at c, and define Q i,c Prob([ 0, c ]) (1 i n) by Q i,c = (1 λ i )Q hc i Then: + λ i δ c, where λ i = (1 E[Q hc i ])/(c E[Q hc i ]), (1 i n). E[Q i,c ] = 1 (1 i n) and g d n [ 0, ) n Q i i=1 g d n Q i,c. [ 0, c ] n i=1 8 Proof. Obviously, E[Q h c i ] E[Q i ] 1, hence 0 λ i < 1, and the Q i,c are in fact probability distributions on [ 0, c ]. Also, by the definition of λ i, E[Q i,c ] = (1 λ i )E[Q h c i ] + λ i c = 1. Clearly, for each i = 1,..., n, the distribution Q hc i is stochastically smaller than or equal to Q i,c, and since g is nondecreasing we have (cf. [4], Theorem 4.B.10, part (b)), g d n Q h c i g d n Q i,c. [ 0, c ] n i=1 [ 0, c ] n i=1 The integral on the l.h.s. equals g( h(z 1 ),..., h(z n ) ) d n Q i, [ 0, ) n i=1 which is the same as g d n [0, ) n i=1 Q i because of g( h c (z 1 ),..., h c (z n ) ) = g(z 1,..., z n ) by the assumed condition on g. Remark. If in the lemma the Q i are all equal, Q 1 =... = Q n = Q, say, then the Q i,c are all equal as well, Q 1,c =... = Q n,c = Q c, say.

9 3 An important statistic We return to the nonparametric statistical model of Section 1, i.e., we suppose that the values x 1,..., x n of n i.i.d. nonnegative real random variables X 1,..., X n are observed whose distribution is unknown, and we consider the one-sided hypotheses (1) about the expectation µ = E(X i ). Let 0 < < 1 be a given level. For n = 1, it was shown by Wang & Zhao in [1], pp. 82-83, that φ = 1 [1/, ) is UMP level among all nonrandomized level tests. Also they showed that this is no longer true if one admits randomized tests. For n = 2, they constructed a UMP test within the class of all strongly monotone and symmetric level tests (p. 84 of [1]), which is a highly nontrivial result. We employ the notion of a strongly monotone test from [1] for arbitrary n. A test φ : [ 0, ) n [ 0, 1 ] is said to be strongly monotone (nondecreasing) iff, firstly, φ(x) > 0, x z, x z imply φ(z) = 1 and, secondly, φ(x) < 1, x z, x z imply φ(z) = 0. Clearly, strong monotonicity of φ implies monotonicity of φ, i.e., φ is nondecreasing (x z implies φ(x) φ(z)). For a nonrandomized test φ strong monotonicity of φ is the same as monotonicity of φ (φ is nondecreasing). So, the class of all strongly monotone symmetric level tests for H 0 is slightly larger than the class of all nonrandomized monotone symmetric level tests for H 0. The following statistic introduced by Wang & Zhao in [1] for n = 2 is meaningful for the testing problem (1). We introduce that statistic for arbitrary n, ( ) W (x) = sup P X (i) x (i) i = 1,..., n, x = (x 1,..., x n ) [ 0, ) n, (9) H 0 where X (1) X (2)... X (n) denote the order statistics of the random variables X 1,..., X n and x (1) x (2)... x (n) denote the nondecreasingly rearranged components of x = (x 1,..., x n ). Our notation in (9) is somewhat loose; the sup on the r.h.s. of (9) is to mean that one takes into account all possible distributions Q = P X i of n i.i.d. nonnegative real random variables X 1,..., X n with E(X i ) 1 and takes the supremum over of all resulting probabilities P ( X (i) x (i) i = 1,..., n), for fixed x = (x 1,..., x n ) [ 0, ) n. That is, the value W (x) is the optimum value of the extremum problem, maximize G x,n (Q) = Q n ( C x ) (10) s.t. Q Prob([0, )), E[Q] 1, where x = (x 1,..., x n ) [ 0, ) n is given and C x denotes the set C x = { z = (z 1,..., z n ) [ 0, ) n : z (i) x (i) i = 1,..., n }. (11) Obviously, the function W on [ 0, ) n is symmetric and nonincreasing. Below we will see that W is continuous (see Lemma 3.5). So the following result is obtained by analogous arguments as those in [1], Corollary A.1 and its proof on pp. 91-92. Lemma 3.1 Consider the (nonrandomized) test ψ = 1 {W }. Then, for any strongly monotone symmetric level test φ for H 0 we have φ ψ. In particular, if ψ keeps level on H 0 then ψ is UMP within the class of all strongly monotone symmetric level tests for H 0 vs. H 1.

As it will turn out, the test ψ keeps level on H 0 only for n 2; moreover, for n 3 no UMP strongly monotone symmetric level test for H 0 vs. H 1 exists. To see this we need a more explicit description of the statistic W than that from the definition (9) or (10), resp. For n = 1 we have C x = [ x, ), i.e., { } W (x) = sup Q( [ x, ) ) : Q Prob( [ 0, ) ), E[Q] 1, from which one easily obtains (using the Tchebychev-Markov inequality in case x > 1), { 1 x 1 W (x) =, if 1/x x > 1. So for n = 1 we have ψ = 1 [1/, ), which keeps level on H 0, again by the Tchebychev- Markov inequality. Now let n 2. From the definition (9) or (10) we have trivially W (x) = 1 for all x = (x 1,..., x n ) [ 0, ) n with x (n) 1. In the nontrivial case that x (n) > 1 the extremum problem (10) can be brought into an (EP2) problem from Section 2 by using Lemma 2.5. The function g = 1 Cx satisfies the assumptions of Lemma 2.5 with c = x (n). By that lemma the optimum value W (x) of the extremum problem (10) remains the same when we restrict in (10) to probability distributions Q on [ 0, x (n) ] with E[Q] = 1. So we can apply Theorem 2.3 on an (EP2), which yields the following (cp. [1], Lemma A1 for the case n = 2). Lemma 3.2 Let n 2. Then, for a given x = (x 1,..., x n ) [ 0, ) n with x (n) > 1, the maximum value W (x) of (10) is attained by some Q x for which E[Q x ] = 1 and supp(q x ) = { 0, x (1),..., x (n) } or supp(q x ) = { x (1),..., x (n) }. 10 Proof. Theorem 2.3 with c = x (n), g = 1 Cx, and C x from (11), yields the existence of an optimal solution Q x Prob( [ 0, x (n) ] ) to (10), and the existence of ρ 0, τ R such that g Qx (t) ρ + τt t [ 0, x (n) ] and g Qx (t) = ρ + τt t supp(q x ), (12) where g Qx (t) = Q n 1 x ( C x (t) ), C x (t) = {s [ 0, x (n) ] n 1 : (s, t) C x }. (13) The distribution Q x must assign positive probability to the value x (n), since otherwise we would have Q n x(c x ) = 0 which cannot be true (any probability distribution Q Prob([ 0, x (n) ]) with E[Q] = 1 which assigns positive probability to x (n) has Q n (C x ) > 0). Clearly, the sets C x (t) are nondecreasing in t [ 0, x (n) ], and hence the function g Qx is nondecreasing. Also, g Qx is constant on the subinterval 0 t < x (1) (if this is nonempty), and on the subinterval x (k) t < x (k+1) (if this is nonempty) for each k = 1,..., n 1. More explicitly, it is easily seen that C x (t) =, if 0 t < x (1) ; (14) { C x (t) = s [ 0, x (n) ] n 1 : s (i) x (i) (1 i k 1), } s (i) x (i+1) (k i n 1), (15) if x (k) t < x (k+1) and 1 k n 1 ; { } C x (x (n) ) = s [ 0, x (n) ] n 1 : s (i) x (i) (1 i n 1). (16)

Since g Qx is nondecreasing we must have τ 0 in (12) (otherwise x (n) would be the only support point of Q x contradicting E[Q x ] = 1 and x (n) > 1). Suppose τ = 0. Then, g Qx (t) ρ for all t [ 0, x (n) ] and the equality g Qx (t) = ρ holds for all t supp(q x ). The point s in R n 1 having all components equal to x (n) belongs ( ) n 1 to C x (x (n) ) by (16) and hence ρ = g Qx (x (n) ) Q x ({x (n) }) > 0. So we have ρ > 0 and, because of g Qx (t) = 0 if 0 t < x (1), we have supp(q x ) [ x (1), x (n) ], in particular x (1) < x (n). Let t 0 be the smallest support point of Q x. There is a k {1,..., n 1} with x (k) t 0 < x (k+1). The interval I k = [ x (k), x (k+1) ) has positive Q x -probability, since it contains the smallest support point t 0. Consider the set D = { s = (s 1,..., s n 1 ) : s i I k (1 i k), s i = x (n) (k + 1 i n 1) }. Then Q n 1 x (D) = (Q x (I k )) k (Q x ({x (n) })) n 1 k > 0. For each s D we see from (15) that s C x (x (k+1) ), but s C x (x (k) ). Hence D C x (x (k+1) ) \ C x (x (k) ) and Q n 1 x (D) g Qx (x (k+1) ) g Qx (x (k) ), in particular g Qx (x (k) ) < g Qx (x (k+1) ). On the other hand we have ρ = g Qx (t 0 ) = g Qx (x (k) ) and ρ = g Qx (x (n) ), hence g Qx (x (k) ) = g Qx (x (k+1) ) and thus a contradiction, i.e., τ = 0 cannot hold. So we have τ > 0. By (12) the increasing straight line ρ + τt can touch the graph of the nondecreasing step-function g Qx (t) only at the very left points of its plateaus, the abscissa of which are 0, x (1),..., x (n). Hence supp(q x ) { 0, x (1),..., x (n) }. (17) It remains to show that each x (k) (k = 1,..., n) is in fact a support point of Q x. We know that this is true for k = n. Suppose that for some k {2,..., n} we have x (k) supp(q x ), but x (k 1) supp(q x ), and thus, in particular, x (k 1) < x (k). Clearly, the Q n 1 x -probability of a set C x (t) remains the same when restrict to the subset of those points s in C x (t) with components in supp(q x ). By (15) we have C x (x (k) ) = { s : s (i) x (i) (1 i k 1), s (i) x (i+1) (k i n 1) }, C x (x (k 1) ) = { s : s (i) x (i) (1 i k 2), s (i) x (i+1) (k 1 i n 1) }. But for s with components in supp(q x ) we get from (17) and x (k 1) supp(q x ) that the inequality s (k 1) x (k 1) is equivalent to s (k 1) x (k), which shows that the sets C x (x (k) ) and C x (x (k 1) ) contain the same set of points s with components in supp(q x ). Hence it follows that and by (12), g Qx (x (k) ) = Q n 1 x (C x (x (k) )) = Q n 1 x (C x (x (k 1) )) = g Qx (x (k 1) ), ρ + τx (k) = g Qx (x (k) ) = g Qx (x (k 1) ) ρ + τx (k 1), which is a contradiction to τ > 0 and x (k) > x (k 1). Hence we must have x (k 1) supp(q x ) as well. This completes the proof of our lemma. 11 From Lemma 3.2 it follows that the value W (x) (in the nontrivial case x (n) > 1) equals the maximum of Q n (C x ) taken over all discrete probability distributions Q with {x (1),..., x (n) } supp(q) {0, x (1),..., x (n) } and E[Q] = 1.

For n = 2 that maximum was explicitly determined in [1], Lemma A.1, giving the following. Lemma 3.3 (Wang & Zhao) Let n = 2. Then, for all x = (x 1, x 2 ) [ 0, ) 2, we have 1, if x (2) 1 1, if x W (x) = (2) > 1, x (1) > x (2) x (1) (2x (2) x (1) ) 1 (x (2) 1) 2, if x (x (2) x (1) ) 2 (2) > 1, x (1) x (2) x 2 (2) x (2) x 2 (2) x (2) 12 For n 3 no explicit description of W seems to be possible. However, an alternative (implicit ) representation of the statistic will be useful; in particular, it will help to prove continuity of W (for arbitrary n) below. Lemma 3.4 Let U (1) U (2)... U (n) be the order statistics of n i.i.d. standard uniformly distributed (on the interval ( 0, 1 ) ) random variables U 1,..., U n. Then, for any x = (x 1,..., x n ) [ 0, ) n with x (n) > 1, we have { W (x) = max P ( U (1) u 1,..., U (n) u n ) : n } 0 u 1... u n 1, u i (x (i) x (i 1) ) = x (n) 1, where x (0) = 0. Also, as we already know, W (x) = 1 whenever x (n) 1. Proof. Firstly, we note that if X 1,..., X n are i.i.d. real random variables, F denoting the (right continuous) c.d.f. of X i, and if a 1,..., a n R, then ( ) ( ) P X (i) a i i = 1,..., n = P U (i) F l (a i ) i = 1,..., n, (18) where U 1,..., U n are i.i.d. standard uniform random variables and F l (t) = lim v t, v<t F (v) for t R, (the left continuous c.d.f. of X i ). Eq. (18) can be seen as follows. Consider the pseudo-inverse of F, i=1 F (u) = min{ t R : F (t) u } u ( 0, 1 ). Then, as it is well-known, we may write X i = F (U i ) (1 i n), with some i.i.d. standard uniform random variables U 1,..., U n. It is easily seen that for any a R and any u ( 0, 1 ) the following implications are true, u > F l (a) = F (u) a = u F l (a). This gives, observing that X (i) = F (U (i) ) (1 i n), { } { } { } U (i) > F l (a i ) i X (i) a i i U (i) F l (a i ) i,

and since the event U (i) = F l (a i ) has probability zero for all i, we conclude (18). Now let x = (x 1,..., x n ) [ 0, ) n with x (n) > 1 be given. Denote by m(x) the maximum on the r.h.s. of the asserted formula of the lemma. Note that the term maximum (instead of supremum ) is correct, since the probability P (U (i) u i i = 1,..., n) is a continuous function of u = (u 1,..., u n ) and the set of u s over which the supremum is taken is a compact (and nonempty) set. We have to show that m(x) = W (x). By definition of W (x) through (9) or (10) and by Lemma 3.2, W (x) is the maximum of all probabilities ( ) P X (i) x (i) i = 1,..., n, taken over all i.i.d. random variables X 1,..., X n with values in {0, x (1),..., x (n) } and with E(X i ) = 1. For any such random variables X 1,..., X n denote by F the (right continuous) c.d.f. of X i and by F l its left continuous version. A well-known formula for the expectation of a nonnegative random variable gives 1 = E(X 1 ) = (1 F 0 l(t))dt. Since X 1 has its values in {0, x (1),..., x (n) } we can write 1 F l (t) = n (1 F l (x (i) )) 1 ( x(i 1),x (i) ](t) t > 0, i=1 and hence n 1 = (1 F l (x (i) ) (x (i) x (i 1) ), i.e., i=1 n F l (x (i) ) (x (i) x (i 1) ) = x (n) 1. So, choosing u i = F l (x (i) ) (1 i n), we have 0 u 1... u n 1, n i=1 u i(x (i) x (i 1) ) = x (n) 1, and, together with (18), P ( X (i) x (i) i = 1,..., n) = P ( U (i) u i i = 1,..., n), which proves W (x) m(x). To prove the reverse inequality let 0 u 1... u n 1 with n i=1 u i(x (i) x (i 1) ) = x (n) 1 be given. Then, obviously, the function F (t) = i=1 n u i 1 [ x(i 1),x (i) )(t) + 1 [ x(n), )(t), t R, i=1 is the (right continuous) c.d.f. of some probability distribution Q with support in the set {0, x (1),..., x (n) }. The left continuous version F l satisfies F l (x (i) ) u i (1 i n), and for the expectation of Q we have E[Q] = 0 (1 F (t))dt = n (1 u i )(x (i) x (i 1) ) = 1. i=1 Let X 1,..., X n be i.i.d. Q-distributed random variables. From (18) together with F l (x (i) ) u i (1 i n) we get P ( X (i) x (i) i = 1,..., n) P ( U (i) u i i = 1,..., n), 13 which shows W (x) m(x).

14 Lemma 3.5 The function W is continuous on [ 0, ) n. Proof. Since W is constantly equal to 1 on the cube C = [ 0, 1 ] n, it suffices to prove (i) continuity of W on the set C c = [ 0, ) n \ C, and (ii) W (x) 1 whenever x C c approaches the boundary of C. In view of Lemma 3.4, the following abbreviations will be convenient. f(u) = P ( U (i) u i i = 1,..., n ), for u = (u 1,..., u n ) R n, which is a continuous function on R n, D = { u R n : 0 u 1... u n 1 }, { n } D(x) = u D : u i (x (i) x (i 1) ) = x (n) 1 i=1 for x C c. Clearly, D and D(x) (for x C c ) are nonempty compact and convex sets. It is not hard to see that the vertices of D(x) are given by ( ξ i,..., ξ }{{} i, 1,..., 1 ) for 1 i n with x }{{} (i) 1, where ξ i = x (i) 1 x (i) ; i times n i times ( 0,..., 0, ξ }{{} ij,..., ξ ij, 1,..., 1 ) }{{}}{{} for 1 i < j n with x (i) < 1 < x (j), i times j i times n j times where ξ ij = (x (j) 1)/(x (j) x (i) ). By Lemma 3.5 we have W (x) = max f(u) for x u D(x) Cc. (19) Ad (i): Let x (0) C c and x (k) C c (k N) be a sequence with lim k x (k) = x (0). For each k N let u (k) D(x (k) ) with W (x (k) ) = f(u (k) ). Denote a = lim sup k f(u (k) ). Choose an infinite subsequence u (k), k K N, such that f(u (k) ) (k, k K) converges to a, and due to the compactness of D choose a further convergent subsequence u (k), k K K, with some limit u (0) D. From n i=1 u(k) i (x (k) (i) x(k) (i 1) ) = x(k) (n) 1 for all k K we get by taking the limits, n i=1 u(0) i (x (0) (i) x(0) (i 1) ) = x(0) (n) 1, i.e., u (0) D(x (0) ). Hence by continuity of f and by (19), a = lim k, k K f(u(k) ) = f(u (0) ) W (x (0) ), which shows that lim sup k W (x (k) ) W (x (0) ). It remains to prove the inequality lim inf k W (x (k) ) W (x (0) ). To this end it suffices to show that for any given point u (0) D(x (0) ) one can find a sequence u (k) D(x (k) ) (k N) converging to u (0). Then, taking u (0) D(x (0) ) with f(u (0) ) = W (x (0) ), we get lim k f(u (k) ) = f(u (0) ) = W (x (0) ) and f(u (k) ) W (x (k) ) for all k, from which W (x (0) ) lim inf k W (x (k) ) follows. Now, since D(x (k) ) is a convex set for each k, the set of all limits of convergent sequences u (k) D(x (k) ) (k N) is again a convex set, and it is a subset of D(x (0) ) (see our arguments above). Thus, that set of limits coincides with D(x (0) ) iff each vertex of D(x (0) ) is a limit of some sequence u (k) D(x (k) ). But this is readily verified in view of our description of the vertices of a set D(x) from above. In fact, the vertices of

D(x (0) ) are the limits of the corresponding vertices of D(x (k) ) with one exception: If x (0) (i) = 1 for some i {1,..., n 1} and x(k) (i) < 1 for infinitely many k, then the vertex (0,..., 0, 1,..., 1) (where 0 appears i times) of D(x (0) ) is the limit of u (k) D(x (k) ) for k, where u (k) = ( ξ (k) i,..., ξ (k) i, 1,..., 1 ) with ξ (k) i }{{} i times = x(k) (i) 1, if x (k) x (k) (i) 1, (i) 15 u (k) = ( 0,..., 0, ξ (k) }{{} in,..., ξ(k) i times in ) with ξ(k) in = x(k) (n) 1 x (k) (n) x(k) (i), if x (k) (i) < 1. Ad (ii): Let x (k) C c (k N) converge to some boundary point x (0) of C, hence in particular lim k x (k) (n) = 1. For each k consider the point u(k) with components all equal to (x (k) (n) 1)/x(k) (n), which belongs to D(x(k) ). Obviously, lim k u (k) = (0,..., 0). Again by (19), W (x (k) ) f(u (k) ) for all k and thus, by continuity of f, lim inf k W (x(k) ) f(0,..., 0) = 1, i.e., lim k W (x (k) ) = 1. 4 The Wang-Zhao result for n = 2 Here we present an alternative proof of [1], Theorem 5.2. Theorem 4.1 (Wang & Zhao) Let n = 2. Then, for any given ( 0, 1 ), we have sup Q 2 ( W ) =, Q H 0 where Q H 0 means that Q Prob( [ 0, ) ) with E[Q] 1. So, by Lemma 3.1, the test ψ = 1 {W } is UMP among all strongly monotone symmetric level tests for H 0 vs. H 1. Proof. In what follows we denote c = 1 1 1 and I = [ 0, c ]. By the formula of Lemma 3.3, W (0, c ) =. Since W is symmetric and nonincreasing, we see that the indicator function g = 1 {W } satisfies the conditions of Lemma 2.5 (with c = c ). By that lemma the sup in the assertion equals the maximum value of the (EP2), maximize Q 2 ( W ), (20) s.t. Q Prob(I ), E[Q] = 1.

By Lemma 3.2, resolving the inequality W (x 1, x 2 ) for x 1, x 2 I, yields (after some lengthy but elementary calculations, cp. [1], p. 84), W (x 1, x 2 ) 1 x (2) c, x (1) w (x (2) ), (x 1, x 2 I ), (21) 16 where we have introduced the function w : [ 1, c ] [ 0, 1 ], w (t) = Also we get { t t 2 1, if 1 t 1 t t 1 1, if 1 < t c. (22) W (x 1, x 2 ) = 1 x (2) c, x (1) = w (x (2) ), (x 1, x 2 I ). (23) It is easily seen that the function w given by (22) is a continuous and decreasing oneto-one function from [ 1 1, c ] onto [ 0, ], and so there is the inverse function w 1 We firstly show that for the (EP2) from (20) feasible probability distributions Q with Q 2 (W ) = are the following. Let x = (x 1, x 2) I 2 be such that W (x ) = and choose Q x according to Lemma 3.2, i.e., { x (1), x (2)} supp(q x ) { 0, x (1), x (2)}, E[Q x ] = 1, (24) and W (x ) = Q 2 x ({z I2 : z (1) x (1) z (2) x (2)}). (25) From (21), (23), (24) we see that an x = (x 1, x 2 ) supp(q 2 x ) satisfies W (x) (= W (x )) if and only if x (1) = x (1), x (2) = x (2) or x (1) = x (2) = x (2), i.e., if and only if x (1) x (1), x (2) x (2). Hence, together with (25), we obtain Q 2 x ( W ) = Q2 x ( {x supp(q2 x ) : x (1) x (1), x (2) x (2)} ) = W (x ) =. So we have proved that for any x I 2 with W (x ) = the probability distribution Q x from Lemma 3.2 satisfies Q 2 x ( W ) =. The rest of our proof will show, in particular, that those distributions Q x are the optimal solutions to the (EP2) from (20). Let Q 0 be any optimal solution to (20) (which exists by Theorem 2.3, part (i)). By part (ii) of Theorem 2.3 there exist real numbers ρ 0 and τ such that g Q0 (t) ρ + τt t I and g Q0 (t) = ρ + τt t supp(q 0 ), (26) where g Q0 (t) = Q 0 ( {s I : W (s, t) } ) t I, (27) and then Q 2 0( W ) = ρ + τ. So we have to conclude that ρ + τ. From (27) and (21) we obtain. g Q0 (t) = { 1 Q0 ( [w (t), c ] ), if t c Q 0 ( [w 1 (t), c ] ), if 0 t 1. (28) The function g Q0 is nondecreasing and thus (26) forces τ 0. Suppose τ = 0. Since 0 g Q0 (t) 1 for all t [ 0, c ] and g Q0 (c ) = 1, (26) forces ρ 1 and hence ρ = 1. The minimum support point s 0 of Q 0 must be less than or equal to 1 (because of

E[Q 0 ] = 1), in particular s 0 < 1, hence by (26) and (28) Q 0 ( [ w 1 (s 0 ), c ] ) = 1. But w 1 (s 0 ) > 1 and hence s 0 supp(q 0 ) which is a contradiction. So we have τ > 0. For the support of Q 0 we conclude further: If t supp(q 0 ) and 1 t c, then w (t) supp(q 0 ) ; (29) if s supp(q 0 ) and 0 < s < 1, then w 1 (s) supp(q 0 ). (30) This can be seen as follows. Let t supp(q 0 ) and 1 t c, but suppose that w (t) supp(q 0 ) (hence t > 1 since w ( 1 ) = 1 ). So there is an ε > 0 such that w (t) + ε 1 and Q 0 ( [ w (t), w (t) + ε ] ) = 0. Since w is continuous and decreasing, there is a δ > 0 such that t δ > 1 and w (t) < w (t δ) w (t) + ε, hence Q 0 ( [ w (t), w (t δ) ] ) = 0, and we get from (26), (28), ρ+τt = g Q0 (t) = Q 0 ( [ w (t), c ] ) = Q 0 ( [ w (t δ), c ] ) = g Q0 (t δ) ρ+τ(t δ), which is a contradiction in view of τ > 0. The arguments are similar for the case that s supp(q 0 ) and 0 < s < 1 ; suppose that (s) supp(q 0 ). Then there is an ε > 0 such that w 1 w 1 (s) + ε c and Q 0 ( [ w 1 (s), w 1 (s) + ε ] ) = 0. Since w 1 is continuous and decreasing, there is a δ > 0 such that s δ 0 and w 1 (s) < w 1 (s δ) w 1 (s) + ε, hence Q 0 ( [ w 1 (s), w 1 (s δ) ] ) = 0, and we get from (26), (28), ρ+τs = g Q0 (s) = Q 0 ( [ w 1 (s), c ] ) = Q 0 ( [ w 1 (s δ), c ] ) = g Q0 (s δ) ρ+τ(s δ), which is a contradiction in view of τ > 0. 1 Let t 0 denote the maximum support point of Q 0 ; by (29), (30), t 0 c. Denote s 0 = w (t 0 ). By (26), (28), and (29), ρ + τs 0 = Q 0 (t 0 ) and ρ + τt 0 = Q 0 ( [ s 0, t 0 ] ), (31) where for short we write Q 0 (t 0 ) instead of Q 0 ({t 0 }). We have to distinguish three cases to prove ρ + τ. Case 1: t 0 = c. Case 2: t 0 < c and 0 supp(q 0 ). Case 3: t 0 < c and 0 supp(q 0 ). Ad case 1: Then s 0 = 0 and (31) rewrites as hence c ρ = c Q 0 (c ) E[Q 0 ] = 1 and ρ = Q 0 (c ) and ρ + τc = 1, ρ + τ = ρ + 1 ρ c = (1 1 c )ρ + 1 c (1 1 c ) 1 c + 1 c =. 17

Ad case 2: Then, by (26) and (28), ρ = Q 0 (c ) = 0, and together with (31), τt 0 = Q 0 ( [s 0, t 0 ] ) < 1. If t 0 1 this yields ρ + τ = τ < 1/t 0. Now let t 0 < 1. From 1 = E[Q 0 ] s 0 Q 0 ( [s 0, t 0 ) ) + t 0 Q 0 (t 0 ) = s 0 Q 0 ( [s 0, t 0 ] ) + (t 0 s 0 )Q 0 (t 0 ) (32) we obtain, observing (31) and ρ = 0, 1 s 0 τt 0 + (t 0 s 0 )τs 0 = τs 0 (2t 0 s 0 ). 1 Because of t 0 < 1 and s 0 = w (t 0 ) we see from (22) that s 0 (2t 0 s 0 ) = 1, and we conclude ρ + τ = τ. Ad case 3: Then, by (29), (30), s 0 is the minimum support point of Q 0 and hence Q 0 ( [s 0, t 0 ] ) = 1. For short let us write λ 0 = Q 0 (t 0 ). So (31) rewrites as Resolving for ρ and τ yields ρ + τs 0 = λ 0 and ρ + τt 0 = 1. ρ = t 0λ 0 s 0 t 0 s 0, τ = 1 λ 0 t 0 s 0. Since ρ 0, we have λ 0 s 0 /t 0 ; since (32) is generally true, we have s 0 +(t 0 s 0 )λ 0 1, hence λ 0 (1 s 0 )/(t 0 s 0 ). Thus, s 0 /t 0 (1 s 0 )/(t 0 s 0 ), which after straightforward calculations gives s 0 t 0 t 2 0 t 0. Since s 0 = w (t 0 ) we see from (22) that t 0 1, and hence s 0 = w (t 0 ) = t 0 t 0 1 1. So we obtain, Now, 1 s 0 t 0 s 0 ρ + τ = t 0λ 0 s 0 t 0 s 0 + 1 λ 0 t 0 s 0 = (t 0 1)λ 0 + 1 s 0 t 0 s 0 (t 0 1) 1 s 0 t 0 s 0 + 1 s 0 t 0 s 0 = 1 s 0 t 0 s 0 ( t 0 1 t 0 s 0 + 1 ). = 1 1 and t 0 1 t 0 s 0 = 1, and thus ρ + τ (1 1 ) (1 + 1 ) =. 18 5 The negative result for n 3 As announced earlier, we will prove that if n 3 then the test ψ from Lemma 3.1 does not keep the level on H 0, i.e., sup Q n ( W ) >, Q H 0 and, moreover, there does not exist a UMP strongly monotone symmetric level test for H 0 vs. H 1. To this end we need the following lemma.

Lemma 5.1 Let n 3 and ( 0, 1 ) be given. There exist real numbers b > a > 1 such that W ( 0,..., 0, a, a, a) = W ( 0,..., 0, a, b) =. }{{}}{{} n 3 times n 2 times Proof. For any β > 1 and 1 j n we get from Lemma 3.2, n ( ) n W ( 0,..., 0, β,..., β) = ( }{{}}{{} 1 β i )i (1 1 β )n i. (33) n j times j times i=j This shows, in particular, that for any sequence x (k) [ 0, ) n (k N) with x (k) n) for k we have lim k W (x (k) ) = 0, since with β k = x (k) (n) W (x (k) ) W ( 0,..., 0, β }{{} k ) = 1 n 1 times and j = 1 in (33) ( 1 1 β k ) n (k ) 0. Consider W (0,..., 0, β, β, β) (zero appearing n 3 times) for 1 β <. For β = 1, W (0,..., 0, 1, 1, 1) = 1 and for β, W (0,..., 0, β, β, β) 0. By the continuity of W there exists an a > 1 such that W (0,..., 0, a, a, a) =. The r.h.s. of (33) decreases when j increases, and in particular γ = W ( 0,..., 0, a, a) > W ( 0,..., 0, a, a, a) =. }{{}}{{} n 2 times n 3 times Consider W (0,..., 0, a, β) (zero appearing n 2 times) for a β <. For β = a, W (0,..., 0, a, a) = γ >, and for β, W (0,..., 0, a, β) 0. By the continuity of W there exists a b > a such that W (0,..., 0, a, b) =. Theorem 5.2 Let n 3 and ( 0, 1 ). Choose b > a > 1 as in Lemma 5.1. Then: (i) For x = (0,..., 0, a, b) (with n 2 zeros) and Q x Q n x ( W ) >. In particular, the test ψ does not keep the level on H 0. (ii) from Lemma 3.2 we have The tests φ 1 and φ 2, defined by { { 1, if x(n 2) a 1, if x(n 1) a, x φ 1 (x) =, φ 0, else 2 (x) = (n) b 0, else are nonrandomized monotone symmetric level tests for H 0, and no strongly monotone symmetric level test for H 0 can be at least as powerful (uniformly on H 1 ) as both φ 1 and φ 2. In particular, there does not exist a UMP strongly monotone symmetric level test for H 0 vs. H 1. Proof. (i) Because of E[Q x ] = 1 and b > a > 1 we have supp(q x ) = { 0, a, b }. The set {W } has as a subset { } C x = x [ 0, ) n : x (i) x (i) i = 1,..., n,, 19

since W is symmetric and nonincreasing and W (x ) =. Another subset of {W } is given by { } D = x [ 0, ) n : x (i) = 0, 1 i n 3, and x (n 2) = x (n 1) = x (n) = a, 20 again by the symmetry of W and by W (0,..., 0, a, a, a) =. Clearly, C x disjoint, and hence Q n x ( W ) Qn x ( C x ) + Qn x ( D ), and the assertion follows from and D are Q n x (C x ) = W (x ) = and Q x (D ) = ( n 3) (Qx (0)) n 3 (Q x (a)) 3 > 0. (ii) The tests φ 1 and φ 2 are clearly nonrandomized, monotone, and symmetric. By definition of W we have, with x = (0,..., 0, a, b) (zero appearing n 2 times) as above, y = (0,..., 0, a, a, a) (zero appearing n 3 times), and notation C x, C y from (11), sup E Q n(φ 1 ) = sup Q n (C x ) = W (x ) =, Q H 0 Q H 0 sup E Q n(φ 2 ) = sup Q n (C y ) = W (y ) =. Q H 0 Q H 0 So φ 1 and φ 2 are nonrandomized monotone symmetric level tests for H 0. Now let φ be any strongly monotone symmetric level test for H 0 which is uniformly at least as powerful on H 1 as φ 1, i.e., E Q n(φ) E Q n(φ 1 ) Q Prob( [ 0, ) ) with E[Q] > 1. (34) We have to show that φ is not uniformly at least as powerful ( on H) 1 as φ 2. To this end 0 a we specialize (34) to the two-point distributions Q = with 1/a < λ < 1, 1 λ λ which yields E Q n(φ) = n ( ) n ϕ i λ i (1 λ) n i E Q n(φ 1 ) = i i=0 where we have denoted Hence it follows that ϕ 3 = 1, i.e., n i=3 ϕ i = φ( 0,..., 0, a,..., a), 0 i n. }{{}}{{} n i times i times ( ) n λ i (1 λ) n i, (35) i φ(0,..., 0, a, a, a) = 1. (36) For, suppose that ϕ 3 < 1. Then, by strong monotonicity of φ, ϕ i = 0 for i = 0, 1, 2 which contradicts (35). As a next step we show that θ = φ(0,..., 0, a, b) < 1. (37) Suppose that θ = 1. Then φ φ 2 by the symmetry and monotonicity of φ. By Lemma 3.2 there is a probability distribution Q 0 with support equal to {0, a, b} and

with E[Q 0 ] = 1 such that = W (0,..., 0, 0, a, b) = E Q n 0 (φ 2 ). But E Q n 0 (φ), and thus we must have φ(x) = φ 2 (x) for all x from the support of Q n 0, i.e., for all x with components in {0, a, b}, and in particular φ(0,..., 0, a, a, a) = φ 2 (0,..., 0, a, a, a) = 0, contradicting (36). Hence (37) follows. Consider the sets C = {x = (x 1,..., x n ) : x i {0, a, b} (1 i n), x (n 2) = 0, x (n 1) = a, x (n) = b}, D = {x = (x 1,..., x n ) : x i {0, a, b} (1 i n), x (n) a}. Then we have φ(x) + (1 θ) 1 C (x) φ 2 (x) + 1 D (x) (38) x = (x 1,..., x n ) with x i {0, a, b} (1 i n), which can be seen as follows. The l.h.s. of (38) does not exceed 1 because of φ(x) = θ on C, while the r.h.s. of (38) yields either 0, 1, or 2. So it remains to verify inequality (38) for all x with components in {0, a, b} and φ 2 (x) = 1 D (x) = 0. The latter means that x (n 1) = 0 and x (n) = b, hence φ(x) = φ(0,..., 0, b) which equals zero by (37) and the strong monotonicity of φ, and clearly 1 C (x) = 0. This shows (38). Hence it follows that for any probability distribution Q with supp(q) = {0, a, b} we have E Q n(φ 2 ) E Q n(φ) (1 θ) Q n (C) Q n (D) = n(n 1)(1 θ) Q(b)Q(a)(Q(0)) n 2 (1 Q(b)) n. Taking Q(0) = Q(a) = ε, Q(b) = 1 2ε with 0 < ε < 1/2 and E[Q] > 1, i.e., ε < (b 1)/(2b a), one gets E Q n(φ 2 ) E Q n(φ) n(n 1)(1 θ)(1 2ε)ε n 1 (2ε) n ( ) = ε n 1 n(n 1)(1 θ) [2n(n 1)(1 θ) + 2 n ]ε. The latter expression is positive for some positive small ε. So we have shown that there exists a probability distribution Q on [ 0, ) with E[Q] > 1 and E Q n(φ) < E Q n(φ 2 ), i.e., φ is not uniformly at least as powerful on H 1 as φ 2. 21 References [1] Wang, W., Zhao, L.H., Nonparametric tests for the mean of a non-negative population. J. Statist. Plann. Inference 110 (2003), pp. 75-96. [2] Bauer, H., Maß und Integrationstheorie. [In German]. De Gruyter, Berlin, 1992. [3] Billingsley, P., Weak Convergence of Measures: Applications in Probability. Soc. for Industrial and Applied Mathematics, Philadelphia, 1971. [4] Shaked, M., Shanthikumar, J.G., Stochastic Orders and Their Applications. Academic Press, San Diego, 1994.