Zhishui Hu 1, John Robinson 2 and Qiying Wang 2

ESAIM: PS 6 (202) 425 435 DOI: 0.05/ps/200027 ESAIM: Probability and Statistics www.esaim-ps.org TAIL APPROXIMATIOS FOR SAMPLES FROM A FIITE POPULATIO WITH APPLICATIOS TO PERMUTATIO TESTS Zhishui Hu, John Robinson 2 and Qiying Wang 2 Abstract. This paper derives an explicit approximation for the tail probability of a sum of sample values taken without replacement from an unrestricted finite population. The approximation is shown to hold under no conditions in a wide range with relative error given in terms of the standardized absolute third moment of the population, β 3. This approximation is used to obtain a result comparable to the well-known Cramér large deviation result in the independent case, but with no restrictions on the sampled population and an error term depending only on β 3. Application to permutation tests is investigated giving a new limit result for the tail conditional probability of the statistic given order statistics under mild conditions. Some numerical results are given to illustrate the accuracy of the approximation by comparing our results to saddlepoint approximations requiring strong conditions. Mathematics Subject Classification. 62E20, 60F05. Received September, 2009. Revised June 7, 200.. Introduction Let X,X 2,...,X n be a simple random sample drawn without replacement from a finite population {a} = {a,...,a } with a k =0and a 2 k =, wheren< and throughout the paper without limits denotes summation over k from to. Let S n = n j= X j, p = n/, q = p, ω 2 = pq, b =max a k. k Under appropriate conditions, the so-called finite central limit theorem (see [7,8]) states that, as n, sup P (Sn xω ) { Φ(x)} 0, x where Φ(x) is the distribution function of a standard normal variate. The central limit theorem is useful when x is not too large. There are two approaches for estimating the error of the normal approximation. One approach eywords and phrases. Cramér large deviation, saddlepoint approximations, moderate deviations, finite population, permutation tests. Department of Statistics and Finance, University of Science and Technology of China, Hefei 230026, Anhui, P.R. China. huzs@ustc.edu.cn 2 School of Mathematics and Statistics The University of Sydney, SW, 2006, Australia. johnr@maths.usyd.edu.au; qiying@maths.usyd.edu.au Article published by EDP Sciences c EDP Sciences, SMAI 202

426 Z. HU ET AL. is to investigate the absolute error via Berry Esseen bounds and Edgeworth expansions. This has been done by many researchers. We only refer to Bikelis [3] andhöglund [9] for the rates in the Erdös and Rényi central limit theorem, and Bickel and van Zwet [2], Robinson [2], Babu and Bai [] as well as Bloznelis [4,5] forthe Edgeworth expansions. Another approach is to investigate the relative error of P ( ) S n xω to Φ(x). In this direction, Hu et al. (HRW) [0] derived the following result: there is an absolute constant A>0 such that exp { A( + x) 3 } P ( ) S n xω β 3 /ω exp { A( + x) 3 } β 3 /ω, (.) Φ(x) for 0 x (/A)ω /b,whereβ 3 = E X 3 = a k 3 /. As a direct consequence of (.), HRW [0] also established the following Cramér-type large deviation result: there exists an absolute constant A>0 such that P ( ) S n xω = +O()( + x) 3 β 3 /ω, (.2) Φ(x) for 0 x (/A) min { ω /b, (ω /β 3 ) /3}. Results (.) and(.2) are useful because they provide not only the relative error but also a Berry Esseen rate of convergence. It is also interesting to note that the results depend only on β 3 with an absolute constant and we have no restrictions on the {a} and p. As mentioned in Remark.4 of HRW [0], however, the results (.) and(.2) do not provide sufficiently precise results on the accuracy of the normal approximation in the large deviation region. In the present paper, an analogue of Cramér s large deviation result in the independent case (see Chap. 8 of Petrov (975), for example) is obtained in Theorem 2.2, which essentially improves the results (.) and(.2) in the situation that an explicit large deviation range for x is provided and the error term is improved by reducing the power of x. Rather than studying the accuracy of the normal approximation, we derive in Theorem 2. an approximation for the tail probability P ( S n xω ) with no restrictions on the population or on p, witha relative error depending only on β 3, and giving an explicit large deviation range for x. This result is comparable to those of Robinson [3], Robinson et al. [4] and Wang [5] which require a number of very strong restrictive conditions. Furthermore, numerical examples show that the tail approximation of Theorem 2. is effectively as accurate as the saddle-point approximations of Booth and Butler [6] and Wang [5]. This indicates that the tail approximation of Theorem 2. can be used to replace the saddle-point approximations in many applications since these require a number of very strong restrictive conditions on the sampled data. In Section 2 we present, in Theorem 2., a tail probability approximation and from this we derive, in Theorem 2.2, a result analogousto that of Cramér for independent and identically distributed random variables. The result of Theorem 2. may be considered a first order saddlepoint approximation, which can be applied with no conditions. It gives a precise region for existence of a unique solution of the saddlepoint equations and a relative error for an explicit large deviation region. Theorem 2.2 improves essentially Theorem of Robinson [] in two aspects. First we provide an explicit range for the x and, second, the error term in Robinson [] depends on p, which limits essential applications of the result, whereas our results hold true with no restrictions on the {a} and p. Section 3 applies our main results to two-sample permutation tests, obtaining in Theorem 3. an approximation for the conditional probability given the order statistics with small relative error almost surely. Some numerical comparisons are presented in Section 4. Finally, in Section 5, we present the proofs of the main results. Throughout A, A,A 2,... are absolute constants. 2. Main results The complex moment generating function of S n is shown in () of Robinson [] tobe π Q n (u + iv) =(2πB n ) exp[ ((u + iv)a k (pq) /2 + α + iθ)]dθ, π

TAIL APPROXIMATIOS FOR FIITE POPULATIOS 427 for any α, whereb n = ( n) p n q n, (z) =log(p e qz + q e pz ), p, q > 0andp + q =. otingthat (t) =pq( e t ) / (p e t + q) and (t) =pq e t/ (p e t + q) 2, (2.) we have that (t) > 0 for any t R and (t) > 0or< 0ift>0ort<0. These facts imply that, for each u R, (ua k + α) is strictly increasing in α and the equation (ua k + α) = 0 (2.2) has a unique solution α (u). Furthermore, α (0) = 0 and α (u) <C if 0 <u<c/b. In the following, let a () a (2)... a () be the ordered values of a,...,a,and k (u), k (u) and k (u) bethevaluesof(t), (t) and (t) evaluated at t = ua k + α (u). Throughout we take x>0and u>0. Our main result is given as follows. Theorem 2.. (i) Assume that min{a (k+) a (k) : k =,..., } >δ > 0, we have α (u)/u (a ( n+) + a ( n) )/2 as u. (ii) The function m (u) = a k k (u) is strictly increasing, m (u) a () +...+ a ( n+) as u, and the equation m (u) =xω, (2.3) has a unique solution u x whenever 0 <x< [ ] a () +...+ a ( n+) /ω. (iii) For any given constant C>0, if0 <x<m (C/b ) / ω,then ) P (S n >xω ) = ( Φ(u x σ )) Λ n (x) (+O ( + u x σ )β 3 /ω, (2.4) where ω { Λ n (x) = k (u x) exp k (u x ) u x xω + u 2 xσ /2} 2, u x is the solution of equation (2.3), σ 2 σ2 (u x)= ( a 2 k k (u ) 2/ x) ak k (u x) k (u x), β 3 = E X 3 and O is bounded by a constant depending only on C. (iv) The result (2.4) holds true if 0 <x<ω /(2eb ). Theorem 2. provides an approximation for the tail probability P (S n >xω ) in an explicit wide large deviation range for the x. As mentioned in Section, the tail approximation of Theorem 2. is comparable in accuracy to the well-known saddle-point approximations developed in [6,3,5]. The numerical examples to illustrate this statement are given in Section 4. However, unlike these results which require a number of very strong restrictive conditions on the sampled data, our result (2.4) is established under no conditions since we always have E X 3 = a k 3 / <. The error can be improved under a smoothness condition and further terms to improve accuracy can then be obtained. As a consequence of this result, we also have the following Theorem 2.2, which is comparable to the classical Cramér large deviation result as in the independent case. This result also provides an approximation for the tail probability P (S n >xω ) under no conditions and has theoretical interest. This approximation gives less accuracy compared to the saddle-point approximation (2.4) as illustrated in Section 4, and its purpose is to give a rate of convergence for the Central Limit Theorem rather than to provide an approximation.

428 Z. HU ET AL. Theorem 2.2. For 0 <x<ω /(2eb ), we have P (S n >xω ) Φ(x) { ) = exp x 3 λ (x) }(+O 2 ( + x)β 3 /ω, (2.5) where x 3 λ (x) = k (u x ) u x xω + x 2 /2, u x is the solution of equation (2.3) and O 2 is bounded by an absolute positive constant. Remark 2.3. As in [], λ (x) may be represented as a power series of x with coefficients not depending on p and convergent in some circle. For more details on these lines, we refer to (45) and (46) in []. Remark 2.4. For an asymptotic version of these results, consider a sequence of finite populations {a,...,a }, =, 2, 3,... The results of Theorems 2. and 2.2 hold for each member of this sequence. The relative error term in (2.4), O ( + u x σ )β 3 /ω can then be replaced by an order term O(( + u x σ )β 3 /ω ) and for the results to give appropriate limit theorems we need to assume that ( + u x σ )β 3 /ω 0for 0 <x<ω /2eb as. In the particular example of the Wilcoxon two sample statistic considered in Section 4, b = 3( )/( +), 0 <u x <C/b, σ can be bounded above and below by D pq and d pq, whered and D depend only on C, and β 3 <, so the relative error is O(/ω ). The range of x in this case is 0 <x<cω. Here then, the relative error term is O(( + x)/ n)soitispossibletohavep 0so long as n = p. Further asymptotic results are given in Section 3. 3. Applications to permutation tests Suppose that we observe the random variables X,...,X n and X n+,...,x. Consider the model: X,...,X n and X n+,...,x are independent and identically distributed (iid) with distribution functions F (x θ )andf (x θ 2 ), respectively. We are interested in the hypothesis H : θ θ 2 = δ 0.Write ã k =(Y k Ȳ ) / [ (Yi Ȳ )2 ] /2, k =,...,, (3.) where Ȳ = Yk and { Xi δ Y i = 0, i =,...,n, X i, i = n +,...,. The typical permutation test statistic for the hypothesis H is given by T = (pq) /2 n ã Ri, where (R,...,R ) is a random vector, independent of all preceding random variables, taking each permutation of (,...,) with equal probability. Given the order statistics Y (),...,Y (), the conditional distribution of T has been investigated widely. For instance, Bickel and van Zwet [2] and Robinson [2] investigated Edgeworth expansions. Robinson [3] discussed the saddle point approximation under smoothness conditions. By using Theorems 2. and 2.2, this section derives explicit approximations of P (T x) in the large deviation range assuming only the existence of a finite moment of the sampled population, where P indicates the conditional distribution given Y (),...,Y (). Simulations in Section 4 show that the approximation of Theorem 2. provides accuracy which is comparable to the saddlepoint approximation given in Booth and Butler [6] and Wang [5]. i=

TAIL APPROXIMATIOS FOR FIITE POPULATIOS 429 In order to use Theorems 2. and 2.2, we still adopt the notation as in Section 2 except that a k = ã k,where ã k is given by (3.). The main result in this section is as follows. Theorem 3.. Under the assumption that the hypothesis H is true, if E X m < for some m 3, then for j =, 2, P (T >x) Ψ j (x), a.s. as, (3.2) holds true uniformly in 0 <x O ( ) { } pq/τ,whereτ / /m, Ψ (x) =( Φ(u x σ ))Λ n (x) and Ψ 2 (x) =( Φ(x)) exp x 3 λ (x) are defined as in Theorems 2. and 2.2 respectively. Proof. By the result(iv) of Theorem 2. and (5.) below, we have P (T n >x) Ψ (x) = (+O ( + x)β 3 /ω ), (3.3) [ for 0 <x<ω /(2eb ), where b =max k ã k =max k Y k Ȳ / ] /2, (Yk Ȳ )2 β 3 = Yk Ȳ 3/[ (Yk Ȳ )2 ] 3/2 and O is bounded by an absolute positive constant. oting that W = Y EY,...,W = Y EY are iid random variables with E Y m <, under the hypothesis H and the assumption E X m <, form 3, it is readily seen from the law of large numbers that, as, β3 E W 3 /(EW 2 ) 3/2 a.s. and b 2 max Y k Ȳ k /(EW2 ) /2 a.s. (3.4) From (3.3) and(3.4), Theorem 3. for j = will follow if we prove, as, τ max Y k Ȳ 0, k a.s. This follows from the law of large numbers, since E Y m AE X m < implies that τ max Y k EY k /m k τ [ ] /m Yk EY k m 0, a.s., whenever τ / /m.theorem3. for j = 2 is similar except that we replace the result (iv) of Theorem 2. by Theorem 2.2 and hence the details are omitted. This also completes the proof of Theorem 3.. Remark 3.2. ote that in the asymptotic case for the bound 0 <x< pq/τ to be non-trivial we require that p does not decrease as fast as τ 2 /. 4. umerical comparisons We will present two examples to illustrate the accuracy of the results of Theorems 2. and 2.2, comparing these to the full saddlepoint given by Booth and Butler [6] (with no proof of the bounds) and to the approximation of Wang [5], obtained under very strong conditions. We look at the data set used by Wang [5] inhistable where =36andn = 5, and we also consider the approximation for the Wilcoxon rank sum statistic for

430 Z. HU ET AL. Table. Approximations to P ( 5 j= d R j >y)forwang[5] datasetd,d 2,...,d 36. y.2 3.6 6.0 8.8 2.8 MC 0.3350 0.523 0.0523 0.00 0.0004 Wang93 0.337 0.53 0.0532 0.009 0.0004 BB 0.3387 0.523 0.0534 0.009 0.0004 Theorem 2. 0.363 0.56 0.0528 0.004 0.0004 Theorem 2.2 0.3602 0.503 0.0470 0.0079 0.0002 ormal 0.3589 0.39 0.0353 0.0040 0.000 Table 2. Approximations to P ( 0 i= R i >y) for random permutation R,R 2,...,R 6 of, 2,...,6. y 5 4 3 2 0 09 08 Exact 0.0002 0.00025 0.00050 0.00087 0.0050 0.00237 0.00387 0.00562 BB 0.000 0.00025 0.00049 0.00088 0.0049 0.00240 0.00370 0.00552 Theorem 2. 0.00009 0.0002 0.00042 0.00077 0.0032 0.0024 0.00333 0.00499 Theorem 2.2 0.00002 0.00006 0.0007 0.00036 0.00068 0.0022 0.00204 0.00326 ormal 0.00048 0.00070 0.0003 0.0050 0.0024 0.00303 0.00424 0.00586 P (S n y) where =6andn = 0. In the first case, in Table, we look at Monte Carlo estimates based on 000 000 samples (MC), then the results taken from Wang [5] Table2, an approximation based on the full conditional saddlepoint of Booth and Butler [6] (BB), the approximations from Theorems 2. and 2.2 and the normal approximation. For the Wilcoxon, in Table 2, we give the exact probabilities in place of a Monte Carlo approximation, and all other approximations but those of Wang [5], using a continuity correction in this lattice case to improve the approximations. In each case the results are similar. The results of Wang [5] and of the method of Booth and Butler [6] are remarkably close to the Monte Carlo values throughout the range. The methods of Theorem 2. give quite good results throughout the range, again as might be expected since this gives a first order approximation to the full conditional saddlepoint. The second order approximations of Booth and Butler [6] and of Wang [5] have second order relative accuracy only under a smoothness condition, whereas no such restriction applies to the results of Theorem 2.. Theorem 2.2 does not give particularly good approximations. We might remark that, although the errors in Theorems 2. and 2.2 are the same, the result (2.4) oftheorem2. (iii) can be extended, under a smoothness condition, to an indirect Edgeworth approximation with a smaller relative error, as in Robinson [3], but such an improvement is not possible for the approximation from (2.5). Proof of Theorem 2.. (i). From (3) (t)+p = 5. Proofs of theorems pet pe t + q and (t) q = qe t (5.) p + qe t Let Δ = a ( n+) a ( n).ifα/u = (a ( n+) + a ( n) )/2 ɛ, for arbitrarily small Δ/2 >ɛ>0, then for k =,..., n, for k = n +2,...,, 0 < (ua (k) + α)+p =e uδ/2 uɛ O(e uδ ), 0 < (ua (k) + α)+q =e uδ/2+uɛ O(e uδ )

TAIL APPROXIMATIOS FOR FIITE POPULATIOS 43 and (ua ( n) + α)+p + (ua ( n+) + α) q = which is less than 0 for u large. So, noting that ( n)p nq =0, n ( (ua (k) + α)+p)+ k= k= n+ p 2 e uδ/2 uɛ q 2 e uδ/2+uɛ (pe ua ( n)+α + q)(p + qe ua ( n+) α ), ( (ua (k) + α) q) < 0, for u large. In the same way, putting α/u = (a ( n+) + a ( n) )/2+ɛ, we can show that n ( (ua (k) + α)+p)+ k= k= n+ ( (ua (k) + α) q) > 0. Since ɛ is chosen arbitrarily, this implies α (u)/u (a ( n+) + a ( n) )/2 asu. (ii). Since α (u) = a k k (u)/ k (u) by differentiating (2.2), we have m (u) = [ a k ak + α (u)] k (u) = [a k + α (u)]2 k (u) =σ2 (u) > 0, (5.2) for u R. Som (u) is strictly increasing. Then, as u,using(i) and a k =0, m (u) = pq e q(ua k+α (u)) e p(ua k+α (u)) a k pe q(ua k+α (u)) + qe p(ua k+α (u)) pq[ ak I(ua k + α (u) > 0)/p ] a k I(ua k + α (u) < 0)/q = a k I(ua k + α (u) > 0) a () +...+ a ( n+). This, together with the fact m (0) = 0 (recalling α (0) = 0), yields that for 0 <xω <a () +...+ a ( n+), there is a unique solution of (2.3). This is as might be expected as a () +...+ a ( n+) is the maximum value for S n. (iii). From (ii), for any 0 <x<m (C/b ) /ω,thesolutionu x of equation (2.3) is unique and 0 <u x C/b.LetH n (t; u x )=Ee uxsn I(S n t)/ee uxsn.wehave P (S n >xω ) = Ee uxsn e uxt dh n (t; u x ) xω ( ) t = Ee uxsn e uxt m (u x ) dφ xω σ ( ( )) t + Ee uxsn e uxt m (u x ) d H n (t; u x ) Φ xω σ := Ee uxsn (I + I 2 ). (5.3) Recalling 0 <u x C/b, it follows from Theorem 3. of HRW (2007) that ( ) I 2 2e xuxω sup t H m (u x ) n(t; u x ) Φ A e xuxω β 3 /ω, (5.4) t σ

432 Z. HU ET AL. and Ee ux Sn = = G n (p) ω k (u x) exp { k (u x) exp k (u x )} ( + O 2 /ω ) { k (u x )} ( + O 3 /ω ), (5.5) where O 2 and O 3 are constants depending only on C, and we have used the estimate, G n (p) = 2π ( n) p n q n = ω ( + O 6/ω 2 ), where O 6 /6, which follows from Stirling s formula. Also, I =(2π) /2 e xuxω e uxσ t t2 /2 dt =e uxxω +u2 x σ2 /2 ( Φ(u x σ )). (5.6) Clearly I e ux xω ( Φ()) if u x σ. By noting that x 2 x 0 e y2 /2 dy> x y 2 e y2 /2 dy = x e x2 /2 x e y2 /2 dy, we have Φ(x) >xφ (x)/( + x 2 )forx and hence I 2 2π e uxxω (u x σ ) if u x σ >. From these facts and (5.3) (5.6), we obtain P (S n >xω ) = Ee uxsn e uxxω +u2 x σ2 /2 ( Φ(u x σ )) ( + I 2 /I ) = ( Φ(u x σ )) Λ n (x) (+O ( + u x σ )β 3 /ω, O is bounded by a constant depending only on C, which completes the proof of (iii). (iv). We first show that, for any given C>0, if 0 <u<c/b,then pqe 2C m (u) =σ2 (u) pqe2c. (5.7) In fact, by recalling α (u) C and ua k + α (u) < 2C if 0 <u<c/b, it follows from (2.) that,foreach k and 0 <u<c/b, pqe 2C k (u) pqe2c, (5.8) whereweusepqe t (pe t + q) 2 = pqe t (p + qe t ) 2. This, together with the first equality of (5.2), yields that m (u) =σ 2 (u) a 2 k k (u) pqe 2C. Similarly it follows from a k = 0, the second equality of (5.2) and(5.8) that m (u) =σ2 (u) pq e 2C (a k + α (u))2 pqe 2C. This proves (5.7). From (5.7), if 0 <x<ω /(2eb ), then Thus (2.4) holds from (iii). m (/2b ) = /2b 0 σ 2 (u)du pq/(2eb ) xω. (5.9)

TAIL APPROXIMATIOS FOR FIITE POPULATIOS 433 Proof of Theorem 2.2. It follows from Theorem 2. (iv) that, for 0 <x<ω /(2eb ), P (S n >xω ) Φ(x) = ω ψ(u x σ ) { } exp k (u x ) u x m (u x )+x 2 /2 k (u ψ(x) x) ) (+O ( + u x σ )β 3 /ω, (5.0) where ψ(x) =( Φ(x))/Φ (x), u x is the solution of (2.3) ando is bounded by an absolute constant. ow the theorem follows from (5.0), if we show that, for 0 <x<ω /(2eb ), e 3/2 x u x σ (u x ) e 3/2 x (5.) and ω ψ(u x σ ) =+O 4 ( + x)β 3 /ω, (5.2) k (u ψ(x) x) where O 4 is bounded by an absolute constant. We will complete the proof by establishing these two results. First note that 0 <u x /(2b )from(5.9) and Theorem 2. (ii). It follows from (5.7) withc =/2 that u x ω 2 e <xω = m (u x )=m (0) + u x m (u ) <u x ω 2 e (5.3) where 0 <u <u x. Thus e ω u x/x eω. This, together with (5.7) withc =/2 again, implies (5.). We next prove (5.2). Write α α (u). Recalling (a k + α ) k (u) =0,wehave (a k + α )2 k (u)+ α k (u) =0. Soα = (a k + α )2 k (u)/ k (u). This, together with α (u) = a k k (u)/ k (u) and(5.2), yields that m (u) = a k α k (u)+ a k (a k + α ) 2 k (u) = (a k + α ) 3 k (u). (5.4) If 0 <u C/b, by recalling ua k +α (u) 2C,notingthat k (u) = k (u)(q peua k+α (u) )/(pe ua k+α (u) + q) and using (5.8), it is readily seen that k (u) pqe2c. On the other hand, α (u) e4c a k / by (5.8) whenever 0 <u C/b.Takingthesefactsinto(5.4), we obtain that [ m (u) <A pq ak 3 + ( ak 2 ) ] 3 A 2 ω 2 β 3, (5.5) for 0 <u /(2b ). Using (5.5), Taylor s expansion and the facts that m (0) = 0, m (0) = ω2 and u x /x e/ω,wehave x u x ω = ω m (u x ) m (0) u x m (0) where 0 u 2 u x /(2b ). Similarly, u2 x ω m (u 2 ) A 3 x 2 β 3 /ω, (5.6) u x σ (u x ) ω u x σ 2 (u x) ω 2 /ω eu x x m (u x) m (0) /ω2 eu x x m (u 3) /ω 2 A 3 x 2 β 3 /ω, (5.7)

434 Z. HU ET AL. where 0 u 3 u x /(2b ). From (5.6) and(5.7), we obtain x ux σ (u x ) x ux ω + ux σ (u x ) ω A4 x 2 β 3 /ω. This, together with (5.), yields that ψ(u xσ ) = u xσ (u x ) x ψ (θ) ψ(x) ψ(x) A 5 ( + x)β 3 /ω, (5.8) where x θ u x σ (x) and we have used the following estimates: ψ(t) min{(2t), Φ()} and ψ (t) = tψ(t) t 2 for t>0. From (5.8), the result (5.2) will follow if we prove ω k (u x) Ax/ω. (5.9) In fact, by noting that for any 0 <u< d k (u) = (a k + α (u)) k (u), du it follows from Taylor s expansion that k (u x) = (0) + u x R, where R Apq a k. Therefore, by recalling u x ex/ω and (0) = pq, for0<x<ω /(2eb ), k (u x) ω 2 Au x pq a k Axpq a 2 k /ω Axω. This, together with (5.8), implies that, for 0 <x<ω /(2eb ), ω k (u x) k (u x) ω 2 k (u Ax/ω, x) which yields (5.9). Proof of Theorem 2.2 is now complete. Acknowledgements. The authors wish to thank the referees for comments which have led to improvements in the paper. Hu is partially supported by SFC (o. 08022). Robinson and Wang are partially supported by the Australian Research Council. References [] G.J. Babu and Z.D. Bai, Mixtures of global and local Edgeworth expansions and their applications. J. Multivariate Anal. 59 (996) 282 307. [2] P.J. Bickel and W.R. van Zwet, Asymptotic expansions for the power of distribution-free tests in the two-sample problem. Ann. Statist. 6 (978) 937 004. [3] A. Bikelis, On the estimation of the remainder term in the central limit theorem for samples from finite populations. Stud. Sci. Math. Hung. 4 (969) 345 354 (in Russian).

TAIL APPROXIMATIOS FOR FIITE POPULATIOS 435 [4] M. Bloznelis, One and two-term Edgeworth expansion for finite population sample mean. Exact results I. Lith. Math. J. 40 (2000) 23 227. [5] M. Bloznelis, One and two-term Edgeworth expansion for finite population sample mean. Exact results II. Lith. Math. J. 40 (2000) 329 340. [6] J.G. Booth and R.W. Butler, Randomization distributions and saddlepoint approximations in generalized linear models. Biometrika 77 (990) 787 796. [7] P. Erdös and A. Rényi, On the central limit theorem for samples from a finite population. Publ. Math. Inst. Hungarian Acad. Sci. 4 (959) 49 6. [8] J. Hájek, Limiting distributions in simple random sampling for a finite population. Publ. Math. Inst. Hugar. Acad. Sci. 5 (960) 36 374. [9] T. Höglund, Sampling from a finite population. A remainder term estimate. Scand. J. Stat. 5 (978) 69 7. [0] Z. Hu, J. Robinson and Q. Wang, Crameŕ-type large deviations for samples from a finite population. Ann. Statist. 35 (2007) 673 696. [] J. Robinson, Large deviation probabilities for samples from a finite population. Ann. Probab. 5 (977) 93 925. [2] J. Robinson, An asymptotic expansion for samples from a finite population. Ann. Statist. 6 (978) 004 0. [3] J. Robinson, Saddlepoint approximations for permutation tests and confidence intervals. J. R. Statist. Soc. B 20 (982) 9 0. [4] J. Robinson, T. Hoglund, L. Holst and M.P. Quine, On approximating probabilities for small and large deviations in R d. Ann. Probab. 8 (990) 727 753. [5] S. Wang, Saddlepoint expansions in finite population problems. Biometrika 80 (993) 583 590.