POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY

APPLICATIONES MATHEMATICAE 36, (29), pp. 2 Zbigniew Ciesielski (Sopot) Ryszard Zieliński (Warszawa) POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY Abstract. Dvoretzky Kiefer Wolfowitz type inequalities for some polynomial and spline estimators of distribution functions are constructed. Moreover, ints on te corresponding algoritms are given as well.. Introduction. Te family of all continuous probability distribution functions on te real line is denoted by F. If te probability corresponding to F F is supported on te interval I ten we write F F(I). Let X,..., X n be a simple sample corresponding to a distribution F F and let (.) F n () = n n (,] (X j ). j= Te Dvoretzky Kiefer Wolfowitz (DKW) inequality in its final form (see [6]) (.2) P { F n F ε} 2e 2nε2 for all F F, were F n F = sup F n () F (), gives us a powerful tool for statistical applications (testing ypoteses concerning unknown F, confidence intervals for F etc.). It seems, owever, unnatural to estimate a continuous F F by te step function F n. In te abundant literature of te subject one can find different approaces to smooting empirical distribution functions but it turns out tat smooting may spoil te estimator. For classical kernel estimators (see e.g. [8]) no inequality of DKW type eists (Zieliński [9]). Consequently, one cannot tell ow many observations X,..., X n are 2 Matematics Subject Classification: Primary 62G7; Secondary 62G3, 6E5. Key words and prases: nonparametric estimation, distribution estimation, Dvoretzky Kiefer Wolfowitz inequality, polynomial estimators, spline estimators, smooting of empirical distribution function. DOI:.464/am36-- [] c Instytut Matematyczny PAN, 29

2 Z. Ciesielski and R. Zieliński needed to guarantee te prescribed accuracy of te kernel estimator. Using te metod from [9] one can prove a similar negative result for some polynomial estimators (see Section 4 below). In wat follows we discuss tree estimators and te DKW type inequalities for tem. Te first one (see Section 3), to be denoted by Φ m,n, is a smoot modification of F n. It is a piecewise polynomial estimator of te class C m F wit knots placed at te sample points X :n,..., X n:n. Te smootness parameter m =, 2,... can be fied by te statistician according to is a priori knowledge of te smootness of te estimated F F. In tis case te DKW type inequality olds in te class F of all continuous probability distribution functions on te real line. Te second one (see Section 4) is te polynomial estimator constructed in Ciesielski [3], and te tird one (see Section 5) is a spline estimator wit equally spaced knots (Ciesielski [2], [4]). In bot cases subclasses of F are eplicitly specified for wic te DKW type inequalities are proved. We start in Section 2 wit preliminaries on specific approimations by algebraic polynomials and by splines wit equally spaced knots. 2. Preliminaries from approimation teory. We recall te necessary ingredients from approimation teory, in particular te basic properties of te Bernstein polynomials and of te related B-splines. Te relation is direct: te Bernstein polynomials are simply te degenerate B-splines. Let us start wit te Bernstein polynomials. For a given integer r let Π m denote te space of real polynomials of degree at most m = r (i.e. of order r). For convenience, bot parameters r and m will be used; m is more natural for polynomials and r for splines. Te basic Bernstein polynomials of degree m are linearly independent and given by te formula (2.) N i,m () = Consequently, ( m i ) i ( ) m i wit i =,..., m. (2.2) Π m = span[n i,m : i =,..., m], and terefore eac w Π m as a unique representation m (2.3) w() = w i N i,m (). i= Now, given te coefficients w i and te (, ) we can calculate te value w() using te Casteljeau algoritm based on te identity (2.4) N i,m () = ( )N i,m () + N i,m (). Te Bernstein polynomials are positive on (, ) and tey form a partition

of unity, (2.5) Polynomial and spline estimators 3 m N i,m () = for i =,..., m and (, ). i= On te oter and, te modified Bernstein polynomials (2.6) M i,m () = (m + )N i,m () are normalized in L (, ), i.e. (2.7) M i,m () d =. In our construction of te polynomial estimator, te Durrmeyer kernel m (2.8) R m (, y) = M i,m ()N i,m (y) i= plays an essential role. Te kernel is symmetric and it as te following spectral representation: m (2.9) R m (, y) = λ i,m l i ()l i (y), i= were te eigenfunctions are te ortonormal Legendre polynomials l i and for te corresponding eigenvalues λ i,m we ave (cf. e.g. [5]) (2.) λ,m =, λ i,m = i j= wence, wit te notation λ k = k(k + ), (2.) 2 5 λ j λ k+ λ j,m λ j m + 2 m + j m + + j for i =,..., m, for λ k m < λ k+, j k. Now, for a given function F wic is eiter of bounded variation or continuous on [, ], te value of te underlying smooting operator is a polynomial of degree m + defined by te formula [ (2.2) T m F () = F () + Now, since (2.3) M i,m (y) dy = m+ j=i+ ] R m (y, z) df (y) dz. N j,m+ () for i =,,..., m, te Casteljeau algoritm, in case F = F n, can be applied to calculate te value T m F (). It also follows from (2.2) tat te integrated Legendre

4 Z. Ciesielski and R. Zieliński polynomials (2.4) L j () = l j (y) dy for j =,,... are te eigenfunctions for te operators T m wit te corresponding eigenvalues (2.6). For more details on te eigenvalues λ i,m we refer to [3]. An elementary argument gives, for any probability distribution functions F and G, bot supported on [, ], te important inequality (2.5) T m F T m G F G. Moreover, if F is a continuous function on [, ], ten (2.6) T m F 3 F. Proposition 2.. For eac F C[, ] we ave (2.7) F T m F as m. Proof. Te family {T m } of operators on C[, ] is by (2.6) uniformly bounded and terefore by te Banac Steinaus teorem it is sufficient to ceck (2.7) for F = L j wit any fied j. However, in tis case F T m F = L j T m L j = ( λ j,m )L j, and an application of (2.) completes te proof. In wat follows we use te symbol W m p (I) for te Sobolev space of order of smootness m and wit te integrability eponent p. Teorem 2.2. Let F W 2 2 [, ] F[, ], D = d/d and M = D2 F 2. Ten (2.8) F T m F M m /4 for m = 2, 3,.... Proof. For te density f = DF we ave (2.9) (f, ) = wit te usual scalar product (f, g) in L 2 [, ], and (2.2) F () T m F () = (f R m f)(z) dz, were R m is te Durrmeyer operator corresponding to (2.8). It now follows from (2.2) tat (2.2) F T m F f R m f 2. Using te spectral representation of R m we find tat (2.22) f R m f 2 2 = ( λ i,m ) 2 a 2 i, i=

Polynomial and spline estimators 5 were te a i s are te Fourier Legendre coefficients and by definition eac λ i,m for i > m is assumed to be. Moreover, (2.23) a i = (f, l i ). Using te notation from (2.4) we get (2.24) a i = (D 2 F, L i ) M L i 2. It remains to estimate L i 2. To tis end apply te identity (2.25) 2L i+ = l i+2 l i, i, (2i + )(2i + 3) (2i + )(2i ) to get (2.26) L i+ 2 2 = ( ) 8λ i 3 ( ) 4λ i 8λ i 3 =, 5λ 4λ i i. Tus, for te unique k satisfying te inequalities λ k m < λ k+ we obtain (2.27) a i 2 M 2 = M 2 5λ i 5 k. i=k+ By te monotonicity in m of f R m f 2 and by (2.) we get k ( ) 2 (2.28) f R m f 2 λi 2 a i 2 + M 2 m + 2 5 k. Now, λ = 2, L 2 2 = /, m/2 k < m, and terefore k ( ) 2 λi (2.29) a i 2 3M 2 m + 2 5 m. i= i= i=k Combining (2.29) and (2.28) we get (2.8). In te construction below of te spline estimator wit random knots, a particular role is played by te following polynomial density function on [, ]: ( ) 2m (2.3) φ m () := M m,2m () = (2m + ) m ( ) m, m were te positive integer m is te given parameter of smootness. Te corresponding polynomial distribution function is (2.3) Φ m () = φ m (y) dy, or else (see (2.3)) (2.32) Φ m () = 2m+ i=m+ N i,2m+ (),

6 Z. Ciesielski and R. Zieliński wic permits calculating Φ m () wit te elp of te Casteljeau algoritm. It is also important tat Φ m () solves te two-point Hermite interpolation problem of order m, i.e. (2.33) D k Φ m () = and D k Φ m () = δ k, for k =,..., m. For later convenience we also introduce te transformed polynomial distribution ( ) a (2.34) Φ m (; [a, b]) = Φ m. b a Clearly, Φ m (a; [a, b]) = and Φ m (b; [a, b]) =. On te real line R similar roles to polynomials and Bernstein polynomials on [, ] are played by te cardinal splines and cardinal B-splines, respectively. For a given integer r te space S r of cardinal splines of order r is te space of all functions on R of class C r 2 wose restrictions to (j, j + ) are polynomials of order r, i.e. D r S = on eac (j, j + ) for j =, ±,.... It is known tat tere is a unique B (r) S r (up to a multiplicative constant) positive on (, r) and wit support [, r]. Te function B (r) becomes unique if normalized e.g. so tat (2.35) B (r) (y) dy =. R Te density B (r) can as well be defined probabilistically by te formula (2.36) P {U + + U r < } = B (r) (y) dy were U,..., U r are independent uniformly distributed r.v. on [, ]. We also need te rescaled cardinal B-splines N (r) i, () = B(r) ( i ) and M (r) i, () = B(r) ( i ) for i Z, were > is te so called window parameter. Tey sare te nice properties (2.37) N (r) i, () = and M (r) i, (y) dy =. Clearly, supp N (r) i, formula (2.38) N (r) i, () = i (r) = supp M i, R = [i, (i + r)]. Moreover, te recurrent i (r ) N (r ) i, () + (i + r) (r ) N (r ) i+, () supplies an algoritm for calculating at a given te value of te cardinal

spline (2.39) Polynomial and spline estimators 7 i a i N (r) i, () given te coefficients (a i ). Te same algoritm can be used to calculate te value at of te distribution function (2.4) M (r) i, (y) dy = N (r+) i, (). Now, just as in te polynomial case (cf. (2.8)) we introduce te spline kernel (2.4) R (k,r) (, y) = M (k) (r) i+ν, ()N i, (y). i= In wat follows it is assumed tat k r and tat r k is even, i.e. r k = 2ν wit ν being a non-negative integer. It ten follows tat te supports of M (k) (r) i+ν, and of N i, are concentric. Denote by C ±(R) te linear space of all rigt-continuous functions on R wit finite limits at ±, and by BV (R) te linear space of all continuous functions of finite total variation on R. Now, for F C ± (R) BV (R) te value of te operator T (k,r) F is defined by te formula ( ) (2.42) T (k,r) F () = R (k,r) (z, y) F (dz) dy = R i= R j=i M (k) i+ν, df N (r) i, (y) dy. To recall from [4] te direct approimation teorem by te operators T (k,r), te notion of modulus of a given order m is needed: were m t ω m (F ; δ) = sup m t F, t <δ is te mt order progressive difference wit step t, i.e. m ( ) m m t f() = ( ) j+m f( + jt). j j= Now, te following teorem, important for spline estimation, was proved in [4]: Teorem 2.3. Let k r wit r k even and let >. Ten for F C ± (R) BV (R), (2.42) T (r,k) F F,

8 Z. Ciesielski and R. Zieliński and for F C ± (R), (2.43) F T (k,r) F 2(4 + (r + k) 2 )ω 2 (F ; ), and in particular for k =, ( (2.44) F T (,r) F ω F ; r + ). 2 Consequently, for eac continuous probability distribution F, (2.45) F T (k,r) F as +. 3. A smoot modification of F n. We start, for a given positive integer m, wit te polynomial density (2.3) and its polynomial probability distribution (2.3). Let now X :n,..., X n:n wit X :n X n:n be te order statistics from te sample X,..., X n. For tecnical reasons we assume tat X :n < < X n:n ; oterwise one sould consider multiple knots. Moreover, define X :n = ma{, X :n (X 2:n X :n )} = ma{, 2X :n X 2:n }, X n+:n = min{x n:n + (X n:n X n :n ), } = min{2x n:n X n :n, }. Now, for te given sample X,..., X n and for given m a new estimator Φ m,n () for F is defined as follows: it equals for < X :n, for X n+:n, and for i =,..., n + and X i :n < X i:n, (3.) Φ m,n () = n Φ(; [X i :n, X i:n ]) + F n (X i :n ) 2n. We consider Φ m,n () as an estimator of te unknown distribution function F F wic generates te sample X,..., X n. Given te sample, te estimator Φ m,n () may be easily calculated by te Casteljeau algoritm or using te standard incomplete beta functions or distribution functions of beta random variable; te functions are available in numerical computer packages. Summarizing we get Proposition 3.. Te estimator Φ m,n as te following properties:. It is a piecewise polynomial probability distribution supported on [X :n, X n+:n ] wit knots at te sample points. 2. Te function Φ m,n ()+/2n interpolates F n at te sample points, i.e. Φ m,n (X i:n ) + 2n = F n(x i:n ) for i =,..., n. 3. Φ m,n C m (R). 4. D k Φ m,n (X i:n ) = for k =,..., m and i =,..., n. 5. Φ m,n F F n F + /2n.

Polynomial and spline estimators 9 6. Moreover, te following DKW type inequality olds: (3.2) P { Φ m,n F ε} 2 e 2n(ε /2n)2, n > 2ε, F F. 4. Estimating by polynomials on [, ]. A polynomial estimator on [, ] (see [3] and cf. (2.2)) can be defined by (4.) F m,n () = T m F n (), were T m transforms distributions on [, ], continuous or not, into distribution functions on [, ] wic are polynomials of degree m +. It turns out tat for te family F[, ] of all continuous distributions supported on [, ], te DKW type inequality for te estimator F m,n does not old. More precisely, we ave te following negative result Teorem 4.. Tere are ε > and η > suc tat for any m and n one can find a distribution function F F[, ] suc tat (4.2) P { F m,n F > ε} > η. To prove (4.2) it is enoug to demonstrate tat for some ε, η > and for every n and for every odd m te following inequality olds: (4.3) P {F m,n (/2) > F (/2) + ε > } > η. We start te proof by writing formula (4.) in te form (4.4) F m,n () = n m N i,m (X j ) M i,m (y) dy, n wence by (2.3), (4.5) F m,n () = n j= i= n j= i= m N i,m (X j ) m+ k=i+ N k,m+ (). Let now {ε k : k =,,...} be a sequence of independent Bernoulli r.v. wit P {ε k = } = /2 = P {ε k = }. Moreover, let ζ m+ = ε + + ε m+ be te binomial r.v. Ten for odd m = 2ν and i ν, (4.6) m+ k=i+ N k,m+ (/2) = P {ζ m+ > i} P {ζ m+ > ν} = P {ζ m+ ν} = /2. Consequently, by (2.3) we obtain, for odd m, (4.7) F m,n (/2) n ν N i,m (X j ) 2n 2n j= i= n j= X j M ν,m (y) dy.

Z. Ciesielski and R. Zieliński Now, te continuous function M m () = M ν,m (y) dy is decreasing on [, ], M m () =, and M m (/2) = /2. Consequently, for ε (, /6) tere is δ > suc tat M m (δ + /2) > 4ε. Coose now F F[, ] suc tat F (/2) < ε and F (/2 + δ) > η /n. Tus, by our ypotesis we get, for j =,..., n, P F {X j < /2 + δ} > η /n and P F {M m (X j ) > 4ε} > η /n. Taking into account tat n {M m (X j ) > 4ε} one obtains j= wic proves (4.3). P F { n n j= { n n j= } M m (X j ) > 4ε } M m (X j ) > 4ε > η, To obtain a positive DKW type inequality for te above polynomial estimators we reduce te space F of all continuous distribution functions to a smaller subclass. We sall discuss, for a given constant M, subclass W M of F[, ] suc tat F W M if and only if te density f = F is absolutely continuous and its derivative satisfies (4.8) f () 2 d M. Teorem 4.2. Given constants M >, ε > and η >, one can find eplicit values of m and n suc tat (4.9) P { F m,n F > ε} < η for all F W M. Specifically, it is enoug to coose m and n so tat 2M < ε and 2 ep ( 2 nm 2 ) m/4 m /2 < η. Proof. Let us start wit te identity F F m,n = (F T m F ) + T m (F F n ). Now (see [3]) te triangle inequality and contraction property (2.5) imply (4.) F F m,n F F n + F T m F,

Polynomial and spline estimators wence by Teorem 2.2 and by te DKW inequality we get P { F F m,n > ε} P { F F m,n > 2M/m /4 } and tis completes te proof. P { F F n + F T m F > 2M/m /4 } P { F F n + M/m /4 > 2M/m /4 } = P { F F n > M/m /4 } 2 ep ( 2 nm 2 ) < η, m /2 5. Estimating by splines wit equally spaced knots. In tis section we eibit a ric family of subclasses of continuous probability distributions on R for wic te DKW type inequality olds. Te general sceme is very muc like tat in Section 4 (cf. te proof of Teorem 4.2). We start by recalling te generalized Hölder classes. A function ω() on R + is said to be a modulus of smootness if it is bounded, continuous, vanising at, non-decreasing and subadditive, e.g. concave. Suppose we are given integers (k, r) satisfying te conditions formulated just below formula (2.4). Taking into account wat we already know it is reasonable to consider te following Hölder classes of probability distributions: (5.) (5.2) H (k,r) ω,2 = {F F : 2(4 + (r + k) 2 )ω 2 (F ; ) ω() for all > }, ( H (k,r) ω, = {F F : ω F ; r + k ) } ω() for all >. 2 According to our sceme, te spline estimator wit window parameter and wit te given (k, r) is defined by te formula (5.3) F,n = F (k,r),n = T (k,r) F n. For tese estimators we also ave te basic inequality (5.4) F F,n F F n + F T (k,r) F. We are now in a position to state te DKW type inequality for te generalized Hölder classes of continuous distributions. Teorem 5.. Let i =, 2 and k r wit r k even. Ten for all ε > and η > tere are > and n suc tat (5.5) P { F,n F > ε} < η for all F H (k,r) ω,i. It suffices to coose and n suc tat ω() < ε/2 and 2 ep( nε 2 /2) < η. Te proof is immediate from te DKW inequality, (5.4) and Teorem 2.3.

2 Z. Ciesielski and R. Zieliński References [] C. de Boor, Linear combinations of B-splines, in: Approimation Teory II, Academic Press, 976, 47. [2] Z. Ciesielski, Local spline approimation and nonparametric density estimation, in: Constructive Teory of Functions 87, Sofia, 988, 79 84. [3], Nonparametric polynomial density estimation, Probab. Mat. Statist. 9 (988),. [4], Asymptotic nonparametric spline density estimation, ibid. 2 (99), 24. [5] Z. Ciesielski and J. Domsta, Te degenerate B-splines as basis in te space of algebraic polynomials, Ann. Polon. Mat. 46 (985), 7 79. [6] P. Massart, Te tigt constant in te Dvoretzky Kiefer Wolfowitz inequality, Ann. Probab. 8 (99), 269 283. [7] I. J. Scoenberg, Cardinal interpolation and spline functions, J. Appro. Teory 2 (969), 67 26. [8] E. J. Wegman, Kernel estimators, in: Encyclopedia of Statistical Sciences, 2nd ed., Vol. 6, Wiley-Interscience, 26. [9] R. Zieliński, Kernel estimators and te Dvoretzky Kiefer Wolfowitz inequality, Appl. Mat. (Warsaw) 34 (27), 4 44. Instytut Matematyczny PAN Abraama 8 8-825 Sopot, Poland E-mail: Z.Ciesielski@impan.gda.pl Instytut Matematyczny PAN Śniadeckic 8-956 Warszawa, Poland E-mail: R.Zielinski@impan.pl Received on.7.28 (945)