NEAREST NEIGHBOR DENSITY ESTIMATES 537 THEOREM. If f is uniformly continuous on IEgd and if k(n) satisfies (2b) and (3) then If SUpx I,/n(x) - f(x)l _

Similar documents
EXPONENTIAL INEQUALITIES IN NONPARAMETRIC ESTIMATION

Optimal global rates of convergence for interpolation problems with random design

NONPARAMETRIC REGRESSION ESTIMATION 1311 then the nearest neighbor estimate is universally consistent, that is, (1.5) E ( m(x) - m (X) p) - 0 as n - o

This chapter contains a very bare summary of some basic facts from topology.

Problem set 1, Real Analysis I, Spring, 2015.

SHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction

Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

PROBABILITIES OF MISCLASSIFICATION IN DISCRIMINATORY ANALYSIS. M. Clemens Johnson

INTRODUCTION TO REAL ANALYSIS II MATH 4332 BLECHER NOTES

MATH 202B - Problem Set 5

The Heine-Borel and Arzela-Ascoli Theorems

CORRECTION OF DENSITY ESTIMATORS WHICH ARE NOT DENSITIES

Lecture Notes on Metric Spaces

Support Vector Method for Multivariate Density Estimation

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

FINITE BOREL MEASURES ON SPACES OF CARDINALITY LESS THAN c

Chapter 4. Measure Theory. 1. Measure Spaces

THEOREMS, ETC., FOR MATH 515

A PROOF OF THE UNIFORMIZATION THEOREM FOR ARBITRARY PLANE DOMAINS

Lecture 2: Uniform Entropy

The information-theoretic value of unlabeled data in semi-supervised learning

l(y j ) = 0 for all y j (1)

Analysis Qualifying Exam

REVIEW OF ESSENTIAL MATH 346 TOPICS

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Real Analysis Problems

Assignment 8. Section 3: 20, 24, 25, 30, Show that the sum and product of two simple functions are simple. Show that A\B = A B

2.1 Sets. Definition 1 A set is an unordered collection of objects. Important sets: N, Z, Z +, Q, R.

Density estimators for the convolution of discrete and continuous random variables

Introduction to Empirical Processes and Semiparametric Inference Lecture 12: Glivenko-Cantelli and Donsker Results

A GENERALIZATION OF THE FLAT CONE CONDITION FOR REGULARITY OF SOLUTIONS OF ELLIPTIC EQUATIONS

PERIODS IMPLYING ALMOST ALL PERIODS FOR TREE MAPS. A. M. Blokh. Department of Mathematics, Wesleyan University Middletown, CT , USA

NOWHERE LOCALLY UNIFORMLY CONTINUOUS FUNCTIONS

REAL AND COMPLEX ANALYSIS

THE HARDY LITTLEWOOD MAXIMAL FUNCTION OF A SOBOLEV FUNCTION. Juha Kinnunen. 1 f(y) dy, B(x, r) B(x,r)

An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance

PREVALENCE OF SOME KNOWN TYPICAL PROPERTIES. 1. Introduction

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3


MOMENTS OF THE LIFETIME OF CONDITIONED BROWNIAN MOTION IN CONES BURGESS DAVIS AND BIAO ZHANG. (Communicated by Lawrence F. Gray)

UNIFORMLY DISTRIBUTED MEASURES IN EUCLIDEAN SPACES

Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon Measures

General Theory of Large Deviations

A GENERAL THEOREM ON APPROXIMATE MAXIMUM LIKELIHOOD ESTIMATION. Miljenko Huzak University of Zagreb,Croatia

for all subintervals I J. If the same is true for the dyadic subintervals I D J only, we will write ϕ BMO d (J). In fact, the following is true

arxiv: v2 [math.ca] 4 Jun 2017

Week 5 Lectures 13-15

Random Variable. Pr(X = a) = Pr(s)

Introduction to Proofs in Analysis. updated December 5, By Edoh Y. Amiran Following the outline of notes by Donald Chalice INTRODUCTION

Variable Selection and Weighting by Nearest Neighbor Ensembles

BALANCING GAUSSIAN VECTORS. 1. Introduction

ON THE DEFINITION OF RELATIVE PRESSURE FOR FACTOR MAPS ON SHIFTS OF FINITE TYPE. 1. Introduction

EXISTENCE OF SOLUTIONS TO ASYMPTOTICALLY PERIODIC SCHRÖDINGER EQUATIONS

Daniel M. Oberlin Department of Mathematics, Florida State University. January 2005

7 Convergence in R d and in Metric Spaces

MATH5011 Real Analysis I. Exercise 1 Suggested Solution

SPACES IN WHICH COMPACT SETS HAVE COUNTABLE LOCAL BASES

Rademacher Averages and Phase Transitions in Glivenko Cantelli Classes

("-1/' .. f/ L) I LOCAL BOUNDEDNESS OF NONLINEAR, MONOTONE OPERA TORS. R. T. Rockafellar. MICHIGAN MATHEMATICAL vol. 16 (1969) pp.

On the Sample Complexity of Noise-Tolerant Learning

Doubly Indexed Infinite Series

Metric Spaces and Topology

I-CONVERGENT TRIPLE SEQUENCE SPACES OVER n-normed SPACE

Math 5051 Measure Theory and Functional Analysis I Homework Assignment 2

f f(x)dlz(x)<= f t(x)d.(x)<= f t_(x),~.(x)

Bounded and continuous functions on a locally compact Hausdorff space and dual spaces

LOCALLY UNIFORMLY CONTINUOUS FUNCTIONS

lim f f(kx)dp(x) = cmf

U e = E (U\E) e E e + U\E e. (1.6)

Problem Set 2: Solutions Math 201A: Fall 2016

A Measure and Integral over Unbounded Sets

Elementary Analysis Math 140D Fall 2007

Nonparametric Bayes-Risk Estimation

ASYMPTOTIC STRUCTURE FOR SOLUTIONS OF THE NAVIER STOKES EQUATIONS. Tian Ma. Shouhong Wang

Adaptive Nonparametric Density Estimators

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Yale University Department of Computer Science. The VC Dimension of k-fold Union

Introduction and Preliminaries

Generalization theory

An Introduction to Statistical Theory of Learning. Nakul Verma Janelia, HHMI

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

The ordering on permutations induced by continuous maps of the real line

On Estimates of Biharmonic Functions on Lipschitz and Convex Domains

Chapter 5. Measurable Functions

THE RANGE OF A VECTOR-VALUED MEASURE

Real Analysis Notes. Thomas Goller

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures

NECESSARY CONDITIONS FOR WEIGHTED POINTWISE HARDY INEQUALITIES

Solutions Final Exam May. 14, 2014

11691 Review Guideline Real Analysis. Real Analysis. - According to Principles of Mathematical Analysis by Walter Rudin (Chapter 1-4)

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

Probability and Measure

Probability for Statistics and Machine Learning

van Rooij, Schikhof: A Second Course on Real Functions

ASYMPTOTIC BEHAVIOR OF SOLUTIONS OF DISCRETE VOLTERRA EQUATIONS. Janusz Migda and Małgorzata Migda

Bahadur representations for bootstrap quantiles 1

One-Dimensional Diffusion Operators

02. Measure and integral. 1. Borel-measurable functions and pointwise limits

Definitions & Theorems

Transcription:

The Annals of Statistics 977, Vol. 5, No. 3, 536-540 (q THE STRONG UNIFORM CONSISTENCY OF NEAREST NEIGHBOR DENSITY ESTIMATES BY Luc P. DEVROYE AND T. J. WAGNER' The University of Texas at Austin Let X,,, Xn be independent, identically distributed random vectors with values in ]Rd and with a common probability density f. If Vk(x) is the volume o f the smallest sphere centered at x and containing at least k of the Xi,, Xn then fn(x) = k/(nvk (x)) is a nearest neighbor density estimate of f. We show that if k = k(n) satisfies k(n)/n --~ 0 and k(n)/log n --* 00 then 5 x I fn(x) - f(x)j --~ 0 w.p. l when f is uniformly continuous on Rd. Introduction. Suppose that Xi,, Xn are independent, identically distributed random vectors with values inw and with a common probability density f. If Vk (x) is the volume of the smallest sphere centered at x and containing at least k of the random vectors X,,, Xn, then Loftsgaarden and Quesenberry (965), to estimate f(x) from X,,, Xn, let f%(x) = kl(rv.(x)) where k - k(n) is a sequence of positive integers satisfying (2) (a) k(n)too (b) k(n)/n -* 0. (The factor k - was used instead of k by Loftsgaarden and Quesenberry ; this has no effect on any of the asymptotic results stated here.) They showed that fn(x) is a consistent estimate of f(x) at each point where f is continuous and positive. This result can also easily be inferred from the work of Fix and Hodges (95). For d -, Moore and Henrichon (969) showed that supx If%( x) - f(x)i --> 0 in probability if f is uniformly continuous and positive on P and if, additionally, (3) k(n)/log n -~ oo. Wagner (973) showed that fn(x) is a strongly consistent estimate of f(x) at each continuity point of f if, in addition to (2b), (4) ~i e-ak(n) < 00 for all a>0. (Notice that (4) is always implied by (3) but (2 a) and (4) are needed (3).) The result of this paper is the following theorem. 536 to imply Received February 976 ; revised October 976. Research of the authors was sponsored by AFOSR GRANT 72-237. AMS 970 subject classifications. 60F5, 62G05. Key words and phrases. Nunparametric density estimation, multivariate density estimation, uniform consistency, consistency.

NEAREST NEIGHBOR DENSITY ESTIMATES 537 THEOREM. If f is uniformly continuous on IEgd and if k(n) satisfies (2b) and (3) then If SUpx I,/n(x) - f(x)l _ % O W.p.. f%(x) _ ~ n= K((x - x2)/r(n))/nr(n)d where K is the uniform probability density for the unit sphere in Pd is a sequence of positive numbers, the recent results of Moore (977) (see Theorem 3.) and the above theorem immediately yield that sup x Ifx n () -f(x)i _*O w.p. and {r(n)} and Yackel whenever f is uniformly continuous on IRd and r(n) -k 0, nr(n)dllog n -> oo. This fact, an improvement over the previously published convergence results for the kernel estimate with a uniform kernel (e.g., see Theorem 2. of Moore and Yackel (977)), also is a special case of Theorem 4.9 of Devroye (976) who proves the same statement for all kernels K which are bounded probability densities with compact support and whose discontinuity points have a closure with Lebesgue measure 0. To simplify notation we assume below that multiplications are always PROOF. carried out before division. Let > 0 and choose o > 0 such that f(y) -f(x)i </2 whenever x and y are within a sphere of volume ~. Deferring measurability arguments for the moment, P{SUpx If% (x) - J ( 'X)I > ~} = P{U= [Vk (x) < k/n(f(x) + E)} + P{U=:.rc=»E [Vk(x) > k/n(f(x) - The event U x [Vk (x) < k/n(f(x) + s)] implies that, for some x, there must be a sphere centered at x with volume less than k/n( f (x) + e) and containing k of the random vectors X, X,,. If k/ns < ~ then the probability measure of such a sphere must be less than k(f(x) + s/2)/n(f(x) + e) so that, for one of these spheres S, ~n(s) - u( s) > k - k(f(x) -I- ~/ 2) n n( f (x) + s) ke > ke 2n( f (x) + s) - 2n(F + e) where F is the maximum of f on W,,u is the measure on the Borel subsets of W corresponding to f and,un is the empirical measure on the Borel subsets of I$d for Xl,, Xn. Thus, for k/ns < ~, (5) P{U x [v k (x) < k/n(f(x) + )]} G P{supsE,,, n Ii(S) ~ n - ~( S)I > ks/2n(f + s)} where.sfn is the class of all spheres in JI$ d whose volume is less than 4k/ns.

538 LUC P. DEVROYE AND T. J. WAGNER Next, with 4k/ne < o, Ux :fcx)>e [Vk(x) > k/n(f(x) - ~)} C U x :fcx)>e [Vk(x) > k/n(f(x) - ( 3 ~/4 ))] which implies that, for some x with f(x) > s, there is a sphere S centered at x, with volume < 4k/ne, and Thus le(s) f~(s) ~ k(.f(x) - ~/2)/n(f(x) - ( a)~),u n (S) c k/n, and -,un(s) > ke/4n(f(x) - (4)e) (6) P[U= :,(.)> =[Vx(x) J kln(j(=) - E)) so that P{SUpSe,~n J,(S) C- fln \'S )I ~ - ke/4nf} s P{supx If%(x) -- f(x)i >_ ~} ~ 2P{sup se n J(S) ~n,(s)j > ks/4n(f + ~)}. The proof will be completed if we show that for each E > 0 (7) n P{sup~E n Ii-(5) - B(S)I > k~/4n(f + ~)} <. To prove (7) we employ a variation of the argument used by Vapnik and Chervonenkis (97). In this variation use will be made of the following result. If Yl,..., Yn represent independent drawings without replacement from a population of k 0's and l's then, for > 0 and k >_ n, (8) P[IQ Y2 )/n - ~, ~ 2 e -ne 2/(2P +e) where ;u, the {number of 's}/k, is assumed to be < 2. Additionally (8) holds when Y,..., Yn are Bernoulli random variables with parameter ~ 2. (Use the two-sided version of Theorem 3 of Hoeffding (963) along with 2 and log ( + (s/,4) >_ 2E/(2 ;u + ~). See also Section 6 of this paper.) Now, if sup er ;u (A) <_ M and n >_ 8M/d2, an easy modification of Lemma of Vapnik and Chervonenkis (97) yields (9) P[sup,,, Ii(A) un - l~(a)i > o] G 2P[sup,,, Ii(A) ~n - f~n'(a)i ~ ~/2 where ~e n '(A) is the empirical measure for A with Xn+,..., X2n and is any class of Borel sets in Pd for which sup Ii(A) - and super Ii(A) un - are random variables. Putting $Y -,, we see that M can be taken to be 4kF/n~. Since, for a > 0, P[sup n J(A) ;u n - len'(a)i >_ o/ 2 ] (0) P[sup n J(A) l~n - l~n'(a)i o/2 ; sup, 2% (A) < am] + P[sup n /2%(A) > am] we see, using (3) and putting o = k~/4n(f + E), that (7) follows whenever both

NEAREST NEIGHBOR DENSITY ESTIMATES 539 terms of the right-hand side of (0) are summable for some a > 0. Looking at the first term, we note that it equals S pend! I[sup n ~u n (A)-un (A)I?b/2] I[supJ %p2n(a)<_am] dq (2n). where IE is the indicator of the set E c W and Q is the probability measure on R 2 '"d for X,.., X2n and where the inner summation is taken over all (2n)! permutations of x,..., x2n. But this last integral equals S 2nd (2n)(. '[ sup,n u2n(a)5_am] sup~ n [lu n (A)-un(A)I~a/2] dq Spend t 2n). [sup nu2n(a)~am] 5U[lun(A)-un(A)I?8/2] dq < Spend LJAe ''[sup np2%(a)5am] (2n) ' [Jp%(A)-P2n(A)I~a/4] dq. where $f' _ Sf'(xl,..., x2n) is any finite subclass of Jn which yields the same class of intersections with {x,..., x2n } and where the inner summation is again taken over the (2n)! permutations of x,..., x 2n. The quantity within { } is bounded above, using (8), by A) +4a) whenever ;i 2% (A) ~ Z. Since M = 4kF/n~ we see, from (3), that for all n sufficiently large the last integral is upper-bounded by 2 52%d e-n82/(32am+48)( Ae ' ) dq. Choosing Jf' to be a smallest possible subclass, we have (Vapnik and Chervonenkis (97), Cover (965)) that ( A ' ) < + (2n)d+ 3 and, using (3) again, that the first term of (0) is summable for all a > 0. Looking at the second term of (0), let r be the radius of a sphere in W whose volume is 4k/ne. If some sphere of radius r contains I of the points X,..., X2n then there must be at least one sphere or radius 2r, centered at one of the points X,..., X2n which contains at least l points. Thus P[sup~n ;u 2% (A) > am] < 2nP[~c 2n (SXl (2r)) > am] where S(t) denotes the sphere of radius t centered at x. But P[ ;u2n(sx l( 2r)) > am] max xej d P[~2n_l(Sx(2r)) > (a2nm - )/(2n -- )] < maxx a ~d P[;u2n _ l (Sx (2r)) - p(sx (2r)) > [(a2nm -)/(2n -)] - 2d4kF/n~]. At this point it is not difficult, using (3) and (8), to show that the second term of (9) is summable as long as a > 2d. Finally, to complete the proof, it is easy to see that all of the uncountable unions over x are indeed events and that the various supremums over Jn are indeed random variables.

540 LUC P. DEVROYE AND T. J. WAGNER REFERENCES COVER, T. (965). Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Trans. Computers 0 326-334. DEVROYE, L. P. (976). Nonparametric discrimination and density estimation. Ph. D. thesis, Univ. of Texas, Austin. Fix, E. and HoDGES, J. L. (95). Discriminatory analysis. Nonparametric discrimination : consistency properties. Report 4, Project Number 2-49-004, USAF School of Aviation Medicine, Randolph Field, Texas. HOEFFDING, W. (963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 583-30. LOFTSGAARDEN, D. 0. and QUESENBERRY, C. P. (965). A nonparametric estimate of a multivariate density function. Ann. Math. Statist. 36049-05. MOORS, D. S. and HENRICHON, E. G. (969). Uniform consistency of some estimates of a density function. Ann. Math. Statist. 40499-502. MOORS, D. S. and YACKEL, J. W. (977). Consistency properties of nearest neighbor density estimates. Ann. Statist. 543-54. VAPNIK, V. N. and CHERVONENKIS, A. YA. (97). On the uniform convergence of relative frequencies of events to their probabilities. Theory Probability Appi.6 264-280. WAGNER, T. J. (973). Strong consistency of a nonparametric estimate of a density function. IEEE Trans. Systems, Man, and Cybernet. 3, 289-290. DEPARTMENT OF ELECTRICAL ENGINEERING THE UNIVERSITY OF TEXAS AT AUSTIN COLLEGE OF ENGINEERING AUSTIN, TEXAS 7872