arxiv: 160600304v2 SUPPLEMENTARY MATERIAL TO EFFICIENT MULTIVARIATE ENTROPY ESTIMATION VIA K-NEAREST NEIGHBOUR DISTANCES By Thomas B Berrett,, Richar J Samworth, a Mig Yua, Uiversity of Cambrige Uiversity of Wiscosi Maiso Appeix This is the supplemetary material to Berrett, Samworth a Yua 2017, hereafter referre to as the mai text A1 Proofs of auxiliary results Proof of Propositio 9 i the mai text Fix τ α+, 1] We first claim that give ay ɛ > 0, there exists A ɛ > 0 such that aδ A ɛ δ ɛ for all δ 0, γ] To see this, observe that there exists δ 0 0, γ] such that aδ δ ɛ for δ δ 0 But the sup δ ɛ aδ max 1, γ ɛ aδ 0 } γ ɛ δ0 ɛ, δ 0,γ] which establishes the claim, with A ɛ := γ ɛ δ0 ɛ Now choose ɛ = 1 3 τ α+ a let τ := τ 3 + 2 3α+ α+, 1 The, by Höler s iequality, a sice ατ /1 τ >, a fx fx τ x A ɛ δ ɛ sup fx τ x sup f F,θ x:fx<δ} as δ 0, as require A ɛ δ ɛ 1 + ν τ f F,θ R 1 + x α x:fx<δ} } τ 1 τ 1 τ x 0 Research supporte by a PhD scholarship from the SIMS fu Research supporte by a EPSRC Early Career Fellowship a a grat from the Leverhulme Trust Research supporte by NSF FRG Grat DMS-1265202 a NIH Grat 1- U54AI117924-01 AMS 2000 subject classificatios: 62G05, 62G20 Keywors a phrases: efficiecy, etropy estimatio, Kozacheko Leoeko estimator, weighte earest eighbours 1 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
2 T B BERRETT, R J SAMWORTH AND M YUAN For the seco part, fix ρ > 0, set ɛ := 1 2 τ sup f F,θ X f F,θ X α+ α+, 1 The, by Höler s iequality agai, a fx ρ fx τ x A ɛ/ρ sup fx τ x as require A ɛ/ρ 1 + ν τ R 1 + x α a τ := τ 2 + 2α+ } τ 1 τ 1 τ x <, Proof of Lemma 10 i The lower bou is immeiate from the fact that h x r V f r for ay r > 0 For the upper bou, observe that by Markov s iequality, for ay r > 0, h x x + r = B x x +r fy y B 0 r fy y 1 µ αf r α The result follows o substitutig r = µ αf 1/α 1 s for s 0, 1 ii We first prove this result i the case β 2, 4], givig the state form of b 1 Let C := 4V β/ / + β, a let y := Cafx β/2 ss/fx} β/ Now, by the mea value theorem, we have for r r a x that h xr V r V fx 2 + 2 r+2 fx afxfx V 2 + β r+β It is coveiet to write s 1+2/ fx s x,y := s 2 + 2V 2/ + y fx 1+2/ The, provie s x,y 0, V raxfx], we have s 1/ x,y h x V fx} 1/ fx s x,y + 2 + 2fx V 2/ Now, by our hypothesis, we kow that sup sup f F,θ sup max s S x X s1+2/ 1+2/ x,y 2/ V s 2/ fx 2 + 2fx 1+2/, y s 1/2 V 2/ max 2 + 2 β/ afxv s1+β/ 2 + βfx β/ x,y } C 2/ }, CC β/ 0 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 3 as Hece there exists 1 = 1, θ N such that for all 1, all f F,θ, s S a x X, we have 1 2 + 2 s1+2/ x,y s 1+2/ s1+2/ 1/2 V 2/ afxs 2/ 2 2 + 2fx 2/ + y } s Moreover, there exists 2 = 2, θ N such that for all 2, all s S, x X a f F,θ we have s x,y 1+β/ 2s 1+β/ Fially, we ca choose 3 = 3, θ N such that max C 4 β/ 4 + 2V 4 β/, 2 1/2 C 2/ + βv 2/, 3/2 C 2/ } 2 + 2 + βv 2/ + β a such that C 8 1/2 V /2 for 3 It follows that for max 1, 2, 3 =:, for f F,θ, s S a for x X, we have that s x,y 0, V raxfx] a h x y s 1/ x,y V fx} 1/ s afxs1+2/ 2 1/2 V 2/ fx 2/ afxβ/2 s 1+β/ fx β/ 1/2 V 2/ afxs 2/ 2 + 2fx 2/ + y } afxs1+β/ s [C afx2 β/2 4 + 2V 4/ Cafx 2 1/2 V 2/ s fx } 4 β/ } s 2/ fx + βv β fx β β/ ] V 0 + β The lower bou is prove by very similar calculatios, a the result for the case β 2, 4] follows The geeral case ca be prove usig very similar argumets, a is omitte for brevity A2 Auxiliary results for the proof of Theorem 2 i the mai text the efiitio of V f give i the statemet of Theorem 1 Recall Lemma 1 For each N a θ Θ a m N, we have i sup f F,θ X fx logm fx x < ; ii if f F,θ V f > 0; imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
4 T B BERRETT, R J SAMWORTH AND M YUAN Proof of Lemma 1 Fix N a θ = α, β, γ, ν, a Θ i For ɛ 0, 1 a t 0, 1], we have log 1 t 1 ɛ t ɛ α α1 mɛ Let ɛ = mα+2, so that mɛ = 2 The, by Höler s iequality, for ay f F,θ, fx log m fx x 2 m 1 fx log m f x + 2 m 1 log m f X X fx 2m 1 f mɛ ɛ m fx 1 mɛ x + 2 m 1 log m f X } 1 + x α 1 mɛ mɛ mɛ x 2m 1 γ mɛ ɛ m 1 + ν 1 mɛx + 2 m 1 max log m γ, 1 V α α m logm ν α + α+ α α where the bou o log m f comes from 13 i the mai text ii Now efie A,θ := max sup Hf, 1 } f F,θ 2 log if f, 1 f F,θ }, a the set S,θ := x X : e 4A,θ fx e 2A,θ} For f F,θ, x S,θ a y B x 8 1/2 ae 4A,θ} 1/β 1 we have by Lemma 2 below that 1 fy fx 151/2 ae 4A,θ e 2A,θ y x β 1 7 By the cotiuity of f, there exists x 0 S,θ such that fx 0 = 1 2 e 2A,θ1+ e 2A,θ Thus, by 1, we have that B x0 r,θ S,θ, where Hece 71 e 2A,θ } 1/β 1 1 r,θ := 30 1/2 ae 4A,θ 8 1/2 ae 4A,θ} 1/β 1 V f = E f [log fx 1 + Hf} 2 ] A 2,θ P f X 1 S,θ A 2,θ e 4A,θ V r,θ, as require The followig auxiliary result provies cotrol o eviatios of the esity arisig from the smoothess coitio of our F,θ classes imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 5 Lemma 2 For θ = α, β, γ, ν, a Θ, m := β 1, f F,θ a y B x ra x, we have, for multi-iices t with t m, that f t f t y xt x t x 15 1/2 a fx fx y x miβ t,1 7 Proof If t = m the the result follows immeiately from the efiitio of F,θ Heceforth, therefore, assume that m 1 a t m 1 Writig here for the largest absolute etry of a array, we have for y B x ra x that f t f t y xt x t x y x sup f t z B x y x x t z y x f t +1 x + 1/2 y x sup f t +1 z f t +1 x m t l=1 l 1/2 y x l f t +l x z B x y x + m t /2 y x m t sup z B x y x f m z f m x } 1 afxfx y x 1 1/2 y x + β t /2 y x β t 1 151/2 afxfx y x, 7 as require Lemma 3 that Uer the coitios of Theorem 1 i the mai text, we have sup k k 0,,k 1 } sup E f [Ṽ w V f} 2 ] 0 f F,θ Proof For w = w 1,, w k T W k, write suppw := j : w j 0} The E f Ṽ w V f k w j E f log 2 ξ j,1 j=1 w 1 max j suppw X E f log 2 ξ j,1 f log 2 f + E f Ĥw 2 } Hf 2 X f log 2 f + Var f Ĥw + E f Ĥ w 2 Hf 2 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
6 T B BERRETT, R J SAMWORTH AND M YUAN Thus, by Theorem 1 i the mai text, 18 i the proof of that result a Lemma 1i, we have that sup k k 0,,k 1 } sup f F,θ E f Ṽ w V f 0 Now, 2 Var f Ṽ w w 2 1 max j suppw Var f log 2 ξ j,1 + w 2 1 max j,l suppw Cov f log 2 ξ j,1, log 2 ξ l,2 Let a,j := j 3j1/2 log 1/2 0 a a +,j := j + 3j1/2 log 1/2 1 Mimickig argumets i the proof of Theorem 1, for ay m N, j suppw a ɛ > 0, E f log m ξ j,1 fx 1 } = = X fx a+,j 1 a,j 1 0 log m V 1fxh 1 x log m 1s e Ψj e Ψj B j, j s s s k β/ + O max β/ logm 1, k B j, j s s x α α+ ɛ α α+ ɛ } 0, uiformly for j suppw, k k 0,, k 1 } a f F,θ Moreover, by Cauchy Schwarz, we ca ow show, for example, that E f log 4 ξ j,1 = E f [logξ j,1 fx 1 log fx 1 } 4 ] E f log 4 fx 1 uiformly for j suppw, k k 0,, k 1 } a f F,θ The result follows upo otig that we may use a similar approach for the covariace term i 2 to see that sup k k 0,,k 1 } sup f F,θ Var Ṽ w 0 A3 Proof of Propositio 5 i the mai text Proof of Propositio 5 i the mai text I each of the three examples, we provie θ = α, β, γ, ν, a Θ such that f F,θ I fact, β > 0 may be chose arbitrarily i each case i We may choose ay α > 0, a the set ν = 2 α/2 Γ α 2 + 2 /Γ/2 We may also set γ = 2π /2 It remais to fi a A such that 6 i the mai text hols Write H r y := 1 r e y2 /2 r y r e y2 /2 for the rth Hermite polyomial, a ote that H r y p r y, where p r is a polyomial of egree r with o-egative coefficiets Usig multi-iex otatio for partial imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 7 erivatives, if t = t 1,, t 0, 1,, } with t := t 1 + + t, we have f t x x t = fx H tj x j fx j=1 p tj x fxq t x, for some polyomial q r of egree r, with o-egative coefficiets It follows that if y Bx1, the for ay β > 0 with m = β 1, f m x f m y fx y x β m m/2 fx y x β m max f t x t: t =m x t f t y x t m+1/2 max sup f t x + w fx t: t =m+1 x t m+1/2 sup j=1 w B 0 1 w B 0 1 m+1/2 e x q m+1 x + 1 fx + wq m+1 x + w fx Similarly, f r x max r=1,,m fx m/2 max q r x r=1,,m Write gδ := 2 log δ2π /2} 1/2 a efie a A by settig aδ := max1, ãδ}, where } ãδ := m/2 sup max max q r x, 1/2 e x q m+1 x + 1 x: x gδ r=1,,m = m/2 max max q r gδ, 1/2 e gδ } q m+1 gδ + 1 r=1,,m The sup x:fx δ M f,a,β x aδ a aδ = oδ ɛ for every ɛ > 0, so 6 i the mai text hols ii We may choose ay α < ρ, a set ν = 2 α/2 Γ α 2 + 2 ρ/2 α/2 Γ ρ α 2 Γ 2 Γ ρ 2 We may also set γ = Γ ρ 2 + 2 To verify 6 i the mai text for suitable Γρ/2ρ /2 π /2 a A, we ote by iuctio, that if t = t 1,, t 0, 1,, } with t := t 1 + + t, the f t x x t fxq t x 1 + x 2 /ρ t, imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
8 T B BERRETT, R J SAMWORTH AND M YUAN where q r is a polyomial of egree r with o-egative coefficiets Thus, similarly to the Gaussia example, for ay β > 0 with m = β 1, f m x f m y sup sup y Bx 1 fx y x β m x R m+1/2 sup sup x R w B 0 1 say, where A 1,m,ρ [0, Similarly, fx + wq m+1 x + w fx1 + x 2 /ρ m+1 =: A 1,m,ρ, sup max x R r=1,,m f r x fx m/2 sup max x R r=1,,m q r x 1 + x 2 =: A2 /ρ r,m,ρ, say, where A 2,m,ρ [0, Now efiig a A to be the costat fuctio aδ := max1, A 1,m,ρ, A2,m,ρ }, we agai have that sup x:fx δ M f,a,β x aδ, so 6 i the mai text hols iii We may take ay α > 0 a ν = 1, γ = 3 To verify 6 i the mai text, fix β > 0, set m := β 1, a efie a A by aδ := A m max 1, log 2m+1 1 }, δ for some A m 1 epeig oly o m The, by iuctio, we fi that for some costats A m, B m > 0 epeig oly o m, a x 1, 1 M f,a,β x max max r=1,,m A r 1 x 2, sup 2r y:0< y x r ax B m+1 1 x 2 2m+1 a fx, A m+1 fy 1 y 2 2m+1 fx provie A m i the efiitio of a is chose sufficietly large Hece 6 i the mai text agai hols A4 Proof of Propositio 6 i the mai text Proof of Propositio 6 i the mai text To eal with the itegrals over X c, we first observe that by 13 i the mai text there exists a } imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 9 costat C,f > 0, epeig oly o a f, such that X c 3 fx C,f 1 0 X c B k, k s log u x,s s x fx log + log 1 + x µ 1/α α f for every ɛ > 0 Moreover, 4 fx log fx x = Oq1 ɛ, X c } x = O maxq log, q 1 ɛ }, for every ɛ > 0 Now, a slightly simpler argumet tha that use i the proof of Lemma 10ii i the mai text gives that for r 0, r x ], we have h x r V fxr V + β C β, β xr+ We euce, agai usig a slightly simplifie versio of the argumet i Lemma 10ii i the mai text, that there exists 0 N such that for a 0, s [0, 1 ] a x X, we have 5 V fxh 1 x s s 2V β/ + β s1+ β/ C, βx fx 1+ β/ s 2 It follows from 3, 4, 5 a a almost ietical argumet to that leaig to 15 i the mai text that for every 0 a ɛ > 0, a 1 E f Ĥ H V fxh 1 x s fx B k, k s log s x X 0 s + O maxq 1 ɛ, q log, 1 } a 1 2 fx B k, k s V fxh 1 x s s X 0 s s x + O maxq 1 ɛ, q log, 1 } β/ 4V B k+ β/, k C, βx + β B k, k X fx β/ x + O maxq 1 ɛ, q log, 1 }, as require imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
10 T B BERRETT, R J SAMWORTH AND M YUAN A5 Completio of the proof of Lemma 7 i the mai text To prove Lemma 7 i the mai text, it remais to bou several error terms arisig from argumets that approximate the variace of the uweighte Kozacheko Leoeko estimator Ĥ, a the to show how these argumets may be aapte to yiel the esire asymptotic expasio for VarĤw A51 Bous o S 1,, S 5 To bou S 1 : By similar methos to those use to bou R 1 i the proof of Lemma 3 i the mai text, it is straightforwar to show that for every ɛ > 0, we have S 1 = X c fx 1 0 α k B k, k s log 2 u x,s s x = O α+ ɛ α α+ ɛ To bou S 2 : For every ɛ > 0, we have that 1 S 2 = fx B k, k s log 2 u x,s s x = o 3 ɛ, X a 1 by very similar argumets to those use to bou R 2 i the proof of Lemma 3 i the mai text To bou S 3 : We have 1s log 2 u x,s log 2 e Ψk fx 1s V fxh 1 x s } V fxh 1 x = 2 log e Ψk + log log fx s s s It therefore follows from Lemma 10ii i the mai text that for every ɛ > 0, a 1 S 3 = fx X 0 k β/ k = O max log, β/ 1s B k, k s log 2 u x,s log 2 e Ψk fx α α+ ɛ α α+ ɛ } } s x To bou S 4 : A simplifie versio of the argumet use to bou R 4 i Lemma 3 of the mai text shows that for every ɛ > 0, 1 1s S 4 = fx B k, k s log 2 X e Ψk s x = o 3 ɛ fx a 1 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 11 To bou S 5 : Very similar argumets to those use to bou R 1 i Lemma 3 i the mai text show that for every ɛ > 0, α k S 5 = fx log 2 α+ ɛ fx x = O X c α α+ ɛ A52 Bous o T 1, T 2 a T 3 To bou T 1 : Let B Betak 1, k 1 By 13 i the mai text, for every ɛ > 0, T 11 := fxfy log fy logufx F,x F,xu y x X X c c ũ,x,y 2 fxfy log fy k 1 X c X c 1 logu x,s fx B k 1, k 1 s 2s 1 0 k 1 s y x [ fxfy log fy E log 1 } X c X c B + log 1 2B 1 1 B k 1 + log + log fx + log 1 + x } ] µ 1/α E 2B 1 α f k 1 y x k 1 2 + 2α α+ ɛ = o, 2α α+ ɛ where we use the Cauchy Schwarz iequality a elemetary properties of beta raom variables to obtai the fial bou Now let u x := u x,a/ 1 = V 1h 1 x a 1 e Ψk, a cosier T 12 := fxfy log fy logufx F,x F,xu y x X ũ,x,y X c If ũ,x,y u x, the by very similar argumets to those use to bou R 1 a R 2 cf 13 a 14 i the mai text, together with Cauchy Schwarz, logufx F,x F,xu ũ,x,y 6 1 a 1 logu x,s fx B k 1, k s + B k, k 1 s} s log + log fx + log 1 + x µ 1/α α f 3 ɛ, imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
12 T B BERRETT, R J SAMWORTH AND M YUAN for every ɛ > 0 O the other ha, if ũ,x,y < u x, the x y < r,u x +r,u y, where we have ae the r,u y term to ai a calculatio later i the proof Defie the sequece From Lemma 10ii i the mai text, ρ := [ c log 1/ 1 ] 1 sup r,u y = sup h 1 a } k log 1/ k log 1/ y sup = oρ y X y X 1 y X fy δ Now suppose that x X c a y X satisfy y x ρ Choose 0 N large eough that r,u y ρ /2 for all y X, a that log 1 max3/2 8 1/2 /β, 12V 1 2 } for all 0 a k k0,, k 1 } The whe β 0, 1] a 0, usig the fact that B x ρ /2 B y 3ρ /2, we have fw w V fyρ /2 V afyfyρ /2 3ρ /2 β 7 B xρ /2 V fyρ /2 1 3c ρ /2 β } 1 2 V ρ /2 δ a 1 Hece, for all 0, x X c, y X with y x ρ a k k 0,, k 1 }, 8 r,u x + r,u y ρ O other ha, suppose istea that x X c a ρ x := if y X y x ρ Sice X is a close subset of R, we ca fi y X such that y x = ρ x, a set x := ρ ρ x + 1 ρ x ρ y The x y = ρ, so from 7, we x have r,u x ρ /2 for 0 a k k0,, k 1 } Sice B xρ /2 B x ρ x ρ /2, we euce that r,u x ρ x ρ /2 a 9 y X : x y < r,u x + r,u y} = for 0 a k k 0,, k 1 } But for 0, 10 sup 1 sup x X c y X : y x ρ fy 151/2 fx fy c ρ β < 1 7 2, so that if x X c, y X a x y ρ, the fy < 2δ for 0 a k k 0,, k 1 } imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 13 It therefore follows from 6, 8, 9, 10 a the argumet use to bou T 11 that for each ɛ > 0 a 0, T 12 fxfy log fy 1 x y <r,u X x +r,u y} X c X c 0 logufx F,x F,xu y x + o 2 y:fy<2δ fxfy log fy 0 k 1 2 + 2α = o α+ ɛ 2α α+ ɛ logufx F,x F,xu y x + o 2 Fially for T 1, we efie T 13 := r,1 fxfy log fy X Bx c fx 1/ log ufx F,x F,xu y x ũ,x,y By Lemma 10ii i the mai text, we ca fi 1 N such that for 1, k k0,, k 1 }, x X a s a / 1, we have V fxh 1 x s 2s Thus, for 1, k k0,, k 1 }, x X a y Bx c r,1, fx 1/ ũ,x,y 24 log fx 2a fxe Ψk u x Thus, from 6, T 13 = O 2 log We coclue that for every ɛ > 0, k 1 2 + 2α T 1 T 11 + T 12 + T 13 = o α+ ɛ 2α α+ ɛ To bou T 2 : Fix x X a z B 0 Choosig 2 N large eough that r,1 8 1/2 1/β c 1 δ 1/ for 2, we have by Lemma 2 that sup fy fx 1 1 2 y B x r,1 δ 1/ imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
14 T B BERRETT, R J SAMWORTH AND M YUAN for 2, k k0,, k 1 } Also, for all 2, k k0,, k 1 }, we have fy x,z log fy x,z fx log fx fy x,z logfy x,z /fx + log fx fy x,z fx afxfx y x,z x β log fx + 4} Moreover, by argumets use to bou T 11, logufx F,x F,xu E 2B logb 1 z /fx k 1 + log + log fx + log 1 + x } E 2B 1 f k 1, µ 1/α α where B Betak 1, k 1 It follows that for every ɛ > 0, T 2 = eψk fy x,z log fy x,z fx log fx} V 1 X B 0 z /fx α k 1/2 k = O max α+ ɛ } kβ/, α ɛ α+ β/ log2+β/ logufx F,x F,xu z x To bou T 3 : Note that by Fubii s theorem, fx log fx logufx F,x F,xu z x z X B 0 = V X fx log fx = V X fx log fx e Ψk fx 0 u x 0 miufx, } logufx F,x F,xu x ufx logufx F,x F,xu x + O 3 ɛ, for all ɛ > 0, where the orer of the error term follows from a similar argumet to that use to obtai 6 a Lemma 10i Thus, for every ɛ > 0, T 3 = k 1 a 1 V fxh 1 x s fx log fx logu x,s fx k 1 X 0 s } } 1s 2s log B k, k 1 s 1 s x + O 3 ɛ k 1 k 1/2 = O max k α α+ ɛ α α+ kβ/, log ɛ β/ } imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 15 A53 Bous o U 1 a U 2 To bou U 1 : Usig Lemma 10i a 13 i the mai text as i our bous o T 11 we have that for every ɛ > 0, u x U 11 := fx log ufx F,x F,x u x X c 0 a 1 fx logu x,s fx B k, k 1 s 1s k X c 0 k 1 s x 1 k 2 + α α+ ɛ 11 = o 1+ α α+ ɛ Moreover, usig argumets similar to those use to bou R 2 i the proof of Lemma 3 i the mai text, for every ɛ > 0, 12 U 12 := fx log ufx F X u x,x F,x u x = o 3 ɛ From 11, a 12, we have for every ɛ > 0 that 1 k 2 + α U 1 U 11 + U 12 = o α+ ɛ 1+ α α+ ɛ To bou U 2 : By Lemma 10ii a lettig B Betak + β/, k 1, we have that for every ɛ > 0, a 1 U 21 := V fxh 1 x s 1s k } fx log B k, k 1 s s x X 0 s k 1 kβ/ β/ E 1B k k 1 afxfx 1 β/ x X k 1/2 k β/ = O max β/, k α α+ ɛ } α+ ɛ α Moreover, we ca use similar argumets to those use to bou R 4 i the proof of Lemma 3 i the mai text to show that for every ɛ > 0, 1 } U 22 := 1s 1s k fx log X e Ψk B k, k 1 s s x k 1 = o 3 ɛ a 1 We euce that for every ɛ > 0, k 1/2 U 2 U 21 + U 22 = O k β/ max β/, k α α+ ɛ α α+ ɛ } imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
16 T B BERRETT, R J SAMWORTH AND M YUAN A54 Bous o W 1,, W 4 To bou W 1 : We partitio the regio [l x, v x ] [l y, v y ] c ito eight rectagles as follows: [lx, v x ] [l y, v y ] c = [0, lx [0, l y [0, l x [l y, v y ] [0, l x v y, [l x, v x ] [0, l y [l x, v x ] v y, v x, [0, l y v x, [l y, v y ] v x, v y, Recall our shortha hu, v = logufx logvfy By Lemma 10i i the mai text a the Cauchy Schwarz iequality, as well as very similar argumets to those use to bou R 2 i the proof of Lemma 3 i the mai text, we ca bou the cotributios from each rectagle iiviually, to obtai that for every ɛ > 0, W 1 = fxfy hu, v F,x,y F,x F,y u, v x y X X [l x,v x] [l y,v y] c = o 9/2 ɛ To bou W 2 : We have W 2 = fxfy X X vx vy l x l y hu, v G,x,y F,x F,y u, v x y + 1 We write B a,b,c := ΓaΓbΓc Γa+b+c, a, for s, t > 0 with s + t < 1, let 13 B a,b,c s, t := sa 1 t b 1 1 s t c 1 B a,b,c eote the esity of a Dirichleta, b, c raom vector at s, t For a, b > 1, writig I := [a / 1, a + / 1], let B k+a, k := I s k+a 1 1 s k 1 s, B k+a, k s := sk+a 1 1 s k 1 /B k+a, k B k+a,k+b, 2k 1 := I I s k+a 1 t k+b 1 1 s t 2k 2 s t B k+a,k+b, 2k 1 s, t := sk+a 1 t k+b 1 1 s t 2k 2 /B k+a,k+b, 2k 1 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 17 The by the triagle a Pisker s iequalities, a Beta tail bous similar to those use previously, we have that B k+a,k+b, 2k 1 s, t B k+a, k sb k+b, k t s t I I B k+a,k+b, 2k 1 1 B k+a,k+b, 2k 1 + B k+a, k B k+b, k 1 B k+a, k B k+b, k B + 2 k+a,k+b, 2k 1 s, t log k+a,k+b, 2k 1 s, t = 2 B I I 1 1 t 0 0 B k+a, k sb k+b, k t Bk+a,k+b, 2k 1 s, t B k+a,k+b, 2k 1 s, t log B k+a, k sb k+b, k t } 1/2 s t } 1/2 s t + o 2 Γ + a + b 1Γ k = 2 [log 1/2 2 + 2k 2ψ 2k 1 Γ 2k 1Γ + aγ + b 14 = k 1 + o1} k 1ψ + b k 1 + ψ + a k 1} + ψ + a + b 1] 1/2 + o 2 As a first step towars bouig W 2 ote that vx vy W 21 := fxfy hu, v G,x,y F,x F,y u, v x y X X l x l y = fxfy logu x,s fx logu y,t fy X X I I Bk,k, 2k 1 s, t B k, k sb k, k t } s t x y 1s 1t = fxfy log X X I I e Ψk log e Ψk Bk,k, 2k 1 s, t B k, k sb k, k t } s t x y + W 211 15 = 1 + O k α α+ ɛ 1+ α α+ ɛ + O 2 + W 211, for every ɛ > 0 But, by Lemma 10ii i the mai text a 14, for every imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
18 T B BERRETT, R J SAMWORTH AND M YUAN ɛ > 0, W 211 = V h 1 x s fx 1t fxfy 2 log log X X I I s e Ψk V h 1 x s fx V h 1 y t } fy + log log s t Bk,k, 2k 1 s, t B k, k sb k, k t } s t x y 2 V h 1 x s fx fxfy log X X I s [ log } 1 Ψ k 1 + log1 s Bk, k 1 s log 1 Ψ } ] B k, k s s x y k 1+ 2β 2α k1+ α+ ɛ } + O max, 16 k 1/2 k β/ = O max β/, k 1+ 2β α α+ ɛ α α+ ɛ } 2α 1+ α+ ɛ Moreover, by Lemma 10i a ii i the mai text a very similar argumets, for every ɛ > 0, vx vy W 22 := fxfy hu, v G,x,y F,x F,y u, v x y X X c l x l y k 1+ α α+ ɛ α k α+ ɛ } kβ/ = O 1+ max, α ɛ α+ α ɛ α+ β/, 1 k 1/2 vx vy W 23 := fxfy hu, v G,x,y F,x F,y u, v x y 17 X c X c k 1+ 2α = O α+ ɛ 2α 1+ α+ ɛ l x l y Icorporatig our restrictios o k, we coclue from 15, 16 a 17 that for every ɛ > 0, W 2 W 21 + 1 k 1/2 k β/ + 2 W 22 + W 23 = O max β/, k α α+ ɛ } α+ ɛ α To bou W 3 : We write h u, h v a h uv for the partial erivatives of hu, v a write, for example, h u F u, v = h u u, vf u, v We fi o itegratig imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 19 by parts that, writig F = F,x,y G,x,y, vx vy h F u, v h uv F u, v u v [l x,v x] [l y,v y] vx = 18 l x [ hu F u, l y h u F u, v y ] u + l x l y vy l y [ hv F l x, v h v F v x, v ] v + hf v x, v y + hf l x, l y hf v x, l y hf l x, v y Usig staar biomial tail bous as use to bou W 1 together with 13 i the mai text we therefore see that for every ɛ > 0, v x vy vx vy } W 31 := fxfy h F u, v h uv F u, v u v x y 19 = X X X X fxfy l x l y vx l x h u F u, v y u + + o 9/2 ɛ l x l y vy l y } h v F v x, v v x y Now, uiformly for u [l x, v x ] a x, y X X a for every ɛ > 0, 2 F u, v y = 1 x y r,u} p k 1 k 1,x,u1 p,x,u k 1 + o 9/2 ɛ B k, k p,x,u = 1 x y r,u} + o 9/2 ɛ 1 1 20 1 x y r,vx } 2πk 1/2 1 + o1} + o 9/2 ɛ By 10 a the argumets leaig up to it, we have 21 sup sup x X c y X B xr,vx +r,vy fx fy 1 0 We therefore have by 13 i the mai text that, for every ɛ > 0, 22 X c X fxfy vx l x k 1 2 + 2α h u F u, v y u y x = O Now, usig Lemma 10ii i the mai text, for x X, α+ ɛ 2α α+ ɛ k β/ 23 max l x fx 1, v x fx 1 } afx + log1/2 fx k 1/2 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
20 T B BERRETT, R J SAMWORTH AND M YUAN We also ee some cotrol over vfy By 10 a the work leaig up to it, for max 0, 5, x X a y x r,vx + r,vy, fy 1 151/2 c ρ β} δ δ /2 k/ 1 7 Thus afy c β a usig 21 we may apply Lemma 10ii i the mai text to the set X = X y : y x r,vx + r,vy for some x X } From this a 21, for ay x X a y B x r,vx + r,vy, k β/ 24 max l y fy 1, v y fy 1 afy + log1/2 fx k 1/2 Usig 21 agai, we have that afy x,z fx ɛ for each ɛ > 0, uiformly for x X a z v x fx} 1/ + v y fx} 1/ From 20, 23 a 24 we therefore have that vx fxfy h u F u, v y u y x X X l x k 1/2 fxfy1 x y <r,vx } logv y fy logv x /l x y x 25 X X k 1/2+2β/ = O max 1+2β/ k 1/2, k 1 2 + α, log α+ ɛ 1+ α α+ ɛ } for every ɛ > 0 By 19, 22 a 25, it follows that k 1/2+2β/ 26 W 31 = O max 1+2β/, log k 2α k 1/2+ α+ ɛ, 1/2 2α α+ ɛ } Fially, by 13 i the mai text a 21, we have sice F = 0 whe x y > r,u + r,v that 27 W 32 := fxfy X c X vx vy Combiig 26 a 27 we have that l x l y 2α k h uv F u, v u v x y = O k 1/2+2β/ W 3 = W 31 + W 32 = O max 1+2β/, log k 1/2, k 2α α+ ɛ 2α α+ ɛ α+ ɛ 2α α+ ɛ } imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 21 To bou W 4 : Let p := B xr,u B yr,v fw w a let N 1, N 2, N 3, N 4 Multi 2, p,x,u p, p,y,v p, p, 1 p,x,u p,y,v + p Further, let so that F 1,x,yu, v := PN 1 + N 3 k, N 2 + N 3 k, F,x,y F 1,x,yu, v = PN 1 + N 3 = k 1, N 2 + N 3 k1 x y r,u} + PN 2 + N 3 = k 1, N 1 + N 3 k1 x y r,v} + PN 1 + N 3 = k 1, N 2 + N 3 = k 11 x y r,u r,v} Now PN 1 +N 3 = k 1 = 2 k 1 p k 1,x,u1 p,x,u k 1 2πk 1/2 1+o1} a F,x,y u, v = G,x,y u, v if x y > r,u + r,v, a so, by 23 a 24, we have that vx vy F,x,y G,x,y u, v fxfy u v x y X X l x l y uv vx vy 1 F,x,y G,x,y u, v = fxfy u v x y X X l x l y uv log + O max k 1/2, k 1 2 + 2β, k 1 2 + α α+ ɛ } 28 α+ ɛ 1+ 2β 1+ α We ca ow approximate F,x,yu, 1 v by Φ Σ k 1/2 ufx 1}, k 1/2 vfx 1} a G,x,y u, v by Φk 1/2 ufx 1}Φk 1/2 vfx 1} To avoi repetitio, we focus o the former of these terms To this e, for i = 3,,, let 1Xi B Y i := xr,u}, 1 Xi B yr,v} so that i=3 Y N1 + N i = 3 We also efie N 2 + N 3 p,x,u µ := EY i = p,y,v p,x,u 1 p V := CovY i =,x,u p p,x,u p,y,v, p p,x,u p,y,v p,y,v 1 p,y,v Whe x X a y B xr,vx + r,vy we have that, writig for the symmetric ifferece a usig 21, PX 1 B x r,u B y r,v > 0 a so V is ivertible for such x a y We may therefore set Z i := V 1/2 Y i µ imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
22 T B BERRETT, R J SAMWORTH AND M YUAN The by the Berry Essee bou of Götze 1991, writig C for the set of close, covex subsets of R 2 a lettig Z N 2 0, I, there exists a uiversal costat C 2 > 0 such that 29 sup P 1 2 1/2 C C i=3 Z i C PZ C C 2E Z 3 3 2 1/2 The istributio of Z 3 epes o x, y, u a v, but, recallig the substitutio y = y x,z as efie i 22 i the mai text, we claim that for x X, y = y x,z B x r,u + r,v, u [l x, v x ] a v [l y, v y ], 30 E Z 3 3 1/2 k z To establish this, ote that for x X a y x r,vx + r,vy, we have k by 21, 23 a 24 that y x fx 1/ Thus, for v [l y, v y ], a usig Lemma 2, we also have that 31 vfx 1 max v y fy 1, l y fy 1 + v y fy fx k β/ afx fy + log1/2 fx k 1/2 Now, by the efiitio of l x a v x, 32 max p,x,u k/ 1, p,y,v k/ 1 } 3k1/2 log 1/2 1 for all x, y X a u [l x, v x ], v [l y, v y ] Next, we bou 2 k p α z for x X a y = y x,z with z v x fx} 1/ + v y fx} 1/ First suppose that u v We may write B x r,u B y r,v = B x r,v B y r,v } [B x r,u \B x r,v } B y r,v ], where this is a isjoit uio Writig I a,b x := x 0 B a,bs s for the regularise icomplete beta fuctio a recallig that µ eotes Lebesgue measure o R, we have µ Bx r,v B y r,v = V r,vi +1 2, 1 2 = veψk 1 I +1 2, 1 2 1 1 x y 2 4r,v 2 z 2 4vfx} 2/ imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
a Now, r I +1 2, 1 2 EFFICIENT ENTROPY ESTIMATION 23 α z = I +1 1 2, 1 z 2 2 4 1 r2 = 1 r2 /4 1 2 4 B +1/2,1/2 Hece by the mea value iequality, µ Bx r,v B y r,v eψk α z 1fx eψk 1 1 B +1/2,1/2 [ v z 1 vfx} 1/ + α z 1 vfx B +1/2,1/2 fx It follows that for all x X, y B x r,vx + r,vy a v [l y, v y ], B xr,v B yr,v fw w eψk α z 1 k k β/ afx fy + k1/2 log 1/2 fx usig 31 a Lemma 2 We also have by 32 that fw w p,x,u p,x,v B xr,u\b xr,v} B yr,v ] k k β/ afx fy + k1/2 log 1/2 fx Thus, whe x X, y = y x,z B x r,vx + r,vy, u [l x, v x ], v [l y, v y ] a u v, 33 2 k p α z k β/ afx fy + log1/2 fx k 1/2 We ca prove the same bou whe v > u similarly, usig 23, 31 a Lemma 2 We will also require a lower bou o p,x,u + p,y,v 2p i the regio where B x r,u B y r,v, ie, z ufx} 1/ + vfx} 1/ By the mea value theorem, } 2 1 I +1 2, 1 1 δ 2 2 1/2 /2 δ max, 1 I +1 2 B +1/2,1/2 2, 1 1/2 2 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
24 T B BERRETT, R J SAMWORTH AND M YUAN for all δ [0, 1] Thus, for u v, with v [l y, v y ], x X, a y = y x,z with z 2vfx} 1/, by 31 we have, µ Bx r,u B y r,v c µ Bx r,v B y r,v c } = V r,v x y 2 1 I +1 1 2, 1 2 4r,v 2 k z fx Whe z > 2vfx} 1/ we simply have µ Bx r,v B y r,v c = V r,v a the same overall bou applies Moreover, the same lower bou for µ By r,v B x r,u c hols whe u < v, u [l x, v x ], x X, a y = y x,z B x r,vx +r,vy We euce that for all x X, y = y x,z B x r,vx + r,vy, u [l x, v x ] a v [l y, v y ], 34 p,x,u + p,y,v 2p maxp,x,u p, p,y,v p } k z We are ow i a positio to bou E Z 3 3 above for x X, y = y x,z B x r,vx + r,vy, u [l x, v x ], v [l y, v y ] We write E Z 3 3 = p V 1/2 1 p,x,u 3 + p 1 p,x,u p,y,v V 1/2 1 p,x,u 3 p,y,v + p,y,v p V 1/2 p,x,u 3 1 p,y,v + 1 p,x,u p,y,v + p V 1/2 p,x,u 3 35, a bou each of these terms i tur First, p V 1/2 1 p,x,u 3 1 p,y,v 36 p,y,v = p V 3/2 1 p,x,u 1 p,y,v p,x,u + p,y,v 2p } 3/2 } 3/2 1 p,x,u 1 p,y,v = p p p,x,u p,y,v + p,x,u p p,y,v p p,x,u+p,y,v 2p } p,x,u + p,y,v 1 3/2 p mi, 1/2 /k 1/2, V p p,x,u p,y,v usig 32 a 33, a where we erive the fial bou from the left ha sie of the miimum if z 1 a the right ha sie if z < 1 Similarly, 37 p,x,u p V 1/2 1 p,x,u 3 1/2, p,x,u p p,y,v V 3/2 3/2 k z p,y,v imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 25 where we have use 34 for the fial bou By symmetry, the same bou hols for the thir term o the right-ha sie of 35 Fially, very similar argumets yiel 38 1 p,x,u p,y,v + p V 1/2 p,x,u 3 k/ 3/2 p,y,v Combiig 36, 37 a 38 gives 30 Writig Φ A for the measure associate with the N 2 0, A istributio for ivertible A, a φ A for the correspoig esity, we have by Pisker s iequality a a Taylor expasio of the log-etermiat fuctio that 2 sup Φ A C Φ B C 2 C C R 2 φ A log φ A φ B = 1 2 log B log A + trb 1 A B} B 1/2 A BB 1/2 2, provie B 1/2 A BB 1/2 1/2 Hece sup Φ A C Φ B C mi1, 2 B 1/2 A BB 1/2 } C C We ow take A = 2V/k, B = Σ a use the submultiplicativity of the Frobeius orm alog with 32 a 33 a the fact that Σ 1/2 = 1 + α z 1 + 1 α z 1 } 1/2 to euce that 39 sup Φ A C Φ B C 1 k β/ } afx fy + log1/2 C C z fx k 1/2 for x X, y Bxr,vx + r,vy, u [l x, v x ], v [l y, v y ] Now let u = fx 1 1+k 1/2 s a v = fx 1 1+k 1/2 t By the mea value theorem, 23 a 31, } Φ Σ k 1/2 k 2µ Φ k Σ s, t } 1 2p,x,u k 2π 1/2 k 1/2 s + 2p,y,v k k 1/2 t k β/ 40 k 1/2 afx fy + k 1/2 fx It follows by 29, 30, 39 a 40 that for x X a y B xr,vx + imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
26 T B BERRETT, R J SAMWORTH AND M YUAN r,vy, sup u [l x,v x],v [l y,v y] F 1,x,yu, v Φ Σ s, t mi 1, log1/2 k k 1/2 + afx fy z fx β/ k 1/2 + 1 } z Therefore, by 23 a 24, a sice fy fx/2 for x X, y B x r,vx + r,vy a 0, we coclue that for each ɛ > 0 a 0 41 X X vx vy 1 F,xu, v Φ Σ s, t fxfy 1 l x l y uv x y r,u+r,v} u v y x k log 1/2 k β/ } 2 fx X k 1/2 + afx/2 fx sup F,x,y 1 x,z u, v Φ Σ s, t z x k log 5/2 = O max k 3/2 B 0 3 u [l x,v x],v [l yx,z,v yx,z ], k 1 2 + α α+ ɛ α α+ ɛ, k 1/2+β/ log β/, k1/2+2β/ 2β/ } By similar i fact, rather simpler meas we ca establish the same bou for the approximatio of G,x,y by Φk 1/2 ufx 1}Φk 1/2 vfx 1} To coclue the proof for the uweighte case, we write X = X 1 X 2, where X 1 := x : fx k 2β δ }, X 2 := x : δ fx < k 2β δ }, a eal with these two regios separately We have by Slepia s iequality that Φ Σ s, t ΦsΦt for all s a t Hece, recallig that s = s x,u = k 1/2 ufx 1} a t = t x,v = k 1/2 vfx 1}, by 21, 23 a 31, for every ɛ > 0, vx vy Φ Σ s, t ΦsΦt 42 X 2 X fxfy 1 l x e Ψk V 1k X 2 l y uv X 2 1 x y r,u+r,v}u v y x fy x,z 1 x yx,z r,vx+r,vyx,z } R fx 2 l x l yx,z Φ Σ s, t ΦsΦt} s t z x 1+ k 2β α fx α z z x = o B 0 2 α+ ɛ 1+ α α+ ɛ, imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
X 1 X l x EFFICIENT ENTROPY ESTIMATION 27 where to obtai the fial error term, we have use the fact that B 0 2 α z z = V By 23 a 24 we have, for each ɛ > 0, vx vy Φ Σ s, t ΦsΦt fxfy 1 uv x y r,u+r,v} u v y x e Ψk V 1k = eψk 1k X 1 X 1 l y fy x,z 1 x yx,z r,vx+r,vyx,z } R fx 2 α z z x l x l yx,z log 1/2 fx x + O max k 1/2, k β/ α 1+β/, k 43 = eψk log 1/2 1k + O max k 1/2, k β/ k1+ 2β α α+ ɛ, 1+β/ 1+ α α+ ɛ α+ ɛ 1+ α α+ ɛ } By applyig Lemma 10ii i the mai text as for 31 we have, for x X 1 a y B x r,vx + r,vy, that 44 max v v vfx 1 3k 1/2 log 1/2 afx fy x,v y} } k β/ = ok 1/2, fx with similar bous holig for l x a l y A correspoig lower bou of the same orer for the left-ha sie of 43 follows from 44 a the fact that 2 log 2 log 2 log 2 log Φ Σ s, t ΦsΦt} s t = α z + O 2 uiformly for z R It ow follows from 28, 41, 42 a 43 that for each ɛ > 0, log 5/2 W 4 = O max k 1/2 as require, k 3 2 + α ɛ α+ α ɛ 1+ α+, k3/2+2β/ 1+2β/, k1+ 2β α ɛ α+ α ɛ 1+ α+, k 1 2 + β } log, 1+ β We ow tur our attetio to the variace of the weighte Kozacheko Leoeko estimator Ĥw We first claim that 45 k Var w j log ξ j,1 = j=1 k w j w l Covlog ξ j,1, log ξ l,1 = V f + o1 j,l=1 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
28 T B BERRETT, R J SAMWORTH AND M YUAN By 18, 19 a Lemma 3 i the mai text, for j such that w j 0, Var log ξ j,1 = V f + o1 as For l > j, usig similar argumets to those use i the proof of Lemma 3 i the mai text, a writig u k x,s := u x,s = V 1h 1 x s e Ψk for clarity, we have Elog ξ j,1 log ξ l,1 1 = fx = = X X fx X 0 0 1 1 s 0 0 1 s logu j x,slogu l x,s+t B j,l j, ls, t t s x 1s log fxe Ψj fx log 2 fx x + o1 1s + t log fxe Ψl B j,l j, l s, tt s x+o1 as, uiformly for 1 j < l k1 Now 45 follows o otig that sup k k w < Next we claim that k 46 Cov w j log ξ j,1, j=1 k w l log ξ l,2 = o 1 l=1 as I view of 20 i the mai text a the fact that sup k k w <, it is sufficiet to show that Cov logfx 1 ξ j,1, logfx 2 ξ l,2 = o 1 as, wheever w j, w l 0 We suppose without loss of geerality here that j < l, sice the j = l case is ealt with i 27 We broaly follow the same approach use to bou W 1,, W 4, though we require some ew similar otatio Let F,x,y eote the coitioal istributio fuctio of ξ j,1, ξ l,2 give X 1 = x, X 2 = y a let F,x j eote the coitioal istributio fuctio of ξ j,1 give X 1 = x Let } ue r,u j Ψj 1/ :=, p j,x,u := h x r j V 1,u Recall the efiitios of a ±,j give i the proof of Lemma 3, a let v x,j := ifu 0 : 1p j,x,u = a +,j } a l x,j := ifu 0 : 1p j,x,u = a,j } imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 29 For pairs u, v with u v x,j a v v y,l, let M 1, M 2, M 3 Multi 2; p j,x,u, p l,y,v, 1 p j,x,u p l,y,v a write Also write G,x,yu, v := PM 1 j, M 2 l Σ := 1 j/l 1/2 α z j/l 1/2 α z 1 where α z := V 1 µ B0 1 B z expψl Ψj 1/ Writig W i for remaier terms to be boue later, we have Cov logfx 1 ξ j,1, logfx 2 ξ l,2 = fxfy hu, v F,x,y F,xF j,yu, l v x y + W 1 X X = fxfy X X = fxfy X X = V 1 e Ψj 1jl 1/2 47 = V 1 e Ψj 1l [l y,l,v y,l ] [l x,j,v x,j ] [l y,l,v y,l ] [l x,j,v x,j ] vy,l vx,j l y,l R, hu, v F,x,y G,x,yu, v x y 1 + F,x,y G,x,y u, v l x,j uv R α z z 1 + u v x y 1 + 2 i=1 3 i=1 Φ Σ s, t ΦsΦt} s t z 1 + 4 4 i=1 1 W i = O + k as The fial equality here follows from the fact that, for Borel measurable sets K, L R, 48 µ K + z L z = µ Kµ L, R so that R α z z = V e Ψl Ψj 4 i=1 W i i=1 W i W i W i To bou W 1 : Very similar argumets to those use to bou W 1 show that W 1 = o 9/2 ɛ as, for every ɛ > 0 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
30 T B BERRETT, R J SAMWORTH AND M YUAN To bou W 2 : Similar to our work use to bou W 2, we may show that a+,j 1 a,j 1 a+,l 1 a,l 1 as, for fixe a, b > 1 Also, 1 1 s 0 0 1s log e Ψj B j+a,l+b, j l 1 s, t B j+a, j sb l+b, l t t s 1t log e Ψl jl1/2 1 + o1} B j,l, j l 1 s, t B j, j sb l, l t}ts = 1 + O 2 as Usig these facts a very similar argumets to those use to bou W 2 we have for every ɛ > 0 that k W 2 1/2 k β/ = O max β/, k α α+ ɛ } α+ ɛ α To bou W 3 : Similarly to 18 a the surrouig work, we ca show that for every ɛ > 0, log W 3 = O max k 1/2, k 1 2 + 2β 1+ 2β, k 2α α+ ɛ 2α α+ ɛ } To bou W 4 : Let N 1, N 2, N 3, N 4 Multi 2; p j,x,u p, p l,y,v p, p, 1 p j,x,u p l,y,v+p, where p := B xr j fw w Further, let,u B yr,v l F,1,x,y := PN 1 + N 3 j, N 2 + N 3 l The, as i 28, we have vx,j vy,l F,x,y fxfy G,x,y u, v u v x y X X l x,j l y,l uv vx,j vy,l F,1,x,y G,x,yu, v = fxfy u v x y X X l x,j l y,l uv log + O max k 1/2, k 1 2 + 2β, k 1 2 + α α+ ɛ } α+ ɛ 1+ 2β 1+ α imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 31 We ca ow approximate F,1,x,yu, v by Φ Σ j 1/2 ufx 1}, l 1/2 vfx 1} a G,x,yu, v by Φj 1/2 ufx 1}Φl 1/2 vfx 1} This is rather similar to the correspoig approximatio i the bous o W 4, so we oly preset the mai iffereces First, let We also efie Y i := 1Xi B xr j,u} 1 Xi B yr l,v} µ := EY i = p j,x,u p l,y,v a V := CovY i = p j,x,u1 p j,x,u p p j,x,up l,y,v p p j,x,up l,y,v p l,y,v1 p l,,y,v a set Z i := V 1/2 Y i µ Our aim is to provie a bou o p Sice the fuctio r, s µ B0 r 1/ B z s 1/, is Lipschitz we have for x X, y = x + fx 1/ r j,1 z B xr,v j x,j + r,v l y,l, u [l x,j, v x,j ] a v [l y,l, v y,l ] that 49 2 e Ψj p α z k β/ afx fy + log1/2 fx k 1/2, usig similar equatios to 23, 24 a 31 From this a similar bous to 32, we fi that V k 2 / 2 a V 1/2 /k 1/2 We therefore have E Z 3 3 V 1/2 3 E Y 3 µ 3 1/2 /k 1/2, which is as i the l = j case except with the factor of z 1/2 missig Note ow that lim sup sup j,l:j<l w j,w l 0 sup z B 0 1+e Ψl Ψj/ Σ 1/2 < Hece, usig 49, similar bous to 32 a the same argumets as leaig up to 39, k β/ 50 sup Φ A C Φ B C afx fy + log1/2 C C fx k 1/2, imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
32 T B BERRETT, R J SAMWORTH AND M YUAN where B := Σ a j A := 2 1 p j,x,u1 p j,x,u j 1/2 l 1/2 p p j,x,up l,y,v j 1/2 l 1/2 p p j,x,up l l 1 p l,y,v1 p l,y,v,y,v Now let u := fx 1 1 + j 1/2 s a v := fx 1 1 + l 1/2 t Similarly to 40, we have j 2p Φ,x,u j 2pl,y,v l Σ j 1/2, l 1/2 Φ Σ s, t k β/ k 1/2 afx fy + k 1/2 fx Similarly to the argumets leaig up to 41, it follows that X X fxfy vx,j vy,l l x,j l y,l k log 3/2 = O max k 3/2 F,1,x u, v Φ Σ s, t uv 1 x y r j,u+r l,v}, k 1 2 + α α+ ɛ α α+ ɛ, k 1/2+β/ log β/, k1/2+2β/ 2β/ u v y x }, where the power o the first logarithmic factor is smaller because of the absece of the factor of the z 1 term i 50 The remaier of the work require to bou W 4 is very similar to the work oe from 42 to 43, usig also 48, so is omitte We coclue that 3 log W 4 2 = O max k 1 2, k 3 2 + α ɛ α+ α ɛ 1+ α+, k 3 2 + 2β 1+ 2β, k1+ 2β α ɛ α+ α ɛ 1+ α+, k 1 2 + β } log 1+ β The equatio 47, together the bous o W 1,, W 4 just prove, establish the claim 46 We fially coclue from 45 a 46 that VarĤw = 1 Var k j=1 as require = V f + o 1, w j log ξ j,1 + 1 1 k Cov w j log ξ j,1, j=1 k w l log ξ l,2 l=1 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
A Proof of Theorem 8 EFFICIENT ENTROPY ESTIMATION 33 Proof of Theorem 8 For the first part of the theorem we aim to apply Theorem 2521 of va er Vaart 1998, a follow the otatio use there With Ṗ := λlog f + Hf : λ R} we will first show that the etropy fuctioal H is ifferetiable at f relative to the taget set Ṗ, with efficiet ifluece fuctio ψ f = log f Hf Followig Example 2516 i va er Vaart 1998, for g Ṗ, the paths f t,g efie i 10 of the mai text are ifferetiable i quaratic mea at t = 0 with score fuctio g Note that X gf = 0 a X g2 f < for all g Ṗ It is coveiet to efie, for t 0, the set A t := x X : 8t gx 1}, o which we may expa e 2tg easily as a Taylor series By Höler s iequality, for ɛ 0, 1/2, f log f 8t 21 ɛ f g 21 ɛ log f A c t X 8t 21 ɛ X as t 0 Moreover, f log1 + e 2tg A c t A c t } 1 ɛ g 2 f f log f 1/ɛ} ɛ = ot X log 2 + 2t g f 16t 2 4 log 2 + 1 g 2 f X We also have that ct 1 1 = 2 1 tg f X 1 + e 2tg e 2tg 1 + 2tg + tge 2tg 1 A t 1 + e 2tg f + 16 51 3 t2 g 2 f + 72t 2 g 2 f 72t 2 g 2 f A t X It follows that t 1 Hf t,g Hf} + log f + Hf}fg X = 1 1 2ct t X 1+e 2tg log f 1 fe 2tg 1 + 2tg + tge 2tg 1} log f t A t 2 2 log 1 + e 2tg 16 3 t g 2 f log f + 22t X X A c t A c t 1 + t g f 2ct 2ct 1+e 2tg log 1+e 2tg + tg1 + log f g 2 f + o1 0 + tg1 + e 2tg + o1 } f imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
34 T B BERRETT, R J SAMWORTH AND M YUAN The coclusio 11 i the mai text therefore follows from va er Vaart 1998, Theorem 2521 We ow establish the seco part of the theorem First, by our previous bou o ct i 51, for 12t < X g2 f} 1/2 we have that f t,g 2ct f 2 f 1 72t 2 X g2 f 4 f, a µ α f t,g 4µ α f We ow stuy the smoothess properties of f t,g This requires some ivolve calculatios, because we first ee to uersta correspoig properties of g To this e, for a m times ifferetiable fuctio g : R R, efie a Mg x := max max t=1,,m gt x, D g := max 1, sup δ 0, f g m y g m } x sup y Bx rax y x β m sup x:fx δ M g x aδ m+1 Let J m eote the set of multisets of elemets 1,, } of cariality at most m, a for J = j 1,, j s } J m, efie g J x := s g s l=1 x x j l Moreover, for i 1,, s}, let P i J eote the set of partitios of J ito i o-empty multisets As a illustratio, if = 2, the J 3 =, 1}, 2}, 1, 1}, 1, 2}, 2, 1}, 2, 2}, 1, 1, 1},1, 1, 2},1, 2, 1},1, 2, 2},2, 1, 1},2, 1, 2},2, 2, 1},2, 2, 2} } Moreover, if J = 1, 1, 2} J 3, the } P 2 J = 1, 1}, 2} }, 1, 2}, 1} }, 1, 2}, 1} } } The, by iuctio, a writig g := g 1 = log f + Hf, it may be show that carj gjx 1 i 1 i 1! = f i f P1 f Pi i=1 P 1,,P i } P i J Now, the cariality of P i J is give by a Stirlig s umber of the seco ki: car P i J = 1 i i 1 i l l carj =: S carj, i, i! l l=0 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
say Thus, if carj m, the 52 g Jx carj i=1 EFFICIENT ENTROPY ESTIMATION 35 i 1!S carj, i afx i 1 2 mm+1 m!afx m Moreover, if y x r a x a m 1, the g Jy g Jx carj i=1 i 1! P 1,,P i } P i J Now, by Lemma 2, f i y f i x 1 i fy fx 1 1 + fy fx 1 i 1 Moreover, by iuctio a Lemma 2 agai, fp1 f Pi y f P1 f Pi x f i y + f P 1 f Pi x f i } y f i y f i x 1 71 i 1i fy 56 fx 1 f P1 f Pi y f P1 f Pi x 8 1/2 71 i } 1 afx i f i x y x β m 56 We euce that eve whe m = 0, 53 gjy gjx 8 1/2 71 mm!m + 1 m+2 afx m+1 y x β m 41 Comparig 52 a 53, we see that 54 D g 8 1/2 71 mm!m + 1 m+2 =: D 41 Now let qy := 1 + e 2ty 1, so that f t,g x = 2ctq gx fx Similar iuctive argumets to those use above yiel that whe J J m with m 1 a g is m times ifferetiable, q g J x = carj i=1 q i gx P 1,,P i } P i J a we ow bou the erivatives of q By iuctio, q i y = 2t i i 1 i l a i l e 2tly 1 + e 2ty l+1, l=1 g P1 g Pi x, imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
36 T B BERRETT, R J SAMWORTH AND M YUAN where for each i N, we have a i 1 = 1, a i i = i! a a i l = la i 1 l + a i 1 l 1 for l 2,, i 1} Sice max 1 l i a i l 2i i 1 agai by iuctio, we euce that 55 1 + e 2ty q i y 2 2i 1 i i t i Writig s := carj, it follows that 56 q g J x q gx s 2 2i 1 i i t i Ss, iafx im+1 Dg i i=1 q gx s s+1 2 2s 1 max1, t s B s afx sm+1 D s g, where B s := s i=1 Ss, i eotes the sth Bell umber We ca ow apply the multivariate Leibiz rule, so that for a multi-iex ω = ω 1,, ω with ω m, a for t 1 a m 1, ω f t,g x x ω = 2ct ω ν q g x ω ν fx ν x ν x ω ν 57 ν:ν ω 2 3m 1 m m+1 B m D m g afxm2 +m f t,g x Now, i orer to cotrol ω f t,g y x ω f t,g x ω x, we first ote that by 53 ω a 54, we have for y x r a x, i N, J J m with carj = s a P 1,, P i } P i J, 58 g P 1 g P i y g P 1 g P i x 2D i afx im+1 y x β m Thus, by 55, 58, the mea value theorem a Lemma 2, for t 1, y x r a x a m 1, q g J y q g J x s q i g x i=1 P 1,,P i } P i J s + q i g y q i g x} i=1 D m qg xafx m2 +m+1 y x β m gp 1 gp i y gp 1 gp i x} P 1,,P i } P i J gp 1 gp i y B m2 3m+5 1/2 m + 1 m+1 1 + e 2tg x e 2tg x + e 2t g y g x 59 D m qg xafx m2 +m+1 y x β m B m 2 3m+5 1/2 m + 1 m+1 56 2t 41 imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
EFFICIENT ENTROPY ESTIMATION 37 Here, to obtai the fial iequality, we use the fact that 1+a b+a 1 b for a, b > 0 a b < 1, a the fact that e 2t g y g fy 2t, fx } 2t x = max 1+afx y x β 1} 2t 1 fx fy for y x r a x Usig the multivariate Leibitz rule agai, together with 56, 59 a Lemma 2, for t 1, y x r a x a ω = m 1, ω f t,g y x ω ω f t,g x x ω 2ct ω ω ν fy ν qg y ν y ω ν x ν ν qg x x ν ν:ν ω + ν qg x ν } fy x ν x ν ν fx x ν 60 2 4m+9 1/2 B m m + 1 m+1 D m afx m2 +m+1 f t,g x y x β m =: C md m afx m2 +m+1 f t,g x y x β m This also hols i the case m = 0 Now ote that if 12t < X g 2 f} 1/2 we have Fially, efie the fuctio fx = 1 + e 2tg x 2ct f t,g x f t,g x 4 61 ãδ := m/2 C md m aδ/4 m2 +m+1 The ã A a from 57 a 60, we have M ft,g,ã,βx ãf t,g x We coclue that for t < mi 1, 144 g 2 f} 1/2, we have that f t,g F,θ, where θ = α, β, 4γ, 4ν, ã Θ The result follows o otig that f t,gλ = f tλ,g Refereces Berrett, T B, Samworth, R J a Yua, M 2017 Efficiet multivariate etropy estimatio via k-earest eighbour istaces Submitte Götze, F 1991 O the rate of covergece i the multivariate CLT A Prob, 19, 724 739 va er Vaart, A W 1998 Asymptotic Statistics Cambrige Uiversity Press, Cambrige imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018
38 T B BERRETT, R J SAMWORTH AND M YUAN Statistical Laboratory Wilberforce Roa Cambrige CB3 0WB Uite Kigom E-mail: rsamworth@statslabcamacuk E-mail: tberrett@statslabcamacuk URL: http://wwwstatslabcamacuk/ rjs57 URL: http://wwwstatslabcamacuk/ tbb26 Departmet of Statistics Uiversity of Wiscosi Maiso Meical Scieces Ceter 1300 Uiversity Aveue Maiso, WI 53706 Uite States of America E-mail: myua@statwisceu URL: http://pagesstatwisceu/ myua/ imsart-aos ver 2014/10/16 file: WKLAoSFialSupptex ate: Jauary 28, 2018