MINIMAX RATES OF CONVERGENCE AND OPTIMALITY OF BAYES FACTOR WAVELET REGRESSION ESTIMATORS UNDER POINTWISE RISKS

Size: px

Start display at page:

Download "MINIMAX RATES OF CONVERGENCE AND OPTIMALITY OF BAYES FACTOR WAVELET REGRESSION ESTIMATORS UNDER POINTWISE RISKS"

Julie Blankenship
5 years ago
Views:

1 Statistica Siica 19 (2009) : Supplemet S1 S17 1 MINIMAX RATES OF ONVERGENE AND OPTIMALITY OF BAYES FATOR WAVELET REGRESSION ESTIMATORS UNDER POINTWISE RISKS Natalia Bochkia ad Theofais Sapatias Uiversity of Ediburgh ad Uiversity of yprus Supplemetary Material This ote cotais detailed proofs for Theorems 1-6 ad Propositios 1-2. Proofs of most auxiliary statemets are also icluded for completeess. Throughout this ote we use a geeric positive costat which is ot ecessarily the same eve withi a sigle equatio. S1. No-adaptive ad adaptive miimax rates of covergece uder poitwise l u -risks (1 u < ) i the stadard oparametric regressio model. First we prove Theorem 1 (o-adaptive case). Proof of Theorem 1. Lower boud] R u ( f Bpq(A) r t 0 ) u(r 1/p) estimator f ad ay sequece B as ] lim sup u(r 1/p) B sup E f(t 0 ) f(t 0 ) u f Bpq r (A) as is equivalet to the followig: for ay =. (7.1) Suppose ow that (7.1) does ot hold for some estimator f ad some sequece B such that B ad / log (B ) as i.e. lim sup u(r 1/p) B Hece for ay f 0 B r pq(a ) A < A sup f B r pq(a) ] lim sup u(r 1/p) B E f(t 0 ) f 0 (t 0 ) u <. E f(t 0 ) f(t 0 ) u ] <. (7.2) Thus Theorem S1.2 yields that for B ad / log (B ) as lim if (/ log (B ) ) ] u(r 1/p) sup E f(t 0 ) f(t 0 ) u > 0 f Bpq(A) r

2 2 NATALIA BOHKINA AND THEOFANIS SAPATINAS ad hece lim sup u(r 1/p) ] sup E f(t 0 ) f(t 0 ) u = f Bpq(A) r which cotradicts the assumptio made i (7.2). This completes the proof for the lower boud of Theorem 1. Upper boud] The validity of the upper bouds follows from the followig theorem. (Note that Theorems S1.1 ad S1.3 are proved uder a slightly more geeral coditio o the error distributio.) Theorem S1.1. Assume (S1) (S2) ad (S3) 1 u < ad f B r pq(a) with 1 p q A > 0 ad 1/p < r < s. Let ϕ be the pdf of N(0 σ 2 /2) 0 < σ σ σ < ad let ˆf be a hard thresholdig wavelet estimator with threshold t : σ 1/2 for = L L t = σ 1/2 log 2 (2u)] 1/2 for = J 1 1 where 1 = log 2. The for ay t 0 (0 1) ( ) R( u ˆf Bpq(A) r t 0 ) = O u(r 1/p) as. Proof of Theorem S1.1. It is easily see that the risk is bouded by R( u ˆf Bpq(A) r t 0 ) 2 L/2 E(ˆθ k θ k ) u ] 1/u φ k K L 1 (t 0 ) k K L 1 (t 0 ) k K (t 0 ) Term Q 11 + Q 12 i (7.3) is bouded by k K (t 0 ) =J k K (t 0 ) 2 L/2 θ k θ k φ 2 /2 E(ˆθ k θ k ) u ] 1/u ψ 2 /2 θ k θ k ψ 2 /2 θ k ψ = Q 11 + Q 12 + Q 21 + Q 22 + Q 3 ] u. (7.3) 2 L/2 V(ˆθ k )] 1/2 + 2 L/2 θ k θ k 1/2 σ L 1 + r k K L 1 k K L 1 ( = O 1/2) ( + o (r 1/p)) ( ) = o (r 1/p) u

3 POINTWISE OPTIMALITY UNDER GENERAL LOSSES 3 due to Lemma A.4 i Bochkia ad Sapatias (2006) ad the fact that V(ˆθ k ) = O( 1 ). O the other had term Q 3 i (7.3) is bouded by =J k K (t 0 ) 2 /2 θ k = O =J 2 /2 2 (r 1/p+1/2) = O (2 J(r 1/p)) ( (r 1/p)) ( = o (r 1/p) due to Lemma A.4 i Bochkia ad Sapatias (2006) which also implies that term Q 22 i (7.3) is domiated by (r 1/p) (log ) I(p= ). Thus we have that R( u ˆf ] 1/u Bpq(A) r t 0 ) (r 1/p) + k K (t 0 ) ) 2 /2 E θ k ˆθ k u] 1/u. (7.4) osider separately the sums for low ad high resolutio levels usig Lemma A.3 i Bochkia ad Sapatias (2006) to boud the summad i the first sum ad Lemma S5.6 to boud the summad i the secod sum: R u ( ˆf B r pq(a) t 0 )] 1/u (r 1/p)/() /2 k K (t 0 ) 1 2 /2 t E d k θ k u I( d k > t )] 1/u. The first sum is easy to calculate sice by defiitio t is idepedet of for 1 ad it is (r 1/p) equal to. Sice for > 1 t by Lemma A.4 i Bochkia ad Sapatias (2006) ad Lemma S5.7 the secod sum is bouded from above by 1/2 2 /2 (u+1)/2u 2 + 1/2 log ] (u+1)/2u 2 1/2 + k K (t 0 ) 2 /2 θ k 2 (r 1/p) = O ( (r 1/p) ). Thus for ay t 0 (0 1) R u ( ˆf B r pq(a) t 0 ) u(r 1/p) as. This completes the proof of Theorem S1.1. proved. Thus the proof for the upper boud of Theorem 1 is completed ad hece Theorem 1 is We ow prove Theorem 2 (adaptive case). I order to do that we eed some prelimiary results.

4 4 NATALIA BOHKINA AND THEOFANIS SAPATINAS Theorem S1.2. Take (r p q) such that r > 1 p 1 p q ad a sequece B such that B / log (B ) as. Let f be a estimator of f based o observatios from the stadard oparametric regressio model (2.1). If f 0 Bpq(A r ) with 0 < A < A satisfyig ] lim sup u(r 1/p) B E f(t 0 ) f 0 (t 0 ) u < the lim if (/ log (B ) ) ] u(r 1/p) sup E f(t 0 ) f(t 0 ) u > 0. f Bpq(A) r Proof of Theorem S1.2. Let X be a radom variable havig either distributio P θ0 with desity f θ0 or distributio P θ1 with desity f θ1 with respect to some domiatig measure. For ay estimator δ = δ(x) of θ {θ 0 θ 1 } its l u -risk (1 u < ) is defied by R u (δ θ) = E δ(x) θ u. Deote by κ(x) = f θ1 (x)/f θ0 (x) the ratio of the two desities. (κ(x) = for some x is possible with the obvious iterpretatio κ(x)f θ0 (x) = f θ1 (x).) For 1 u < deote by u the value satisfyig 1/u + 1/u = 1 (i.e. u ad u are cougate umbers). Let I u := I u (θ 0 θ 1 ) = E θ0 (κ(x)) u ] 1/u with obvious chage for u = (this is a measure of distace betwee the two distributios P θ0 ad P θ1 ). Take f (t) = γ 1 g(β (t t 0 )) + f 0 (t) t 0 1] where 1) g is a compactly supported ad mootoically decreasig fuctio satisfyig (i) g B r pq(a A ) (ii) g(x) 0 for x (0 1] g(0) > 0 ad (iii) g 2 2 > 0 (such a fuctio is easy to costruct either directly or by usig wavelets); deote by b = uc u g 2 2 ] 1 > 0 where c u = 1/2(u 1)] for u > 1 ad c u = 1 for u = 1; 2) γ = (/(b log (B ))) (r 1/p) ad β = (/(b log (B ))) 1. Note that 2) above implies that γβ 2 = /(b log (B )) ad γ 1 β r 1/p = 1. The i view of Lemma 1 i ai (2003) f B r pq(a). Write P 0 ad P 1 for the probability measures o R geerated from the stadard oparametric regressio model (2.1) with f = f 0 ad f = f respectively. Sice σ 2 is assumed kow we take σ 2 = 1 without loss of geerality. The a sufficiet statistic for the family of probability measures {P 0 P 1 } is give by the log likelihood ratio T = log(dp 1 /dp 0 ) (see e.g. Brow ad Low (1996b)). Set ρ = { } i=1 where t i := i/ i = The g 2 (β (t i t 0 )) γ 2 N( ρ /2 ρ ) uder P 0 T N( ρ /2 ρ ) uder P 1

5 POINTWISE OPTIMALITY UNDER GENERAL LOSSES 5 ad κ(x) = e x. Note that for large ad r > 1/p + 1/2 ρ ρ := 1 0 ρ = f f = γ 2 (1996b)). Now defie ḡ by β 1 g 2 2 g 2 (β (t t 0 )) dt where γ 2 (by followig the argumets of Sectio 5 i Brow ad Low ḡ(β (t t 0 )) = g(β (t i t 0 )) t i 1 < t t i i = with ḡ(0) = g(0). The ρ = 1 0 ḡ 2 (β (t t 0 )) dt. γ 2 Followig the argumets i the proof of Theorem 4 i ai (2003) we ca rewrite I u = ρ e ρ (u 1)/2 = e 2(u 1) for 1 < u < ad for u = Ĩ = κ(x)i(x ρ ) = e ρ. We ca uify the two cases by writig I u = e γ 2 β 1 values of β ad γ i the defiitio of g we get that I u = B 1/u /(ub) e µcu where µ = ρ ρ. Substitutig the e µcu sice γ 2 β 1 /b = log (B ). Sice µ does ot ecessarily goes to zero whe 1/p < r < 1/p + 1/2 we use the assumptio that g is o-egative ad decreasig to obtai the iequality that µ 0 for r > 1/p which is sufficiet for our purpose. Ideed sice g is o-egative ad decreasig for t t i 1 t i ] we get ḡ 2 (β (t t 0 )) = g 2 (β (t i t 0 )) g 2 (β (t t 0 )) implyig that ρ ρ (i.e. µ 0) ad thus that I u = e ρ c u e ρcu = B 1/u. Let ow δ = f (t 0 ) θ 0 = f 0 (t 0 ) ad θ 1 = f (t 0 ). Note that for sufficietly large = f (t 0 ) f 0 (t 0 ) = g(0)γ 1. For some > 0 ad large eough Lemma 2(i) i ai (2003) states that R u (δ θ 1 ) u (1 uε u I u / ) where R u (δ θ 0 ) ε u u; i our case it is give that ε u = 1/u value of ε u ad the upper boud for I u we get γ (r 1/p) B 1/u. Substitutig the ( ) R u (δ θ 1 ) = E f g(0) u { } (t 0 ) f (t 0 ) u 1 u 1/u (r 1/p) B 1/u B 1/u (1 + o(1))γ (g(0)) 1 This completes the proof of Theorem S1.2. ( =b g(0)] u(r 1/p) log B ) u(r 1/p) { } 1 u 1/u (b log(b )) (r 1/p) (1 + o(1))(g(0)) 1 ( ) u(r 1/p) =b g(0)] u(r 1/p) log B (1 + o(1)).

6 6 NATALIA BOHKINA AND THEOFANIS SAPATINAS We are ow ready to prove Theorem 2 (adaptive case). Proof of Theorem 2 Lower boud] osider two Besov classes B r i p i q i (A i ) with r i > 1/p i for i = 1 2. Let 0 < r 2 1/p 2 < r 1 1/p 1. Applyig Theorem S1.2 it immediately follows that if a estimator ˆf satisfies as lim sup l sup f B r 1 p 1 q 1 (A 1 ) E ˆf (t 0 ) f(t 0 ) u ] < for some l > u(r 2 1/p 2 )/(1 + 2(r 2 1/p 2 ) the ] (/ ) lim if u(r 2 1/p 2 ) log 2(r 2 1/p 2 )+1 sup E ˆf (t 0 ) f(t 0 ) u > 0. f B r 2 p 2 q 2 (A 2 ) This completes the proof for the lower boud of Theorem 2. Upper boud] The validity of the upper boud follows from the followig theorem. Theorem S1.3. Assume (S1) (S2) ad (S3) 1 u < ad f B r pq(a) with 1 p q A > 0 ad 1/p < r < s. Let ϕ be the pdf of N(0 σ 2 /2) 0 < σ σ σ < ad let ˆf be a hard thresholdig wavelet estimator with threshold t = σ 1/2 (u log 2) 1/2 for = L L J 1. The for ay t 0 (0 1) ( R( u ˆf Bpq(A) r t 0 ) = O log ) u(r 1/p) as. ( ) 1 Proof of Theorem S1.3. Let 2 be such that 2 2 = 2ν+1 log. From the proof of Theorem S1.1 it follows that we eed to fid a upper boud o (7.4) which we cosider separately for low ad high resolutio levels: 2 k K (t 0 ) 2 /2 E(ˆθ k θ k ) u ] 1/u + k K (t 0 ) 2 /2 E(ˆθ k θ k ) u ] 1/u = R 1 + R 2. By Lemma A.3 i Bochkia ad Sapatias (2006) the first sum is bouded from above by /2 t + O( 1/2 2 2/2 ) = 1/2 2 /2 σ 1/2 + O( 1/2 2 2/2 ) 1/2 log 2 ] 1/2 2 2/2 + O( 1/2 2 2/2 ) ( ) ν/(2ν+1). log I the proof of Theorem S1.1 we showed that if t 1/2 as E d k θ k u I( d k > t ) σ u u/2 t /σ u+1 exp{ (t /σ ) 2 /2}

7 POINTWISE OPTIMALITY UNDER GENERAL LOSSES 7 ( ) which holds here sice for > 2 t 1/2 1/2 2 = log 2 log as. Note that sice 2 < 1 ad t 1/2 for > 2 as we ca apply Lemma S5.7 to obtai a upper boud o the secod sum: 1/2 = ( log 2 /2 σ 1/2 t /σ 1+1/u exp{ (t /σ ) 2 /2u} 2 /2 1/2+1/2u exp{ log 2/2} 1/2 (log ) 1/2+1/2u ) 1/2 (log ) 1+1/2u = o This completes the proof of Theorem S1.3. proved. ( ) ν/(2ν+1). log Thus the proof for the upper boud of Theorem 2 is completed ad hece Theorem 2 is S2. Proof of theorems for o-adaptive Bayes factor wavelet estimators Set ε 1 = 1. Proof of Theorem 3. Followig the proof of Theorem S1.1 we oly eed to cosider (7.4): R = 2 /2 k K (t 0 ) E θ k ˆθ k u] 1/u = Slow + S high where S low represets the sum over idices L 1 ad S high represets the sum over 1 < J 1. Note that for the low resolutio levels 1 ν / = 2 m 1 1/2 = 2 m 1( 1 ) m 1ε 1 1/2 0 as (7.5) sice m 1 ε 1 1/2 = m 1 (r 1/p+1/2) 0. Similarly for the high resolutio levels ν / = 2 m 2 1/2 = 2 m 2( 1 ) m 2ε 1 1/2 as (7.6) sice m 2 ε 1 1/2 = m 2 (r 1/p+1/2) 0. Low resolutio levels L 1. We use Lemma A.3 i Bochkia ad Sapatias (2006) to boud S low from above: S low = 1 2 /2 k K (t 0 ) mi(t θ k ) + O(2 1/2 1/2 ) sice κ u = + x u ϕ (x)dx = cσ u + z u e z β dz < σ u < (uiformly).

8 8 NATALIA BOHKINA AND THEOFANIS SAPATINAS Sice ν / 0 due to (7.5) ad prior h satisfies the assumptios of Lemma S5.5 we ca apply Lemma S5.5 together with the upper boud for the threshold uder power-expoetial errors (Lemma 4(i) i Pesky ad Sapatias (2007)) to obtai the followig boud: S low 1 1 ( 2 /2 t + O(2 1/2 1/2 ) 1/2 2 /2 )] 1/β log β ν 1 + O(2 1 /2 1/2 ) 1 ( 1/2 2 /2 )] 1/β log β ν 1 + O(2 1 /2 1/2 ) = O ( r 1/p ) sice β ν 1 = 2 (a 1 m 1 ) b 1+1/2 b 1+1/2+ε 1 (a 1 m 1 ) + uder the assumptios of the theorem. Thus for the low level sum the rate is optimal. High resolutio levels J 1. We eed to show that the followig sum is bouded by ν/(2ν+1) : S high = 2 /2 k K (t 0 ) E θ k ˆθ k u] 1/u. By Lemma S5.6 S high 2 2 /2 k K (t 0 ) E θ k d k u I( d k > t )] 1/u /2 k K (t 0 ) Due to Lemma A.4 i Bochkia ad Sapatias (2006) the secod sum is bouded by 2 /2 2 (r 1/p+1/2) = 2 (r 1/p) = O( ν/(2ν+1) ) as. Note also that ν / due to (7.6). Now we cosider the first term. osider separately 2 cases: β 1 ad β > β 1. I this case λ ϕ = 0 ad if β > 1 by Lemma 2(ii) i Pesky (2006) θ k. ζ = 1 + o(ν 1 ) < β x implyig that I( d k > t k ) = I(ζ (d k ) > β ) = 0 ad thus the first sum is zero. S5.3: Note that β > 1 sice β = 2 a 2 b 2 b 2+ε 1 (a 2 ) + sice b 2 + ε 1 (a 2 ) + > β > 1. I this case we boud the summads i the first sum usig Lemma S5.7 ad Lemma E θ k d k u I( d k > t ) u/2 t /σ ] u+1 e (t /σ ) β = u/2 log β ] (u+1)/β β 1

9 POINTWISE OPTIMALITY UNDER GENERAL LOSSES 9 sice β for > 1 due to b 2 + ε 1 (a 2 ) + > 0 ad thus t 1/2. Thus the secod sum i this case is bouded by The last sum is equal to = 1/2 2 /2 1/2 log β ] (u+1)/uβ β 1/u 2 /2 b 2 log + a 2 log 2] (u+1)/uβ b 2/u 2 a 2/u 1/2 b2/u (b 2 + ε 1 (a 2 ) + ) log ] (u+1)/uβ 2 (1/2 a 2/u). (b 2+a 2 )/u log ] (u+1)/uβ 1/2 a 2 /u > 0 1/2 b 2/u log ] 1+(u+1)/uβ 1/2 a 2 /u = 0 (1 ε 1)/2 (b 2+ε 1 a 2 )/u log ] (u+1)/uβ 1/2 a 2 /u < 0. Thus this sum coverges to zero at a rate ot slower tha the optimal if b 2 + a 2 > u(1 ε 1 )/2 if a 2 u/2 b 2 + ε 1 a 2 > 0 if a 2 > u/2 which ca be rewritte as b 2 + ε 1 a 2 > (u/2 a 2 )(1 ε 1 ) if a 2 u/2 b 2 + ε 1 a 2 > 0 if a 2 > u/2 or b 2 + ε 1 a 2 > (u/2 a 2 ) + (1 ε 1 ). It is easy to check that this coditio implies b 2 + ε 1 (a 2 ) + > 0 required earlier. This completes the proof of Theorem 3. Proof of Theorem 4. Followig the proof of Theorem S1.1 we oly eed to cosider (7.4) which we cosider separately for low ad high resolutio levels. Low resolutio levels L 1. Sice m 1 r 1/p + 1/2 ν / 0 followig the proof for the low levels of Theorem 3 the sum for the low levels is bouded by the optimal rate plus 1 2 t which achieves the optimal rate if β /ν for the cosidered (Lemma S5.5) i.e. give 1/2+b 1 2 (a 1 m 1 ) 1 if a 1 m 1 > 0 β /ν = 1/2+b 1 2 (a 1 m 1 ) 1/2+b 1 1 if a 1 m 1 = 0 1/2+b 1 if a 1 m 1 < 0

10 10 NATALIA BOHKINA AND THEOFANIS SAPATINAS or equivaletly if 1/2 + b 1 + ε 1 (a 1 m 1 ) 0 if a 1 m 1 > 0 1/2 + b 1 < 0 if a 1 m 1 = 0 1/2 + b 1 0 if a 1 m 1 < 0. High resolutio levels J 1. Sice ϕ is a heavy tailed desity (3.3) by Lemma 2 (ii) i Pesky (2006) ζ (x) = I (x) ϕ ( x) = 1 + o(1) < β sice β = 2 a 2 b 2 b 2+ε 1 (a 2 ) + as due to b 2 + ε 1 (a 2 ) + > 0. Thus I(ζ ( d k ) > β ) = 0 ad the secod sum is zero. This completes the proof of Theorem 4. S3. Proof of theorems for adaptive Bayes factor wavelet estimators To prove adaptive error rates we use aother divisio of the resolutio levels with the critical level 2 defied by 2 = ( ) 1 2(r 1/p) + 1 log 2. (7.7) log Note that this adaptive critical level is smaller tha the o-adaptive critical level 1. Lemma S3.1. Assume (S1) (S2) ad (S3) 1 u < ad f B r pq(a) with 1 p q A > 0 ad 1/p < r < s. Let ϕ be the pdf of N(0 σ 2 /2) 0 < σ σ σ < ad let ˆf BF is the correspodig Bayes Factor estimator. We assume that h is such that ζ (x) icreases for x > 0. Deote f = max(1 ν / ) v = β f /ν as. The for ay t 0 (0 1) R u ( ˆf BF B r pq(a) t 0 )] 1/u ( ) (r 1/p) () log + 1/2 + 1/2 2 2 /2 f log( ϕh v )] 1/2 2 /2 v 1/u log v ] (u+1)/2u. (7.8) Proof of Lemma S3.1. Followig the proof of Theorem S1.1 we oly eed to cosider (7.4) ad usig Lemma A.3 i Bochkia ad Sapatias (2006) ad Lemma S5.6 we have that R u ( ˆf BF B r pq(a) t 0 )] 1/u ν/(2ν+1) /2 2 k K (t 0 ) 2 /2 t E θ k d k u I( d k > t )] 1/u.

11 POINTWISE OPTIMALITY UNDER GENERAL LOSSES 11 By Lemma S5.4 the first sum is bouded by 2 2 /2 t 2 2 /2 σ 1/2 ( 1 + ν2 2 1/2 2 /2 σ f log(ϕh β f /ν ) ] 1/2. ) 1/2 log( ϕh β 1 + /ν 2 ) ] 1/2 By Lemma S5.7 ad Lemma S5.2 the secod sum is bouded by 1/2 1/2 1/2 2 /2 t σ / ] 1+1/u exp{ t σ / ] 2 /u} 2 /2 log(β max(1 /ν ))] (u+1)/2u β max(1 /ν )] 1/u 2 /2 log(β f /ν )] (u+1)/2u β f /ν ] 1/u. Thus Lemma S3.1 is proved. If we boud the logarithmic term i the sums from above we obtai the followig corollary. orollary S3.1. Assume that ν ad β are such that for all = L L+1... J 1 ν c ad β /ν as such that log(β /ν ) B log for some c B > 0. The uder assumptios of Lemma 1 for ay t 0 (0 1) R u ( ˆf BF B r pq(a) t 0 )] 1/u ( ) r 1/p ( + log We are ow ready to prove Theorems 5 ad 6. log ) 1/2 1/2u 2 /2 (β /ν ) 1/u. (7.9) Proof of Theorem 5. By substitutig assumptio (3) of the theorem ito orollary S3.1 we obtai the followig boud: R u ( ˆf BF B r pq(a) t 0 )] 1/u ( ) r 1/p ( + log log ) 1/2 1/2u 2 /2 b/u 2 a/u.

12 12 NATALIA BOHKINA AND THEOFANIS SAPATINAS The latter sum is bouded by = ( ) 1/2 (log ) 1/2u log ( ) 1/2 (log ) 1/2u log b/u 1/2u ε 1(1/2 a/u) 1/2 a/u < 0 b/u 1/2u log 1/2 a/u = 0 b/u 1/2u (1/2 a/u) 1/2 a/u > 0 b/u 1/2u+ε 1(1/2 a/u) a > u/2 b/u 1/2u log a = u/2 b/u 1/2u+1/2 a/u a < u/2 ( ) 1/2 (log ) 1/2u+1 b/u 1/2u+(u/2 a) +/u log ( ) 1/2 ( ) r 1/p (log ) 1/2u+1 log log due to assumptio b + 1/2 (u/2 a) + 0. Thus Theorem 5 is proved. Proof of Theorem 6. Followig the proof of Theorem S1.1 we oly eed to cosider (7.4) ad usig Lemma A.3 i Bochkia ad Sapatias (2006) ad Lemma S5.6 we have that R u ( ˆf BF B r pq(a) t 0 )] 1/u ν/(2ν+1) /2 k K (t 0 ) 2 /2 t E θ k d k u I( d k > t )] 1/u. a) Low resolutio levels. Sice ν / 0 ad h is either heavy tailed (3.3) or ormal we ca apply Lemma 4(i) i Pesky ad Sapatias (2007) to boud the first sum by 1/2 2 2 /2 log(β /ν )] 1/β 1/2 (log ) 1/2 2 2/2 = sice we assume that log(β /ν ) Blog ] β/2. b) High resolutio levels. By Lemma S5.7 we have ( ) (r 1/p)/() log E ˆθ k θ k u I( d k > t ) u/2 t /σ ] u+1 e t /σ ] β ad by Lemma S5.2 we have ϕ (t ) ϕ (0)(β /ν ) 1 sice ν / 0. Hece t σ 1/2 log(β /ν )] 1/β with β /ν implyig E ˆθ k θ k u I( d k > t ) u/2 log(β /ν )] (u+1)/β β /ν ] 1

13 POINTWISE OPTIMALITY UNDER GENERAL LOSSES 13 ad = 2 k K (t 0 ) 2 /2 E ˆθ k θ k u I( d k > t )] 1/u 1/2 1/2u log ] (u+1)/(2u) = 2 2 /2 β /ν ] 1/u. We have the same sum as i the proof of Theorem 5 but with a differet power of the logarithmic factor. Therefore uder the same assumptios as i Theorem 5 we obtai the optimal rate of covergece. Hece Theorem 6 is proved. S4. Proof of propositios for a priori Besov regularity Proof of Propositio 1. Accordig to Theorem 3 i Pesky ad Sapatias (2007) we eed to check that lim 2 (r+1/2) ν 1 β 1/p ] mi(p q) <. Deote κ = mi(p q) 1 ) the the sum ca be writte as κb 1/p 1 2 κ(r+1/2 m 1 a 1 /p) + κb 2/p = κb 1/p+κ(r+1/2 m 1 a 1 /p) + log ] I(r+1/2 1/p m 1 a 1 /p=0) + κb 2/p+κZ(r+1/2 m 2 a 2 /p) log ] I(r+1/2 1/p m 2 a 2 /p=0) 2 κ(r+1/2 m 2 a 2 /p) where Z = 1 if r+1/2 1/p m 2 a 2 /p 0 ad Z = 1 if r+1/2 1/p m 2 a 2 /p < 0. This sum is fiite as if b 1 /p (r + 1/2 m 1 a 1 /p) + 0 ad b 2 /p Z(r + 1/2 m 2 a 2 /p) 0 the iequalities are strict if r + 1/2 m 1 a 1 /p = 0 or r + 1/2 m 2 a 2 /p = 0 respectively. Proof of Propositio 2. Accordig to Theorem 3 i Pesky ad Sapatias (2007) we eed to check that lim 2 (r+1/2) ν 1 β 1/p ] mi(p q) <. Deote κ = mi(p q) 1 ) the the sum ca be writte as κb/p 2 κ(r+1/2 m (a+m)/p) = κb/p+κ(r+1/2 m (a+m)/p) + log ] I(r+1/2 1/p m (a+m)/p=0) which is fiite as if b/p (r + 1/2 m (a + m)/p) + 0 ad the iequality is strict if r + 1/2 m (a + m)/p = 0. S5. Auxiliary lemmas. Lemma S5.2. If ϕ ad h are symmetric uimodal desity fuctios the ϕ ( ( t ) β 1 mi ϕ (0) h(0)ν ).

14 14 NATALIA BOHKINA AND THEOFANIS SAPATINAS Proof of Lemma S5.2. Note that the symmetry ad uimodality of ϕ implies that ϕ (x) ϕ (0) for ay x. Therefore the equatio for the threshold t (see expressio above (3.11)) ca be rewritte as follows β = ζ (t ) = + ϕ ( (t x))ν h(ν x)dx ϕ ( t ) + ϕ (0)ν h(ν x)dx ϕ ( = t ) Similarly by symmetry ad uimodality of h we have β = ζ (t ) = + ϕ ( x)ν h(ν (t x))dx ϕ ( t ) ϕ (0) ϕ ( t ). + ϕ ( x)ν h(0)dx ϕ ( = t ) ν h(0) ϕ ( t ). Rearragig the terms we have ϕ ( { t ) mi β 1 ϕ (0) β 1 h(0)ν / }. (7.10) Thus Lemma S5.2 is proved. The followig lemma is a obvious corollary from Lemma S5.2. Lemma S5.3. If ϕ (x) = c β σ 1 e x/σ β β > 0 ad h is a symmetric uimodal desity the { ( )] } 1/β β t σ max log log (β )] 1/β. h(0)ν Lemma S5.4. Take ϕ (x) N(0 σ 2 /2) ϕ (x)/h(x) ϕh ad let ζ (x) be icreasig for x > 0. The t σ 1/2 ( 1 + ν2 ) 1/2 log ϕh β 1 + ν 2 1/2. Proof of Lemma S5.4. Sice ζ (x) icreases for x > 0 ζ (x) β if ad oly if x t. We ca fid the followig lower boud o ζ (x) usig h(x) 1 ϕh ϕ (x). More precisely ζ (x) = ϕ (x )] 1 ν ϕ ((x y) )h(ν y)dy 1 ϕh ϕ (x )] 1 ν = 1 ϕh = 1 ϕh ν πσ ϕ ((x y) )ϕ (ν y)dy exp { ( + ν 2 )y 2 /σ 2 + 2xy/σ 2 } dy { } ν 2 x 2 exp ν 2 + ( + ν 2. )σ2

15 POINTWISE OPTIMALITY UNDER GENERAL LOSSES 15 { } Take x > 0 such that 1 ν ϕh exp 2 x 2 = β + (+ν 2. The )σ2 ν 2 t x = σ ( 1 + ν2 ) 1/2 log ϕh β 1 + ν 2 1/2. Thus Lemma S5.4 is proved. Lemma S5.5. Let ϕ ad h be symmetric uimodal desities ad ϕ have fiite variaces σ 2 0 < σ σ σ < h has a bouded secod derivative ad ζ (x) icreases for x > 0. The if ν / 0 as β /ν 1 if ad oly if t 0 1/2 for some 1 0 > 0 which deped o ϕ ad h but ot o ν or. Remark S5.1. If the coditio that h has a bouded secod derivative is replaced with the coditios that ϕ (x)/h(x) ϕh ad ϕ have bouded secod derivatives (uiformly i ) the β /ν 1 implies that t 0 1/2. Proof of Lemma S5.5. osider the fuctio ζ (x) at poit x = v where v is idepedet of. The ζ (x ) = ϕ ( + x )] 1 ν ϕ ( y)h(ν (x y))dy = ϕ (v)] 1 ν + ϕ (z)h = ϕ (v)] 1 ν + ϕ (z) h(0) + ( ) ν (v z) = ϕ (v)] 1 ν ( ) ] 2 ν h(0) + O (v 2 + σ 2 ). dz ( ν ) 2 (v z) 2 h (c(v z))/2 ] dz Sice ζ (x) icreases t x if ad oly if ζ (x ) ζ (t ) = β. The last iequality ca be writte as ( ) ] 2 ϕ (v)] h(0) 1 ν + O (v 2 + σ 2 ) β /ν. As ν / 0 the left had side teds to = h(0)/ϕ (v) uiformly i sice σ 2 are bouded from above ad below implyig that β /ν 0 if ad oly if t 1 1/2 where 1 = v is a costat. Thus Lemma S5.5 is proved. We state the obvious lemma below due to its frequet use i the proofs.

16 16 NATALIA BOHKINA AND THEOFANIS SAPATINAS Lemma S5.6. Let ˆθ k be a hard thresholdig estimator of θ k with threshold t based o observatio d k ad 1 u <. The E ˆθ k θ k u = θ k u I( d k t ) + E d k θ k u I( d k > t ) θ k u + E d k θ k u I( d k > t ). Lemma S5.7. If ϕ (x) = β σ 1 exp{ x/σ β } ad t for > 1 as the E θ k d k u I( d k > t )] 1/u β t σ / ] u+1 exp{ t σ / ] β }. Proof of Lemma S5.7. It is easily see that E d k θ k u I( d k > t ) = x θ k u exp{ (x θ k )/σ β }dx σ 2π x >t = σu u/2 2π yσ / +θ k >t y u exp{ y β }dy. Sice for B > max(0 (u + 1)/β 1] 1/β ) we have the followig boud: B z u e z β dz 1 β Bu+1 e Bβ (7.11) E d k θ k u I( d k > t ) σ u u/2 t /σ u+1 exp{ (t /σ ) β } = σ u u/2 (2u log 2) (u+1)/2 e u log 2 by Lemma S5.6 ad for θk A 2 (r 1/p+1/2) A 2 1(r 1/p+1/2) = A (7.12) accordig to Lemma A.4 i Bochkia ad Sapatias (2006). Now we prove (7.11). B z u e zβ dz = x = 1/z] = 1/B 0 x u 2 e x β dx Fuctio x u 2 e x β icreases for x < β/(u + 2)] 1/β sice the Thus for all x 1/B < β/(u + 2)] 1/β B d dx (x u 2 e x β ) = e x β x u 3 (u + 2) + βx β ] < 0. z u e zβ dz = 1/B 0 x u 2 e x β dx B 1 B u+2 e Bβ = B u+1 e Bβ.

17 Thus Lemma S5.7 is proved. POINTWISE OPTIMALITY UNDER GENERAL LOSSES 17 Lemma S5.8. Let ϕ ad h be Studet s t distributios with desity (4.2) with ρ ad γ degrees of freedom respectively 0 < γ < ρ ϕ (x) = σ 1 ϕ(x/σ ) 0 < σ σ σ <. The if ν / 0 as the threshold t defied by (3.10) asymptotically satisfies 1 t /σ = (1 + o(1)) 0 1 β 1 0 β ] 1/(ρ+1) ν σ ( ) ν σ γ ] 1/(ρ γ) for some costats 0 1 > 0 depedig oly o γ ad ρ. β < 0 ν σ ν σ 0 β 0 ( ν σ ) ρ β > 0 ( ν σ ) ρ Proof of Lemma S5.8. We ca obtai the asymptotic expressio for the threshold followig the proof of Lemma 4(ii) i Pesky ad Sapatias (2007) uder the assumptios of the lemma that for some costat 0 > 0 ζ (x) = 0 ν σ F (x) with F (x) = (γ +ν 2 x2 ) (γ+1)/2 (ρ+x 2 /σ 2 )(ρ+1)/2 (1+ o(1)) (due to Lemma 2 i Pesky ad Sapatias (2007)) ad c ρ c 1 γ (1 + o(1)) x < σ / F (x) = c 1 γ ( x/σ ) ρ o(1)] σ / x 1/ν ( x/σ ) ρ+1 (ν x) γ o(1)] x > 1/ν where c k = k (k+1)/2. Sice the threshold satisfies β = ζ (t ) = 0 ν σ F (t ) we obtai that 1 t /σ = (1 + o(1)) 0 1 β 1 0 β ν σ which is the statemet of the lemma. β ν σ < 0 ] 1/(ρ+1) ν σ 0 β ( ) ] ν σ γ+1 1/(ρ γ) β ν σ ν σ 0 ( ν σ ) ρ+1 > 0 ( ν σ ) ρ+1

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite