Block thresholding for density estimation: local and global adaptivity

Size: px
Start display at page:

Download "Block thresholding for density estimation: local and global adaptivity"

Transcription

1 Journal of Multivariate Analysis ) Bloc thresholding for density estimation: local and global adaptivity Eric Chicen a,,t.tonycai b,1 a Department of Statistics, Florida State University, Tallahassee, FL , USA b Department of Statistics, The Wharton School, University of Pennsylvania, Philadelphia, PA 19104, USA Received 18 August 003 Available online 11 September 004 Abstract We consider wavelet bloc thresholding method for density estimation. A bloc-thresholded density estimator is proposed and is shown to achieve the optimal global rate of convergence over Besov spaces and simultaneously attain the optimal adaptive pointwise convergence rate as well. These results are obtained in part through the determination of an optimal bloc length. 004 Elsevier Inc. All rights reserved. AMS 1991 subject classification: primary 6G07; secondary 6G0 Keywords: Adaptive estimation; Besov space; Bloc thresholding; Density estimation; Hölder space 1. Introduction In nonparametric function estimation the performance of an estimator is typically measured under one of two commonly used losses: squared error at each point and integrated squared error over the whole interval. The first is a measure of accuracy of an estimator locally at a point and the second provides a global measure of precision. Minimax and adaptation theories have been well developed for both the local and global losses. See for example [7,1 14,1,3]. See also the references in Efromovich [16]. Corresponding author. Fax: address: chicen@stat.fsu.edu E. Chicen). 1 Research supported in part by NSF Grant DMS X/$ - see front matter 004 Elsevier Inc. All rights reserved. doi: /j.jmva

2 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) It has been noted in the literature that a local or global ris measure alone does not fully capture the performance of an estimator. For functions of spatial inhomogeneity, local smoothness of the functions varies significantly from point to point and a globally rateoptimal estimator can have erratic local behavior. On the other hand, an estimator which is locally rate optimal at each point can be far from optimal under the global loss, see [8]. More recent focus has been on a simultaneously local and global analysis of the performance of an estimator. The goal is to construct adaptive estimators which are near optimal simultaneously under both pointwise loss and global loss over a collection of function classes. Such an estimator permits the trade-off between variance and bias to be varied along the curve in an optimal way, resulting in spatially adaptive smoothing in classical sense. This approach has been used for example in Cai [4,6] and Efromovich [17] in the context of nonparametric regression and in Cai [5] for inverse problems. Wavelet methodology has demonstrated considerable success in terms of spatial adaptivity and asymptotic optimality. In particular, bloc thresholding rules have been shown to possess impressive properties. The estimators mae simultaneous decisions to retain or to discard all the coefficients within a bloc and increase estimation accuracy by utilizing information about neighboring coefficients. The idea of bloc thresholding can be traced bac to Efromovich [15] in orthogonal series estimators. In the context of nonparametric regression local bloc thresholding has been studied, for example, in Hall et al. [19], Cai [4,6], Cai and Silverman [9] and Efromovich [17]. Bloc thresholding rules for inverse problems were considered in Cai [5]. In particular it is shown in Cai [4,6] that there are conflicting demands on the bloc size for achieving the global and local adaptivity. To achieve the optimal global adaptivity the bloc size needs to be large and to achieve the optimal local adaptivity the bloc size must be small. An optimal choice of bloc size is given and the resulting estimator is shown to attain the adaptive minimax rate of convergence simultaneously under both the pointwise and global losses. In the present paper we consider bloc thresholding for density estimation. In this context, Keryacharian et al. [] proposed a wavelet bloc thresholding estimator which uses an entire resolution level as a bloc. The thresholding rule is not local and so does not enjoy a high degree of spatial adaptivity. A local version of bloc thresholding density estimator was introduced in Hall et al. [0]. The bloc size is chosen to be of the order log n) where n is the sample size. The estimator is shown to enjoy a number of advantages over the conventional term-by-term thresholding estimators. The global properties of the estimator were studied. The estimator adaptively attains the global optimal rate of convergence over a range of function classes of inhomogeneous smoothness under integrated squared error. However, as shown in this paper, the estimator does not achieve the optimal local adaptivity under pointwise squared error. The bloc size is too large to fully capture subtle spatial changes in the curvature of the underlying function. In the present paper, we propose a bloc thresholding density estimator and give a simultaneously local and global analysis for the estimator. Let X 1,X,...,X n be a random sample from a distribution with density function f. We wish to estimate the density f based on the sample. The estimation accuracy is measured both globally by the mean integrated squared error R ˆ f,f)= E ˆ f f, 1)

3 78 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) and locally by the expected squared error loss at each given point x 0 R ˆ fx 0 ), f x 0 )) = E ˆ fx 0 ) fx 0 )). ) Our bloc thresholding procedure first divides the empirical coefficients at each resolution level into nonoverlapping blocs and then simultaneously eeps or ills all the coefficients within a bloc, based on the magnitude of the sum of the squared empirical coefficients within that bloc. Motivated by the analysis of bloc thresholding rules for nonparametric regression in Cai [6], the bloc size is chosen to be log n. It is shown that the bloc thresholding estimator adaptively achieves not only the optimal global rate over Besov spaces, but simultaneously attains the adaptive local convergence rate as well. These results are obtained in part through the determination of the optimal bloc length. The paper is organized as follows. After Section, in which bacground information on wavelets and the function spaces of interest is given, we discuss bloc thresholding rules for density estimation in Section 3. The asymptotic properties of the bloc thresholding estimator are set forth in Section 4, along with a related theorem on a bloc thresholded convolution ernel estimator. Simulation results for the proposed estimator are found in Section 5 and proofs of the theorems are given in Section 6.. Wavelets and function spaces An orthonormal wavelet basis is generated from dilation and translation of a father wavelet and a mother wavelet ψ. In this paper, the functions and ψ are assumed to be compactly supported and = 1. We call a wavelet ψ r-regular if ψ has r vanishing moments and r continuous derivatives. Let ij t) = i/ i t j), ψ ij t) = i/ ψ i t j). The collection { mj, ψ ij, i m, j Z} is then an orthonormal basis of L R), see [11,6]. Besov spaces arise naturally in many fields of analysis. They contain a number of traditional function spaces such as Hölder and Sobolev spaces as special cases. A Besov space Bp,q s has three parameters: s measures degree of smoothness, p and q specify the type of norm used to measure the smoothness. Besov spaces can be defined by the sequence norm of wavelet coefficients. For a given function f, denote α mj = ft) mj t) dt and β ij = ft)ψ ij t) dt. Define the sequence norm of wavelet coefficients of f by f B s p,q = α mj l p + is+1/ 1/p) 1/p q 1/q β i,j p. 3) i=m j The standard modification applies for the cases p, q =. Let the wavelet ψ be r-regular. For s<r, the Besov space Bp,q s is defined to be the Banach space consisting of functions with finite Besov norm B s p,q. The Hölder space Λ s is a special case of a Besov space Bp,q s with p = q =. See Triebel [8,9] and Meyer [6] for more on the properties of Besov spaces. We shall measure the global adaptivity of an estimator over two families of rich function classes which were used in Hall et al. [19]. The classes contain functions of inhomogeneous

4 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) smoothness and are different from other traditional smoothness classes. Functions in these classes can be regarded as the superposition of smooth functions with irregular perturbations such as jump discontinuities and high-frequency oscillations. Let Fp,q s M, L) ={f Bs p,q : suppf ) [ L, L], f Bp,q s M} 4) be the collection of functions with support contained in [ L, L] and Besov norm bounded by M. Following the notation of Hall et al. [19], the first function space of interest is Ṽ s1 F,q s M, L)) which consists of functions which are the sum of a function in the space F,q s M, L), q 1 and an irregular function in F s 1 M, L). s+1/) 1,q A second space of interest, denoted by V d,τ F,q s M, L)), consists of functions which can be represented as the sum of a function in the space F,q s M, L), q 1 and a function in P d,τ,l which is the set of piecewise polynomials of degree d, support in [ L, L], and with the number of discontinuities no more than τ. Local adaptivity of an estimator is measured over the local Hölder classes Λ s M, x 0, δ) which is defined as follows. For 0 <s 1, Λ s M, x 0, δ) ={f : fx) fx 0 ) M x x 0 s,x x 0 δ,x 0 + δ)} and for s>1, Λ s M, x 0, δ) ={f : f s ) x) f s ) x 0 ) M x x 0 t,x x 0 δ,x 0 + δ)}, where s is the largest integer strictly less than s and t = s s. 3. The estimators 3.1. Wavelet and convolution ernels Let X 1,X,...,X n be a random sample of size n from a distribution with density function f. The objective is to estimate the density function f based on the sample. We shall use similar notation as in Hall et al. [19]. Let Kx,y) be a ernel function on R, and define K i x, y) = i K i x, i y), i = 0, 1,,....Additionally, K i f will be the integral operator defined as K i fx)= K i x, y)f y) dy. Note that ˆK i x) = n 1 n K i x, X m ) m=1 is an unbiased estimate of K i fx) for all x. If using a convolution ernel, Kx,y) = Kx y). For wavelets, Kx,y) = j x j) y j),where is the father wavelet. We impose the following conditions on the ernel K. First, there exists a compactly supported Q L such that and Qx) = 0 when x >q 0 5) Kx,y) Qx y) for all x and y. 6)

5 80 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Next, K satisfies the moment condition of order N: x N+1 Qx) dx < and Kx,y)y x) dy = δ 0 for = 0, 1,...,N. 7) Conditions 5) 7) are met in the wavelet case if the mother wavelet ψ has N vanishing moments. As in Hall et al. [19] define the innovation ernel by D i x, y) = K i+1 x, y) K i x, y) for i = 0, 1,....Let D i f be the integral operator K i+1 f K i f. Then, similarly to ˆK i, an unbiased estimator of D i fx)is ˆD i x) = n 1 n D i x, X m ). m=1 For the wavelet ernel defined above, K and D i can be associated with the projection operators of the multiresolution analysis. Kx,y) is the projection operator on to the space spanned by and its integer translates. D i x, y) is, then, the operator projecting on to the detail spaces of multiresolution analysis. K and D i perform similar tass in the convolution case: namely, projection operators on to coarse and detail spaces. This innovation ernel will be used to define the density estimator. 3.. Bloc thresholded estimators The density f may be written as fx)= K 0 fx)+ D i fx). 8) The linear part, K 0 fx), will be estimated by ˆ K 0 x). The remaining part will be estimated using thresholding methods, and hence is nonlinear in nature. Let and ψ be compactly supported father and mother wavelets satisfying conditions 5) 7). We shall write j for 0j. Then unbiased estimates of α j = f, j and β ij = f, ψ ij are ˆα j = n 1 n j X m ) and ˆβ n ij = n 1 ψ ij X m ). m=1 Note that the linear part K 0 fx) can be written as K 0 fx) = Kx, y)f y) dy = j α j j x). The estimate of K 0 fx) is then Kˆ 0 x) = j ˆα j j x) = n 1 n m=1 m=1

6 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Kx,X m ). Similarly, note that D i x, y) = K i+1 x, y) K i x, y) = j ψ ij x)ψ ij y). Therefore, in a manner similar to that used above on K 0 fx), D i fx) = j β ij ψ ij x). The estimate of the detail part D i fx)is, then, ˆD i x) = ˆβ j ij ψ ij x). We can then rewrite 8) as fx)= j and estimate 9) as ˆ fx)= j α j j x) + ˆα j j x) + β ij ψ ij x), 9) j R j ˆβ ij ψ ij x) = ˆ K 0 x) + R ˆD i x), 10) where R is a finite truncation value for the infinite series. An adaptive density estimator will be constructed by applying a bloc thresholding rule as follows. In each resolution level i, the indices j are divided up into nonoverlapping blocs of length l = log n. Within this bloc, the average estimated squared bias l 1 ˆβ j B) ij will be compared to the threshold. Here, B) refers to the set of indices j in bloc. By estimating all of these squared coefficients together, the additional information allows a better comparison to the threshold, and hence a better convergence rate than the more conventional term-by-term thresholding estimators. If the average squared bias is larger than the threshold, all coefficients in the bloc will be ept. Otherwise, all coefficients will discarded. Letting B i = l 1 j B) β ij and estimating this with ˆB i = l 1 ˆβ j B) ij, the bloc thresholding wavelet estimator of f becomes ˆ fx)= j ˆα j j x) + This may also be written as R j B) ˆβ ij ψ ij x)i ˆB i >cn 1 ). 11) ˆ fx)= ˆK 0 x) + R ˆD i x)i x J i )I ˆB i >cn 1 ), where ˆD i x) = ˆβ j B) ij ψ ij x) is an estimate of D i fx)= j B) β ij ψ ij x), and J i = {x : ψ ij x) = 0} = {supp ψ ij }. j B) j B) Note that if the support of ψ is of length σ, then the length of J i is σ + l 1)/ i l/ i, and these intervals overlap each other at either end by i σ 1).

7 8 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) The equivalent, bloc-thresholded convolution ernel estimator is ˆ fx)= ˆK 0 x) + R ˆD i x)i x I i )I  i >cn 1 ), 1) where the I i are nonoverlapping intervals of length i l, and A i = l 1 I i D i fx)) dx, is estimated by  i = l 1 I i ˆD i x) dx. Remar. As mentioned in the introduction, bloc size plays a crucial role in the performance of the resulting bloc thresholding estimator. It determines the degree of adaptivity. The bloc size of l = log n is chosen so that the estimator achieves both the optimal global and local adaptivity. Remar. Hal et al. [19] choose bloc size l = Clog n) and show that the bloc thresholding estimator is adaptively rate optimal under the global mean integrated squared error. However, as shown in the next section, this choice of bloc size is too large to achieve the optimal local adaptivity. 4. Local and global adaptivity The global minimax rate of convergence of an estimate of a density in a Besov class F s p,q M, L) to the true underlying density is On s/s+1) ). This minimax rate of convergence can be achieved adaptively without nowing the smoothness parameters. For the wavelet ernel density estimator 11) with bloc length l = log n and appropriate choice of series truncation parameter R, the optimal rate of convergence is achieved adaptively over the space Ṽ s1 F s,q M, L)) B A), where B A) is the space of all functions f with f A. Theorem 1. Let fˆ be the wavelet ernel density estimator 11) with the bloc length l = log n, R = log Dnl 1 ) where D is a constant given in 35), and ) c = A0.08) 1 C Q + Q 1 C 1/ 1, where C 1 and C are the universal constants from Talagrand [7]. Suppose that the wavelets and ψ are compactly supported and r-regular with r > maxs 1,N 1) i.e., conditions 6), 7) with order N 1, and 5) are met). Then there exists a positive constant C such that for all 1/ <s<n, q 1, and s 1 s>s/s + 1), sup E fˆ f Cn s/s+1). f Ṽ s1 F,q s M,L)) B A) In this theorem and the rest in this section) the function space parameters M,L and A are arbitrary finite constants. The bound constant C is dependent on them as well as on the choice of the ernel function K and hence on Q).

8 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) The convolution ernel estimator 1) also achieves the global, optimal minimax convergence rate with this smaller bloc length, although over a different space of irregular Besov functions. Theorem. Let ˆ f be the convolution ernel density estimator 1). Let l,r and c be as in theorem 1. Let τ n be a sequence of positive numbers such that for all ζ > 0, τ n = On ζ+1/n+1) ). If K satisfies 6), 7) with order N 1, and 5), and 1/ <s<n, then there exists a positive constant C such that sup sup d<n,τ τ n f V dτ F,q s M,L)) B A) E ˆ f f Cn s/s+1). Here, it can be seen that the number of discontinuities that can be handled by the estimator is on the order of the sample size n to a power. One of the differences between Theorems 1 and and those set forth in Hall et al. [19] is in regards to the bloc length l. In this paper, l = log n is used instead of their value log n). This choice of bloc length is crucial in estimators of the form given in the above two theorems. In fact, larger bloc lengths are unsatisfactory in that they preclude local convergence optimality. When attention is focused on adaptive estimation there are some striing differences between local and global theories. Under integrated squared error loss there are many situations where rate adaptive estimators can be constructed. When attention is focused on estimating a function at a given point rate optimal adaptive procedures typically do not exist. A penalty, usually a logarithmic factor must be paid for not nowing the smoothness. Important wor in this area began with Lepsi [3] where attention focused on a collection of Lipschitz classes. See also Brown and Low [1], Efromovich and Low [18] and Lepsi and Spooiny [5]. Connections between local and global parameter space adaptation can be found in Lepsi et al. [4], Cai [3] and Efromovich [17]. We shall use the local Hölder class Λ s M, x 0, δ) defined in Section to measure local adaptivity. The minimax rate of convergence for estimating fx 0 ) over Λ s M, x 0, δ) is n s/s+1). Lepsi [3] and Brown and Low [] showed that in adaptive pointwise estimation, where the smoothness parameter s is unnown, the optimal adaptive rate of convergence over Λ s M, x 0, δ) is n 1 log n) s/s+1). By using a bloc length of l = log n in the wavelet ernel estimator, this optimal adaptive rate of convergence is achieved simultaneously over a range of local Hölder classes. Theorem 3. Let fˆ be the wavelet ernel density estimator 11). Let R,l and c be as in Theorem 1, and suppose and ψ are bounded. If 1/ <s<n, and ψ has N 1 vanishing moments, then there exists a positive constant C such that for any x 0 in the support of f sup E fx ˆ 0 ) fx 0 )) Clog n/n) s/s+1). f Λ s M,x 0,δ) B A) By combining Theorems 1 and 3 we see that the bloc thresholding density estimator 11) adaptively achieves not only the optimal global rate of convergence over a wide range

9 84 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) of perturbed Besov spaces, but simultaneously attains the adaptive local convergence rate as well. The optimal global and local adaptivity cannot be attained if a larger bloc size is used. With a bloc length of order larger than log n for example, l = log n) ), the global rate may still be attained, but the local rate will not: Theorem 4. Let ˆ f be the wavelet ernel density estimator 11). Let R,l andcbeasin Theorem 1, and suppose and ψ are as in Theorem 3. If 1/ <s<nand l = log n) 1+r for some r>0, then for some x 0 in the support of f and some constant C sup E fx ˆ 0 ) fx 0 )) Clog n/n) s/s+1) log n) rs/s+1). f Λ s M,x 0,δ) 5. Simulation results In this section we compare the bloc thresholded wavelet estimator from this paper with various other estimators via a simulation study. We will refer to the estimator at 11) simply as DenBloc. The threshold c supplied by the theorems for DenBloc is useful for theoretical purposes, but it is not practical for implementation. Since the thresholding in the estimator is essentially a bias-variance comparison, we eep the estimated wavelet coefficients when the average squared bias of the coefficients in a bloc exceeds the variance of those coefficients. Therefore, c is replaced with an estimate of the variance of the coefficients in a bloc. This variance is approximated by forming a pilot estimate of the density f and evaluating it at the center of the bloc. Two of the competing estimators examined come from Cai and Silverman [9]. These estimators, NeighCoeff and NeighBloc, are wavelet estimators where the comparison against a threshold is not based on a single coefficient as is done in VisuShrin, for example) or on a single bloc of coefficients as is done with DenBloc). Rather, neighboring coefficients or blocs are considered when maing the threshold comparison for a particular coefficient or bloc. These estimator s were devised for nonparametric regression settings, but they are easily modified to the density estimation problem at hand. For NeighBloc, the bloc length is l = log n/. However, the variance and squared bias used for thresholding is computed not just from the current bloc, but includes information from its neighbors to the immediate left and right when possible). The total bloc size used for maing the thresholding decision is log n when the neighboring blocs are added in. The variance of the coefficients in the extended bloc is replaced by the pilot estimate of f evaluated in the center of the bloc as before. NeighCoeff is NeighBloc with bloc length l = 1. The extended bloc is of length 3. Again, the appropriate substitution is made in the threshold comparison as before. For more information on these estimators and the thresholds used see Cai and Silverman [9] and Chicen [10].

10 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Fig. 1. Test densities. Solid line is saw, dashed line is mixnorm, and dotted line is the double exponential. Additionally, other estimators were also looed at. Biased cross-validation and unbiased cross validation ernel estimators with normal and triangle ernels were all implemented. These estimators were compared against one another in terms of mean squared error on the three test densities given in Fig. 1. Saw is a combination of sums of uniform random variables, mixnorm is a mixture of three normal densities, and the last is a double exponential random variable. Formulas for these densities are in Chicen [10]. Results of simulations for some of these estimators on various sample sizes are given in Tables 1 and. In each table, the MSE of the estimate is given from a repetition of size 60. Only one of the 4 ernel methods is reported here, the unbiased cross-validation normal UCVN) ernel estimator. The other ernel estimators mentioned above generally performed worse than this one, and the results are not included here. See [10] for additional simulation results. Table 1 show MSEs for samples sizes n = 0, 50, 100, 500, 1000 and 000. For saw, the wavelet estimators have lower MSEs than UCVN with the exception of sample size 100. Once the sample size hits 100, all three of the wavelet estimators have the same MSE. Examination of the thresholded coefficients reveals that for large sample sizes, all the detail coefficients calculated are 0. Since the coarse coefficients are the same for each wavelet estimate, the wavelet estimates are identical as well. At the lower sample sizes, NeighCoeff seems preferable, then DenBloc. This agrees well with the simulation results from Cai and Silverman [9]. For the mixnorm density, the UCVN is generally the best. This is perhaps not surprising given that the ernel for UCVN is from the same family as the density being estimated. Again, NeighCoeff is the best of the wavelet methods, while there is no clear distinction between NeighBloc and DenBloc. On the final density, the double exponential, the UCVN is the worst of the estimates. The three wavelet estimators are approximately equivalent with the exception of the lowest sample size. Since the wavelet estimators are approximately equivalent in terms of MSE in large sample sizes n 50), it is instructive to examine how the estimators wor with respect to small sample sizes. The results are given in Table. Here, the sample sizes are n = 10, 15, 0, 5, 30 and 40.

11 86 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Table 1 MSE for saw, mixnorm and double exponential densities Density n DenBloc NeighBloc NeighCoeff UCVN Saw Mixnorm Dbl exp For saw, NeighCoeff is clearly the best estimator. It is only surpassed by UCVN at the very low size n = 10. NeighBloc has lower MSE than DenBloc for the lower sample sizes, while DenBloc taes the lead for the larger sample sizes. As with the sample sizes in Table 1, the UCVN is generally best at approximating mixnorm. NeighCoeff is the next best, while DenBloc and NeighBloc follow the same relation as they did for saw. On the double exponential, NeighCoeff is clearly best over the sample sizes in Table, followed by NeighBloc, DenBloc, and lastly, UCVN. The theorems in this paper show that DenBloc attains optimal convergence rates asymptotically. For the sample sizes considered here, however, NeighCoeff seems superior in terms of MSE. In particular, NeighCoeff is better than DenBloc at low sample sizes. The distinction between the wavelet estimators becomes blurred as the sample size increases. This suggests that NeighCoeff should be used for sample sizes under 50, and any of the three estimators are acceptable for larger n. Some example reconstructions are given in Figs. and 3. Fig. shows a comparison of DenBloc and UCVN with a sample size of 100 on the saw density. DenBloc does well at attaining the peas and valleys of the density. UCVN clearly shows a density with four modes, but does not capture the same highs and lows that DenBloc does. Fig. 3 is a typical reconstruction of the mixnorm density. Here, DenBloc does a good job at estimating the pea on the left, but is too irregular over the central smooth portion. UCVN underestimates the pea, but outperforms DenBloc on the smoother portion of the density.

12 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Table MSE for saw, mixnorm and double exponential densities Density n DenBloc NeighBloc NeighCoeff UCVN Saw Mixnorm Dbl exp Fig.. Typical reconstruction of saw. n = 100. Solid line is DenBloc estimate, dashed line is UCVN estimate, and dotted line is actual density. 6. Proofs of theorems In this section, proofs are given for Theorems 1, 3 and 4. Theorem s proof is omitted due to its similarity to the proof of Theorem 1. Before beginning, several preliminary results are necessary.

13 88 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Fig. 3. Typical reconstruction of mixnorm. n = 100. Solid line is DenBloc estimate, dashed line is UCVN estimate and dotted line is actual density Preliminaries First, a simple lemma based on Minowsi s inequality: Lemma 1. Let Y 1,Y,...,Y n be random variables. Then n ) [ n ] E Y i EYi )1/. i=1 i=1 Second, a theorem from Talagrand [7] as stated in Hall et al. [19]. Theorem 5. Let U 1,U,...,U n be independent and identically distributed random variables. Let ε 1,ε,...,ε n be independent Rademacher random variables that are also independent of the U i. Let G be a class of functions uniformly bounded by M. If there exists v, H>0 such that for all n, sup var gu) v, g G E sup g G m=1 n ε m gu m ) nh, then there exist universal constants C 1 and C such that for all λ > 0, P [ sup g G n 1 ) ] n gu m ) EgU) λ + C H m=1 Finally, a lemma from Hall et al. [3]. [ ] e nc 1 λ v 1 ) λm 1 ).

14 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Lemma. If Kx,y) is a ernel satisfying condition 1), Q L, and J is a compact interval, then ) E ˆK 0 x) K 0 fx) dx f Q J /n, J and ) E ˆD i x) D i fx) dx 4 f Q i J /n, J where J is the length of the interval J. 6.. Proof of Theorem 1 We will prove this theorem for q =. For general q 1, the results will hold since B s pq Bs p. Let i s be the integer such that i s n 1/s+1) < i s+1. Minowsi s inequality implies that E fˆ f 4E ˆK 0 K 0 + 4E [ R +4E +4 D i f i=r+1 = T 1 + T + T 3 + T 4 T 1 is bounded by Lemma : i s [ ˆD i IJ i )I ˆB i >cn 1 ) D i f ] ] ˆD i IJ i )I ˆB i >cn 1 ) D i f T 1 Cn 1. 13) Each of the remaining pieces T i will be treated individually in their own sections Bound on T To bound the nonlinear part T, note that Lemma 1 and Minowsi s inequality give T C i s E 1/ ˆD i x)i ˆB i >cn 1 ) D i fx)) dx.

15 90 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) For a fixed i i s, the orthogonality of wavelets gives E ˆD i x)i ˆB i >cn 1 ) D i fx)) dx E ˆD i x) D i fx)) dx +E D i fx)) dx IB i cn 1 ) J i +E D i fx)) dx I ˆB i cn 1 )I B i > cn 1 ) J i = T 1 + T + T 3. As in Hall et al. [19], T 1 is bounded by Lemma. T is bounded by the size of the indicator function and the fact that the number of intervals overlapping the support of f is no more than a constant times i /l: T 1,T C i /n. To bound T 3, the following lemma from Hall et al. [19] is useful Lemma 3. If J i D i fx)) dx lc/n) then { } { } ˆD i x)) dx lc/n ˆD i x) D i fx)) dx 0.08lc/n, J i and if J i D i fx)) dx > lc/n then { } { ˆD i x)) dx lc/n J i J i J i ˆD i x) D i fx)) dx 0.16lc/n }. Using this lemma, T 3 E ) D i fx)) dx I ˆD i x) D i fx)) dx 0.16lc/n. J i J i Using Young s inequality with 6), and the fact that the length of the interval J i is a constant times l/ i, D i fx)) dx J i D i f dx J i So, T 3 Cl/ i P [ J i C f Q 1 l/i. ) ] 1/ ˆD i x) D i fx) dx ) 0.16lc/n. 14)

16 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) To bound the above probability, Talagrand s theorem Theorem 5) will be used. Similar to Hall et al. { ˆD i x) D i fx)) dx } 1/ j i { n } = sup n 1 gx)d i x, X m )dx E gx)d i x, X 1 )dx, g G m=1 J i J i where { } G = gx)d i x, )I j B))dx : g 1. J i Talagrand s theorem will be used with and M = i/ Q, v = f Q 1, H = Q 1l f /n, λ = 0.16lc/n C Q 1l f /n > 0. The probability at 14) is then bounded by [ ) ] 1/ P ˆD i x) D i fx) dx λ + C Q 1l f /n) I i [ ) )]} exp { nc 1 λ / f Q 1 λ/ i/ Q. For 0 i i s, constant c and λ positive, l n s/s+1) L) Q c ) C Q 1L) 1 Q implies ) ) λ / f Q 1 < λ/ i/ Q. 15) Thus, for large enough n, [ ) ] 1/ P ˆD i x) D i fx) dx ) 0.16lc/n Cn δ, 16) I i where δ is the constant 0.16c ) C 1 C Q 1A δ = A Q 1 and c is large enough to mae δ > 0. Putting 14) and 16) together with the fact that the number of intervals J i that intersect the support of f is no more than C i /l, T 3 Cn δ. 17)

17 9 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) All pieces are now available to bound T. [ is ] T C T 1 + T + T 3 ) 1/ C [ is ) ) ] 1/ i /n + n δ/ C i s n 1 + is [ n δ) = C n s/s+1) + log n 1/s+1) ) n δ]. 18) 6... Bound on T 3 As with T before, write R T 3 C E 1/ ˆD i x)i ˆB i >cn 1 ) D i fx)) dx. For a fixed i, i s + 1 i R, [ ) E ˆD i x)i x J i )I ˆB i >cn 1 ) D i fx)] dx ) ˆD i x) D i fx) dx I ˆB i >cn 1 )I B i >c/n)) E J i ) ˆD i x) D i fx) dx I ˆB i >cn 1 )I B i c/n)) +E J i +E D i fx)) dx I ˆB i cn 1 )I B i cn 1 ) J i +E D i fx)) dx I ˆB i cn 1 )I B i > cn 1 ) J i = T 31 + T 3 + T 33 + T 34. By Lemma 3, T 3 = I E E J i [ { J i J i ) ˆD i x) D i fx) dx I ˆB i >cn 1 )I B i c/n)) ) ˆD i x) D i fx) dx ) } 1/ ˆD i x) D i fx) dx )] 0.08lc/n.

18 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) To bound this, we use the fact that for any nonnegative random variable Y, EY IY > a) = a PY > a)+ The integrals in T 3 are of this form with [ ) ] 1/ Y = ˆD i x) D i fx) dx 0 J i EY IY > a) a exp a ypy > y)dy. and a = 0.08lc/n. Using Talagrand s theorem as was done for the piece T 3, )] y C H ) PY > y) exp [ nc 1 f Q y C H 1 i/, Q and therefore a C H ) [ nc 1 )] a C H i/ Q f Q 1 y C H ) [ nc 1 ) ] + y exp a f Q y C H 1 i/ dy Q = T 31 + T 3. For i s + 1 i R and a C H)positive R l n L) Q c 19) C Q 1L) 1) Q implies a C H) f 1 Q 1 a C H) i/ Q 1. Note that a C H > 0 implies that λ at 15) is positive as well. Therefore, 0.08c ) C Q T 31 Cl/n exp C 1A 1l A Q 1 Cn γ 1 log n, 0) where γ is the constant 0.08c ) C 1 C Q 1A γ = A Q 1) 1 and c is large enough to mae γ positive. For T 3, let a 0 = f Q 1 Q 1 i/ + C H>0. Then, if a a 0, ) a0 y C H) ) y C H T 3 = y exp nc 1 a f Q dy + y exp nc 1 1 a 0 i/ dy Q = T 31 + T 3.

19 94 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) To bound T 31, note that by a change of variables and increase in upper limit of integration, ) T 31 f Q 1 n 1 C 1 a C H) 1 exp nc 1 f Q 1 ) y + C Hy1/y) exp nc 1 a C H f Q dy. ) 1 The first term on the right of ) is bounded by Cn 1 γ, where γ is the constant in 1). To bound the second term, use integration by parts. ) y C Hy1/y) exp nc 1 a C H f Q dy 1 ) = C H f Q 1 a C H) exp nc 1 a C H nc 1 f Q 1 C H f ) Q 1 1 a C H nc 1 y exp y nc 1 f Q dy. 1 Since the integrand in second term on the right side above is strictly positive, this integral is also bounded by Cn γ 1. Using integration by parts on T 3, ) T 3 C n 1 + n 1 i log n/n + i n e nd/i, where d is the constant d = C 1 f Q 1 / Q. 3) If a>a 0, then T 3 T 3. Therefore, T 3 = T 31 + T 31 + T 3 ) ) i / log n n γ 1 log n ) + i / log n n 1 + n 1 i log n/n + i n e nd/i. 4) To bound T 34, observe that the only difference between T 3 and T 34 is the range of the index i. Therefore, by repeating the argument for T 3, the bound for T 34 is the same as at 17) T 34 Cn δ. This bound requires that 5) R L) Q 4 1 log n/n 0.16c = D c 6) C Q 1L) 1) Q which implies the condition at 19).

20 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) The bound on T 3 is found in a similar manner to 18). R T 3 C T 31 + T 3 + T 33 + T 34 ) 1/ C R Observe that for i R, T 31 + T 33 ) 1/ + R T 1/ 3 + R T 1/ 34. i e nd/i R e nd/r D c nlog n) 1 e d log n/d c = D c nlog n) 1 n d/d c. Therefore, i e nd/i is less than or equal to some constant if d D c. This condition is met if c 0.3L) 1 ) C Q 1 + Q 1 C 1/ 1. 7) Therefore, using the bound for T 3 givenat4), R T 1/ 3 ) C n γ + n 1 log n. 8) Using the bound for T 34 found at 5), R log R ) n δ. 9) T 1/ 34 In Hall et al. [19] it is shown that R T 31 + T 33 ) 1/ Cn s/s+1). 30) Therefore, using 8) 30), [ T 3 C n s/s+1) + n γ + log R ) ] n δ. 31) Bound on T 4 The final piece T 4 is easily bounded T 4 = C i=r+1 D i f = C i=r+1 β ij. 3) j

21 96 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Since f = f 1 + f where f 1 B, s and f B s 1 s+1/) 1, Bs 1 s, β ij = fx)ψ ij x)dx = f 1 x) + f x)) ψ ij x)dx and 3) becomes T 4 C i=r+1 j = β 1ij + β ij, β 1ij + β ij ). From the bounds on wavelet coefficients given at 3) and 4), j β 1ij C is, and j β ij C is 1 s). Therefore, using R = D c n/l, provided that s 1 s>s/s + 1). T 4 Cn s/s+1), 33) Determination of constants γ, δ, D, and c Using the bounds from 13), 18), 31), and 33) E f ˆ f C [n s/s+1) + log R ) n δ + n γ]. For γ, n γ n s/s+1) for all 1 <s<nif and only if γ N/N + 1). The above constraint is met for all f in the space interest if the value of the threshold c is set accordingly: ) c A0.08) 1 N C 1 Q + Q 1. 34) C 1 N + 1) Note that the condition at 19) and 15) that a C H and λ be positive are met if 34) holds. Since c must satisfy both 34) and 7), and L) 1 f A, the value may be set as c = A0.08) 1 1 C 1 Q + Q 1 C 1 ). Let D = Q 4 1 Q L){C 4 Q A 1/ L 1/ ) + A) 1/ Q 1 C 1/ 1 }). 35)

22 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) The value for the constant D c can then be taen to be D c = D 1.Forδ, note that δ γ is a positive constant, so log R ) n δ = log R ) n γ n δ γ) Cn s/s+1). Therefore, using the bound for c at 7), E f f ˆ Cn s/s+1), and the Theorem 1 is proved Proof of Theorem 3 To simplify the proof, assume that f is in Λ s M) rather than in the local Hölder classes Λ s M, x 0, δ) for points x 0 in the support of f. Write fx ˆ 0 ) fx 0 ) as fx ˆ 0 ) fx 0 ) = ) ˆαj α j j x 0 ) j i s R i=r+1 j B) j B) β ij ψ ij x 0 ) = L 1 + L + L 3 + L 4, j ˆβ ij ψ ij x 0 )I ˆB i >cn 1 ) β ij ψ ij x 0 )) ˆβ ij ψ ij x 0 )I ˆB i >cn 1 ) β ij ψ ij x 0 )) where i s is as before. Then ) ) E fx ˆ 0 ) fx 0 ) C EL 1 + EL + EL 3 + EL 4. In each of these sums, the total number of indices j where the support of ψ ij or j intersects the point x 0 is no more than q 0 + 1, where q 0 is as in 5). This fact will be used several times in the following proof Bound on L 1 Recalling that = 1 and that is bounded, EL 1 CE j [ˆαj α j ) j x 0 ) ] C E j ˆαj α j ) = CE j ˆαj α j ) j x).

23 98 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Using the orthogonality of the j, EL 1 CE j ˆα j j x) α j j x) = CE { ˆK 0 x) K 0 fx)} dx. By applying Lemma, EL 1 Cn 1. 36) Bound on L To bound L, brea it into the following sums: EL = E i s ) ˆβ ij β ij ψ ij x 0 )I ˆB i >cn 1 ) + i s j B) β ij ψ ij x 0 )I ˆB i cn 1 ) j = EL 1 + L ) CEL 1 + CEL. To bound L 1, first apply Lemma 1: i s EL 1 E ) ˆβ ij β ij ψ ij x 0 )I ˆB i >cn 1 ) i s ψ E j B) i s i/ j B) E j Now, E j ˆβ ij β ij ) is of order n 1 : 1/ ) ˆβ ij β ij ψ ij x 0 )I ˆB i >cn 1 ) 1/ ) ˆβ ij β ij. 37) E j ˆβ ij β ij ) = j var ˆβ ij j n 1 var ψ ij X 1 ) Cn 1 ψ ij x) dx. 38)

24 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) Using this result, 37) becomes EL 1 C [ is Cn s/s+1). ] i/ n 1) 1/ The bound for L is found by breaing it in to two pieces. L = i s j B) β ij ψ ij x 0 )I ˆB i cn 1 )I B i > cn 1 ) + i s = L 1 + L. j B) β ij ψ ij x 0 )I ˆB i cn 1 )I B i cn 1 ) The piece L 1 is bounded using Talagrand s theorem. First, note that by Lemma 3 and the fact that f Λ s M) β ij C is+1/), we have E j B) CE C is β ij ψ ij x 0 )I ˆB i cn 1 )I B i > cn 1 ) j B) j B) is+1/) i/ ψ I ˆB i cn 1 )I B i > cn 1 ) ) ) P ˆD i x) D i fx) dx > 0.16c log n/n. Then ) ) P ˆD i x) D i fx) dx > 0.16c log n/n Cn δ, where δ is as before. Therefore, using this bound on the probability and Lemma 1, EL 1 i s E j B) is ) C is n δ/ Cn δ. 1/ β ij ψ ij x 0 )I ˆB i cn 1 )I B i > cn 1 )

25 100 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) To bound L, observe that E β ij ψ ij x 0 )I ˆB i cn 1 )I B i cn 1 ) j B) C i ψ β ij IB i cn 1 ). Now, B i cn 1 implies that β ij Cl/n. j B) By virtue of f being in Λ s M), β ij C is+1/). j B) Therefore, and so E j B) j B) β ij IB i cn 1 ) C n 1 log n is+1/)), j B) β ij ψ ij x 0 )I ˆB i cn 1 )I B i cn 1 ) C i n 1 log n is+1/)). Therefore, the bound on L is after an application of Lemma 1) [ is EL C ] i/ n 1 log n is+1/)) 1/. Now, n 1 log n is+1/) whenever i nlog n) 1) 1/s+1). Therefore, letting i be the integer such that i nlog n) 1) 1/s+1) < i +1, i EL C i/ i s log n/n + i/ is+1/) ) log n s/s+1) C. n The bound on EL is therefore C n δ + n 1 log n) s/s+1)), i=i +1

26 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) and hence ) ] s/s+1) EL [n C δ + n 1 log n. 39) Bound on L 3 As with L, brea L 3 into the following parts: R EL 3 = E ) ˆβ ij β ij ψ ij x 0 )I ˆB i >cn 1 ) + R j B) j B) β ij ψ ij x 0 )I ˆB i cn 1 ) = EL 31 + L 3 ) CEL 31 + CEL 3. Additionally, L 31 must be divided as well. R ) CE ˆβ ij β ij ψ ij x 0 )I ˆB i >cn 1 )I B i >cn 1 /) EL 31 +CE R j B) = CEL CEL 31. j B) ) ˆβ ij β ij ψ ij x 0 )I ˆB i >cn 1 )I B i cn 1 /) To tae care of L 311, notice that E ) ˆβ ij β ij ψ ij x 0 )I ˆB i >cn 1 )I B i >cn 1 /) j B) C nc 1 B i E ) ˆβ ij β ij ψ ij x 0 ). As in 38) E Since B i = l 1 j B) j B) j B) ) ˆβ ij β ij ψ ij x 0 ) β ij C is+1/), i ψ E C i /n. j B) ˆβ ij β ij )

27 10 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) the bound for EL 311 then follows from an application of Lemma 1 ) 1/ R EL 311 C n i c n is+1/) R C is Cn s/s+1). To bound EL 31, Talagrand s theorem will be used. To begin, note that by Lemma 3 E ) ˆβ ij β ij ψ ij x 0 )I ˆB i >cn 1 )I B i cn 1 /) j B) C i ψ E ) ˆβ ij β ij I ˆB i >cn 1 )I B i cn 1 /) j B) C i E ) ˆD i x) D i fx) dx J i ) I ˆD i x) D i fx)) dx > 0.08c log n/n J i = C i T 3. This is bounded just as T 3 wasat4). The number of indices is here no more than a constant, giving a bound of ) C [n i γ 1 log n + n 1 + n 1 i log n/n + i n exp nd )] i, where γ is as in 1) and d is as in 3). Therefore, repeating the argument for the piece T 3 at 8), EL 31 Cn γ + n 1 R ). Only L 3 still needs bounding. EL 3 C C R R Cn s/s+1). j B) β ij ψ ij x 0 ) i/ ψ is+1/)

28 The bound for L 3 is then E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) EL 3 C n s/s+1) + n γ). 40) Bound on L 4 L 4 is bounded much lie L 3 was. The only difference is the range of the index i and the lac of an indicator function. EL 4 = E β ij ψ ij x 0 ) C i=r+1 j B) is i=r+1 Cn s/s+1). 41) Determination of constants γ, δ, and c From the bounds derived at 36), 39), 40), and 41), ) E fx 0 ) fx ˆ 0 ) C n δ + log n/n) s/s+1) + n γ). As before, we need γ and δ to be larger than N/N + 1) and 7) to hold, so ) c A0.08) 1 1 C Q + Q 1 will suffice, as well as imply the necessary conditions on γ and δ. This implies ) E fx 0 ) fx ˆ 0 ) Clog n/n) s/s+1) and the proof is complete. C Proof of Theorem 4 Suppose the bloc length l in the wavelet estimator 11) is taen to be of order larger than log n, say l = log n) 1+r for some r>0. Then, assume that f is a density function with one detail coefficient, β i j, which is as large as possible, and no other non-zero coefficients β ij overlapping the support of ψ i j. Outside this support, f has sufficient mass to ensure it integrates to one. This function f is desired to be in the space Λ s M, x 0, δ), solet β i j = i s+1/). Let i be such that i = n/l) 1/s+1), and j an integer such that

29 104 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) ψ i x 0 j ) c > 0 for some constant c. Let ) 1/ S = sup E fx ˆ 0 ) fx 0 )) f Λ s M) E fˆ x 0 ) f x 0 )) ) 1/ = E ˆβ i j I ˆB >c/n) β i j )ψ i j x 0) + j B j ˆα j α j ) j x 0 ) R + ˆβ ij I ˆB i >c/n) β ij )ψ ij x 0 ) j B) + 1/ β ij )ψ ij x 0 ), i>r j where B is the bloc containing the nonzero detail coefficient, and the final sum is over the remaining blocs. Since EY + X i ) ) 1/ EY ) 1/ EX i )1/ for random variables X i and Y [19], S E 1/ ˆβ i j I ˆB >c/n) β i j )ψ i j x 0) j B E j ˆα j α j ) j x 0 ) 1/ R E 1/ ˆβ ij β ij )ψ ij x 0 ) I ˆB i >c/n) j B) E 1/ β ij ψ ij x 0 ) i>r j = U 1 U U 3 U 4. Now, U C n 1 by Lemma. U 3 is easily seen to be bounded be C n s/s+1) by noting that in the previous sections, EL 1 Cn s/s+1), EL 31 Cnγ log n) 1+r ) 1 +

30 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) n s/s+1) ) and the other relevant pieces are zero. The values of γ and δ may be taen the same as in the case where l = log n. U 4 C n s/s+1) by repeating the argument for L 4.Forγ large enough i.e., threshold c as chosen), U + U 3 + U 4 C n s/s+1).for U 1, [ ] 1/ U 1 = Eˆβ i j β i j ) ψ i j x 0)I ˆB >c/n)+ Eβ i j ψ i j x 0)I ˆB c/n) ) 1/ Eβ i j ψ i j x 0)I ˆB < c/n)i B c/n) [ 1/ = β i j ψ i j x 0)EI ˆB < c/n)i B c/n))], where B is the mean of the true squared coefficients in the bloc containing the nonzero coefficient. By Talagrand s theorem and Lemma 3, EI ˆB c/n)i B c/n)) Cn γ, where γ is as before. For suitable n, this is less than 1. Therefore, the expectation of the indicators in the lower bound for U 1 is at least 1/. So, ) 1/ U 1 C β i j ψ i j x 0) = C l/n) s/s+1). Therefore, S = U 1 U U 3 C l/n) s/s+1) or sup E fx ˆ 0 ) fx 0 )) Clog n/n) s/s+1) log n) rs/s+1). f Λ s M,x 0,δ) References [1] L. Brown, M. Low, Asymptotic equivalence of nonparametric regression and white noise, Ann. Statist ) [] L. Brown, M. Low, A constrained ris inequality with applications to nonparametric functional estimations, Ann. Statist ) [3] T. Cai, Adaptive wavelet estimation: a bloc thresholding and oracle inequality approach, Ann. Statist ) [4] T. Cai, Wavelet regression via bloc thresholding: adaptivity and the choice of bloc size and threshold level, Technical Report #99-14, Department of Statistics, Purdue University, [5] T. Cai, On adaptive estimation of a derivative and other related linear inverse problems, J. Statist. Planning Inference ) [6] T. Cai, On bloc thresholding in wavelet regression: adaptivity bloc size and threshold level, Statist. Sinica 1 00) [7] T. Cai, M. Low, Nonparametric function estimation over shrining neighborhoods: superefficient and adaptation, Technical Report, Department of Statistics, University of Pennsylvania, 00. [8] T. Cai, M. Low, L. Zhao, Tradeoffs between global and local riss in nonparametric function estimation, Technical Report, Department of Statistics, University of Pennsylvania, 001.

31 106 E. Chicen, T.T. Cai / Journal of Multivariate Analysis ) [9] T. Cai, B. Silverman, Incorporating information on neighboring coefficients into wavelet estimation, Sanhya Ser. B ) [10] E. Chicen, Simulation results for wavelet-based density estimators, Technical Report M968, Department of Statistics, Florida State University, 004. [11] I. Daubechies, Ten Lectures on Wavelets, SIAM, Philadelphia, 199. [1] D. Donoho, I. Johnstone, Adapting to unnown smoothness via wavelet shrinage, J. Amer. Statist. Assoc ) [13] D. Donoho, I. Johnstone, Minimax estimation via wavelet shrinage, Ann. Statist ) [14] D. Donoho, R. Liu, Geometrizing rates of convergence III, Ann. Statist ) [15] S. Efromovich, Nonparametric estimation of a density of unnown smoothness, Theoretical Probab. Appl ) [16] S. Efromovich, Nonparametric Curve Estimation, Springer, New Yor, [17] S. Efromovich, On blocwise shrinage estimation, Technical Report, 00. [18] S. Efromovich, M. Low, Adaptive estimates of linear functionals, Probab. Theory Related Fields ) [19] P. Hall, G. Keryacharian, D. Picard, Bloc threshold rules for curve estimation using ernel and wavelet methods, Ann. Statist ) [0] P. Hall, G. Keryacharian, D. Picard, On the minimax optimality of bloc thresholded wavelet estimators, Statist. Sinica ) [1] I. Ibragimov, R. Hasminsii, Nonparametric estimation of the values of a linear functional in Gaussian white noise, Theory Probab. Appl ) [] G. Keryacharian, D. Picard, K. Tribouley, l p adaptive density estimation, Bernoulli 1996) [3] O. Lepsi, On a problem of adaptive estimation in white Gaussian noise, Theory Probab. Appl ) [4] O. Lepsi, E. Mammen, V. Spoioiny, Optimal spatial adaptation to inhomogeneous smoothness: an approach based on ernel estimates with variable bandwidth selectors, Ann. Statist ) [5] O. Lepsi, V. Spoioiny, Optimal pointwise adaptive methods in nonparametric estimation, Ann. Statist ) [6] Y. Meyer, Wavelets and Operators, Cambridge University Press, Cambridge, 199. [7] M. Talagrand, Sharper bounds for Gaussian and empirical processes, Ann. Probab. 1994) [8] H. Triebel, Theory of Function Spaces, Birhäuser, Basel, [9] H. Triebel, Theory of Function Spaces II, Birhäuser, Basel, 1983.

Introduction Wavelet shrinage methods have been very successful in nonparametric regression. But so far most of the wavelet regression methods have be

Introduction Wavelet shrinage methods have been very successful in nonparametric regression. But so far most of the wavelet regression methods have be Wavelet Estimation For Samples With Random Uniform Design T. Tony Cai Department of Statistics, Purdue University Lawrence D. Brown Department of Statistics, University of Pennsylvania Abstract We show

More information

Wavelet Shrinkage for Nonequispaced Samples

Wavelet Shrinkage for Nonequispaced Samples University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Wavelet Shrinkage for Nonequispaced Samples T. Tony Cai University of Pennsylvania Lawrence D. Brown University

More information

A Data-Driven Block Thresholding Approach To Wavelet Estimation

A Data-Driven Block Thresholding Approach To Wavelet Estimation A Data-Driven Block Thresholding Approach To Wavelet Estimation T. Tony Cai 1 and Harrison H. Zhou University of Pennsylvania and Yale University Abstract A data-driven block thresholding procedure for

More information

Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach

Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1999 Adaptive Wavelet Estimation: A Block Thresholding and Oracle Inequality Approach T. Tony Cai University of Pennsylvania

More information

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan

Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan Discussion of Regularization of Wavelets Approximations by A. Antoniadis and J. Fan T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania Professors Antoniadis and Fan are

More information

(a). Bumps (b). Wavelet Coefficients

(a). Bumps (b). Wavelet Coefficients Incorporating Information on Neighboring Coecients into Wavelet Estimation T. Tony Cai Bernard W. Silverman Department of Statistics Department of Mathematics Purdue University University of Bristol West

More information

OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1

OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1 The Annals of Statistics 1997, Vol. 25, No. 6, 2512 2546 OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1 By O. V. Lepski and V. G. Spokoiny Humboldt University and Weierstrass Institute

More information

Nonparametric Regression

Nonparametric Regression Adaptive Variance Function Estimation in Heteroscedastic Nonparametric Regression T. Tony Cai and Lie Wang Abstract We consider a wavelet thresholding approach to adaptive variance function estimation

More information

4.1 Haar Wavelets. Haar Wavelet. The Haar Scaling Function

4.1 Haar Wavelets. Haar Wavelet. The Haar Scaling Function 4 Haar Wavelets Wavelets were first aplied in geophysics to analyze data from seismic surveys, which are used in oil and mineral exploration to get pictures of layering in subsurface roc In fact, geophysicssts

More information

Biorthogonal Spline Type Wavelets

Biorthogonal Spline Type Wavelets PERGAMON Computers and Mathematics with Applications 0 (00 1 0 www.elsevier.com/locate/camwa Biorthogonal Spline Type Wavelets Tian-Xiao He Department of Mathematics and Computer Science Illinois Wesleyan

More information

The Root-Unroot Algorithm for Density Estimation as Implemented. via Wavelet Block Thresholding

The Root-Unroot Algorithm for Density Estimation as Implemented. via Wavelet Block Thresholding The Root-Unroot Algorithm for Density Estimation as Implemented via Wavelet Block Thresholding Lawrence Brown, Tony Cai, Ren Zhang, Linda Zhao and Harrison Zhou Abstract We propose and implement a density

More information

ECE 901 Lecture 16: Wavelet Approximation Theory

ECE 901 Lecture 16: Wavelet Approximation Theory ECE 91 Lecture 16: Wavelet Approximation Theory R. Nowak 5/17/29 1 Introduction In Lecture 4 and 15, we investigated the problem of denoising a smooth signal in additive white noise. In Lecture 4, we considered

More information

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 1998 Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½ Lawrence D. Brown University

More information

Bayesian Nonparametric Point Estimation Under a Conjugate Prior

Bayesian Nonparametric Point Estimation Under a Conjugate Prior University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 5-15-2002 Bayesian Nonparametric Point Estimation Under a Conjugate Prior Xuefeng Li University of Pennsylvania Linda

More information

1.5 Approximate Identities

1.5 Approximate Identities 38 1 The Fourier Transform on L 1 (R) which are dense subspaces of L p (R). On these domains, P : D P L p (R) and M : D M L p (R). Show, however, that P and M are unbounded even when restricted to these

More information

On variable bandwidth kernel density estimation

On variable bandwidth kernel density estimation JSM 04 - Section on Nonparametric Statistics On variable bandwidth kernel density estimation Janet Nakarmi Hailin Sang Abstract In this paper we study the ideal variable bandwidth kernel estimator introduced

More information

POINT VALUES AND NORMALIZATION OF TWO-DIRECTION MULTIWAVELETS AND THEIR DERIVATIVES

POINT VALUES AND NORMALIZATION OF TWO-DIRECTION MULTIWAVELETS AND THEIR DERIVATIVES November 1, 1 POINT VALUES AND NORMALIZATION OF TWO-DIRECTION MULTIWAVELETS AND THEIR DERIVATIVES FRITZ KEINERT AND SOON-GEOL KWON,1 Abstract Two-direction multiscaling functions φ and two-direction multiwavelets

More information

Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis

Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis Jeffrey S. Morris University of Texas, MD Anderson Cancer Center Joint wor with Marina Vannucci, Philip J. Brown,

More information

Size properties of wavelet packets generated using finite filters

Size properties of wavelet packets generated using finite filters Rev. Mat. Iberoamericana, 18 (2002, 249 265 Size properties of wavelet packets generated using finite filters Morten Nielsen Abstract We show that asymptotic estimates for the growth in L p (R- norm of

More information

Nonparametric estimation using wavelet methods. Dominique Picard. Laboratoire Probabilités et Modèles Aléatoires Université Paris VII

Nonparametric estimation using wavelet methods. Dominique Picard. Laboratoire Probabilités et Modèles Aléatoires Université Paris VII Nonparametric estimation using wavelet methods Dominique Picard Laboratoire Probabilités et Modèles Aléatoires Université Paris VII http ://www.proba.jussieu.fr/mathdoc/preprints/index.html 1 Nonparametric

More information

Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes

Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes Alea 4, 117 129 (2008) Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes Anton Schick and Wolfgang Wefelmeyer Anton Schick, Department of Mathematical Sciences,

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Mark Gordon Low

Mark Gordon Low Mark Gordon Low lowm@wharton.upenn.edu Address Department of Statistics, The Wharton School University of Pennsylvania 3730 Walnut Street Philadelphia, PA 19104-6340 lowm@wharton.upenn.edu Education Ph.D.

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

Density estimators for the convolution of discrete and continuous random variables

Density estimators for the convolution of discrete and continuous random variables Density estimators for the convolution of discrete and continuous random variables Ursula U Müller Texas A&M University Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract

More information

Nonparametric regression with martingale increment errors

Nonparametric regression with martingale increment errors S. Gaïffas (LSTA - Paris 6) joint work with S. Delattre (LPMA - Paris 7) work in progress Motivations Some facts: Theoretical study of statistical algorithms requires stationary and ergodicity. Concentration

More information

An Introduction to Wavelets and some Applications

An Introduction to Wavelets and some Applications An Introduction to Wavelets and some Applications Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France An Introduction to Wavelets and some Applications p.1/54

More information

be the set of complex valued 2π-periodic functions f on R such that

be the set of complex valued 2π-periodic functions f on R such that . Fourier series. Definition.. Given a real number P, we say a complex valued function f on R is P -periodic if f(x + P ) f(x) for all x R. We let be the set of complex valued -periodic functions f on

More information

ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE. By Michael Nussbaum Weierstrass Institute, Berlin

ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE. By Michael Nussbaum Weierstrass Institute, Berlin The Annals of Statistics 1996, Vol. 4, No. 6, 399 430 ASYMPTOTIC EQUIVALENCE OF DENSITY ESTIMATION AND GAUSSIAN WHITE NOISE By Michael Nussbaum Weierstrass Institute, Berlin Signal recovery in Gaussian

More information

Optimal global rates of convergence for interpolation problems with random design

Optimal global rates of convergence for interpolation problems with random design Optimal global rates of convergence for interpolation problems with random design Michael Kohler 1 and Adam Krzyżak 2, 1 Fachbereich Mathematik, Technische Universität Darmstadt, Schlossgartenstr. 7, 64289

More information

DAVID FERRONE. s k s k 2j = δ 0j. s k = 1

DAVID FERRONE. s k s k 2j = δ 0j. s k = 1 FINITE BIORTHOGONAL TRANSFORMS AND MULTIRESOLUTION ANALYSES ON INTERVALS DAVID FERRONE 1. Introduction Wavelet theory over the entire real line is well understood and elegantly presented in various textboos

More information

Chi-square lower bounds

Chi-square lower bounds IMS Collections Borrowing Strength: Theory Powering Applications A Festschrift for Lawrence D. Brown Vol. 6 (2010) 22 31 c Institute of Mathematical Statistics, 2010 DOI: 10.1214/10-IMSCOLL602 Chi-square

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS

L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS Rend. Sem. Mat. Univ. Pol. Torino Vol. 57, 1999) L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS Abstract. We use an abstract framework to obtain a multilevel decomposition of a variety

More information

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model. Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model By Michael Levine Purdue University Technical Report #14-03 Department of

More information

ORTHOGONAL SERIES REGRESSION ESTIMATORS FOR AN IRREGULARLY SPACED DESIGN

ORTHOGONAL SERIES REGRESSION ESTIMATORS FOR AN IRREGULARLY SPACED DESIGN APPLICATIONES MATHEMATICAE 7,3(000), pp. 309 318 W.POPIŃSKI(Warszawa) ORTHOGONAL SERIES REGRESSION ESTIMATORS FOR AN IRREGULARLY SPACED DESIGN Abstract. Nonparametric orthogonal series regression function

More information

A Modern Look at Classical Multivariate Techniques

A Modern Look at Classical Multivariate Techniques A Modern Look at Classical Multivariate Techniques Yoonkyung Lee Department of Statistics The Ohio State University March 16-20, 2015 The 13th School of Probability and Statistics CIMAT, Guanajuato, Mexico

More information

Estimation of a quadratic regression functional using the sinc kernel

Estimation of a quadratic regression functional using the sinc kernel Estimation of a quadratic regression functional using the sinc kernel Nicolai Bissantz Hajo Holzmann Institute for Mathematical Stochastics, Georg-August-University Göttingen, Maschmühlenweg 8 10, D-37073

More information

ON BLOCK THRESHOLDING IN WAVELET REGRESSION: ADAPTIVITY, BLOCK SIZE, AND THRESHOLD LEVEL

ON BLOCK THRESHOLDING IN WAVELET REGRESSION: ADAPTIVITY, BLOCK SIZE, AND THRESHOLD LEVEL Statistica Sinica 12(2002), 1241-1273 ON BLOCK THRESHOLDING IN WAVELET REGRESSION: ADAPTIVITY, BLOCK SIZE, AND THRESHOLD LEVEL T. Tony Cai University of Pennsylvania Abstract: In this article we investigate

More information

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) Yale, May 2 2011 p. 1/35 Introduction There exist many fields where inverse problems appear Astronomy (Hubble satellite).

More information

Asymptotic Equivalence and Adaptive Estimation for Robust Nonparametric Regression

Asymptotic Equivalence and Adaptive Estimation for Robust Nonparametric Regression Asymptotic Equivalence and Adaptive Estimation for Robust Nonparametric Regression T. Tony Cai 1 and Harrison H. Zhou 2 University of Pennsylvania and Yale University Abstract Asymptotic equivalence theory

More information

Optimal rates of convergence for estimating Toeplitz covariance matrices

Optimal rates of convergence for estimating Toeplitz covariance matrices Probab. Theory Relat. Fields 2013) 156:101 143 DOI 10.1007/s00440-012-0422-7 Optimal rates of convergence for estimating Toeplitz covariance matrices T. Tony Cai Zhao Ren Harrison H. Zhou Received: 17

More information

Lectures notes. Rheology and Fluid Dynamics

Lectures notes. Rheology and Fluid Dynamics ÉC O L E P O L Y T E C H N IQ U E FÉ DÉR A L E D E L A U S A N N E Christophe Ancey Laboratoire hydraulique environnementale (LHE) École Polytechnique Fédérale de Lausanne Écublens CH-05 Lausanne Lectures

More information

Wavelets and regularization of the Cauchy problem for the Laplace equation

Wavelets and regularization of the Cauchy problem for the Laplace equation J. Math. Anal. Appl. 338 008440 1447 www.elsevier.com/locate/jmaa Wavelets and regularization of the Cauchy problem for the Laplace equation Chun-Yu Qiu, Chu-Li Fu School of Mathematics and Statistics,

More information

On the Estimation of the Function and Its Derivatives in Nonparametric Regression: A Bayesian Testimation Approach

On the Estimation of the Function and Its Derivatives in Nonparametric Regression: A Bayesian Testimation Approach Sankhyā : The Indian Journal of Statistics 2011, Volume 73-A, Part 2, pp. 231-244 2011, Indian Statistical Institute On the Estimation of the Function and Its Derivatives in Nonparametric Regression: A

More information

Optimal Estimation of a Nonsmooth Functional

Optimal Estimation of a Nonsmooth Functional Optimal Estimation of a Nonsmooth Functional T. Tony Cai Department of Statistics The Wharton School University of Pennsylvania http://stat.wharton.upenn.edu/ tcai Joint work with Mark Low 1 Question Suppose

More information

Variance Function Estimation in Multivariate Nonparametric Regression

Variance Function Estimation in Multivariate Nonparametric Regression Variance Function Estimation in Multivariate Nonparametric Regression T. Tony Cai 1, Michael Levine Lie Wang 1 Abstract Variance function estimation in multivariate nonparametric regression is considered

More information

Piecewise Smooth Solutions to the Burgers-Hilbert Equation

Piecewise Smooth Solutions to the Burgers-Hilbert Equation Piecewise Smooth Solutions to the Burgers-Hilbert Equation Alberto Bressan and Tianyou Zhang Department of Mathematics, Penn State University, University Park, Pa 68, USA e-mails: bressan@mathpsuedu, zhang

More information

Some approximation theorems in Math 522. I. Approximations of continuous functions by smooth functions

Some approximation theorems in Math 522. I. Approximations of continuous functions by smooth functions Some approximation theorems in Math 522 I. Approximations of continuous functions by smooth functions The goal is to approximate continuous functions vanishing at ± by smooth functions. In a previous homewor

More information

Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes

Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes Anton Schick Binghamton University Wolfgang Wefelmeyer Universität zu Köln Abstract Convergence

More information

Wavelets and multiresolution representations. Time meets frequency

Wavelets and multiresolution representations. Time meets frequency Wavelets and multiresolution representations Time meets frequency Time-Frequency resolution Depends on the time-frequency spread of the wavelet atoms Assuming that ψ is centred in t=0 Signal domain + t

More information

A New Method for Varying Adaptive Bandwidth Selection

A New Method for Varying Adaptive Bandwidth Selection IEEE TRASACTIOS O SIGAL PROCESSIG, VOL. 47, O. 9, SEPTEMBER 1999 2567 TABLE I SQUARE ROOT MEA SQUARED ERRORS (SRMSE) OF ESTIMATIO USIG THE LPA AD VARIOUS WAVELET METHODS A ew Method for Varying Adaptive

More information

Research Statement. Harrison H. Zhou Cornell University

Research Statement. Harrison H. Zhou Cornell University Research Statement Harrison H. Zhou Cornell University My research interests and contributions are in the areas of model selection, asymptotic decision theory, nonparametric function estimation and machine

More information

Nonlinear Stationary Subdivision

Nonlinear Stationary Subdivision Nonlinear Stationary Subdivision Michael S. Floater SINTEF P. O. Box 4 Blindern, 034 Oslo, Norway E-mail: michael.floater@math.sintef.no Charles A. Micchelli IBM Corporation T.J. Watson Research Center

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

41903: Introduction to Nonparametrics

41903: Introduction to Nonparametrics 41903: Notes 5 Introduction Nonparametrics fundamentally about fitting flexible models: want model that is flexible enough to accommodate important patterns but not so flexible it overspecializes to specific

More information

Wavelets and Image Compression. Bradley J. Lucier

Wavelets and Image Compression. Bradley J. Lucier Wavelets and Image Compression Bradley J. Lucier Abstract. In this paper we present certain results about the compression of images using wavelets. We concentrate on the simplest case of the Haar decomposition

More information

Vectors in Function Spaces

Vectors in Function Spaces Jim Lambers MAT 66 Spring Semester 15-16 Lecture 18 Notes These notes correspond to Section 6.3 in the text. Vectors in Function Spaces We begin with some necessary terminology. A vector space V, also

More information

Nonparametric Estimation of Functional-Coefficient Autoregressive Models

Nonparametric Estimation of Functional-Coefficient Autoregressive Models Nonparametric Estimation of Functional-Coefficient Autoregressive Models PEDRO A. MORETTIN and CHANG CHIANN Department of Statistics, University of São Paulo Introduction Nonlinear Models: - Exponential

More information

4 Sums of Independent Random Variables

4 Sums of Independent Random Variables 4 Sums of Independent Random Variables Standing Assumptions: Assume throughout this section that (,F,P) is a fixed probability space and that X 1, X 2, X 3,... are independent real-valued random variables

More information

Let p 2 ( t), (2 t k), we have the scaling relation,

Let p 2 ( t), (2 t k), we have the scaling relation, Multiresolution Analysis and Daubechies N Wavelet We have discussed decomposing a signal into its Haar wavelet components of varying frequencies. The Haar wavelet scheme relied on two functions: the Haar

More information

Analysis of Redundant-Wavelet Multihypothesis for Motion Compensation

Analysis of Redundant-Wavelet Multihypothesis for Motion Compensation Analysis of Redundant-Wavelet Multihypothesis for Motion Compensation James E. Fowler Department of Electrical and Computer Engineering GeoResources Institute GRI Mississippi State University, Starville,

More information

On the fast algorithm for multiplication of functions in the wavelet bases

On the fast algorithm for multiplication of functions in the wavelet bases Published in Proceedings of the International Conference Wavelets and Applications, Toulouse, 1992; Y. Meyer and S. Roques, edt., Editions Frontieres, 1993 On the fast algorithm for multiplication of functions

More information

Comments on \Wavelets in Statistics: A Review" by. A. Antoniadis. Jianqing Fan. University of North Carolina, Chapel Hill

Comments on \Wavelets in Statistics: A Review by. A. Antoniadis. Jianqing Fan. University of North Carolina, Chapel Hill Comments on \Wavelets in Statistics: A Review" by A. Antoniadis Jianqing Fan University of North Carolina, Chapel Hill and University of California, Los Angeles I would like to congratulate Professor Antoniadis

More information

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES

AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES Lithuanian Mathematical Journal, Vol. 4, No. 3, 00 AN INEQUALITY FOR TAIL PROBABILITIES OF MARTINGALES WITH BOUNDED DIFFERENCES V. Bentkus Vilnius Institute of Mathematics and Informatics, Akademijos 4,

More information

Existence and Uniqueness

Existence and Uniqueness Chapter 3 Existence and Uniqueness An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect

More information

Energy method for wave equations

Energy method for wave equations Energy method for wave equations Willie Wong Based on commit 5dfb7e5 of 2017-11-06 13:29 Abstract We give an elementary discussion of the energy method (and particularly the vector field method) in the

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

Sobolev spaces. May 18

Sobolev spaces. May 18 Sobolev spaces May 18 2015 1 Weak derivatives The purpose of these notes is to give a very basic introduction to Sobolev spaces. More extensive treatments can e.g. be found in the classical references

More information

Almost sure limit theorems for U-statistics

Almost sure limit theorems for U-statistics Almost sure limit theorems for U-statistics Hajo Holzmann, Susanne Koch and Alesey Min 3 Institut für Mathematische Stochasti Georg-August-Universität Göttingen Maschmühlenweg 8 0 37073 Göttingen Germany

More information

Function spaces on the Koch curve

Function spaces on the Koch curve Function spaces on the Koch curve Maryia Kabanava Mathematical Institute Friedrich Schiller University Jena D-07737 Jena, Germany Abstract We consider two types of Besov spaces on the Koch curve, defined

More information

CLASSIFYING THE SMOOTHNESS OF IMAGES: THEORY AND APPLICATIONS TO WAVELET IMAGE PROCESSING

CLASSIFYING THE SMOOTHNESS OF IMAGES: THEORY AND APPLICATIONS TO WAVELET IMAGE PROCESSING CLASSIFYING THE SMOOTHNESS OF IMAGES: THEORY AND APPLICATIONS TO WAVELET IMAGE PROCESSING RONALD A. DeVORE 1, University of South Carolina, and BRADLEY J. LUCIER 2, Purdue University Abstract Devore, Jawerth,

More information

Lecture 3: Statistical Decision Theory (Part II)

Lecture 3: Statistical Decision Theory (Part II) Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical

More information

Supplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data

Supplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data Supplementary Material for Nonparametric Operator-Regularized Covariance Function Estimation for Functional Data Raymond K. W. Wong Department of Statistics, Texas A&M University Xiaoke Zhang Department

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis

More information

KUMMER S LEMMA KEITH CONRAD

KUMMER S LEMMA KEITH CONRAD KUMMER S LEMMA KEITH CONRAD Let p be an odd prime and ζ ζ p be a primitive pth root of unity In the ring Z[ζ], the pth power of every element is congruent to a rational integer mod p, since (c 0 + c 1

More information

A DATA-DRIVEN BLOCK THRESHOLDING APPROACH TO WAVELET ESTIMATION

A DATA-DRIVEN BLOCK THRESHOLDING APPROACH TO WAVELET ESTIMATION The Annals of Statistics 2009, Vol. 37, No. 2, 569 595 DOI: 10.1214/07-AOS538 Institute of Mathematical Statistics, 2009 A DATA-DRIVEN BLOCK THRESHOLDING APPROACH TO WAVELET ESTIMATION BY T. TONY CAI 1

More information

GIOVANNI COMI AND MONICA TORRES

GIOVANNI COMI AND MONICA TORRES ONE-SIDED APPROXIMATION OF SETS OF FINITE PERIMETER GIOVANNI COMI AND MONICA TORRES Abstract. In this note we present a new proof of a one-sided approximation of sets of finite perimeter introduced in

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Construction of Multivariate Compactly Supported Orthonormal Wavelets

Construction of Multivariate Compactly Supported Orthonormal Wavelets Construction of Multivariate Compactly Supported Orthonormal Wavelets Ming-Jun Lai Department of Mathematics The University of Georgia Athens, GA 30602 April 30, 2004 Dedicated to Professor Charles A.

More information

Estimation of cumulative distribution function with spline functions

Estimation of cumulative distribution function with spline functions INTERNATIONAL JOURNAL OF ECONOMICS AND STATISTICS Volume 5, 017 Estimation of cumulative distribution function with functions Akhlitdin Nizamitdinov, Aladdin Shamilov Abstract The estimation of the cumulative

More information

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION

THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION THE LINDEBERG-FELLER CENTRAL LIMIT THEOREM VIA ZERO BIAS TRANSFORMATION JAINUL VAGHASIA Contents. Introduction. Notations 3. Background in Probability Theory 3.. Expectation and Variance 3.. Convergence

More information

Uniform Confidence Sets for Nonparametric Regression with Application to Cosmology

Uniform Confidence Sets for Nonparametric Regression with Application to Cosmology Uniform Confidence Sets for Nonparametric Regression with Application to Cosmology Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning 3. Instance Based Learning Alex Smola Carnegie Mellon University http://alex.smola.org/teaching/cmu2013-10-701 10-701 Outline Parzen Windows Kernels, algorithm Model selection

More information

Regularity of the density for the stochastic heat equation

Regularity of the density for the stochastic heat equation Regularity of the density for the stochastic heat equation Carl Mueller 1 Department of Mathematics University of Rochester Rochester, NY 15627 USA email: cmlr@math.rochester.edu David Nualart 2 Department

More information

UNIFORM BOUNDS FOR BESSEL FUNCTIONS

UNIFORM BOUNDS FOR BESSEL FUNCTIONS Journal of Applied Analysis Vol. 1, No. 1 (006), pp. 83 91 UNIFORM BOUNDS FOR BESSEL FUNCTIONS I. KRASIKOV Received October 8, 001 and, in revised form, July 6, 004 Abstract. For ν > 1/ and x real we shall

More information

Asymptotically sufficient statistics in nonparametric regression experiments with correlated noise

Asymptotically sufficient statistics in nonparametric regression experiments with correlated noise Vol. 0 0000 1 0 Asymptotically sufficient statistics in nonparametric regression experiments with correlated noise Andrew V Carter University of California, Santa Barbara Santa Barbara, CA 93106-3110 e-mail:

More information

Multiscale Frame-based Kernels for Image Registration

Multiscale Frame-based Kernels for Image Registration Multiscale Frame-based Kernels for Image Registration Ming Zhen, Tan National University of Singapore 22 July, 16 Ming Zhen, Tan (National University of Singapore) Multiscale Frame-based Kernels for Image

More information

Smooth functions and local extreme values

Smooth functions and local extreme values Smooth functions and local extreme values A. Kovac 1 Department of Mathematics University of Bristol Abstract Given a sample of n observations y 1,..., y n at time points t 1,..., t n we consider the problem

More information

On the local existence for an active scalar equation in critical regularity setting

On the local existence for an active scalar equation in critical regularity setting On the local existence for an active scalar equation in critical regularity setting Walter Rusin Department of Mathematics, Oklahoma State University, Stillwater, OK 7478 Fei Wang Department of Mathematics,

More information

Lecture Notes 5: Multiresolution Analysis

Lecture Notes 5: Multiresolution Analysis Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and

More information

WAVELET EXPANSIONS OF DISTRIBUTIONS

WAVELET EXPANSIONS OF DISTRIBUTIONS WAVELET EXPANSIONS OF DISTRIBUTIONS JASSON VINDAS Abstract. These are lecture notes of a talk at the School of Mathematics of the National University of Costa Rica. The aim is to present a wavelet expansion

More information

Maximum Likelihood Diffusive Source Localization Based on Binary Observations

Maximum Likelihood Diffusive Source Localization Based on Binary Observations Maximum Lielihood Diffusive Source Localization Based on Binary Observations Yoav Levinboo and an F. Wong Wireless Information Networing Group, University of Florida Gainesville, Florida 32611-6130, USA

More information

BAND-LIMITED REFINABLE FUNCTIONS FOR WAVELETS AND FRAMELETS

BAND-LIMITED REFINABLE FUNCTIONS FOR WAVELETS AND FRAMELETS BAND-LIMITED REFINABLE FUNCTIONS FOR WAVELETS AND FRAMELETS WEIQIANG CHEN AND SAY SONG GOH DEPARTMENT OF MATHEMATICS NATIONAL UNIVERSITY OF SINGAPORE 10 KENT RIDGE CRESCENT, SINGAPORE 119260 REPUBLIC OF

More information

Wavelet Thresholding for Non Necessarily Gaussian Noise: Functionality

Wavelet Thresholding for Non Necessarily Gaussian Noise: Functionality Wavelet Thresholding for Non Necessarily Gaussian Noise: Functionality By R. Averkamp 1 and C. Houdré 2 Freiburg University and Université Paris XII and Georgia Institute of Technology For signals belonging

More information

1 Review of di erential calculus

1 Review of di erential calculus Review of di erential calculus This chapter presents the main elements of di erential calculus needed in probability theory. Often, students taking a course on probability theory have problems with concepts

More information

Uniform Confidence Sets for Nonparametric Regression with Application to Cosmology

Uniform Confidence Sets for Nonparametric Regression with Application to Cosmology Uniform Confidence Sets for Nonparametric Regression with Application to Cosmology Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry

More information

1 Continuity Classes C m (Ω)

1 Continuity Classes C m (Ω) 0.1 Norms 0.1 Norms A norm on a linear space X is a function : X R with the properties: Positive Definite: x 0 x X (nonnegative) x = 0 x = 0 (strictly positive) λx = λ x x X, λ C(homogeneous) x + y x +

More information

and finally, any second order divergence form elliptic operator

and finally, any second order divergence form elliptic operator Supporting Information: Mathematical proofs Preliminaries Let be an arbitrary bounded open set in R n and let L be any elliptic differential operator associated to a symmetric positive bilinear form B

More information