CONVERGENCE RATES OF GENERAL REGULARIZATION METHODS FOR STATISTICAL INVERSE PROBLEMS AND APPLICATIONS

Size: px
Start display at page:

Download "CONVERGENCE RATES OF GENERAL REGULARIZATION METHODS FOR STATISTICAL INVERSE PROBLEMS AND APPLICATIONS"

Transcription

1 CONVERGENCE RATES OF GENERAL REGULARIZATION METHODS FOR STATISTICAL INVERSE PROBLEMS AND APPLICATIONS BY N. BISSANTZ, T. HOHAGE, A. MUNK AND F. RUYMGAART UNIVERSITY OF GÖTTINGEN AND TEXAS TECH UNIVERSITY Abstract. During the past the convergence analysis for linear statistical inverse problems has mainly focused on spectral cut off and Tikhonov type estimators. Spectral cut off estimators achieve minimax rates for a broad range of smoothness classes and operators, but their practical usefulness is limited by the fact that they require a complete spectral decomposition of the operator. Tikhonov estimators are simpler to compute, but still involve the inversion of an operator and achieve minimax rates only in very restricted smoothness classes. In this paper we introduce a unifying technique to study the mean square error of a large class of regularization methods spectral methods) including the aforementioned estimators as well as many iterative methods, such as ν-methods and the Landweber iteration. The latter estimators converge at the same rate as spectral cut-off, but only require matrixvector products. Our results are applied to various problems, in particular we obtain precise convergence rates for satellite gradiometry, L 2 -boosting, and errors in variable problems. AMS subject classifications. 62G5, 62J5, 62P35, 65J, 35R3 Key words. Statistical inverse problems, iterative regularization methods, Tikhonov regularization, nonparametric regression, minimax convergence rates, satellite gradiometry, Hilbert scales, boosting, errors in variable. Introduction. This paper is concerned with estimating an element f of a Hilbert space H from indirect noisy measurements Y = Kf + noise.) related to f by a known) operator K : H H 2 mapping H to another Hilbert space H 2. The operator K is assumed to be linear, bounded, and injective, but not necessarily compact. We are interested in the case that the operator equation.) is ill-posed in the sense that the Moore- Penrose inverse of K is unbounded. The analysis of regularization methods for the stable solution of.) depends on the mathematical model for the noise term on the right hand side of.): If the noise is considered as a deterministic quantity, it is natural to study the worst case error. In the literature a number of efficient methods for the solution of.) have been developed, and it has been shown under certain conditions that the worst case error converges of optimal order as the noise level tends to see Engl et al. [3]). If the noise is modeled as a random quantity, the convergence of estimators ˆf of f should be studied in statistical terms, e.g. the expected square error E ˆf f 2, also called mean integrated square error MISE). This problem has also been studied extensively in the statistical literature, but the numerical efficiency has not been a major issue so far. It is the purpose of this paper to provide an analysis of a class of computationally efficient regularization methods including Landweber iteration, ν-methods, and iterated Tikhonov regularization, which is applicable to linear inverse problems with random noise as they occur for example in parameter identification problems in partial differential equations, deconvolution or errors in variable models. There exists a considerable amount of literature on regularization methods for linear inverse problems with random noise. For surveys we refer to O Sullivan [35], Nychka & Cox [34], Evans & Stark [4] and Kaipio & Somersalo [24]. A large part of the literature focusses on methods which require the explicit knowledge of a spectral decomposition of the operator K K. The simplest of these methods is spectral cut-off or truncated singular value decomposition for compact operators) where an estimator is constructed by a truncated expansion of f w.r.t. the eigenfunctions of K K e.g. Diggle & Hall [9], Healy, Hendriks & Kim [9]). It has been shown in a number of papers that spectral cut-off estimators are order optimal in a minimax-sense under certain conditions e.g. Mair & Ruymgaart [28], Efromovich [2], Kim & Koo [25]). Based on a singular value decomposition SVD) of K it is also possible to construct exact minimax estimators for given smoothness classes see Johnstone & Silverman [23]). Another major approach are wavelet-vaguelette and vaguelette-wavelet) based methods which lead to estimators of a similar functional form as SVD methods. However, in general these es-

2 timators are based on expansions of f and Kf with respect to different bases of the respective function spaces than those provided by the SVD of K e.g. Donoho [], Abramovich & Silverman [], Johnstone et al. [22]). A well-known method both in the statistical and the deterministic inverse problems literature is Tikhonov regularization. This has been studied for certain classes of linear statistical inverse problems by Cox [8], Nychka & Cox [34] and Mathé & Pereverzev [29, 3]. The main restriction of the usefulness of spectral cut-off and related estimators is the need of the spectral data of the operator i.e. an SVD if K is compact) to implement these estimators. This is known explicitly only in a limited number of special cases, and numerical computation of the spectral data is prohibitively expensive for many real-world applications. Although Tikhonov regularization does not require the spectral data of the operator, there is still a need of setting up and inverting a matrix representing the operator. For iterative regularization methods such as Landweber iteration or ν-methods see Nemirovskii & Polyak [33], Brakhage [5], and Engl et al. [3]) only matrix-vector multiplications are required. Therefore, iterative regularization methods are the methods of choice for many inverse problems in partial differential equations e.g. the backwards heat equation discussed in 5). Here K is a so called parameter-to-solution operator, i.e. an application of this operator to a vector f simply means solving a differential equation with a parameter f. On the other hand, setting up the matrix for K is often not feasible. Furthermore, it is known that Tikhonov regularization achieves minimax rates of convergence only in a very restricted number of smoothness classes, which is highlighted by the fact that its qualification number is, whereas Landweber iteration has infinitely large qualification, and ν methods with qualification ν are available for every ν > see [3]). In this paper we will show that general spectral regularization methods achieve the same rates of convergence of the MISE as spectral cut-off, which is known to be optimal in most cases see above). Whereas the bias or approximation error is exactly the same in a deterministic and a statistical framework, the analysis significantly differs in the estimation of the noise term. In spectral cut-off for compact operators, the noise or variance) part of the estimators ˆf belongs to a finite dimensional space of low frequencies. The main difficulty in the analysis of general spectral regularization methods is the estimation of the high frequency components of the noise. Unlike in a deterministic framework, the bound on the noise term depends not only on the regularization parameter, but also on the distribution of the singular values of K if K is compact). Therefore, a statistical analysis has to impose conditions on the operator. We will verify these conditions for several important problems including inverse problems in partial differential equations and errors in variable models. As an example of particular interest in the machine learning context we obtain optimal rates of convergence of L 2 boosting by interpreting L 2 -boosting as a Landweber iteration see also Bühlmann & Yu [6], Yao et al. [4]). The plan of this paper is as follows: In 2 we introduce an abstract noise model and demonstrate how other commonly used statistical noise models fit into this general framework. 3 gives a brief overview of regularization methods and source conditions. The following section contains the main results of this paper on the rates of convergence of general spectral regularization methods. Finally, in 5 we discuss the application of our results to the backwards heat equation, satellite gradiometry, errors in variable models with dependent random variables, L 2 boosting, and operators in Hilbert scales. Proofs of section 4 are collected in Noise model and spectral theorem. 2.. Spectral theorem. Halmos version of the spectral theorem see, for instance, Halmos [8], Taylor [38]) turns out to be particularly convenient for the construction and statistical analysis of regularized inverses of a self-adjoint operator. This has been demonstrated by Mair & Ruymgaart [28] for the spectral cut-off estimator. The theorem claims that for a not necessarily bounded) self-adjoint operator A : DA) H defined on a dense subset DA) of a separable Hilbert space H there exists a σ-compact space S, a Borel measure Σ on S, a unitary operator U : H L 2 Σ), and a measurable function ρ : S R such that UAf = ρ Uf, Σ-almost everywhere, 2.) 2

3 for all f DA). Introducing the multiplication operator M ρ : DM ρ ) L 2 Σ), M ρ ϕ := ρ ϕ defined on DM ρ ) := {ϕ L 2 Σ) : ρϕ L 2 Σ)}, we can rewrite 2.) as A = U M ρ U, i.e. A is unitarily equivalent to a multiplication operator. The essential range of ρ is the spectrum σa) of A. If A is bounded and positive definite as in sections 3-4, then < ρ A, Σ-a.e. Remark. In the special case that A is compact, a well-known version of the spectral theorem states that A has a complete orthonormal system of eigenvectors u i with corresponding eigenvalues ρ i, and Af = j= ρ j u j, f u j. This can be rewritten in the multiplicative form 2.) by choosing Σ as the counting measure on S = N, i.e. L 2 Σ) = l 2 N), the multiplicator function as ρi) = ρ i, i N, and defining the unitary operator U : H l 2 N) by Uf)i) := f, u i, i N Noise models. Now we will discuss several noise models commonly encountered in statistical modelling. First we introduce the general framework which will be used in the proof of our main result. Thereafter several examples will be discussed. Following Mathé & Pereverzev [29] we assume that our given data can be written as Y = g + σε + τξ, g := Kf, 2.2) where ξ H 2, ξ = is a deterministic error, ε is a stochastic error, and τ, σ > are the corresponding noise levels. Note that model 2.2) allows for stochastic and deterministic noise, simultaneously. Often, the stochastic error will be a Hilbert-space valued random variable Ξ, i.e. a measurable function Ξ : Ω H 2 where Ω, P, P ) is the underlying probability space. However, we will assume more generally that it is a Hilbert-space process, i.e. a continuous linear operator ε : H 2 L 2 Ω, P, P ). Every Hilbert-space valued random variable Ξ with finite second moments, E Ξ 2 < can be identified with a Hilbert-space process ϕ Ξ, ϕ, ϕ H 2, but not vice versa. We will use the notation ε, ϕ := εϕ, ϕ H 2. The covariance Cov ε : H 2 H 2 of a Hilbert-space process ε : H 2 L 2 Ω, P, P ) is the bounded linear operator defined implicitly by Cov ε ϕ, ϕ 2 = Cov ε, ϕ, ε, ϕ 2 ), ϕ, ϕ 2 H 2. We assume that ϕ H 2 : E ε, ϕ =, Cov ε. 2.3) We call ε a white noise process if Cov ε = I. Note that a Gaussian white noise process in an infinite-dimensional Hilbert space cannot be identified with a Hilbert-space valued random variable. If ξ : H 2 L 2 Ω, P, P ) is a Hilbert-space process and A : H 2 H is a bounded linear operator, we define the Hilbert-space process Aξ : H L 2 Ω, P, P ) by Aξ, ϕ := ξ, A ϕ, ϕ H. Its covariance operator is given by Cov Aξ = ACov ξ A. Assumption. For the noise model 2.2) the assumptions 2.3) and ξ = hold true. Moreover, K ε is a Hilbert-space valued random variable satisfying E K ε 2 <, 2.4) and there exists a spectral decomposition 2.) of K K such that for almost all s S Var UK εs)) ρs). 2.5) In the remainder of this section we show that several commonly used noise models fit into the general framework 2.2). We start with an infinite dimensional) white noise model, and then continue with several models based on finitely many observations White noise. A frequently used model is to assume that ε in 2.2) is a white noise process in H 2 see e.g. Donoho [, ], Mathé & Pereverzev [29]). Moreover, we assume that K K is a trace-class operator, i.e. it is compact and the eigenvalues ρ j of K K satisfy trk K) := j= ρ j <. Then Cov K ε = K K, so E K ε 2 = trcov K ε) = trk K) <. 3

4 Therefore, K ε can be identified with a Hilbert-space valued random variable. Using the notation introduced in Remark and defining e j : N R by e j k) := δ jk, u j = U e j H is a unit-length eigenvector of K K to the eigenvalue ρ j, and Var UK εj)) = Var UK ε, e j = Var ε, Ku j = Ku j 2 = ρ j, for j =,, 2,.... Therefore, 2.5) is satisfied with equality Quasi-deconvolution, errors in variable, non compact operators. Suppose we want to estimate the density f of a random variable Y with values in R d, but we can only observe a random variable X = Y + W perturbed by a random variable W. Hence, our data are The density g of X is given by X,..., X n i.i.d. X = Y + W. 2.6) g = h y y)fy) dy =: Kf, R d 2.7) where h y) is the conditional density of W given Y = y. If Y and W are stochastically independent, K is a convolution operator. Recovering of f is known as deconvolution problem and has been studied extensively e.g. Stefanski & Carroll [36], Fan [5] and Diggle & Hall [9]). Dependent Y and W in 2.6) occur in many scientific applications, e.g. brightness determination of extragalactic star clusters in astrophysics, where the variance σ 2 of the noise W increases monotonically with decreasing brightness of the object Y. Here, a reasonable model is described by hz y) = 2πσ 2 y)) /2 exp z 2 /σ 2 y)) see Bissantz [3]). We assume that f L 2 R d ) and that K is a bounded, injective operator in L 2 R d ). As opposed to the previous section, in general, K is not compact here. Obviously, an unbiased estimator of q := K g is given by ˆq n z) := n hx j z z). 2.8) n j= To fit this into our general framework, we show that ˆq n = q + K ε for a Hilbert-space process ε : L 2 R d ) L 2 Ω, P, P ) defined by In fact, for ψ L 2 R d ), K ε, ψ = ε, Kψ = n ε, ϕ := n n j= n ϕx j ) g, ϕ. 2.9) j= R d hx j z z)ψz) dz K g, ψ = ˆq n q, ψ. The next result states that Assumption is satisfied: Proposition 2. Assume that the operator K defined by 2.7) is injective and satisfies K 2,2 < and K 2, <, where K r,s is defined as the operator norm of K : L r R d ) L s R d ). Moreover, let ˆq n and ε be defined by 2.8) and 2.9), and let σ := n g L + g 2 L 2 ) /2 and ε := ε/σ. 2.) Then ε satisfies Assumption, and ˆq n = q + σk ε. Proof. We have to show that 2.3)-2.5) hold true. Since the X j are assumed to be independent, it suffices to consider the case n =. The first part of 2.3), i.e. ε, ϕ = for ϕ L 2 R d ) follows from EϕX) = ϕg dx. Since Cov ε, ϕ, ε, ϕ 2 ) = ϕ ϕ 2 g dx g, ϕ g, ϕ 2 R d for all ϕ, ϕ 2 H 2, 4

5 the covariance operator of ε is given by Cov ε = M g g g, where M g means multiplication by g, and g g : L 2 R d ) L 2 R d ) is the rank- operator defined by g g)ϕ := g ϕ, g. Now Cov ε follows from the estimate Cov ε g L + g 2 L 2, which completes the proof of 2.3). To show 2.4), i.e. E ˆq n q <, note that Covˆq = K Cov ε K = K M g K K g) K g). We have to show that this is a trace class operator. Obviously K g) K g) is trace class as a rank- operator. It is not obvious, however, that K M g K is trace class since neither K nor M g are even compact in general. To show this, we rewrite the kernel of K as kx, y) := hx y y) and note that ess sup kx, ) L 2 = K 2, <. Since g, the operator K M g K is self-adjoint and positive semi-definite. Let {ϕ j : j N} be a complete orthonormal system in the separable Hilbert space L 2 R d ). The B.-Levi Theorem yields gx) Kϕ j )x) 2 dµ 2 x) ϕ j, K M g Kϕ j = j N j N = gx) kx, ), ϕ j 2 dµ 2 x) g L j N ess sup x X 2 kx, ) 2 L 2 µ ) <, which implies that K M g K is trace class with trk M g K) K 2 2,. Finally, 2.5) follows from Lemma 3. If K is a convolution operator with convolution kernel wx y), then the canonical choice of the unitary operator U in the Halmos decomposition is the Fourier transform Uϕ)ξ) = Fϕ)ξ) = ϕx)e 2πiξ x dx, 2.) R d and the multiplier function is then ρ = Fw 2. In this case the condition 2.5) in Assumption can be verified explicitly, see Mair & Ruymgaart [28]. However, for a general integral operator K as in 2.7) we know neither U nor ρ explicitly. Therefore, in the proof of Proposition 2 we rely on the following lemma to show condition 2.5). It implies that 2.5) is a condition on the choice of U in the Halmos representation 2.) rather than a condition on the noise model. The assertion that ρ L Σ) will be required in section 4. Lemma 3. If ε is a Hilbert space process satisfying 2.3), K ε is a Hilbert space valued random variable satisfying 2.4), and K is injective, then there exists a spectral decomposition 2.) of K K such that 2.5) holds true, and ρ L Σ). Proof. According to Halmos spectral theorem there exists a Borel measure Σ on a σ-compact space S, and a unitary operator Ũ : L2 R d ) L 2 Σ) such that K K = Ũ M ρ Ũ. For any Σmeasurable function χ > on S we can construct another Halmos representation of K K by introducing the Borel measure Σ := χ Σ on S and the mapping U : L 2 R d ) L 2 Σ), Uf := χ /2 Ũf since U is unitary and UK Kf = ρ Uf Σ-a.e. for all f H. In particular, we may define χs) := Var ŨK ε)s) ρs) for s M, M := {s S : Var ŨK ε)s) > }. 2.2) Here we use that ρ > Σ-a.e since K and hence K K is injective by assumption. We first consider the case ΣM c ) = where M c := S \ M. Then 2.5) holds true for s M as Var UK ɛ)s) = χs) Var ŨK ɛ)s) = ρs). Moreover, ρ dσ = Var UK ε) dσ = E UK ε 2 dσ = E K ε 2 <, 2.3) which is the assertion. Now assume that ΣM c ) >. Let ψ be an arbitrary strictly positive, function in L Σ), e.g. ψs) := js) 2 ΣAjs) )) where js) := min{j : s Aj } for a sequence 5

6 A A 2... Ω with ΣA j ) < and ΣS \ j A j) =. Such a sequence exists because Σ is σ-finite. We define χs) by 2.2) for s M and χs) := ψs) ρs) for s M c. Then 2.5) is trivially satisfied for s M c, and ρ L Σ) since M ρ dσ < as in 2.3) and ρ dσ ψ d Σ <. M c This finishes the proof Inverse regression. We now review another commonly used noise model see Wabha [39], O Sullivan [35], Nychka & Cox [34], Bissantz et al. [4]) and show how it is related to the model 2.2). Suppose that H i = L 2 µ i ) are L 2 -spaces with respect to measure spaces X i, X i, µ i ), i =, 2, H is separable, and that K : L 2 µ ) L 2 µ 2 ) is an integral operator Kf)x) := kx, y)fy) dµ y), x X 2, 2.4) X with kernel k. Recall that K K is trace class if and only if K is Hilbert-Schmidt and that K is a Hilbert-Schmidt operator if and only if k L 2 µ 2 µ ) see Taylor [37]). The latter condition is easy to verify in most applications. We will assume in the following that the measure space H 2 is finite. Then we can arrange that µ 2 X 2 ) =. We consider the regression model Y i = Kf)X i ) + ε i, f H, i =,..., n, 2.5) where we assume for simplicity that the random variables X i X 2 have uniform distribution on X 2 see also Remark 5). Moreover, we assume that Y i, X i ) Y, X), i =,..., n are i.i.d. random variables with values in R X 2 such that E [Y X] = Kf)X), 2.6) and hence E [ε X] = for ε := Y Kf)X). Finally we assume that that vx) := E [ε 2 X] satisfies < C v,l vx) C v,u < a.s., 2.7) for some constants C v,l, C v,u >. A straightforward computation shows that ˆq n = n n Y i kx i, ). 2.8) i= is an unbiased estimator of the vector q := K Kf. To fit the inverse regression model with random design in our general framework, we introduce the Hilbert-space noise) process ε : H 2 L 2 Ω, P, P ) by and show that ε, ϕ := n K ε, ψ = ε, Kψ = n n Y j ϕx j ) g, ϕ, ϕ H 2, 2.9) j= n Y j kx j, y)ψy) dµ y) K X g, ψ = ˆq n q, ψ j j= for all ψ H, i.e. ˆq n = q + K ε. Proposition 4. Assume the inverse regression model 2.4)-2.7), and let ˆq n and ε be defined by 2.8) and 2.9). Moreover, let K : L 2 µ ) L 2 µ 2 ) be Hilbert-Schmidt, and µ 2 ess sup kx, ) L 2 µ ) <. Define σ := C n and 6 ε := ε/σ,

7 with C := C v,u + g 2 L µ + 2) g 2 L 2 µ 2). Then ε satisfies Assumption for the unitary transform U defined in Remark, and ˆq n = q + σk ε. Moreover, C v,l n ρj) Var U ˆq n)j)), j =,, 2, ) Proof. It suffices to prove this for n =. Since X is uniformly distributed and 2.6) holds true, we have E Y ϕx)) = E E [ε X]ϕX)) + E gx)ϕx)) = gϕ dµ 2 = g, ϕ for all ϕ H 2 and hence the first part of eq. 2.3) holds true. Using once more the same properties of X and Y we find that Cov ε, ϕ, ε, ϕ 2 ) = E { Y 2 ϕ X)ϕ 2 X) } g, ϕ g, ϕ 2 = E { ε 2 + 2εgX) + gx) 2 )ϕ X)ϕ 2 X) } g, ϕ g, ϕ 2 = ϕ v 2 + g 2) ϕ 2 dµ 2 g, ϕ g, ϕ 2 for all ϕ, ϕ 2 H 2. Hence, Cov ε = M v 2 +g 2 g g where M v 2 +g 2ϕ := v2 + g 2 ) ϕ and g g)ϕ := g, ϕ g. This implies Cov ε C and finishes the proof of 2.3). Using the notation of Remark, condition 2.4) can be seen as follows: Since E ˆq q 2 = tr Covˆq q) = Ku j, Cov ε Ku j C j= j= Ku j 2 = C trk K) <. Var U ˆq )j) = u j, Covˆq u j = Ku j, Cov ε Ku j C Ku j 2 = C ρ j, we obtain the bound 2.5). The lower bound in 2.2) holds true since the operator M 2 g g g is positive definite as covariance operator of ε for the case ε. As opposed to [28] we do not need the assumption that the singular vectors u j H and v j H 2 in the singular value decomposition Kf = j= ρj f, u j H v j be uniformly bounded sequences in L µ ) and L µ 2 ), respectively. We only require that µ 2 ess sup kx, ) L 2 µ ) <. This condition is often less restrictive and easier to verify. Remark 5. Generalizations:. We can replace L 2 µ ) by an arbitrary Hilbert space H e.g. a Sobolev space) by replacing kx, ) by kx) := j= ρj vx)u j, x X 2. Then 2.4) and 2.8) read Kf)x) = kx), f H and ˆq n = n n i= Y i kx i ), respectively. Proposition 4 remains valid with literally the same proof if L 2 X ) is replaced by H. 2. deterministic and nonuniform design). The noise model 2.2) also allows to treat models of the form 2.5) where the measurement points are either nonuniformly distributed on X 2 or x i = x n) i are deterministic quantities see, for instance, Nychka & Cox [34], O Sullivan [35]). For conditions on the design density see Munk [32]. 3. Regularization methods. We now review some basic notions of regularization theory. 3.. Regularized estimators. Recall Halmos spectral theorem from 2.. For a self-adjoint operator A : DA) H and a bounded, measurable function Φ : σa) R one defines an operator ΦA) LH) by ΦA) = U M Φρ) U, 3.) 7

8 see e.g. Taylor [38]). The mapping Φ ΦA), called the functional calculus at A, is an algebra homomorphism from the algebra of bounded measurable functions on σa) to the algebra LH) of bounded linear operators on H, and ΦA) sup Φλ), 3.2) λ σa) with equality if Φ is continuous. We will construct estimators of the input function by regularization methods of the form ˆf,σ = Φ K K)K Y. 3.3) Here Φ : σk K) R is a collection of bounded filter functions approximating the unbounded function t t on σk K), which are parametrized by a regularization parameter >. A particular example of a regularization method of the form 3.3) is the spectral cut-off estimator also known as truncated singular value decomposition) described by the functions Φ SC t) := { t, t,, t <. As explained in the introduction, we will focus on regularization methods which can be implemented without explicit knowledge of the spectral decomposition of the operator K K. This includes both implicit methods such as Tikhonov regularization Φ t) = + t) ), iterated Tikhonov regularization and Lardy s method, which involve the inversion of an operator and explicit methods such as Landweber iteration Φ /k+) t) = k j= βt)j where β, K K 2 is a step-length parameter) and ν-methods, which require only matrix-vector products in a discrete setting. For a derivation and discussion of these methods we refer to the monograph [3] Smoothness classes. We will measure the smoothness of the input function f relative to the smoothing properties of K in terms of source conditions: Let Λ : [, ) [, ) be a continuous, strictly increasing function with Λ) =, and assume that there exists a source w H such that f = ΛK K)w. 3.4) The set of all f satisfying this condition with w H w, w > will be denoted by F Λ,w,K K := {ΛK K)w : w H, w w}. We will shortly write F Λ,w := F Λ,w,K K if there is no ambiguity. The most common choice, which is usually appropriate for finitely smoothing operators K is Λt) = t ν, ν >. 3.5) In particular, 3.4) with Λt) = t is equivalent to f = K v, v H2 see Engl et al. [3, Prop. 2.8]). For exponentially ill-posed problems such as the backwards heat equation, 3.5) is usually too restrictive and logarithmic source conditions corresponding to the choice Λt) = ln t) p, p >, 3.6) are more appropriate see Hohage [2], Mair [27]). Since Λ is singular at t =, we assume that the norms in H and H 2 are scaled such that K K < in this case. For a further discussion of source conditions and interpretations as smoothness conditions in Sobolev spaces we refer to the applications in section 5. If f belongs to the smoothness class F Λ,w and we are given exact data Y = g, then the error is bounded by Φ K K)K g f = Φ K K)K K I)ΛK K)w sup Φ t)t )Λt) w, 3.7) t σk K) where we have used 3.2). 8

9 3.3. Assumptions. In the following we discuss a number of standard assumptions on the filter functions Φ, which are satisfied for all commonly used regularization methods. First, we assume that there exists a constant C 2 > such that sup tφ t) C 2, uniformly in >. 3.8a) t σk K) To bound the so called propagated deterministic noise error τ Φ K K)K ξ, we impose the condition there exists C 3 > : sup > sup Φ t) C 3. t σk K) 3.8b) In view of the bound 3.7) on the approximation error, we also assume that there exists a number ν > called qualification of the method and constants γ ν > such that sup t ν tφ t)) γ ν ν, for all and all ν ν. 3.8c) t σk K) The qualification of a method is a measure of the maximal degree of smoothness in terms of the Hölder-type conditions 3.4), 3.5) under which the approximation error 3.7) converges of optimal order. The qualification some commonly used methods is: Tikhonov regularization:, K-times iterated Tikhonov regularization: K, Landweber iteration: in the sense that it is greater than any real number), ν-methods: ν where ν > is a parameter in the method), see references in the Introduction. Note that the condition 3.8c) with ν > implies that lim Φ t) = t for all t σk K). For ν = the condition 3.8c) implies 3.8a) with C 2 = +γ. However, this value of C 2 is usually not optimal as for most regularization methods 3.8a) holds true with C 2 =. For general source conditions we assume that there exists a constant γ Λ such that sup Λt) tφ t)) γ Λ Λ),. 3.9) t σk K) Under Hölder-type source conditions 3.5) this holds true for ν ν by assumption 3.8c). For the choice Λt) = ln t) p, it has been shown in Hohage [2] that 3.8c) with ν > implies 3.9). For more general functions Λ we refer to Mathé & Pereverzev [3] for similar implications. 4. MISE estimates. In this section the main results of this paper are presented. Recall the definition of the estimator ˆf,σ of the input function f in 3.3). Since E Φ K K)K ε =, the MISE satisfies the bias-variance decomposition with the bias term B ˆf,σ ) := E ˆf,σ f. E ˆf,σ f 2 = B ˆf,σ ) 2 + E ˆf,σ E ˆf,σ 2, 4.) 4.. Estimation of the bias. The bias in our model coincides with the error in a deterministic setting and can be estimated by standard techniques see [3]). Using the triangle inequality, the noise model 2.2), 2.3), and the definition 3.3) of ˆf,σ, we get B ˆf,σ ) Φ K K)K Kf f + τ Φ K K)K ξ. The first term called approximation error) is bounded by γ Λ Λ)w due to 3.7) and 3.9). For the second term called propagated deterministic noise error) we obtain the bound Φ K K)K ξ 2 = Φ KK )ξ, KK Φ KK )ξ C 2C 3 using the identity Φ K K)K = K Φ KK ), see [3, eq. 2.43)]) and 3.8). Hence, 4.2) B ˆf,σ ) γ Λ Λ)w + 9 C2 C 3 τ. 4.3)

10 Since we aim to show optimality of general regularization methods by comparison to spectral cutoff see Introduction and 4.3), we now compare the approximation errors of general regularization methods and spectral cut-off. To this end, we introduce the following notations. Notation: For two real-valued functions f, g defined on an interval, ᾱ] we write f) g) or f) < g)),, f) if g) for in some neighborhood of and lim g) Furthermore, we write = or lim sup f) g). f) g),, if there exist constants ᾱ > and Cᾱ such that /Cᾱ)f) g) Cᾱf) for < ᾱ. Recall that Λ : [, ) [, ) is assumed to be a strictly increasing, continuous function with Λ) = and that tφ SC t) = χ [,] t), i.e. I K KΦ SC K K)) is an orthogonal projection operator. Therefore, sup I K KΦ SC K K))f = sup tφ SC t))λt)w Λ)w,. f F Λ,w t σk K) The last relation holds since is not an isolated point of the spectrum σk K) for ill-posed operator equations. Using 3.7) and 3.9) we obtain the estimate sup I K KΦ K K))f γ Λ Λ)w γ Λ sup I K KΦ SC K K))f 4.4) f F Λ,w f F Λ,w as. For many regularization methods and smoothness classes we have γ Λ Estimation of the integrated variance and rate of convergence of the MISE. The more difficult part is the estimation of the integrated variance of the error ˆf,σ f. Under Assumption we have E ˆf,σ E ˆf,σ 2 = σ 2 E Φ ρ)uk ε 2 σ 2 Φ 2 ρ)ρ dσ. 4.5) A crucial point in the following analysis is the estimation of the tails of the spectral function ρ. To this end, we bound the variance in terms of the function R) := Σ{ρ }), >. 4.6) In order to control the MISE of ˆf,σ as it is tempting to assume that R is smooth in a neighborhood around. However, this is not true in general. Therefore, we will pose instead that R can be approximated suitably by a smooth function S with similar properties as R, as. Obviously, R is monotonically decreasing see 4.8a) below). If ρ belongs to L Σ), then dr) = ρ dσ < see 4.8b)), and it follows from Lebesgue s dominated convergence S theorem that lim R) = lim S {ρ } dσ = see 4.8c)). Assumption 2. There exists a constant ᾱ, ρ ] and a function S C 2, ᾱ]) such that R) S),, 4.7) with R defined by 4.6) in terms of the spectral decomposition 2.), and S satisfies S S <, S ) is integrable on, ᾱ], lim S) =, 4.8c) γ S, 2), ᾱ] : S ) S ) γ S. 4.8a) 4.8b) 4.8d)

11 We will show in section 5 for a number of examples that this assumption is satisfied. Now we are in the position to give an estimate of the MISE. The estimate of the MISE in the image space H 2 in 4.) is needed in the analysis of L 2 boosting 5.4) and for nonlinear inverse problems. Theorem 6. Consider the model 2.2), and let Assumptions and 2 hold true. We define a general spectral estimator ˆf,σ by 3.3) and assume that Φ satisfies 3.8).. If condition 3.9) is satisfied for the function Λ defining the smoothness class F Λ,w,K K, then for all f F Λ,w,K K the MISE can be asymptotically bounded by ) 2 E ˆf,σ f 2 H γ < C2 C 3 Λ Λ)w + τ + C2 2 + C3)σ Sβ) dβ,. 4.9) 2. Assume that g F Λ,w,KK H 2 and that Λ satisfies 3.9). If g = Kf with f F Λ,w,K K, then Λt) := tλt), but we do not assume g RK) here!) Then E K ˆf,σ g 2 H 2 < ) 2 C γ Λ Λ)w 2 + C2 τ C3)σ βsβ) dβ,. 4.) Note that for statistical inverse problems as opposed to deterministic inverse problems, the estimates of the noise term and hence the rates of convergence of the MISE do not only depend on the relative smoothness of the solution i.e. on Λ), but also on the operator i.e. on S). Remark 7. We comment on the choice of the regularization parameter >. If the noise levels σ and τ, the spectral properties of K K i.e. S) and the smoothness of f i.e. Λ) are known, one can choose by minimizing the right hand side of 4.9). Since typically the smoothness of the solution is not known a-priori, so called adaptive methods must be employed for the selection of. We do not intend to review the considerable amount of literature on this topic here, but want to mention that the explicit bounds on the variance given in Theorem 6 allow the application of the Lepskij balancing principle as proposed for inverse problems by Mathé & Pereverzev [3, 3] and Bauer & Pereverzev [2]. We will discuss this in more detail elsewhere. With this method one typically loses a log factor in the asymptotic rates of convergence. In most cases this can be avoided by using Akaike s method as studied for spectral cut-off and related methods by Cavalier et al. [7]. Unfortunately, Assumption 2 in this paper is not satisfied for the methods discussed here Comparison with spectral cut-off. To show that with an optimal choice of our estimators can achieve the best possible order of convergence among all estimators as σ, we compare them to the spectral cut-off estimator for which minimax results are known in many situations see references in the introduction). Since we are mainly interested in the case that the statistical noise is asymptotically dominant, we will assume that τ = for simplicity. Moreover, we assume in addition to 2.5) that the lower bound Var UK εs)) γ var ρs). 4.) holds true for some constant γ var >. For the white noise model this is satisfied with γ var = and for the inverse regression model with γ var = C v,l /C see 2.2)). Moreover, we need the following assumption to prove optimal rates in many mildly ill-posed problems. Assumption 3. There exists a constant C 4 > such that for all, ᾱ] C 4 S ) S). 4.2) Theorem 8. Let Assumptions and 2 and the lower bound 4.) hold true and assume that the family of functions {Φ } satisfies 3.8). Moreover, assume that either S = R, or Assumption 3 holds true. Then the integrated variance of the estimator ˆf,σ is bounded by the integrated variance of the spectral cut-off estimator ˆf SC,σ E ˆf,σ E ˆf,σ 2 < C2 2 + κc 2 3 γ var SC SC E ˆf,σ E ˆf,σ 2,, 4.3)

12 with C 2 and C 3 as in Theorem 6 and κ := γ S /2 γ S ), γ S defined in 4.8d). Moreover, if condition 3.9) is satisfied for the function Λ defining the smoothness class F Λ,w and if τ =, then there exists a constant C > such that sup E ˆf,σ f 2 C sup E f F Λ,w f F Λ,w ˆf SC,σ f 2, 4.4) for all σ > and all > sufficiently small. Whereas condition 4.2) is usually satisfied for mildly ill-posed problems, it is not satisfied for exponentially ill-posed problems where S) c ln ) q for constants c, q >. Nevertheless, the error bounds in Theorem 6 yield optimal rates of convergence in the limit σ for logarithmic source conditions after taking the infimum over all. This is made precise in the following result which relies on a comparison of the rates for general regularization methods and bounds on the spectral cut-off rates, which are known to be optimal in many situations e.g. Mair & Ruymgaart [28]). Theorem 9. Under the assumptions of Theorem 6, Part with τ = define the increasing functions γ ) := ᾱ β drβ) and γ 2) := 2 Sβ) dβ and assume that Λ γ 2 γ ))) < CΛ),, 4.5) with the inverse function γ 2 of γ 2 and a constant C >. Then inf E ˆf,σ f 2 > < inf CγΛ Λ)w + C3 2 + C2)σ 2 2 γ ) ), σ, > i.e. if we choose the optimal value of for every noise level σ, all spectral regularization methods achieve the same rate of convergence of the MISE as spectral cut-off. Assumption 4.5) is satisfied if Λt) = ln t) p and γ ) γ 2 2 ) since Λγ 2 γ ))) Λ 2 ) = 2 ln ) p = 2 p Λ). 5. Applications. In this section we discuss how Assumption 2 of our main result Theorem 8) can be verified for some specific operators A of practical interest. A remarkable number of interesting inverse problem can be expressed in the form K K = Θ ) 5.) in terms of the Laplace operator on some compact, smooth d-dimensional Riemannian manifold M with a possibly empty) boundary M. Our first three examples are of this form. Here Θ : [, ), ) is a function satisfying lim λ Θλ) =. Under the given assumptions the Laplace operator defined on D ) := H M) H 2 M) L 2 M) i.e. with Dirichlet condition on M) is a positive, self-adjoint operator, which has a complete orthonormal system of eigenvectors u i in L 2 M) with corresponding eigenvalues λ i see e.g. Taylor [38, Chap. 8.2]). Hence the operator on the right hand side of 5.) defined in 3.) can be written as Θ )f = i Θλ i) f, u i u i for f L 2 M). Due to a famous result of Weyl see Taylor [38, Ch.8, Thm 3.. and Cor.3.5]), the distribution of the eigenvalues has the asymptotic behavior Nλ) := #{λ i : λ i λ}, λ Nλ) c M λ d/2 vol M, c M := Γ d 2 + ) 5.2) 4π) d/2 as λ, where vol M = dx denotes the volume of M. Under the given assumptions the M operator A is compact as operator norm limit of the finite rank operators k i= Θλ i) u i, u i as k. Assume that Θλ) is monotonically decreasing for λ λ and that Θλ) > := Θλ ) for λ < λ. As lim λ Θλ) =, the inverse function Θ :, ] [λ, ) satisfies lim Θ) =. If the spectral decomposition of A is chosen as in Remark, then the function R defined in 4.6) satisfies R) = #{λ i : Θλ i ) } = N Θ) ) c M Θ) ) d/2,. 5.3) 2

13 5.. Backwards heat equation. We consider the inverse problem to reconstruct the temperature at time t = on M from measurements of the temperature at time t = T. The forward problem is described by the parabolic equation t ux, t) = ux, t), x M, t, T ) ux, t) =, x M, t, T ] ux, ) = fx), x M, 5.4) with an initial temperature f L 2 M) and the final temperature in gx) := ux, T ), x M. We have g = exp T )f, i.e. K = exp T ) LL 2 M)) and K K = exp 2T ). Hence, Θλ) = exp 2T λ) in 5.). By virtue of 5.3) the condition R S is satisfied for S) := c M ) d/2 2T ln. It is easy to check that this function satisfies the conditions 4.8). In particular S ) S ) = d 2 ) 2 ln 5.5) so 4.8d) holds with any γ S, 2) for sufficiently small ᾱ if d 3 and with γ S = for all ᾱ < for d 2. If M is a compact Riemannian manifold without boundary, then the smoothness class F Λ, for a logarithmic source condition 3.6) is the unit ball in the Sobolev space H 2p M) with respect to some equivalent norm see Hohage [2]). Similar results hold true if M has a boundary with a Dirichlet or Neumann condition. In this case we additionally need to impose boundary conditions. Hence, if the initial temperature is bounded in some Sobolev norm, f H s C, s = 2p >, if the regularization parameter is chosen such that σ, and if τ = Oσ µ ) with µ > 2 as σ, then it follows from Theorem 6 after some elementary computations that the MISE decays like for all regularization methods satisfying 3.8). E ˆf,σ f 2 L 2 = O ln σ) s), σ 5.2. Satellite gradiometry. In satellite gradiometry measurements of the gravitational force of the earth at a distance a from the center are used to reconstruct the gravitational potential u at the surface of the earth see Hohage [2], Bauer & Pereverzev [2] and references therein). Let the earth be described by E := {x : x < }, and let M := E denote the surface of the earth. Then u satisfies the Laplace equation ux) =, x R 3 \ E and decays like ux) = O x ) as x. The available data consist of noisy measurements of the rate of change of the gravitational force u in radial direction r = x, i.e. gx) := 2 u x), for x = a. r2 A discussion of the measurement errors shows that they are mainly of random nature see [2]). The problem is to determine the potential f = u M at the surface M of the earth. Introducing the operator K : L 2 M) L 2 am) mapping the solution f to the data g, we can write K K in the form 5.) with Θλ) = ΦΛλ)) and ) 2 ) 2 3 Φt) := c 2 + t 2 + t a 2t, Λλ) := λ + 2 3

14 see Hohage [2]). It is easy to show that Φt) is decreasing for sufficiently large t and that Λλ) is monotonic increasing for all λ >. Obviously, Θ) = Λ Φ) ) = Φ) 2 2. The function Φ) cannot be computed explicitly, but we can estimate its asymptotic behavior as. Writing t = Φ) for sufficiently small and pt) := c 2 + t) t) 2 we obtain ) t Φ) = log a log a = log t a log a pt)a 2t ) = log t a ln log a pt)) + 2t 2 ln a, as. Therefore, using 5.3), we get R) = c M Θ) ln ) 2,. 2 ln a The function S) := ln 2 ln a) 2 satisfies the conditions 4.8) see 5.5)). Moreover, the smoothness classes F Λ, for logarithmic source conditions 3.6) are unit balls in the Sobolev spaces H p M) w.r.t. equivalent norms see Hohage [2]). Since the gravitational potential satisfies the Poisson equation u = φ in R 3 and since the mass density φ of the earth E belongs to L 2 E), it follows from elliptic regularity results that u H 2 E), so f = u M H 3/2 M) in the sense of the trace operator see e.g. Taylor [37]). Therefore, if τ = Oσ) and if we choose σ. E ˆf,σ f 2 L 2 = O ln σ) 3), σ, 5.3. Operators in Hilbert scales. In the following we show that our assumptions are satisfied for operators acting in Hilbert scales see Mair & Ruymgaart [28], Mathé & Pereverzev [29]). Hence, spectral regularization methods yield optimal rates of convergence for this class of operators. Let L : DL) H H be an unbounded, positive, self-adjoint operator defined on a dense domain DL) H, and assume the inverse L : H H is bounded. Then L generates a scale of Hilbert spaces H µ, µ R defined as completion of n N DLn ) under the norm generated by the inner product f, g µ := L µ f, L µ g. We have H µ H λ for µ, λ R with µ > λ. A prototype is L = I with the Laplace operator on a closed manifold M, which leads to the usual Sobolev spaces on M. We assume that K is a-times smoothing a > ) in part of) the Hilbert scale H µ ), i.e. K : H µ a H µ is a bounded operator for all µ [µ, µ] which has a bounded inverse K : H µ H µ a. This is equivalent to C µ f µ a Kf µ C µ f µ a for all f H µ a and some constants C µ. Such conditions are satisfied for many boundary integral operators, multiplication operators, convolution operators and compositions of such operators see also the discussion after 5.)). We do not assume here that K is self-adjoint or that K K and L commute, i.e. that they can be diagonalized by the same unitary operator U. Usually the nature of the noise dictates the choice H 2 = H, and one is interested in error bounds for the estimator in positive norm, i.e. H = H µ a for µ a. Then the operator equation Kf = g is ill-posed with K = K µ a considered as an operator from H µ a to H. To verify Assumptions 2 and 3 with R S in 4.7) replaced by R S see Remark 4), we assume that L has a complete orthonormal system of eigenvectors with eigenvalues < λ λ λ 2... tending to infinity. Then the embedding operator J : H µ H is compact, and its singular values are given by σ j J) = λ µ j. It follows from the decomposition K µ a = JK µ µ a and the Courant mini-max characterization of the singular values σ j = σ j K µ a ) see e.g. Kreß [26]) that K µ a µ λ µ j σ j K µ a ) K µ µ a λ µ j, j =,,... Hence if Nλ) := #{λ j : λ j λ} and C := max K µ µ a, K µ a µ ), then R) := {σ j : σj 2 } satisfies N /C 2 ) /2µ) R) N C 2 ) /2µ). 4

15 If the counting function has the asymptotic behavior Nλ) λ d for some d >, then R) d/2µ. For the case L = I, d is the space dimension see 5.2)). A straightforward computation shows that S) := d/2µ satisfies 4.8) and 4.2) in Assumption 2 and 3 if and only if d/2µ), ). Under this condition, it follows from Remark 4 that Theorems 6 and 8 hold true with different constants. It remains to discuss the Hölder-type source conditions 3.5) in this setting. To do this we assume for simplicity that H = H 2 = H. Let K denote the adjoint of K with respect to the inner product in H. It is easy to show that K : H µ H µ+a is bounded and boundedly invertible for all µ [µ, µ]. Let l N such that [ 2al +, 2al ] [µ, µ]. Then there exists a constant γ such that γ L 2al f H K K) l f H γ L 2al f H for all f H 2al. It follows from the Heinz inequality see et al. [3], Heinz [2]) that γ σ L 2aσl f H K K) σl f H γ σ L 2aσl f H for all σ [, ] and f H 2aσl. Therefore, the source condition f = K K) ν w, w H is equivalent to f H 2aν. Let u := 2aν and f H u. Then E ˆf ),σ f 2 2u H = O σ u+a+d/2, σ, 2a for the choice σ u+a+d/2 if τ = O σ u+a u+a+d/2 ) and µ u/2a L 2 -Boosting. Boosting algorithms include a large class of iterative procedures which improve stagewise the performance of estimators. They have achieved significant interest in the machine learning context and more recently in statistics see Freund & Shapire [6] or Friedman [7] among many others). One of the main challenges is to provide a proper convergence analysis and proper stopping rules for the iteration depth see Zhang & Yu [4]). L 2 -Boosting has been introduced in the context of regression analysis by Bühlmann & Yu [6] for classification and more general learning problems. We consider the inverse regression problem described in 2.5 if K is an embedding operator and X 2 is a d-dimensional smooth, compact Riemannian manifold e.g. a smooth compact subset of R d ). Consider a weak learner of the form ˆf,n = n n Y i ky, X j ), 5.6) j= with a continuous, symmetric kernel k : X 2 X 2 R such that the integral operator K : L 2 X 2 ) L 2 X 2 ) with kernel k is compact and strictly positive definite with eigenvalues κ κ... and satisfies ess sup x X2 kx, x) < and #{κ j } d/2µ) as, 5.7) for some µ >. Further, let H µ, µ R be the Hilbert scale generated by the operator L := K /2µ) as described in 5.3. If we set H := H µ and H 2 = H = L 2 X 2 ), then H H 2, and the adjoint of the embedding operator K : H H 2 is given by K ϕ = Kϕ since ϕ, Kψ H = L µ ϕ, L µ Kψ L 2 = ϕ, ψ L 2 for all ψ L 2 X 2 ) and all ϕ H. By a similar reasoning one can show that H is a reproducing kernel Hilbert space RKHS) with reproducing kernel k, x). A typical example of a weak learner is a spline smoother which leads to Sobolev spaces H µ see [6]). Note that the weak learner 5.6) can shortly be written as ˆf,n = K Y. Boosting this learner results in a recursive iteration ˆf j+,n = ˆf j,n βk Y K ˆf j,n ), j =,, 2,..., 5.8) which is in fact Landweber iteration see 3.). Hence, Theorem 6 gives the following bound. 5

16 Corollary. Assume that k satisfies 5.7) with µ > d 2, let g H µ with µ > and β, KK 2 ]. Then the MISE is bounded by E ˆf j,n g 2 L 2 X 2) C j + ) µ/µ + n j + ) d/2µ)). 5.9) For the optimal stopping index j n) n 2µ/2µ+d) we obtain the rate E ˆf j n),n f 2 L 2 X 2) Cn 2µ/2µ+d), which is the well-known minimax rate in the case of Sobolev spaces. Proof. It follows easily from the definitions that g H µ is equivalent to g F Λ,w with Λt) = t µ/2µ for some w >. Since Landweber iteration has infinite qualification see [3]), Λ satisfies 3.9). Moreover, as the singular values of K are σ j K) = κ j, 5.7) implies that R) = #{σ j K) 2 } S) with S) := d/2µ, and S satisfies 4.8) in Assumption 2 for µ > d 2. To verify the assumptions of Prop. 4, we note that trk K) = dr) < for µ > d 2 i.e. K is Hilbert-Schmidt) and that ess sup x X2 kx, ) H = ess sup x X2 kx, ), kx, ) H = ess sup x X2 kx, x) <. Therefore, Prop. 4 and Remark 5. imply that Assumption is satisfied with σ n /2. Hence 5.9) follows from 4.) in Theorem 6 with = j + ) and Remark 4. Corollary immediately applies to all other regularization methods covered by Theorem 6. In particular, ν-methods require only the square root of the number of Landweber iterations to achieve the optimal rate, but they seem to be unknown in statistics and machine learning. Often a discretized sample variant of the iteration 5.8) is considered. Convergence of this algorithm has been analyzed by Yao et al.[4], but without optimal rates. It is still an open problem whether this discretized version achieves the minimax rates of Corollary in the general context of RKHS as it has been shown in [6] for the particular case of spline learning Errors in variable problems. We now further discuss the errors in variables problem introduced in 2. Our aim is to establish rates of convergence of estimators of the density f of F R d as the sample size n tends to infinity. Therefore, with a slight abuse of notation, we will write ˆf,n = ˆf,σn,g) in this context. It follows from the definition 2.) of σ and the boundedness of ΛK K) 2,2 that sup σn, Kf) = sup σn, KΛK K)w) w KΛK K) 2 f F Λ,w w =w n 2, + KΛK K) 2 ) 2,2 where the expression in parenthesis is finite under the assumptions of Proposition 2. We first consider two important special cases h z y) = w z) := exp π z 2 2), h 2 z y) = w 2 z) := c d exp z 2 ), x, z R d with normalization constant c d := π d/2 Γd/2 + )/Γd + ) corresponding to an error variable W independent of Y. Here K is a convolution operator, the canonical unitary transformation U in the spectral decomposition is the Fourier transform F defined in 2.), and the multiplier function is ρ j = Fw j 2, i.e. ρ ξ) = exp 2π ξ 2 2), and ρ 2 ξ) = + 4π 2 ξ 2 ) d. Hence, the corresponding functions R are given by R ) = V d ) d/2 ) d/2 2π ln, R 2 ) = V d 2π) d /d+), < <, where V d denotes the volume of the unit ball in R d. Hence, Assumption 2 is satisfied for R with S = R see 5.5)) and for R 2 with S) = V d 2π) d d/2d+2). Since the norm of the Sobolev space H s R d ) is defined by f H s R d ) = + ξ 2 ) s Ffξ) 2 dξ) /2, a simple computation shows that in the first case a logarithmic source condition 3.6) is equivalent to f H 2p R d ), and in the second case a Hölder-type source condition 3.5) is equivalent to f H 2d+)ν R d ). Suppose that f H s R d ). Then we find in the first case for the choice n /2 the asymptotic rates E ˆf,n f 2 L 2 = O ln n) s), n, 6

17 and in the second case the rate E ˆf,n f 2 L 2 = O n s s+3d/2+ ), n for the choice n d+ s+3d/2+. This generalizes results in Mair & Ruymgaart [28] for spectral cut-off to arbitrary regularization methods and to the multivariate setting. We now consider the case that the random variables Y and W are not stochastically independent. We assume that the conditional density h is of the form hx y y) = wx y) + px, y) 5.) where c + ξ 2 2) a Fwξ) 2 c + ξ 2 2) a for some constants a, c, c >, and p is C - smooth and decays exponentially as x, y. Then the convolution operator K with kernel w is bounded and boundedly invertible from H µ a R d ) to H µ R d ) for all µ R, and the integral operator P with kernel p is compact from H µ a R d ) to H µ R d ) for all µ R. Under our general assumption that K = K+P is injective, it follows from Riesz theory that K : H µ a R d ) H µ R d ) has a bounded inverse. Hence, it follows from the arguments of the previous paragraph that Hölder source condition 3.5) for K are equivalent to f H 2aν R d ). If we additionally assume periodicity of w and p with arbitrary size of the periodicity interval, then it follows from our results on operators in Hilbert scales that also Assumption and 2 are satisfied. 6. Appendix: Proofs and auxiliary results. This section contains the proofs our main results on the MISE. First, we require some technical lemmas. Lemma. If Assumption holds true and the family of functions {Φ } satisfies 3.8), then E ˆf,σ E ˆf,σ 2 σc 3) 2 2 E K ˆf,σ E K ˆf,σ 2 σc 2 ) 2 R) σc 3) 2 β drβ) σc 2 ) 2 2 β 2 drβ). β drβ), 6.a) 6.b) Proof. Recall the bound 4.5) on E ˆf,σ E ˆf,σ 2. We split the integral on the right hand side of 4.5) of the variance over the frequency domain S into low frequency components {ρ } and high frequency components {ρ < }. The low frequency components are bounded by {ρ } Φ ρ) 2 ρ dσ C 2 2 {ρ } ρ dσ = C2 2 β drβ), where the latter equality holds by a transformation of the integral on the l.h.s. to an integral with respect to the image measure Σ ρ, and subsequent reformulation as the Lebesgue-Stieltjes integral given on the r.h.s. of the equation. Similarly, the high frequency components of the variance can be estimated by Φ ρ) 2 ρ dσ C2 3 2 ρ dσ = C2 3 {ρ<} 2 β drβ) {ρ<} using 3.8b). This completes the proof of 6.a). In analogy to 4.5) we have E K ˆf,σ E K ˆf,σ 2 σ 2 S Φ ρ) 2 ρ 2 dσ, and the right hand side of this inequality can be estimated as above to establish the bound 6.b). The next lemma shows that for R = S the high frequency components of the variance are asymptotically bounded by low frequency components and that the relative magnitude of these components is determined by the constant γ S in 4.8d). Lemma 2. Assume that S C 2, ᾱ]) satisfies 4.8), and define κ := Then ᾱ 2 β dsβ) κ β dsβ) κ γs 2 γ S, i.e. 2κ κ+ = γ S. S ᾱ),, ᾱ]. 6.2)

Unbiased Risk Estimation as Parameter Choice Rule for Filter-based Regularization Methods

Unbiased Risk Estimation as Parameter Choice Rule for Filter-based Regularization Methods Unbiased Risk Estimation as Parameter Choice Rule for Filter-based Regularization Methods Frank Werner 1 Statistical Inverse Problems in Biophysics Group Max Planck Institute for Biophysical Chemistry,

More information

Convergence rates of spectral methods for statistical inverse learning problems

Convergence rates of spectral methods for statistical inverse learning problems Convergence rates of spectral methods for statistical inverse learning problems G. Blanchard Universtität Potsdam UCL/Gatsby unit, 04/11/2015 Joint work with N. Mücke (U. Potsdam); N. Krämer (U. München)

More information

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) Yale, May 2 2011 p. 1/35 Introduction There exist many fields where inverse problems appear Astronomy (Hubble satellite).

More information

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS TSOGTGEREL GANTUMUR Abstract. After establishing discrete spectra for a large class of elliptic operators, we present some fundamental spectral properties

More information

Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods

Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods Frank Werner 1 Statistical Inverse Problems in Biophysics Group Max Planck Institute for Biophysical Chemistry,

More information

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing. 5 Measure theory II 1. Charges (signed measures). Let (Ω, A) be a σ -algebra. A map φ: A R is called a charge, (or signed measure or σ -additive set function) if φ = φ(a j ) (5.1) A j for any disjoint

More information

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms (February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm Chapter 13 Radon Measures Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm (13.1) f = sup x X f(x). We want to identify

More information

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) YES, Eurandom, 10 October 2011 p. 1/32 Part II 2) Adaptation and oracle inequalities YES, Eurandom, 10 October 2011

More information

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS Bendikov, A. and Saloff-Coste, L. Osaka J. Math. 4 (5), 677 7 ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS ALEXANDER BENDIKOV and LAURENT SALOFF-COSTE (Received March 4, 4)

More information

Statistical Convergence of Kernel CCA

Statistical Convergence of Kernel CCA Statistical Convergence of Kernel CCA Kenji Fukumizu Institute of Statistical Mathematics Tokyo 106-8569 Japan fukumizu@ism.ac.jp Francis R. Bach Centre de Morphologie Mathematique Ecole des Mines de Paris,

More information

A Concise Course on Stochastic Partial Differential Equations

A Concise Course on Stochastic Partial Differential Equations A Concise Course on Stochastic Partial Differential Equations Michael Röckner Reference: C. Prevot, M. Röckner: Springer LN in Math. 1905, Berlin (2007) And see the references therein for the original

More information

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) YES, Eurandom, 10 October 2011 p. 1/27 Table of contents YES, Eurandom, 10 October 2011 p. 2/27 Table of contents 1)

More information

Two-parameter regularization method for determining the heat source

Two-parameter regularization method for determining the heat source Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 13, Number 8 (017), pp. 3937-3950 Research India Publications http://www.ripublication.com Two-parameter regularization method for

More information

Sobolev spaces. May 18

Sobolev spaces. May 18 Sobolev spaces May 18 2015 1 Weak derivatives The purpose of these notes is to give a very basic introduction to Sobolev spaces. More extensive treatments can e.g. be found in the classical references

More information

Laplace s Equation. Chapter Mean Value Formulas

Laplace s Equation. Chapter Mean Value Formulas Chapter 1 Laplace s Equation Let be an open set in R n. A function u C 2 () is called harmonic in if it satisfies Laplace s equation n (1.1) u := D ii u = 0 in. i=1 A function u C 2 () is called subharmonic

More information

Partial Differential Equations

Partial Differential Equations Part II Partial Differential Equations Year 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2015 Paper 4, Section II 29E Partial Differential Equations 72 (a) Show that the Cauchy problem for u(x,

More information

Hilbert space methods for quantum mechanics. S. Richard

Hilbert space methods for quantum mechanics. S. Richard Hilbert space methods for quantum mechanics S. Richard Spring Semester 2016 2 Contents 1 Hilbert space and bounded linear operators 5 1.1 Hilbert space................................ 5 1.2 Vector-valued

More information

CHAPTER 6. Differentiation

CHAPTER 6. Differentiation CHPTER 6 Differentiation The generalization from elementary calculus of differentiation in measure theory is less obvious than that of integration, and the methods of treating it are somewhat involved.

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

CHAPTER VIII HILBERT SPACES

CHAPTER VIII HILBERT SPACES CHAPTER VIII HILBERT SPACES DEFINITION Let X and Y be two complex vector spaces. A map T : X Y is called a conjugate-linear transformation if it is a reallinear transformation from X into Y, and if T (λx)

More information

Traces, extensions and co-normal derivatives for elliptic systems on Lipschitz domains

Traces, extensions and co-normal derivatives for elliptic systems on Lipschitz domains Traces, extensions and co-normal derivatives for elliptic systems on Lipschitz domains Sergey E. Mikhailov Brunel University West London, Department of Mathematics, Uxbridge, UB8 3PH, UK J. Math. Analysis

More information

Fourier Transform & Sobolev Spaces

Fourier Transform & Sobolev Spaces Fourier Transform & Sobolev Spaces Michael Reiter, Arthur Schuster Summer Term 2008 Abstract We introduce the concept of weak derivative that allows us to define new interesting Hilbert spaces the Sobolev

More information

Review and problem list for Applied Math I

Review and problem list for Applied Math I Review and problem list for Applied Math I (This is a first version of a serious review sheet; it may contain errors and it certainly omits a number of topic which were covered in the course. Let me know

More information

Multiscale Frame-based Kernels for Image Registration

Multiscale Frame-based Kernels for Image Registration Multiscale Frame-based Kernels for Image Registration Ming Zhen, Tan National University of Singapore 22 July, 16 Ming Zhen, Tan (National University of Singapore) Multiscale Frame-based Kernels for Image

More information

Statistical Inverse Problems and Instrumental Variables

Statistical Inverse Problems and Instrumental Variables Statistical Inverse Problems and Instrumental Variables Thorsten Hohage Institut für Numerische und Angewandte Mathematik University of Göttingen Workshop on Inverse and Partial Information Problems: Methodology

More information

Separation of Variables in Linear PDE: One-Dimensional Problems

Separation of Variables in Linear PDE: One-Dimensional Problems Separation of Variables in Linear PDE: One-Dimensional Problems Now we apply the theory of Hilbert spaces to linear differential equations with partial derivatives (PDE). We start with a particular example,

More information

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University PCA with random noise Van Ha Vu Department of Mathematics Yale University An important problem that appears in various areas of applied mathematics (in particular statistics, computer science and numerical

More information

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms.

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. Vector Spaces Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. For each two vectors a, b ν there exists a summation procedure: a +

More information

Chapter 7: Bounded Operators in Hilbert Spaces

Chapter 7: Bounded Operators in Hilbert Spaces Chapter 7: Bounded Operators in Hilbert Spaces I-Liang Chern Department of Applied Mathematics National Chiao Tung University and Department of Mathematics National Taiwan University Fall, 2013 1 / 84

More information

Chapter 8 Integral Operators

Chapter 8 Integral Operators Chapter 8 Integral Operators In our development of metrics, norms, inner products, and operator theory in Chapters 1 7 we only tangentially considered topics that involved the use of Lebesgue measure,

More information

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators

More information

Topics in Harmonic Analysis Lecture 1: The Fourier transform

Topics in Harmonic Analysis Lecture 1: The Fourier transform Topics in Harmonic Analysis Lecture 1: The Fourier transform Po-Lam Yung The Chinese University of Hong Kong Outline Fourier series on T: L 2 theory Convolutions The Dirichlet and Fejer kernels Pointwise

More information

Real Variables # 10 : Hilbert Spaces II

Real Variables # 10 : Hilbert Spaces II randon ehring Real Variables # 0 : Hilbert Spaces II Exercise 20 For any sequence {f n } in H with f n = for all n, there exists f H and a subsequence {f nk } such that for all g H, one has lim (f n k,

More information

NONTRIVIAL SOLUTIONS TO INTEGRAL AND DIFFERENTIAL EQUATIONS

NONTRIVIAL SOLUTIONS TO INTEGRAL AND DIFFERENTIAL EQUATIONS Fixed Point Theory, Volume 9, No. 1, 28, 3-16 http://www.math.ubbcluj.ro/ nodeacj/sfptcj.html NONTRIVIAL SOLUTIONS TO INTEGRAL AND DIFFERENTIAL EQUATIONS GIOVANNI ANELLO Department of Mathematics University

More information

Analysis in weighted spaces : preliminary version

Analysis in weighted spaces : preliminary version Analysis in weighted spaces : preliminary version Frank Pacard To cite this version: Frank Pacard. Analysis in weighted spaces : preliminary version. 3rd cycle. Téhéran (Iran, 2006, pp.75.

More information

here, this space is in fact infinite-dimensional, so t σ ess. Exercise Let T B(H) be a self-adjoint operator on an infinitedimensional

here, this space is in fact infinite-dimensional, so t σ ess. Exercise Let T B(H) be a self-adjoint operator on an infinitedimensional 15. Perturbations by compact operators In this chapter, we study the stability (or lack thereof) of various spectral properties under small perturbations. Here s the type of situation we have in mind:

More information

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation: Oct. 1 The Dirichlet s P rinciple In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation: 1. Dirichlet s Principle. u = in, u = g on. ( 1 ) If we multiply

More information

OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1

OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1 The Annals of Statistics 1997, Vol. 25, No. 6, 2512 2546 OPTIMAL POINTWISE ADAPTIVE METHODS IN NONPARAMETRIC ESTIMATION 1 By O. V. Lepski and V. G. Spokoiny Humboldt University and Weierstrass Institute

More information

The Dirichlet-to-Neumann operator

The Dirichlet-to-Neumann operator Lecture 8 The Dirichlet-to-Neumann operator The Dirichlet-to-Neumann operator plays an important role in the theory of inverse problems. In fact, from measurements of electrical currents at the surface

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

i=1 α i. Given an m-times continuously

i=1 α i. Given an m-times continuously 1 Fundamentals 1.1 Classification and characteristics Let Ω R d, d N, d 2, be an open set and α = (α 1,, α d ) T N d 0, N 0 := N {0}, a multiindex with α := d i=1 α i. Given an m-times continuously differentiable

More information

Learning gradients: prescriptive models

Learning gradients: prescriptive models Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan

More information

2 Tikhonov Regularization and ERM

2 Tikhonov Regularization and ERM Introduction Here we discusses how a class of regularization methods originally designed to solve ill-posed inverse problems give rise to regularized learning algorithms. These algorithms are kernel methods

More information

Here we used the multiindex notation:

Here we used the multiindex notation: Mathematics Department Stanford University Math 51H Distributions Distributions first arose in solving partial differential equations by duality arguments; a later related benefit was that one could always

More information

The spectral zeta function

The spectral zeta function The spectral zeta function Bernd Ammann June 4, 215 Abstract In this talk we introduce spectral zeta functions. The spectral zeta function of the Laplace-Beltrami operator was already introduced by Minakshisundaram

More information

Necessary conditions for convergence rates of regularizations of optimal control problems

Necessary conditions for convergence rates of regularizations of optimal control problems Necessary conditions for convergence rates of regularizations of optimal control problems Daniel Wachsmuth and Gerd Wachsmuth Johann Radon Institute for Computational and Applied Mathematics RICAM), Austrian

More information

L p -boundedness of the Hilbert transform

L p -boundedness of the Hilbert transform L p -boundedness of the Hilbert transform Kunal Narayan Chaudhury Abstract The Hilbert transform is essentially the only singular operator in one dimension. This undoubtedly makes it one of the the most

More information

56 4 Integration against rough paths

56 4 Integration against rough paths 56 4 Integration against rough paths comes to the definition of a rough integral we typically take W = LV, W ; although other choices can be useful see e.g. remark 4.11. In the context of rough differential

More information

TOEPLITZ OPERATORS. Toeplitz studied infinite matrices with NW-SE diagonals constant. f e C :

TOEPLITZ OPERATORS. Toeplitz studied infinite matrices with NW-SE diagonals constant. f e C : TOEPLITZ OPERATORS EFTON PARK 1. Introduction to Toeplitz Operators Otto Toeplitz lived from 1881-1940 in Goettingen, and it was pretty rough there, so he eventually went to Palestine and eventually contracted

More information

On some weighted fractional porous media equations

On some weighted fractional porous media equations On some weighted fractional porous media equations Gabriele Grillo Politecnico di Milano September 16 th, 2015 Anacapri Joint works with M. Muratori and F. Punzo Gabriele Grillo Weighted Fractional PME

More information

Gaussian Random Fields

Gaussian Random Fields Gaussian Random Fields Mini-Course by Prof. Voijkan Jaksic Vincent Larochelle, Alexandre Tomberg May 9, 009 Review Defnition.. Let, F, P ) be a probability space. Random variables {X,..., X n } are called

More information

9 Radon-Nikodym theorem and conditioning

9 Radon-Nikodym theorem and conditioning Tel Aviv University, 2015 Functions of real variables 93 9 Radon-Nikodym theorem and conditioning 9a Borel-Kolmogorov paradox............. 93 9b Radon-Nikodym theorem.............. 94 9c Conditioning.....................

More information

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space.

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Hilbert Spaces Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Vector Space. Vector space, ν, over the field of complex numbers,

More information

Inverse Scattering with Partial data on Asymptotically Hyperbolic Manifolds

Inverse Scattering with Partial data on Asymptotically Hyperbolic Manifolds Inverse Scattering with Partial data on Asymptotically Hyperbolic Manifolds Raphael Hora UFSC rhora@mtm.ufsc.br 29/04/2014 Raphael Hora (UFSC) Inverse Scattering with Partial data on AH Manifolds 29/04/2014

More information

u xx + u yy = 0. (5.1)

u xx + u yy = 0. (5.1) Chapter 5 Laplace Equation The following equation is called Laplace equation in two independent variables x, y: The non-homogeneous problem u xx + u yy =. (5.1) u xx + u yy = F, (5.) where F is a function

More information

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR 1. Definition Existence Theorem 1. Assume that A R m n. Then there exist orthogonal matrices U R m m V R n n, values σ 1 σ 2... σ p 0 with p = min{m, n},

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

Estimates for probabilities of independent events and infinite series

Estimates for probabilities of independent events and infinite series Estimates for probabilities of independent events and infinite series Jürgen Grahl and Shahar evo September 9, 06 arxiv:609.0894v [math.pr] 8 Sep 06 Abstract This paper deals with finite or infinite sequences

More information

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA)

The circular law. Lewis Memorial Lecture / DIMACS minicourse March 19, Terence Tao (UCLA) The circular law Lewis Memorial Lecture / DIMACS minicourse March 19, 2008 Terence Tao (UCLA) 1 Eigenvalue distributions Let M = (a ij ) 1 i n;1 j n be a square matrix. Then one has n (generalised) eigenvalues

More information

10.1. The spectrum of an operator. Lemma If A < 1 then I A is invertible with bounded inverse

10.1. The spectrum of an operator. Lemma If A < 1 then I A is invertible with bounded inverse 10. Spectral theory For operators on finite dimensional vectors spaces, we can often find a basis of eigenvectors (which we use to diagonalize the matrix). If the operator is symmetric, this is always

More information

3 Compact Operators, Generalized Inverse, Best- Approximate Solution

3 Compact Operators, Generalized Inverse, Best- Approximate Solution 3 Compact Operators, Generalized Inverse, Best- Approximate Solution As we have already heard in the lecture a mathematical problem is well - posed in the sense of Hadamard if the following properties

More information

Some lecture notes for Math 6050E: PDEs, Fall 2016

Some lecture notes for Math 6050E: PDEs, Fall 2016 Some lecture notes for Math 65E: PDEs, Fall 216 Tianling Jin December 1, 216 1 Variational methods We discuss an example of the use of variational methods in obtaining existence of solutions. Theorem 1.1.

More information

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space

The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space The Laplacian PDF Distance: A Cost Function for Clustering in a Kernel Feature Space Robert Jenssen, Deniz Erdogmus 2, Jose Principe 2, Torbjørn Eltoft Department of Physics, University of Tromsø, Norway

More information

SELF-ADJOINTNESS OF SCHRÖDINGER-TYPE OPERATORS WITH SINGULAR POTENTIALS ON MANIFOLDS OF BOUNDED GEOMETRY

SELF-ADJOINTNESS OF SCHRÖDINGER-TYPE OPERATORS WITH SINGULAR POTENTIALS ON MANIFOLDS OF BOUNDED GEOMETRY Electronic Journal of Differential Equations, Vol. 2003(2003), No.??, pp. 1 8. ISSN: 1072-6691. URL: http://ejde.math.swt.edu or http://ejde.math.unt.edu ftp ejde.math.swt.edu (login: ftp) SELF-ADJOINTNESS

More information

Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels

Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels Solving the 3D Laplace Equation by Meshless Collocation via Harmonic Kernels Y.C. Hon and R. Schaback April 9, Abstract This paper solves the Laplace equation u = on domains Ω R 3 by meshless collocation

More information

Functional Analysis Exercise Class

Functional Analysis Exercise Class Functional Analysis Exercise Class Week 9 November 13 November Deadline to hand in the homeworks: your exercise class on week 16 November 20 November Exercises (1) Show that if T B(X, Y ) and S B(Y, Z)

More information

A RECONSTRUCTION FORMULA FOR BAND LIMITED FUNCTIONS IN L 2 (R d )

A RECONSTRUCTION FORMULA FOR BAND LIMITED FUNCTIONS IN L 2 (R d ) PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 127, Number 12, Pages 3593 3600 S 0002-9939(99)04938-2 Article electronically published on May 6, 1999 A RECONSTRUCTION FORMULA FOR AND LIMITED FUNCTIONS

More information

Regularization and Inverse Problems

Regularization and Inverse Problems Regularization and Inverse Problems Caroline Sieger Host Institution: Universität Bremen Home Institution: Clemson University August 5, 2009 Caroline Sieger (Bremen and Clemson) Regularization and Inverse

More information

POINTWISE BOUNDS ON QUASIMODES OF SEMICLASSICAL SCHRÖDINGER OPERATORS IN DIMENSION TWO

POINTWISE BOUNDS ON QUASIMODES OF SEMICLASSICAL SCHRÖDINGER OPERATORS IN DIMENSION TWO POINTWISE BOUNDS ON QUASIMODES OF SEMICLASSICAL SCHRÖDINGER OPERATORS IN DIMENSION TWO HART F. SMITH AND MACIEJ ZWORSKI Abstract. We prove optimal pointwise bounds on quasimodes of semiclassical Schrödinger

More information

Spectral theory for compact operators on Banach spaces

Spectral theory for compact operators on Banach spaces 68 Chapter 9 Spectral theory for compact operators on Banach spaces Recall that a subset S of a metric space X is precompact if its closure is compact, or equivalently every sequence contains a Cauchy

More information

11. Spectral theory For operators on finite dimensional vectors spaces, we can often find a basis of eigenvectors (which we use to diagonalize the

11. Spectral theory For operators on finite dimensional vectors spaces, we can often find a basis of eigenvectors (which we use to diagonalize the 11. Spectral theory For operators on finite dimensional vectors spaces, we can often find a basis of eigenvectors (which we use to diagonalize the matrix). If the operator is symmetric, this is always

More information

COMPOSITION OPERATORS ON HARDY-SOBOLEV SPACES

COMPOSITION OPERATORS ON HARDY-SOBOLEV SPACES Indian J. Pure Appl. Math., 46(3): 55-67, June 015 c Indian National Science Academy DOI: 10.1007/s136-015-0115-x COMPOSITION OPERATORS ON HARDY-SOBOLEV SPACES Li He, Guang Fu Cao 1 and Zhong Hua He Department

More information

Spectral Continuity Properties of Graph Laplacians

Spectral Continuity Properties of Graph Laplacians Spectral Continuity Properties of Graph Laplacians David Jekel May 24, 2017 Overview Spectral invariants of the graph Laplacian depend continuously on the graph. We consider triples (G, x, T ), where G

More information

4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan

4 Hilbert spaces. The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan The proof of the Hilbert basis theorem is not mathematics, it is theology. Camille Jordan Wir müssen wissen, wir werden wissen. David Hilbert We now continue to study a special class of Banach spaces,

More information

Gaussian Processes. 1. Basic Notions

Gaussian Processes. 1. Basic Notions Gaussian Processes 1. Basic Notions Let T be a set, and X : {X } T a stochastic process, defined on a suitable probability space (Ω P), that is indexed by T. Definition 1.1. We say that X is a Gaussian

More information

9 Brownian Motion: Construction

9 Brownian Motion: Construction 9 Brownian Motion: Construction 9.1 Definition and Heuristics The central limit theorem states that the standard Gaussian distribution arises as the weak limit of the rescaled partial sums S n / p n of

More information

285K Homework #1. Sangchul Lee. April 28, 2017

285K Homework #1. Sangchul Lee. April 28, 2017 285K Homework #1 Sangchul Lee April 28, 2017 Problem 1. Suppose that X is a Banach space with respect to two norms: 1 and 2. Prove that if there is c (0, such that x 1 c x 2 for each x X, then there is

More information

Real Analysis Notes. Thomas Goller

Real Analysis Notes. Thomas Goller Real Analysis Notes Thomas Goller September 4, 2011 Contents 1 Abstract Measure Spaces 2 1.1 Basic Definitions........................... 2 1.2 Measurable Functions........................ 2 1.3 Integration..............................

More information

A COUNTEREXAMPLE TO AN ENDPOINT BILINEAR STRICHARTZ INEQUALITY TERENCE TAO. t L x (R R2 ) f L 2 x (R2 )

A COUNTEREXAMPLE TO AN ENDPOINT BILINEAR STRICHARTZ INEQUALITY TERENCE TAO. t L x (R R2 ) f L 2 x (R2 ) Electronic Journal of Differential Equations, Vol. 2006(2006), No. 5, pp. 6. ISSN: 072-669. URL: http://ejde.math.txstate.edu or http://ejde.math.unt.edu ftp ejde.math.txstate.edu (login: ftp) A COUNTEREXAMPLE

More information

ON PARABOLIC HARNACK INEQUALITY

ON PARABOLIC HARNACK INEQUALITY ON PARABOLIC HARNACK INEQUALITY JIAXIN HU Abstract. We show that the parabolic Harnack inequality is equivalent to the near-diagonal lower bound of the Dirichlet heat kernel on any ball in a metric measure-energy

More information

Part II Probability and Measure

Part II Probability and Measure Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries Chapter 1 Measure Spaces 1.1 Algebras and σ algebras of sets 1.1.1 Notation and preliminaries We shall denote by X a nonempty set, by P(X) the set of all parts (i.e., subsets) of X, and by the empty set.

More information

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions Chapter 3 Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions 3.1 Scattered Data Interpolation with Polynomial Precision Sometimes the assumption on the

More information

THEOREMS, ETC., FOR MATH 515

THEOREMS, ETC., FOR MATH 515 THEOREMS, ETC., FOR MATH 515 Proposition 1 (=comment on page 17). If A is an algebra, then any finite union or finite intersection of sets in A is also in A. Proposition 2 (=Proposition 1.1). For every

More information

NOTES ON FRAMES. Damir Bakić University of Zagreb. June 6, 2017

NOTES ON FRAMES. Damir Bakić University of Zagreb. June 6, 2017 NOTES ON FRAMES Damir Bakić University of Zagreb June 6, 017 Contents 1 Unconditional convergence, Riesz bases, and Bessel sequences 1 1.1 Unconditional convergence of series in Banach spaces...............

More information

In this chapter we study elliptical PDEs. That is, PDEs of the form. 2 u = lots,

In this chapter we study elliptical PDEs. That is, PDEs of the form. 2 u = lots, Chapter 8 Elliptic PDEs In this chapter we study elliptical PDEs. That is, PDEs of the form 2 u = lots, where lots means lower-order terms (u x, u y,..., u, f). Here are some ways to think about the physical

More information

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents MATH 3969 - MEASURE THEORY AND FOURIER ANALYSIS ANDREW TULLOCH Contents 1. Measure Theory 2 1.1. Properties of Measures 3 1.2. Constructing σ-algebras and measures 3 1.3. Properties of the Lebesgue measure

More information

Analysis Comprehensive Exam Questions Fall 2008

Analysis Comprehensive Exam Questions Fall 2008 Analysis Comprehensive xam Questions Fall 28. (a) Let R be measurable with finite Lebesgue measure. Suppose that {f n } n N is a bounded sequence in L 2 () and there exists a function f such that f n (x)

More information

Takens embedding theorem for infinite-dimensional dynamical systems

Takens embedding theorem for infinite-dimensional dynamical systems Takens embedding theorem for infinite-dimensional dynamical systems James C. Robinson Mathematics Institute, University of Warwick, Coventry, CV4 7AL, U.K. E-mail: jcr@maths.warwick.ac.uk Abstract. Takens

More information

Regularization via Spectral Filtering

Regularization via Spectral Filtering Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

Regularizations of Singular Integral Operators (joint work with C. Liaw)

Regularizations of Singular Integral Operators (joint work with C. Liaw) 1 Outline Regularizations of Singular Integral Operators (joint work with C. Liaw) Sergei Treil Department of Mathematics Brown University April 4, 2014 2 Outline 1 Examples of Calderón Zygmund operators

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

arxiv: v5 [math.na] 16 Nov 2017

arxiv: v5 [math.na] 16 Nov 2017 RANDOM PERTURBATION OF LOW RANK MATRICES: IMPROVING CLASSICAL BOUNDS arxiv:3.657v5 [math.na] 6 Nov 07 SEAN O ROURKE, VAN VU, AND KE WANG Abstract. Matrix perturbation inequalities, such as Weyl s theorem

More information

SHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction

SHARP BOUNDARY TRACE INEQUALITIES. 1. Introduction SHARP BOUNDARY TRACE INEQUALITIES GILES AUCHMUTY Abstract. This paper describes sharp inequalities for the trace of Sobolev functions on the boundary of a bounded region R N. The inequalities bound (semi-)norms

More information

ON MATRIX VALUED SQUARE INTEGRABLE POSITIVE DEFINITE FUNCTIONS

ON MATRIX VALUED SQUARE INTEGRABLE POSITIVE DEFINITE FUNCTIONS 1 2 3 ON MATRIX VALUED SQUARE INTERABLE POSITIVE DEFINITE FUNCTIONS HONYU HE Abstract. In this paper, we study matrix valued positive definite functions on a unimodular group. We generalize two important

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information