PhD Course: Introduction to Inverse Problem. Salvatore Frandina Siena, August 19, 2012

Size: px
Start display at page:

Download "PhD Course: Introduction to Inverse Problem. Salvatore Frandina Siena, August 19, 2012"

Transcription

1 PhD Course: to Inverse Problem theory Department of Information Engineering, Siena, Italy Siena, August 19, / 68

2 An overview of the theory 2 / 68

3 Direct and Inverse Problems - Direct problem: given a know model x in a model space X and an operator K, evaluate K(x) - Inverse problem: given y and an operator K, solve the equation K(x) = y for determine the model x - Note: * The formulation of an inverse problem requires the definition of the operator K with its domain and range * The formulation as an operator equation allows to distinguish among finite, semi-finite, and infinite-dimensional, linear and nonlinear problems * The evaluation of K(x) requires solving a boundary value problem for a differential equation or evaluating an integral equation theory 3 / 68

4 Well-Posedness I - A mathematical model for a physical problem is well-posed, in the sense of Hadamard, if it satisfies the following properties: 1. there exists a solution of the problem (existence) 2. there is at most one solution of the problem (uniqueness) 3. the solution depends continuously on the data (stability) - The direct problem is well-posed while the inverse problem is ill-posed - Note: * The existence of a solution can be enforced by enlarging the solution space * If a problem has more than one solution, then information about the model is missing * If the solution of a problem does not depend continuously on the data, then in general the computed solution has nothing to do with the true solution theory 4 / 68

5 Well-Posedness II Definition 1 Let X and Y be normed spaces, K : X Y a (linear or nonlinear) mapping. The equation Kx = y is called well-posed if the following holds: 1. Existence: for every y Y there is (at least one) x X such that Kx = y 2. Uniqueness: for every y Y there is at most one x X with Kx = y 3. Stability: the solution x depends continuously on y; that is, for every sequence (x n) X with Kx n Kx (n ), it follows that x n x (n ) - every equations for which (at least) one of these properties does not hold are called ill-posed theory 5 / 68

6 Well-Posedness III - Existence and Uniqueness depend only on the algebraic nature of the spaces and the operator i.e. if the operator is onto (surjective) or one-to-one (injective) - Stability depends also on the topologies of the spaces i.e. if the inverse operator K 1 : Y X is continuous - All these requirements are not independent of each other. A theorem says that the inverse operator K 1 is automatically continuous if K is linear and continuous and X and Y are Banach spaces theory 6 / 68

7 Compact Operator - Integral operators are compact operators in many natural topologies under very weak conditions on the kernels Theorem 1 Let X, Y be normed spaces and K : X Y be a linear compact operator with kernel N (K) := {x X : Kx = 0}. Let the dimension of the factor space X /N (K) be infinite. Then there exists a sequence (x n) X such that Kx n 0 but (x n) does not converge. We can even choose (x n) such that x n. In particular, if K is one-to-one, the inverse K 1 : Y R(K) X is unbounded. Here, R(K) := {Kx Y : x X } denotes the range of K - Linear equations of the form Kx = y with compact operators K are always ill-posed theory 7 / 68

8 Concept of I - Many inverse problems can be formulated as operator equations of the form Kx = y where K is a linear compact operator between Hilbert spaces X and Y over the field K = R or C - A successful reconstruction strategy requires additional a priori information about the solution - The regularization strategies are optimal if they have the same asymptotic order as the worst-case error theory 8 / 68

9 Concept of II - The method and the are two of the most important regularization strategies - The regularization parameter α = α(δ) is chosen a priori i.e. before to start to compute the regularized solution - The optimal regularization parameter α depends on bounds of the exact solution and it is not known in advance theory 9 / 68

10 Assumptions of the problem I - The compact operator K is one-to-one * this is not a restriction since that the domain X can be replaced by orthogonal complement of the kernel of K i.e. N(K) = {x X : x, y = 0, y N(K)} - Exists a solution x X of the unperturbed equation Kx = y i.e. y R(K) Y - The injectivity of K implies that this solution is unique - The right-hand side y Y is never known exactly but only with an error of δ > 0. The assumption is that the δ > 0 and y δ Y are known with y y δ δ theory 10 / 68

11 Assumptions of the problem II - The aim is to solve the perturbed equation Kx δ = y δ - In general, the problem is not solvable because is not known if the measured data y δ are in the range R(K) of K - An approximation x δ X to the exact solution x must have the same asymptotic error of the worst-case error - The approximate solution x δ should depend continuously on the data y δ theory 11 / 68

12 I - The purpose is to construct a suitable bounded approximation R : Y X of the (unbounded) inverse operator K 1 : R(K) X Definition 2 A regularization strategy is a family of linear and bounded operators theory such that for all x X R α : Y X, α > 0, lim RαKx = x α 0 i.e. the operators R αk converge pointwise to the identity 12 / 68

13 II - The following theorem derives the definition 2 and from the compactness of K Theorem 2 Let R α be a regularization strategy for a compact operator K : X Y where dim X =. Then * the operators R α are not uniformly bounded i.e. there exists a sequence (α j ) with R αj for j * the sequence (R αkx) does not converge uniformly on bounded subsets of X i.e. there is no convergence R αk to the identity I in the operator norm - N.B. the definition is based on unperturbed data i.e. the regularizer R αy converges to x for the exact right-hand side y = Kx theory 13 / 68

14 III - An approximation of the solution x of Kx = y is defined as x α,δ = R αy δ where y K(X ) is the exact right-hand side and y δ Y is the measured data with y y δ δ - The error can be split into two parts using triangle inequality theory x α,δ x R αy δ R αy + R αy x R α y δ y + R αkx x and finally x α,δ x condition number {}}{ }{{} δ R α + R αkx x }{{} error in the data approximation error (1) 14 / 68

15 Total Error Trend theory Figure: Trend of the total error - The first term is the error in the data multiplied by the condition number R α of the regularized problem. By theorem 2, this term tends to infinity as α tends to zero - The second term is the approximation error (R α K 1 )y at the exact right-hand side y = Kx. By the definition of a regularization strategy, this term tends to zero with α - The aim is to choose α = α(δ) dependent on δ in order to minimize the total error δ R α + R αkx x 15 / 68

16 Admissible Definition 3 A regularization strategy α = α(δ) is called admissible if, for every x X, α(δ) 0 and sup{ R α(δ) y δ x : Kx y δ δ} 0, δ 0 theory - A method to construct classes of admissible regularization strategies is given by filtering singular systems - Let K : X Y be a linear compact operator, and let (µ j, x j, y j ) be a singular system for K. The solution x of Kx = y is given by Picard s theorem [1] as 1 x = (y, y j )x j µ j j=1 provided the series converges i.e. y R(K) 16 / 68

17 Regularizing Filter I Theorem 3 Let K : X Y be a compact with singular system (µ j, x j, y j ) and q : (0, ) (0, K ] R be a function with the following properties: - (1) for every α > 0 and for all 0 < µ K * q(α, µ) 1 * exists c(α) such that q(α, µ) c(α)µ - (2) for every 0 < µ K, lim α 0 q(α, µ) = 1 Then the operator R α : Y X, α > 0, defined by theory R αy = j=1 q(α, µ j ) µ j (y, y j )x j, y Y, is a regularization strategy with R α c(α). A choice α = α(δ) is admissible if α(δ) 0 and δc(α(δ)) 0 as δ 0. The function q is a regularizing filter for K - the theorem implies that R αy converges to the solution x 17 / 68

18 Regularizing Filter II - A optimal strategy in the sense of worst-case error can be achieved by enforcing the previous theorem 3 Theorem 4 Let assumption (1) of the previous theorem 3 hold. Then (2) can be replaced by the stronger assumption: - (2a) exists c 1 > 0 such that for all α > 0 and 0 < µ K with q(α, µ) 1 c 1 α µ * If, furthermore, x K (Y ), then R αkx x c 1 α z, where x = K z - (2b) exists c 2 > 0 such that for all α > 0 and 0 < µ K with q(α, µ) 1 c 2 α µ 2 * If, furthermore, x K K(X ) R αkx x c 2 α z, where x = K Kz theory 18 / 68

19 Regularizing Filter III Theorem 5 The following three functions q satisfy the assumptions (1), (2), and (2a-b) of the previous theorems 3 and 4: - (a) q(α, µ) = µ 2 /(α + µ 2 ). This satisfies (1) with c(α) = 1/(2 α) while assumptions (2a) and (2b) hold with c 1 = 1/2 and c 2 = 1, respectively - (b) q(α, µ) = 1 (1 aµ 2 ) 1/α for some 0 < a < 1/ K 2. In this case (1) holds with c(α) = a/α while (2a) and (2b) are satisfied with c 1 = 1/ 2a and c 2 = 1/a, respectively - (c) Let q be defined by { 1, µ q(α, µ) = 2 α, 0, µ 2 < α In this case (1) holds with c(α) = 1/ α while (2a) and (2b) are satisfied with c1 = c2 = 1 All of the functions q defined in (a), (b), and (c) are regularizing filters that lead to optimal regularization strategies theory 19 / 68

20 Spectral Cutoff - For the first two choices of q exists a characterization that avoids knowledge of the singular system - The choice (c) of q is called the spectral cutoff. The spectral cutoff solution x α,δ X is therefore defined by x α,δ = µ 2 j α 1 µ j (y δ, y j )x j theory - From the fundamental estimate Definition 1 and with the previous theorem, the following result for the cutoff solution hold 20 / 68

21 Spectral Cutoff I Theorem 6 Let y δ Y be such that y δ y δ, where y = Kx denotes the exact right-hand side - (a) let K : X Y be a compact and injective operator with singular system (µ j, x j, y j ). The operators R αy = µ 2 j α 1 µ j (y, y j )x j, y Y, theory define a regularization strategy with R α 1/ α. This strategy is admissible if α(δ) 0 (δ 0) and δ 2 /α(δ) 0 (δ 0) - (b) let x = K z K (Y ) with z E and c > 0. For the choice α(δ) = cδ/e, we have the estimate ( 1 x α(δ),δ x c + δe c) - (c) let x = K Kz K K(X ) with z E and c > 0. The choice α(δ) = c(δ/e) 2/3 leads to the estimate ( ) 1 x α(δ),δ x c + c δ 2/3 E 1/3 21 / 68

22 Spectral Cutoff II - The spectral cutoff is optimal for the information (K ) 1 x E or (K K) 1 x E, respectively (if K is one-to-one) theory 22 / 68

23 to I - In general for a concrete integral operators it is recommended to avoid the computation of a singular system - Given an overdetermined finite linear system of the form Kx = y, a common method to solve it is to determine the best fit that minimizes the defect Kx y with respect to x X for some norm in Y theory 23 / 68

24 to II - If X is infinite-dimensional and K is compact, this minimization problem is also ill-posed by the following lemma Lemma 1 Let X and Y be Hilbert spaces, K : X Y be linear and bounded, and y Y. There exists ˆx X with K ˆx y Kx y for all x X if and only if ˆx X solves the normal equation K K ˆx = K y. Here, K : Y X denotes the adjoint of K theory 24 / 68

25 Tikhonov functional I - The aim is to: * penalize the defect in term of the optimization theory * replace the equation of the first kind K K ˆx = K y with an equation of the second kind in term of language of integral equation theory - Both viewpoints lead to the following minimization problem: given the linear, bounded operator K : X Y and y Y, determine x α X that minimizes the Tikhonov functional theory J α(x) = Kx y 2 + α x 2, x X 25 / 68

26 Tikhonov functional II Theorem 7 Let K : X Y be a linear and bounded operator between Hilbert spaces and α > 0. Then the Tikhonov functional J α has a unique minimum x α X. This minimum x α is the unique solution of the normal equation theory αx α + K Kx α = K y - The solution x α can be written in the form x α = R αy with R α = (αi + K K) 1 K : Y X (2) 26 / 68

27 Tikhonov functional III - Choosing a singular system (µ j, x j, y j ) for the compact operator K, R αy has the representation R αy = µ j (y, y j )x j = α + µ 2 j n=0 n=0 q(α, µ j ) µ j (y, y j )x j, y Y theory with q(α, µ) = µ 2 /(α + µ 2 ) - Note: this function q is exactly the filter function (a) of the theorem 5 27 / 68

28 Tikhonov regularization method I Theorem 8 Let K : X Y be a linear, compact operator and α > 0. - (a) the operator αi + K K is boundedly invertible. The operators R α : Y X from 2 form a regularization strategy with R α 1/(2 α). It is called the Tikhonov regularization method. R αy δ is determined as the unique solution x α,δ X of the equation of the second kind αx α,δ + K Kx α,δ = K y δ Every choice α(δ) 0 (δ 0) with δ 2 /α(δ) 0 (δ 0) is admissible - (b) let x = K z K (Y ) with z E. Choosing α(δ) = cδ/e for some c > 0. Then the following estimate holds: x α(δ),δ x 1 ( 1 c + δe c) 2 theory - (c) let x = K Kz K K(X ) with z E. The choice α(δ) = c(δ/e) 2/3 for some c > 0 leads to the error estimate ( ) 1 x α(δ),δ x 2 c + c δ 2/3 E 1/3 The regularization method is optimal for the information (K ) 1 x E or (K K) 1 x E, respectively (provided K is one-to-one) 28 / 68

29 Tikhonov regularization method II - The eigenvalues of K tend to zero, and the eigenvalues of αi + K K are bounded away from zero by α > 0 [1] - From the previous theorem 8, the α(δ) has to be chosen in such a way that it converges to zero as δ tends to zero but not as fast as δ 2 - From parts (b) and (c), the smoother solution x is the slower α has to tend to zero - N.B. the convergence can be arbitrarily slow if no a priori assumption about the solution x is available [1] theory 29 / 68

30 Tikhonov regularization method III Theorem 9 Let K : X Y be linear, compact, and one-to-one such that the range R(K) is infinite-dimensional. Furthermore, let x X, and assume that there exists a continuous function α : [0, ) [0, ) with α(0) = 0 such that theory lim x α(δ),δ x δ 2/3 = 0 δ 0 for every y δ Y with y δ Kx δ, where x α(δ),δ X solves αx α,δ + K Kx α,δ = K y δ. Then x = 0 30 / 68

31 Consideration - regularization method is not optimal for stronger smoothness assumptions on the solution x i.e. under the assumption x (K K) r (X ) for some r N, r 2 * this is in contrast to Landweber s method or the conjugate gradient method - The choice of α in theorem 8 is made a priori; that is, before starting the computation of x α by solving the least squares problem - Note: The regularization is stronger related to the differential operators or norms used [1] * one example is the interpretation of regularization by smoothing norms in terms of reproducing kernel Hilbert spaces theory 31 / 68

32 Landweber s Equation - Landweber suggested rewriting the equation Kx = y in the form x = (I ak K)x + ak y for some a > 0 and iterating this equation: * x 0 = 0 * x m = (I ak K)x m 1 + ak y for m = 1, 2,... - This iterative scheme can be viewed as the steepest descent algorithm applied to the quadratic functional x Kx y 2 theory 32 / 68

33 I Lemma 2 Consider the sequence (x m = (I ak K)x m 1 + ak y) and define the functional ψ : X R by ψ(x) = 1 2 Kx y 2, x X. Then ψ is Fréchet differentiable in every z X and theory ψ (z)x = Re(Kz y, Kx) = Re(K (Kz y), x), x X Therefore, x m = x m 1 ak (Kx m 1 y) is the steepest descent step with stepsize a 33 / 68

34 II - x m can be now expressed by the explicit form x m = R my, where the operator R m : Y X is defined by m 1 R m = a (I ak K) k K for m = 1, 2,... k=0 theory - Choosing a singular system (µ j, x j, y j ) for the compact operator K, then R my has the representation R my = j=1 q(m, µ j ) µ j (y, y j )x j, y Y with q(m, µ) = 1 (1 aµ 2 ) m - This filter function q is the same of the theorem 5 where α = 1/m 34 / 68

35 Theorem 10 III - (a) let K : X Y be a compact operator and let 0 < a < 1/ K 2. The linear and bounded operators R m : Y X define a regularization strategy with discrete regularization parameter α = 1/m, m N, and R m am. The sequence x m,δ = R my δ is computed by the iteration * x 0,δ = 0 * x m,δ = (I ak K)x m 1,δ + ak y δ for m = 1, 2,.... Every strategy m(δ) (δ 0) with δ 2 m(δ) 0 (δ 0) is admissible - (b) let x = K z K (Y ) with z E and 0 < c 1 < c 2. For every E choice m(δ) with c 1 m(δ) δ c2 E, the following estimate holds: δ theory x m(δ),δ x c 3 δe for some c 3 depending on c 1, c 2, and a. - (c) let x = K Kz K K(X ) with z E and 0 < c 1 < c 2. For every choice m(δ) with c 1(E/δ) 2/3 m(δ) c 2(E/δ) 2/3, we have x m(δ),δ x c 3E 1/3 δ 2/3 35 / 68

36 Consideration - From the previous slide, the Landweber s iteration is optimal for the information (K ) 1 x E or (K K) 1 x E, respectively - The choice x 0 = 0 simplifies the analysis. In general, the explicit iteration x m is given by m 1 x m = a (I ak K) k K y + (I ak K) m x 0, m = 1, 2,... k=0 theory - R m is affine linear i.e. of the form R my = z m + S my, y Y, for some z m X and some linear operator S m : Y X - Note: * high precision (ignoring the presence of errors) requires a large number m of iterations * stability forces us to keep m small enough * from the theorem, it also holds the following general rule: x (K K) r (X ), r N x m(δ),δ x c E 2r+1 1 δ 2r+1 2r 36 / 68

37 I - Most of the inverse problems in science and engineering areas are ill-posed * computational vision * system identification * nonlinear dynamic reconstruction * density estimation - Given the available input data, the solution to the problem is nonunique (one-to-many) or unstable - The classical regularization techniques, due to Tikhonov, are able to making the solution well-posed * model selection * complexity control theory 37 / 68

38 II - theory was introduced to the machine learning community in the 1990s - A regularization algorithm for learning is equivalent to a multilayer network with a kernel in the form of a radial basis function (RBF), an RBF network [2] - The solution of the classical Tikhonov regularization problem can be derived from the regularized functional defined by a: * linear differential operator in the spatial domain * linear integral operator in Fourier domain - Note: the regularized solution was originally derived by a differential linear operator and its Green s function theory 38 / 68

39 II - One way to solve ill-posed problems is to make the problems well-posed by incorporating prior knowledge into the solutions - The forms of prior knowledge vary and are problem dependent * the most popular and important prior knowledge is the smoothness prior - A possible choice of the smoothness of the functional is to put the functional into the reproducing kernel Hilbert space (RKHS) theory 39 / 68

40 Machine Learning Framework - Given a set of observation data (learning examples) {(x i, y i ) R N R} X Y, the learning machine f has to find the solution to the inverse problem - f has to approximate a real function in the hypothesis satisfying the constraints f (x i ) = y(x i ) y i * where y(x) is supposed to be a deterministic function in the target space - The learning problem can be viewed as a multivariate functional approximation problem - Note: this problem is ill-posed since the approximants satisfying the constraints are not unique theory 40 / 68

41 Loss Function - Statistically, the approximation accuracy is measured by the expectation of the approximation error. The expected risk functional R = L(x, y)p(x, y)dxdy X Y where L(x, y) represents the loss functional - A common loss function is the mean squared error defined by L 2 norm and the expected risk functional is R = [y f (x, y)] 2 p(x, y)dxdy X Y - In practice * the joint probability p(x, y) is unknown * an estimate of R is based on finite l observations and so the expected risk functional becomes an empirical risk functional l R emp = [y i f (x i )] 2 i=1 * which introduces an estimate ŷ(x) (ŷ = y ɛ = f (x)) theory 41 / 68

42 Tikhonov - The expected risk can be decomposed into two part R[f ] = R emp[f ] }{{} the empirical risk functional + R reg[f ] }{{} the regularizer risk functional = 1 l [y i f (x i )] λ Df i=1 theory * is the norm operator * λ is a regularization parameter that controls the trade-off between the good fit of the data and the smoothness of the solution * D is a linear differential operator defined as the Fréchet differential of Tikhonov functional - Note: the smoothness prior implicated in D makes the solution stable and insensitive to noise 42 / 68

43 The Fréchet differential and Riesz Theorem Definition 4 A function f : X Y is Fréchet differentiable at a point x, if for every h X, f (x + ɛh) f (x) lim = df (x, h) ɛ 0 ɛ exists, and defines a linear bounded transformation (in h) mapping X into Y. df (x, h) = F (x)h is the Fréchet differential; F (x) is the Fréchet derivative. Theorem 11 Let H be a Hilbert space, and let H be its dual space, consisting of all continuous linear functionals from H into the field R or C. If x is an element of H, then the function ϕ x defined by theory ϕ x(y) = y, x y H where, denotes the inner product of the Hilbert space, is an element of H. The theorem states that every element of H can be written uniquely in this form 43 / 68

44 The Reproducing Kernel Hilbert Space I Definition 5 Let X be an arbitrary set and H a Hilbert space of complex-valued functions on X. H is a reproducing kernel Hilbert space if every linear functional of the form L x : f f (x) (L x : H C) is continuous for any x X. By the Riesz theorem, this implies that for every x X there exists a unique element K x of H with the property that: f (x) = f, K x f H. The function K x is called the point-evaluation functional at the point x. The function K : X X C defined by K(x, y) = K x(y) is called the reproducing kernel for the Hilbert space H and it is determined entirely by H theory 44 / 68

45 The Reproducing Kernel Hilbert Space II - Note: * H is a space of functions, then the element K x is itself a function * the Riesz theorem guarantees that, for every x in X, the element K x is unique - The reproducing property * K(x, y) = K x (y) = K y, K x * K(x, x) = K x, K x 0, x X * K x = 0 if and only if f (x) = 0 f H theory 45 / 68

46 Fréchet Differential in Spatial Domain - The Fréchet Differential of R is dr(f, h) = d R(f + βh) β=0 where dβ h(x) is a constant function of x R emp(f, h) = d Remp(f + βh) dβ = β=0 l [y i f (x i )]h(x i ) = h, i=1 l [y i f (x i )]δ(x x i ) i=1 theory R reg(f, h) = d Rreg(f + βh) dβ β=0 = D[f + βh]dh dx = β=0 Df Dh dx = Df, Dh * δ( ) is the Dirac delta function *, is the inner product in Hilbert space H 46 / 68

47 Integral Equation I - Given a bounded linear operator A : X Y, its adjoint operator is defined as A : Y X such that Ax, y = x, A y (x X, y Y ) and A = A - An operator is a self-adjoint operator if it is equal to its adjoint operator Ax, x = x, A x (x, x X ) Definition 6 Given a positive-definite kernel function K * an integral operator T is defined by Tf = f (s) = K(s, x)f (x) dx * T is the adjoint operator defined by T f = f (x) = K(x, s)f (s) ds * T is self-adjoint if and only if K(s, x) = K(x, s) for all s and x theory - The integral equation is a Fredholm integral equation of the first kind 47 / 68

48 - Note: Integral Equation II - An integral operator with a symmetric kernel K(s, x) is self-adjoint - K = T Tf = g(s) = K (s, x)f (x)dx with K (s, x) = K(s, t)k(x, t)dt K is a integral self-adjoint operator - L = D D is a differential operator theory 48 / 68

49 Green s Function I - Given a positive integral operator T is possible to find a (pseudo-)differential operator D as its inverse - The operator D corresponds to the inner product of the RKHS with a reproducing kernel K associated to T * the kernel K is called Green s function of the differential operator D Definition 7 Given a linear differential operator L, the function G(x, ξ) is the Green s function for L if it has the following properties: * for a fixed ξ, G(x, ξ) is a function of x and satisfies the given boundary conditions * except at the point x = ξ, the derivatives of G(x, ξ) with respect to x are all continuous; the number of derivatives is determined by the order of operator L * with G(x, ξ) considered as a function of x, it satisfies the partial differential equation LG(x, ξ) = δ(x ξ) theory 49 / 68

50 Green s Function II - The solution of the differential equation LF (x) = ϕ(x) is F (x) = R N G(x, ξ)ϕ(ξ) dξ where ϕ(x) is a continuous function of x R N theory 50 / 68

51 Spectral in Fourier Domain - A spectral operator is an integral operator T, in the case of the Fourier operator K(s, x) = exp( j s, x ) Definition 8 For any functional f (x) H, the Fourier operator T is defined by Tf = F (s) = + f (x)exp( jxs) dx; F (s) H Theorem 12 Given two functionals f, g H and their corresponding Fourier transform F and G, the Parseval theorem says that f (x), g(x) = 1 F (s), G(s). In the 2π operator form it is expressed by f, g = Tf, Tg theory 51 / 68

52 Spectral in Fourier Domain - The differential operator D = + T D = + ( 1) n (js) n = exp( js) n! * T D f = T(Df ) = D(Tf ) ( 1) n n! d n dx n and its spectral operator - The kernel function associated with the operator K is given by K (x, x i ) = exp(jsx)exp( jsx i ) ds = δ(x x i ) - Then Kf = K (x, x i )f (s)ds = f (x x i ) theory 52 / 68

53 I - The regularization strategy for the equation Aϕ = f (A : X Y ) is to find an approximated solution ϕ ɛ related to ϕ, such that * ϕ ɛ ϕ ɛ where ɛ is a small, positive value - The two equivalent conditions are to find: * R λ : Y X (λ > 0) such that lim λ 0 R λ A ϕ = ϕ for all ϕ X (pointwise convergence) * R λ f A 1 f as λ 0 - Note: the operator R λ cannot be uniformly bounded with respect to λ, and the operator R λ A cannot be norm convergent as λ 0 theory 53 / 68

54 II - The approximation error is ϕ ɛ λ ϕ ɛ R λ + R λ Aϕ ϕ }{{}}{{} influence of incorrect data approximation error between R λ and A 1 - The regularization parameter, λ, controls the trade-off between: * stability (the first term increases as λ 0) * accuracy (the second term decreases as λ 0) - Note: the aim of the regularization is to find an appropriate λ to achieve the minimum error of the regularized risk functional theory 54 / 68

55 III Definition 9 The regularization strategy R λ i.e. the choice of regularization parameter λ = λ(ɛ), is called regular if for all f A(X ) and all f ɛ Y with f ɛ f ɛ, there holds R λ(ɛ) f ɛ A 1 f, ɛ 0 theory Theorem 13 For a bounded linear operator A(x) = N(A ) and N(A ) = A(X ) * A(X ) is the orthogonal complement of A(X ) i.e A = {ϕ X : Aϕ, g = 0 g A } * N(A ) is the kernel of A i.e. N(A ) = {A f = 0, f } 55 / 68

56 I Theorem 14 Let X be a Hilbert space, and let A : X X be a self-adjoint compact operator * all eigenvalues of A are real * all eigenspaces N(ξI A) for nonzero eigenvalues ξ have finite dimension * all eigenspaces associated with different eigenvalues are orthogonal Suppose the eigenvalues are ordered such that ξ 1 ξ 2..., and denote by P n : X N(ξ ni A) the orthogonal projection onto the eigenspace for the eigenvalue ξ n; then A = ξ np n n=1 Let Q : X N(A) be the orthogonal projection onto the kernel N(A); then ϕ = P nϕ + Qϕ ϕ X n=1 theory 56 / 68

57 II - In the case of an orthonormal basis, ϕ n, ϕ k = δ n,k, the expansion representation is Aϕ = ϕ = ξ n ϕ, ϕ n ϕ n n=1 ϕ, ϕ n ϕ n + Qϕ n=1 theory 57 / 68

58 Singular Values Decomposition Theorem 15 Let X and Y be Hilbert spaces. Let A : X Y be a linear compact operator and A : Y X be its adjoint. Let σ n denote the singular values of A, which are the square roots of the eigenvalues of the self-adjoint compact operator A A : X X. Let {σ n} be an ordered sequence of nonzero singular values of A according to the dimension of the kernel N(σ 2 ni A A). Then there exist orthonormal sequences {ϕ n} in X and {g n} in Y such that Aϕ n = σ ng n, A g n = σ nϕ n n Z - For each ϕ X, the Singular Values Decomposition (SVD) is ϕ = n=1 ϕ, ϕn ϕn + Qϕ * with the orthogonal projection Q : X N(A) and Aϕ = n=1 σn ϕ, ϕn gn theory 58 / 68

59 Picard s Theorem Theorem 16 Let A : X Y be a linear compact operator with singular system (σ n, ϕ n, g n). The Fredholm integral equation of the first kind Aϕ = f is solvable if and only if f N(A ) and n=1 solution is given by n=1 1 σ 2 n 1 σ n f, g n ϕ n f, g n 2 <. Then a theory - The Picard theorem describes the ill-posed nature of the integral equation Aϕ = f * the perturbation ratio ϕ ɛ / f ɛ = ɛ/σ n determines the degree of ill-posedness * more quickly σ decays and the more severe is the ill-posedness 59 / 68

60 Green s Identity - From the Parseval identity (theorem 12), the R reg(f, h) becomes dr reg(f, h) = Df Dh dx = Dh, Df = T D f T D h ds = T D h, T D f Definition 10 For any pair of functions u(x) and v(x), given a linear differential operator D and its associated Fourier operator T D, with adjoint operators, D and T D, are uniquely determined to satisfy the boundary conditions u(x)dv(x)dx = v(x)d u(x)dx R N R N u(s)t D v(s)ds = v(s)t Du(s)ds Ω Ω theory where Ω represents spectrum support in the frequency domain 60 / 68

61 Euler-Lagrange equation of Tikhonov functional I - The first equation is the Green s identity and it can be used to write dr reg(f, h) = Dh, Df = h, D Df (x) with u(x) = Df (x) and Dv(x) = Dh(x) - From the second equation derives dr reg(f, h) = T D h, T D f = h, T DT D f (s) theory with u(s) = T D f (s) and T D v(s) = T D h(s) - The condition dr(f, h) = dr emp(f, h) + λdr reg(f, h) = 0 becomes dr(f, h) = dr(f, h) = [ h(x), D Df (x) 1 λ [ h(s), T DT D f (s) l ] (y i f (x i ))δ(x x i ) i=1 F }{{} Fourier transform { 1 λ l }] (y i f (x i ))δ(x x i ) i=1 61 / 68

62 Euler-Lagrange equation of Tikhonov functional II - The two equivalent necessary conditions for f (x) is an extremum of R(f ) are * dr(f, h) = 0 for all h H * in the distribution sense D Df λ (x) = 1 l (y i f (x i ))δ(x x i ) and λ i=1 }{{} Euler-Lagrange equation of Tikhonov functional R(f ) T D T Df λ (s) = F{ 1 } l λ i=1 (y i f (x i ))δ(x x i ) = 1 l (y i f (x i ))exp( jx i s) λ i=1 }{{} Fourier transform of Euler-Lagrange equation of Tikhonov functional R(f ) theory 62 / 68

63 Differential and Integral Operators - Consider L = D D and K = T DT D * the G(x, ξ) is the Green s function for the linear differential operator LG(x, ξ) = δ(x ξ) * in frequency domain KG(s, ξ) = exp( jsξ) - The operators L and K are defined as * L = ( 1) n n=0 n!2 n 2n where 2 = l i=1 2 x i 2 is the Laplace operator * K = ( 1) 2n s 2n n=0 n!2 n = exp(s 2 /2) - Since LG(x) = δ(x) and KG(s) = 1 follow that G(s) = exp( s 2 /2) G(x) = exp( x 2 /2) theory 63 / 68

64 Solution of Regularized Problem I - The differential equation Lf (x) = ϕ(x) and integral equation Kf (x) = φ(s) have the same the solution f (x) = R N G(x, ξ)ϕ(ξ) dξ - Note: * Lf (x) = L R N G(x, ξ)ϕ(ξ) dξ = R N δ(x ξ)ϕ(ξ) dξ = ϕ(x) * ϕ(ξ) = 1 l λ i=1 [y i f (x i )]δ(ξ x i ) * f λ (x) = 1 l λ i=1 [y i f (x i )] R N G(x, ξ)δ(ξ x i ) dξ = l i=1 w i G(x, x i ) * equivalently in the frequency domain - The purpose of regularization is to find a subspace, the eigenspace of Lf or Kf, within which the operator becomes well-posed. The solution in the subspace is unique theory 64 / 68

65 Solution of Regularized Problem II - The solution of the regularized problem is f λ (x) = l i=1 w ig(x, x i ) with w i = [y i f (x i )]/λ and G(x, x i ) is a positive-definite Green s function for all i - In general, the solution of the Tikhonov regularization problem is f λ (x) = l i=1 w ig(x, x i ) + β(x) where * β(x) is a term that lies in the kernel of D which satisfies the orthogonal condition l i=1 w i β(x i ) = 0 * the simplest case is β(x) = const * the functional space of the solution f λ is an RKHS of the direct sum of two orthogonal RKHS theory 65 / 68

66 Parzen s Window - Replacing the summation by the integral and G(x, x i ) by K(x, x i ). If K is a reproducing kernel invariant to translation and satisfies K(x, x i ) = K(x x i ). The approximated function can be expressed by the convolution f (x) = 1 (y i f (x i )) K(x x i ) dx i λ }{{}}{{} observation kernel = 1 ( ) y i K(x x i ) dx i f (x i )K(x x i ) dx i λ }{{}}{{} Kernel regression Integral operator theory - The f (x) can be reconstructed by the data sample smoothed by an averaged kernel (Parzen s Window) 66 / 68

67 Regularized Solution in Matrix Form - The regularized solution in matrix form is obtained by taking dr(f ) df setting it to zero and R(f ) = 1 2 y f λ(df )T (Df ) = 1 2 (y f)t (y f) λft }{{} K f D D theory where the solution is f = (I λk) 1 }{{} smoothing matrix y * K is quadratic symmetric penalty matrix associated with L * similarly in the frequency domain where the regularizer is K 67 / 68

68 References Andreas Kirsch. An Introdutction to the Mathematical Theory of the Inverse Problems - Second Edition. Springer, theory S. Haykin Z. Chen. On Different Facets of. Neural Computation, 14: , / 68

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto

Reproducing Kernel Hilbert Spaces Class 03, 15 February 2006 Andrea Caponnetto Reproducing Kernel Hilbert Spaces 9.520 Class 03, 15 February 2006 Andrea Caponnetto About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Chapter 7: Bounded Operators in Hilbert Spaces

Chapter 7: Bounded Operators in Hilbert Spaces Chapter 7: Bounded Operators in Hilbert Spaces I-Liang Chern Department of Applied Mathematics National Chiao Tung University and Department of Mathematics National Taiwan University Fall, 2013 1 / 84

More information

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 12, 2007 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

I teach myself... Hilbert spaces

I teach myself... Hilbert spaces I teach myself... Hilbert spaces by F.J.Sayas, for MATH 806 November 4, 2015 This document will be growing with the semester. Every in red is for you to justify. Even if we start with the basic definition

More information

Definition and basic properties of heat kernels I, An introduction

Definition and basic properties of heat kernels I, An introduction Definition and basic properties of heat kernels I, An introduction Zhiqin Lu, Department of Mathematics, UC Irvine, Irvine CA 92697 April 23, 2010 In this lecture, we will answer the following questions:

More information

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 27 Introduction Fredholm first kind integral equation of convolution type in one space dimension: g(x) = 1 k(x x )f(x

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces 9.520: Statistical Learning Theory and Applications February 10th, 2010 Reproducing Kernel Hilbert Spaces Lecturer: Lorenzo Rosasco Scribe: Greg Durrett 1 Introduction In the previous two lectures, we

More information

Spectral Regularization

Spectral Regularization Spectral Regularization Lorenzo Rosasco 9.520 Class 07 February 27, 2008 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

Kernels A Machine Learning Overview

Kernels A Machine Learning Overview Kernels A Machine Learning Overview S.V.N. Vishy Vishwanathan vishy@axiom.anu.edu.au National ICT of Australia and Australian National University Thanks to Alex Smola, Stéphane Canu, Mike Jordan and Peter

More information

Analysis Preliminary Exam Workshop: Hilbert Spaces

Analysis Preliminary Exam Workshop: Hilbert Spaces Analysis Preliminary Exam Workshop: Hilbert Spaces 1. Hilbert spaces A Hilbert space H is a complete real or complex inner product space. Consider complex Hilbert spaces for definiteness. If (, ) : H H

More information

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space Statistical Inference with Reproducing Kernel Hilbert Space Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department

More information

Regularization via Spectral Filtering

Regularization via Spectral Filtering Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

Kernel Principal Component Analysis

Kernel Principal Component Analysis Kernel Principal Component Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space. Chapter 1 Preliminaries The purpose of this chapter is to provide some basic background information. Linear Space Hilbert Space Basic Principles 1 2 Preliminaries Linear Space The notion of linear space

More information

Chapter 8 Integral Operators

Chapter 8 Integral Operators Chapter 8 Integral Operators In our development of metrics, norms, inner products, and operator theory in Chapters 1 7 we only tangentially considered topics that involved the use of Lebesgue measure,

More information

Infinite-dimensional Vector Spaces and Sequences

Infinite-dimensional Vector Spaces and Sequences 2 Infinite-dimensional Vector Spaces and Sequences After the introduction to frames in finite-dimensional vector spaces in Chapter 1, the rest of the book will deal with expansions in infinitedimensional

More information

Multiscale Frame-based Kernels for Image Registration

Multiscale Frame-based Kernels for Image Registration Multiscale Frame-based Kernels for Image Registration Ming Zhen, Tan National University of Singapore 22 July, 16 Ming Zhen, Tan (National University of Singapore) Multiscale Frame-based Kernels for Image

More information

CHAPTER VIII HILBERT SPACES

CHAPTER VIII HILBERT SPACES CHAPTER VIII HILBERT SPACES DEFINITION Let X and Y be two complex vector spaces. A map T : X Y is called a conjugate-linear transformation if it is a reallinear transformation from X into Y, and if T (λx)

More information

Spectral theory for compact operators on Banach spaces

Spectral theory for compact operators on Banach spaces 68 Chapter 9 Spectral theory for compact operators on Banach spaces Recall that a subset S of a metric space X is precompact if its closure is compact, or equivalently every sequence contains a Cauchy

More information

3 Compact Operators, Generalized Inverse, Best- Approximate Solution

3 Compact Operators, Generalized Inverse, Best- Approximate Solution 3 Compact Operators, Generalized Inverse, Best- Approximate Solution As we have already heard in the lecture a mathematical problem is well - posed in the sense of Hadamard if the following properties

More information

In the Name of God. Lectures 15&16: Radial Basis Function Networks

In the Name of God. Lectures 15&16: Radial Basis Function Networks 1 In the Name of God Lectures 15&16: Radial Basis Function Networks Some Historical Notes Learning is equivalent to finding a surface in a multidimensional space that provides a best fit to the training

More information

Fredholm Theory. April 25, 2018

Fredholm Theory. April 25, 2018 Fredholm Theory April 25, 208 Roughly speaking, Fredholm theory consists of the study of operators of the form I + A where A is compact. From this point on, we will also refer to I + A as Fredholm operators.

More information

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint

More information

Review and problem list for Applied Math I

Review and problem list for Applied Math I Review and problem list for Applied Math I (This is a first version of a serious review sheet; it may contain errors and it certainly omits a number of topic which were covered in the course. Let me know

More information

Statistical Convergence of Kernel CCA

Statistical Convergence of Kernel CCA Statistical Convergence of Kernel CCA Kenji Fukumizu Institute of Statistical Mathematics Tokyo 106-8569 Japan fukumizu@ism.ac.jp Francis R. Bach Centre de Morphologie Mathematique Ecole des Mines de Paris,

More information

Convergence of Eigenspaces in Kernel Principal Component Analysis

Convergence of Eigenspaces in Kernel Principal Component Analysis Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) Yale, May 2 2011 p. 1/35 Introduction There exist many fields where inverse problems appear Astronomy (Hubble satellite).

More information

Normed & Inner Product Vector Spaces

Normed & Inner Product Vector Spaces Normed & Inner Product Vector Spaces ECE 174 Introduction to Linear & Nonlinear Optimization Ken Kreutz-Delgado ECE Department, UC San Diego Ken Kreutz-Delgado (UC San Diego) ECE 174 Fall 2016 1 / 27 Normed

More information

Spectral Continuity Properties of Graph Laplacians

Spectral Continuity Properties of Graph Laplacians Spectral Continuity Properties of Graph Laplacians David Jekel May 24, 2017 Overview Spectral invariants of the graph Laplacian depend continuously on the graph. We consider triples (G, x, T ), where G

More information

David Hilbert was old and partly deaf in the nineteen thirties. Yet being a diligent

David Hilbert was old and partly deaf in the nineteen thirties. Yet being a diligent Chapter 5 ddddd dddddd dddddddd ddddddd dddddddd ddddddd Hilbert Space The Euclidean norm is special among all norms defined in R n for being induced by the Euclidean inner product (the dot product). A

More information

Functional Analysis I

Functional Analysis I Functional Analysis I Course Notes by Stefan Richter Transcribed and Annotated by Gregory Zitelli Polar Decomposition Definition. An operator W B(H) is called a partial isometry if W x = X for all x (ker

More information

Nonlinear error dynamics for cycled data assimilation methods

Nonlinear error dynamics for cycled data assimilation methods Nonlinear error dynamics for cycled data assimilation methods A J F Moodey 1, A S Lawless 1,2, P J van Leeuwen 2, R W E Potthast 1,3 1 Department of Mathematics and Statistics, University of Reading, UK.

More information

2 Tikhonov Regularization and ERM

2 Tikhonov Regularization and ERM Introduction Here we discusses how a class of regularization methods originally designed to solve ill-posed inverse problems give rise to regularized learning algorithms. These algorithms are kernel methods

More information

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms.

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. Vector Spaces Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. For each two vectors a, b ν there exists a summation procedure: a +

More information

RIESZ BASES AND UNCONDITIONAL BASES

RIESZ BASES AND UNCONDITIONAL BASES In this paper we give a brief introduction to adjoint operators on Hilbert spaces and a characterization of the dual space of a Hilbert space. We then introduce the notion of a Riesz basis and give some

More information

Functional Analysis. Martin Brokate. 1 Normed Spaces 2. 2 Hilbert Spaces The Principle of Uniform Boundedness 32

Functional Analysis. Martin Brokate. 1 Normed Spaces 2. 2 Hilbert Spaces The Principle of Uniform Boundedness 32 Functional Analysis Martin Brokate Contents 1 Normed Spaces 2 2 Hilbert Spaces 2 3 The Principle of Uniform Boundedness 32 4 Extension, Reflexivity, Separation 37 5 Compact subsets of C and L p 46 6 Weak

More information

************************************* Applied Analysis I - (Advanced PDE I) (Math 940, Fall 2014) Baisheng Yan

************************************* Applied Analysis I - (Advanced PDE I) (Math 940, Fall 2014) Baisheng Yan ************************************* Applied Analysis I - (Advanced PDE I) (Math 94, Fall 214) by Baisheng Yan Department of Mathematics Michigan State University yan@math.msu.edu Contents Chapter 1.

More information

Two-parameter regularization method for determining the heat source

Two-parameter regularization method for determining the heat source Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 13, Number 8 (017), pp. 3937-3950 Research India Publications http://www.ripublication.com Two-parameter regularization method for

More information

FUNCTIONAL ANALYSIS-NORMED SPACE

FUNCTIONAL ANALYSIS-NORMED SPACE MAT641- MSC Mathematics, MNIT Jaipur FUNCTIONAL ANALYSIS-NORMED SPACE DR. RITU AGARWAL MALAVIYA NATIONAL INSTITUTE OF TECHNOLOGY JAIPUR 1. Normed space Norm generalizes the concept of length in an arbitrary

More information

An Iteratively Regularized Projection Method for Nonlinear Ill-posed Problems

An Iteratively Regularized Projection Method for Nonlinear Ill-posed Problems Int. J. Contemp. Math. Sciences, Vol. 5, 2010, no. 52, 2547-2565 An Iteratively Regularized Projection Method for Nonlinear Ill-posed Problems Santhosh George Department of Mathematical and Computational

More information

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers. Chapter 3 Duality in Banach Space Modern optimization theory largely centers around the interplay of a normed vector space and its corresponding dual. The notion of duality is important for the following

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

INVERSE FUNCTION THEOREM and SURFACES IN R n

INVERSE FUNCTION THEOREM and SURFACES IN R n INVERSE FUNCTION THEOREM and SURFACES IN R n Let f C k (U; R n ), with U R n open. Assume df(a) GL(R n ), where a U. The Inverse Function Theorem says there is an open neighborhood V U of a in R n so that

More information

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space.

Hilbert Spaces. Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Hilbert Spaces Hilbert space is a vector space with some extra structure. We start with formal (axiomatic) definition of a vector space. Vector Space. Vector space, ν, over the field of complex numbers,

More information

REAL ANALYSIS II HOMEWORK 3. Conway, Page 49

REAL ANALYSIS II HOMEWORK 3. Conway, Page 49 REAL ANALYSIS II HOMEWORK 3 CİHAN BAHRAN Conway, Page 49 3. Let K and k be as in Proposition 4.7 and suppose that k(x, y) k(y, x). Show that K is self-adjoint and if {µ n } are the eigenvalues of K, each

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 9, 2011 About this class Goal In this class we continue our journey in the world of RKHS. We discuss the Mercer theorem which gives

More information

1 Continuity Classes C m (Ω)

1 Continuity Classes C m (Ω) 0.1 Norms 0.1 Norms A norm on a linear space X is a function : X R with the properties: Positive Definite: x 0 x X (nonnegative) x = 0 x = 0 (strictly positive) λx = λ x x X, λ C(homogeneous) x + y x +

More information

Analysis Comprehensive Exam Questions Fall F(x) = 1 x. f(t)dt. t 1 2. tf 2 (t)dt. and g(t, x) = 2 t. 2 t

Analysis Comprehensive Exam Questions Fall F(x) = 1 x. f(t)dt. t 1 2. tf 2 (t)dt. and g(t, x) = 2 t. 2 t Analysis Comprehensive Exam Questions Fall 2. Let f L 2 (, ) be given. (a) Prove that ( x 2 f(t) dt) 2 x x t f(t) 2 dt. (b) Given part (a), prove that F L 2 (, ) 2 f L 2 (, ), where F(x) = x (a) Using

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

More information

Exercise Solutions to Functional Analysis

Exercise Solutions to Functional Analysis Exercise Solutions to Functional Analysis Note: References refer to M. Schechter, Principles of Functional Analysis Exersize that. Let φ,..., φ n be an orthonormal set in a Hilbert space H. Show n f n

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 9, 2011 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability... Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................

More information

C.6 Adjoints for Operators on Hilbert Spaces

C.6 Adjoints for Operators on Hilbert Spaces C.6 Adjoints for Operators on Hilbert Spaces 317 Additional Problems C.11. Let E R be measurable. Given 1 p and a measurable weight function w: E (0, ), the weighted L p space L p s (R) consists of all

More information

Lax Solution Part 4. October 27, 2016

Lax Solution Part 4.   October 27, 2016 Lax Solution Part 4 www.mathtuition88.com October 27, 2016 Textbook: Functional Analysis by Peter D. Lax Exercises: Ch 16: Q2 4. Ch 21: Q1, 2, 9, 10. Ch 28: 1, 5, 9, 10. 1 Chapter 16 Exercise 2 Let h =

More information

Slide05 Haykin Chapter 5: Radial-Basis Function Networks

Slide05 Haykin Chapter 5: Radial-Basis Function Networks Slide5 Haykin Chapter 5: Radial-Basis Function Networks CPSC 636-6 Instructor: Yoonsuck Choe Spring Learning in MLP Supervised learning in multilayer perceptrons: Recursive technique of stochastic approximation,

More information

Spectral Theory, with an Introduction to Operator Means. William L. Green

Spectral Theory, with an Introduction to Operator Means. William L. Green Spectral Theory, with an Introduction to Operator Means William L. Green January 30, 2008 Contents Introduction............................... 1 Hilbert Space.............................. 4 Linear Maps

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

Regularization and Inverse Problems

Regularization and Inverse Problems Regularization and Inverse Problems Caroline Sieger Host Institution: Universität Bremen Home Institution: Clemson University August 5, 2009 Caroline Sieger (Bremen and Clemson) Regularization and Inverse

More information

Linear Inverse Problems

Linear Inverse Problems Linear Inverse Problems Ajinkya Kadu Utrecht University, The Netherlands February 26, 2018 Outline Introduction Least-squares Reconstruction Methods Examples Summary Introduction 2 What are inverse problems?

More information

Examination paper for TMA4145 Linear Methods

Examination paper for TMA4145 Linear Methods Department of Mathematical Sciences Examination paper for TMA4145 Linear Methods Academic contact during examination: Franz Luef Phone: 40614405 Examination date: 5.1.016 Examination time (from to): 09:00-13:00

More information

The Dirichlet-to-Neumann operator

The Dirichlet-to-Neumann operator Lecture 8 The Dirichlet-to-Neumann operator The Dirichlet-to-Neumann operator plays an important role in the theory of inverse problems. In fact, from measurements of electrical currents at the surface

More information

1. Subspaces A subset M of Hilbert space H is a subspace of it is closed under the operation of forming linear combinations;i.e.,

1. Subspaces A subset M of Hilbert space H is a subspace of it is closed under the operation of forming linear combinations;i.e., Abstract Hilbert Space Results We have learned a little about the Hilbert spaces L U and and we have at least defined H 1 U and the scale of Hilbert spaces H p U. Now we are going to develop additional

More information

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms (February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops

More information

1 Distributions (due January 22, 2009)

1 Distributions (due January 22, 2009) Distributions (due January 22, 29). The distribution derivative of the locally integrable function ln( x ) is the principal value distribution /x. We know that, φ = lim φ(x) dx. x ɛ x Show that x, φ =

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Iterative regularization of nonlinear ill-posed problems in Banach space

Iterative regularization of nonlinear ill-posed problems in Banach space Iterative regularization of nonlinear ill-posed problems in Banach space Barbara Kaltenbacher, University of Klagenfurt joint work with Bernd Hofmann, Technical University of Chemnitz, Frank Schöpfer and

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Sample ECE275A Midterm Exam Questions

Sample ECE275A Midterm Exam Questions Sample ECE275A Midterm Exam Questions The questions given below are actual problems taken from exams given in in the past few years. Solutions to these problems will NOT be provided. These problems and

More information

Hilbert Spaces. Contents

Hilbert Spaces. Contents Hilbert Spaces Contents 1 Introducing Hilbert Spaces 1 1.1 Basic definitions........................... 1 1.2 Results about norms and inner products.............. 3 1.3 Banach and Hilbert spaces......................

More information

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis

Real Analysis, 2nd Edition, G.B.Folland Elements of Functional Analysis Real Analysis, 2nd Edition, G.B.Folland Chapter 5 Elements of Functional Analysis Yung-Hsiang Huang 5.1 Normed Vector Spaces 1. Note for any x, y X and a, b K, x+y x + y and by ax b y x + b a x. 2. It

More information

Fourier analysis, measures, and distributions. Alan Haynes

Fourier analysis, measures, and distributions. Alan Haynes Fourier analysis, measures, and distributions Alan Haynes 1 Mathematics of diffraction Physical diffraction As a physical phenomenon, diffraction refers to interference of waves passing through some medium

More information

Real Analysis Qualifying Exam May 14th 2016

Real Analysis Qualifying Exam May 14th 2016 Real Analysis Qualifying Exam May 4th 26 Solve 8 out of 2 problems. () Prove the Banach contraction principle: Let T be a mapping from a complete metric space X into itself such that d(tx,ty) apple qd(x,

More information

16 1 Basic Facts from Functional Analysis and Banach Lattices

16 1 Basic Facts from Functional Analysis and Banach Lattices 16 1 Basic Facts from Functional Analysis and Banach Lattices 1.2.3 Banach Steinhaus Theorem Another fundamental theorem of functional analysis is the Banach Steinhaus theorem, or the Uniform Boundedness

More information

MP463 QUANTUM MECHANICS

MP463 QUANTUM MECHANICS MP463 QUANTUM MECHANICS Introduction Quantum theory of angular momentum Quantum theory of a particle in a central potential - Hydrogen atom - Three-dimensional isotropic harmonic oscillator (a model of

More information

4 Linear operators and linear functionals

4 Linear operators and linear functionals 4 Linear operators and linear functionals The next section is devoted to studying linear operators between normed spaces. Definition 4.1. Let V and W be normed spaces over a field F. We say that T : V

More information

Partial Differential Equations

Partial Differential Equations Part II Partial Differential Equations Year 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2015 Paper 4, Section II 29E Partial Differential Equations 72 (a) Show that the Cauchy problem for u(x,

More information

Polynomial Approximation: The Fourier System

Polynomial Approximation: The Fourier System Polynomial Approximation: The Fourier System Charles B. I. Chilaka CASA Seminar 17th October, 2007 Outline 1 Introduction and problem formulation 2 The continuous Fourier expansion 3 The discrete Fourier

More information

MAT 578 FUNCTIONAL ANALYSIS EXERCISES

MAT 578 FUNCTIONAL ANALYSIS EXERCISES MAT 578 FUNCTIONAL ANALYSIS EXERCISES JOHN QUIGG Exercise 1. Prove that if A is bounded in a topological vector space, then for every neighborhood V of 0 there exists c > 0 such that tv A for all t > c.

More information

f(s) e -i n π s/l d s

f(s) e -i n π s/l d s Pointwise convergence of complex Fourier series Let f(x) be a periodic function with period l defined on the interval [,l]. The complex Fourier coefficients of f( x) are This leads to a Fourier series

More information

Real Analysis Problems

Real Analysis Problems Real Analysis Problems Cristian E. Gutiérrez September 14, 29 1 1 CONTINUITY 1 Continuity Problem 1.1 Let r n be the sequence of rational numbers and Prove that f(x) = 1. f is continuous on the irrationals.

More information

Math 46, Applied Math (Spring 2009): Final

Math 46, Applied Math (Spring 2009): Final Math 46, Applied Math (Spring 2009): Final 3 hours, 80 points total, 9 questions worth varying numbers of points 1. [8 points] Find an approximate solution to the following initial-value problem which

More information

The Learning Problem and Regularization

The Learning Problem and Regularization 9.520 Class 02 February 2011 Computational Learning Statistical Learning Theory Learning is viewed as a generalization/inference problem from usually small sets of high dimensional, noisy data. Learning

More information

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces Introduction to Optimization Techniques Nonlinear Optimization in Function Spaces X : T : Gateaux and Fréchet Differentials Gateaux and Fréchet Differentials a vector space, Y : a normed space transformation

More information

Review of Some Concepts from Linear Algebra: Part 2

Review of Some Concepts from Linear Algebra: Part 2 Review of Some Concepts from Linear Algebra: Part 2 Department of Mathematics Boise State University January 16, 2019 Math 566 Linear Algebra Review: Part 2 January 16, 2019 1 / 22 Vector spaces A set

More information

The Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee

The Learning Problem and Regularization Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee The Learning Problem and Regularization 9.520 Class 03, 11 February 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing

More information

Errata Applied Analysis

Errata Applied Analysis Errata Applied Analysis p. 9: line 2 from the bottom: 2 instead of 2. p. 10: Last sentence should read: The lim sup of a sequence whose terms are bounded from above is finite or, and the lim inf of a sequence

More information

Spectral Properties of Elliptic Operators In previous work we have replaced the strong version of an elliptic boundary value problem

Spectral Properties of Elliptic Operators In previous work we have replaced the strong version of an elliptic boundary value problem Spectral Properties of Elliptic Operators In previous work we have replaced the strong version of an elliptic boundary value problem L u x f x BC u x g x with the weak problem find u V such that B u,v

More information

FUNCTIONAL ANALYSIS LECTURE NOTES: ADJOINTS IN HILBERT SPACES

FUNCTIONAL ANALYSIS LECTURE NOTES: ADJOINTS IN HILBERT SPACES FUNCTIONAL ANALYSIS LECTURE NOTES: ADJOINTS IN HILBERT SPACES CHRISTOPHER HEIL 1. Adjoints in Hilbert Spaces Recall that the dot product on R n is given by x y = x T y, while the dot product on C n is

More information

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design

MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications. Class 19: Data Representation by Design MIT 9.520/6.860, Fall 2017 Statistical Learning Theory and Applications Class 19: Data Representation by Design What is data representation? Let X be a data-space X M (M) F (M) X A data representation

More information

Existence and uniqueness: Picard s theorem

Existence and uniqueness: Picard s theorem Existence and uniqueness: Picard s theorem First-order equations Consider the equation y = f(x, y) (not necessarily linear). The equation dictates a value of y at each point (x, y), so one would expect

More information

Starting from Heat Equation

Starting from Heat Equation Department of Applied Mathematics National Chiao Tung University Hsin-Chu 30010, TAIWAN 20th August 2009 Analytical Theory of Heat The differential equations of the propagation of heat express the most

More information

Kernel B Splines and Interpolation

Kernel B Splines and Interpolation Kernel B Splines and Interpolation M. Bozzini, L. Lenarduzzi and R. Schaback February 6, 5 Abstract This paper applies divided differences to conditionally positive definite kernels in order to generate

More information

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions Chapter 3 Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions 3.1 Scattered Data Interpolation with Polynomial Precision Sometimes the assumption on the

More information

Elementary linear algebra

Elementary linear algebra Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The

More information

OPERATOR THEORY ON HILBERT SPACE. Class notes. John Petrovic

OPERATOR THEORY ON HILBERT SPACE. Class notes. John Petrovic OPERATOR THEORY ON HILBERT SPACE Class notes John Petrovic Contents Chapter 1. Hilbert space 1 1.1. Definition and Properties 1 1.2. Orthogonality 3 1.3. Subspaces 7 1.4. Weak topology 9 Chapter 2. Operators

More information