2 Linear Ill-Posed Problems Compact Operators Regularization Operators CGNE for the Bilinear Ansatz...

Size: px
Start display at page:

Download "2 Linear Ill-Posed Problems Compact Operators Regularization Operators CGNE for the Bilinear Ansatz..."

Transcription

1 Abstract The effect of wavefront distortion due to turbulence in the Earth s atmosphere has made it necessary to develop tools for ground-based telescopes that compensate for this aberration. Therefore, the technology of Adaptive Optics (AO) has been investigated and used in astronomy. However, AO systems require to solve large linear ill-conditioned systems. This thesis deals with possible approaches to find regularized solutions of such systems. We start with an introduction to Adaptive Optics and the underlying mathematical modeling. The second chapter deals with regularization of illposed problems and introduces some regularization methods. In Chapter 3, we discuss whether some bases are likely to yield sparse representations of the desired solution and present the algorithm of iterative soft-thresholding that promotes sparsity. Finally, we introduce an accelerated version of this method and conclude by giving some numerical results. 1

2 Zusammenfassung Turbulenzen in der erdnahen Atmosphäre, auch Luftunruhe genannt, verursachen Fehler in Bildern von Himmelskörpern, die von erdstationierten Teleskopen erzeugt werden. Daher ist es notwendig, diese Teleskope mit der Technik der Adaptiven Optik (AO) auszustatten, die die verzerrten Wellenfronten, die das Teleskop erreichen, korrigiert. Ein Teil dieser Korrektur durch AO Systeme verlangt das Lösen großdimensionierter linearer schlecht konditionierter Probleme. Die vorliegende Arbeit befasst sich mit möglichen Zugängen eine regularisierte Lösung solcher Probleme zu geben. Zu Beginn geben wir eine kurze Einführung in Adaptive Optik und dem zugrundeliegenden mathematischen Modell. Das zweite Kapitel beinhaltet einen Überblick über die Theorie der Regularisierung schlecht gestellter Probleme und führt bekannte Regularisierungsmethoden ein. In Kapitel 3 behandeln wir die Frage, ob es zur gesuchten Lösung eine Basis gibt, in der sie sich gut, also mit wenig Koeffizienten ungleich 0, darstellen lässt. Weiters stellen wir einen iterativen Algorithmus vor, der auf Soft-Thresholding basiert und dünnbesetzte Lösungen bevorzugt. Abschließend behandeln wir eine beschleunigte Version dieser Methode und präsentieren numerische Ergebnisse. 2

3 Acknowledgement I owe my deepest gratitude to my supervisor Prof. Ronny Ramlau who supported me throughout my work on this thesis and always took the time to answer my questions and make suggestions. I would also like to thank my colleagues Dr. Mariya Zhariy and Dr. Tapio Helin who encouraged me and helped me to develop an understanding of the subject. Last but not least I would like to show my gratitude to my family and friends. 3

4 List of Figures 1.1 Correction of the incoming wavefront Achievable improvement with adaptive optics. On the left: Image dominated by atmospheric turbulence. On the right: Corrected image clearly shows the binary star Configuration of an AO system A deformable mirror Schematic representation of the Shack-Hartmann WFS Discretization of the aperture for n = 10. The circle describes the pupil of the telescope. Shaded squares are the pupil-masked subapertures and dots represent DM actuators Condition number w.r.t. matrix size Original phase screen Reconstructed phase screens for different noise levels, τ = Relative error in φ w.r.t. noise level δ, τ = φ and ψ for Daubechies wavelets D slice of original phase screen Sparsity pattern of P for the Haar basis Sparsity pattern of P for the Daubechies wavelets Reconstructed 1D phase screens for Haar basis, p = 1 and different noise levels Error in φ w.r.t. noise level δ. Haar basis, p = Error in φ w.r.t. to noise level δ for FISTA with p = Performance of ISTA and FISTA for p = Performance of ISTA and FISTA for p = 1.5 and p = Distribution of absolute values of wavelet coefficients around zero for different values of p. x-axis: absolute values of coefficients, y-axis: number of coefficients

5 Contents 1 Introduction Imaging Through the Atmosphere Adaptive Optics Components Mathematical Modeling Bilinear Influence Functions Linear Ill-Posed Problems Compact Operators Regularization Operators CGNE for the Bilinear Ansatz Sparse Reconstruction An Iterative Soft-Thresholding Algorithm Wavelets Multiresolution Analysis Orthonormal Bases of Compactly Supported Wavelets Implementation Choosing the Weights w γ The Shrinkage Function S wγ,p Building the Poke Matrix The Regularization Parameter α A Fast Iterative Soft-Thresholding Algorithm Numerical Results and Conclusion 56 References 60 5

6 1 Introduction We start with a (very) brief introduction to Adaptive Optics. For a detailed discussion we refer to [8]. 1.1 Imaging Through the Atmosphere When light from an astronomical object, e.g. a star, propagates through the atmosphere, it undergoes some turbulence when entering the Earth s atmosphere. This is due to the interaction between different temperature layers and wind speeds. An indicator for turbulence in fluids is the Reynolds number. If the Reynolds number is below a critical value, the flow will be laminar, otherwise turbulence will occur. The Reynolds number is a dimensionless parameter given as Re = inertial forces viscous forces = V l k ν, where V is a characteristic velocity, l is a characteristic size and k ν is the kinematic viscosity of the fluid. Close to Earth, due to the solar heating of its surface, convection currents are caused and the kinematic viscosity is k ν = m 2 s 1. For typical characteristic velocities (V > 1m/s) and characteristic lengths l of several meters to kilometers, this results in Reynolds numbers Re > 10 6, which are sufficiently large for turbulence to occur, [12]. Since a beam of light that propagates through the atmosphere suffers from this turbulence, images of astronomical objects taken from Earth are blurred and distorted. This is the biggest challenge for Earth-based astronomy. Adaptive Optics deals with developing devices for ground-based telescopes that can compensate this distortion. The main idea of Adaptive Optics is the following: Since the object of interest is assumed to be far away from Earth, the propagating wavefronts are almost plane. Due to atmospheric turbulence they become perturbed and are no longer planar. A beam of light can be described by an electric field of the form Ae +iφatm, where A is the amplitude and φ atm is the phase of the beam. If a mirror can be shaped according to the conjugated phase, i.e. Ae iφatm, 6

7 Figure 1.1: Correction of the incoming wavefront. and the light emitted by the astronomical object is reflected at this mirror one obtains plane wavefronts and has thus corrected the distortion (see Fig ). Of course, this is only a very basic and schematic description of an Adaptive Optics (AO) system. The main challenge is to achieve a high-speed, real-time correction for the turbulence. Figure shows an example of how much Adaptive Optics can improve the quality of an image. 1.2 Adaptive Optics Components We now want to go into more detail about how an AO system looks like, i.e. what its main components are. Typically, an AO system consists of 1. a deformable mirror (DM), 2. a wavefront sensor (WFS) and 3. a control computer (see Fig ). In order to compensate for the atmospheric turbulence, the 1 optics 2 [16], p. 4 3 [16], p. 2 7

8 Figure 1.2: Achievable improvement with adaptive optics. On the left: Image dominated by atmospheric turbulence. On the right: Corrected image clearly shows the binary star. shape of the DM is adapted in real-time to follow the wavefront aberrations. The control computer receives measurement signals from the WFS and generates control signals to drive the DM. The WFS is located after the DM in the optical path. Thus, it measures the residual in the wavefront perturbation after the DM correction has been applied. The aim is to minimize this residual by the AO control loop. Eventually, the corrected wavefront is sent to an astronomical instrument, such as an imaging camera, which is located in the focal plane of the telescope. Deformable Mirrors A deformable mirror consists of a continuous reflective facesheet that is deformed by a set of actuators placed at the back of it (see Fig ). There are several designs for the DM and especially for the actuators. The latter can be electromechanical, electromagnetic, piezoelectric or magnetostrictive units. Most commonly, the actuators are piezoelectric elements. The most important parameters in the design of DMs are - the number of actuators, - the spacing between them, - the maximum stroke, - the drive voltage levels and 4 atokovin/tutorial/part2/dm.html #SEC2.2 8

9 Figure 1.3: Configuration of an AO system. - the shape of the influence functions (see Chapter 1.3). Wavefront Sensors Figure 1.4: A deformable mirror. The WFS measures the wavefront distortions caused by atmospheric turbulence. However, most wavefront sensors do not measure the wavefront directly but rather its first derivative (wavefront slopes) or second derivative (curvature). The most popular WFS is the Shack-Hartmann type. For the Shack-Hartmann WFS, the lenslet array is optically conjugated to the pupil plane of the telescope. This n n grid of tiny lenses spatially samples the distorted wavefront. Each lens forms a small part of the image, corresponding to a part of the aperture (subaperture), onto a detector located in the focal plane of the lenslet array. If the wavefront in the pupil is plane, the subaperture images, called spots, form a regular grid. A distorted wave- 9

10 front results in a displacement of the spots (see Fig ). It is intuitively understood that the displacements of the spots in two orthogonal directions x and y are proportional to the average wavefront slopes s x, s y in x and y over the corresponding subapertures. Thus, a Shack-Hartmann WFS measures the local [ ] gradient error. The wavefront is then reconstructed from the sx array s = R 2n2 of the measured slopes. s y Figure 1.5: Schematic representation of the Shack-Hartmann WFS. 1.3 Mathematical Modeling Our aim is to reconstruct the incoming wavefront φ atm = φ atm (x) H 1 (R 2 ) from given WFS measurements s R 2n2 in order to compute the DM commands that are needed to shape the DM such that it is optically conjugated to φ atm. There are different setups of how the DM actuators are arranged. The most commonly used one for Shack-Hartmann WFSs is Fried geometry. In this geometry, the actuators are located at the corners of the subapertures. By introducing a phase-to-wfs interaction operator G : H 1 (R 2 ) R 2n2 the problem can be formulated as follows: s = Gφ atm. (1.1) 5 atokovin/tutorial/part3/wfs.html #SEC3.2 10

11 For a Shack-Hartmann WFS, Gφ atm is the gradient of the phase averaged over those subapertures that are located entirely inside the pupil, i.e. G has the following structure: G = [G x, G y ] = [MΓ x, MΓ y ], where for the i-th subaperture Ω i φ atm [Γ x φ atm ] i = Ω i x dx and similarly for Γ y. Furthermore, the mask M is defined as the diagonal matrix whose i-th diagonal entry is 1 if Ω i is completely contained in the pupil and 0 otherwise. Since H 1 (R 2 ) L 2 (R 2 ) and L 2 (R 2 ) = L 2 (R) L 2 (R) we can use a basis {b i } i=1 of L 2 (R) to represent the wavefront φ atm H 1 (R 2 ): φ atm (x, y) = ϕ i,j b i (x)b j (y) = Bϕ, (1.2) i,j=1 where ϕ i,j is the (i,j)-th coefficient of φ atm in this basis representation and B denotes the operator that maps the coefficients to φ atm H 1 (R 2 ). By inserting (1.2) in (1.1) we get s = GBϕ = Pϕ, (1.3) where P = GB denotes the DM-to-WFS (or Poke) operator and ϕ = (ϕ i,j ) i,j N. For the actual computation of the corrected wavefront φ rec which is generated by the deformable mirror we use a discrete approach. Therefore we compute a solution in finite dimensional spaces X k L 2 (R 2 ) for which X 1 X 2 X 3..., X k = L 2 (R 2 ), k N where k is the dimension of X k. If the telescope pupil consists of n n subapertures and if we neglect the mask M, we end up with N := (n + 1) 2 DM commands. Hence, we consider the space X N and represent φ rec via the basis {b i } n+1 i=1 of the corresponding subspace of L2 (R) and coefficients a i,j : φ rec (x, y) = n+1 i,j=1 11 a i,j b i (x)b j (y). (1.4)

12 We assume that {b i } n+1 i=1 is a basis that is shift invariant and locally supported, and we can thus define h j for j = (j 2 1) (n + 1) + j 1 by Here, := 1 n ( x xj1 h j (x, y) := b ) b ( y yj2 and b is defined such that ( x xj1 ) b = b j1 (x), ( y yj2 ) b = b j2 (y). With this new representation we get ). φ rec = Ha = N a j h j, (1.5) j=1 where a R N is the vector of DM commands, h j s are the influence functions and H : R N H 1 (R 2 ) is the DM-to-phase operator. Note that we need sufficiently smooth influence functions in order to guarantee that H maps to H 1 (R 2 ). The influence functions are called bilinear, if { z + 1 z 1, b(z) = 0 otherwise, and they are called bicubic, if b is a cubic b-spline supported on the interval [ 2, 2]. By inserting (1.5) in (1.1) we can introduce the DM-to-WFS matrix (Poke matrix) P as the product GH, i.e. s = GHa = Pa. (1.6) The entries of the Poke matrix P = [P x,p y ] are then determined by P i,j h j (x, y) x = d(x, y), (1.7) Ω i x yi2 +1 ( = hj (x i1 +1, y) h j (x i1, y) ) dy, y i2 P i,j h j (x, y) y = d(x, y), (1.8) Ω i y xi1 +1 ( = hj (x, y i2 +1) h j (x, y i2 ) ) dx, x i1 12

13 where Ω i = [x i1, x i1 +1] [y i2, y i2 +1]. In practice, the slopes s will not be given exactly but perturbed data s δ will be available, i.e. s δ = s + η δ for some random noise vector η δ. Then, (1.6) reads s δ = Pa. (1.9) Note that the total number of slope measurements is approximately twice the number of DM commands. This redundancy in the measurements has the beneficial effect of smoothing random errors, which do not accumulate in contrast to the case where there is no redundancy. 1.4 Bilinear Influence Functions In this section, we assume to have a discretization with n-by-n subapertures (see Fig. 1.4). For bilinear influence functions (1.7) and (1.8) reduce to P x i,j = 1 2 [h j(x i1 +1, y i2 ) h j (x i1, y i2 ) + h j (x i1 +1, y i2 +1) h j (x i1, y i2 +1)], P y i,j = 1 2 [h j(x i1, y i2 +1) h j (x i1, y i2 ) + h j (x i1 +1, y i2 +1) h j (x i1 +1, y i2 )]. The operator P T P, where P is the Poke matrix, has zero values in its spec- Figure 1.6: Discretization of the aperture for n = 10. The circle describes the pupil of the telescope. Shaded squares are the pupil-masked subapertures and dots represent DM actuators. trum which means that there is no unique solution of the problem. For illustration purposes, we determine κ(p T P) = λmax(pt P) λ min (P T P) by taking λ min (P T P) as the smallest non-zero eigenvalue in norm. Otherwise, κ(p T P) = for any n and no comparison could be possible. Figure 1.7 shows how κ(p T P) grows as n gets larger. 13

14 κ(p T P) n, where pupil has n x n subapertures Figure 1.7: Condition number w.r.t. matrix size. Large condition numbers can cause a problem for noisy data. Consider a linear equation system Ax = b. (1.10) If the data is noisy we need to change the equation to and the change in x is A(x + x) = b + b x = A 1 b. (1.11) The corresponding norm estimates to (1.10) and (1.11) are x 2 A 1 2 b 2, (1.12) b 2 A 2 x 2. (1.13) These two inequations yield an estimate for the relative error in x: x x κ(a) b b. Therefore, if the condition number κ(a) is large, a small perturbation in b can cause a large error in x. 14

15 But even if κ(a) is not large, small eigenvalues of A can still cause problems. If all the eigenvalues of a symmetric matrix A are small, the condition number κ(a) = λ max(a) λ min (A) will be moderate. Nevertheless, the eigenvalues of A 1 will be large, since they are reciprocal to the eigenvalues of A. Since we get x by multiplying with A 1 (see (1.11)), the data error can still be amplified. Systems with large condition numbers are called ill-conditioned. Growing condition numbers are often due to the fact that the underlying continuous problem is ill-posed, a property that is defined in the next chapter. There, we also introduce methods for solving ill-posed problems which we can use for tackling the ill-conditioned system (1.9). 15

16 2 Linear Ill-Posed Problems The following introduction to ill-posed problems is mainly based on [7]. Let T : X Y be a bounded linear operator and consider X and Y to be Hilbert spaces. We are interested in solving Tx = y (2.1) for given y Y. The problem is called ill-posed if it is not well-posed. The following definition of well-posedness goes back to J. Hadamard: Definition 2.1. A problem is well-posed if and only if the following properties hold: (i) For all admissible data, a solution exists: R(T) = Y. (ii) For all admissible data, the solution is unique: N (T) = {0}. (iii) The solution depends continuously on the data: T 1 L(Y, X). Therefore, a problem is ill-posed if one of these properties is violated. If the last criterion is not fulfilled, serious problems can occur when applying usual numerical methods to the problem, since they become unstable. So-called regularization methods make it possible to recover information about the solution as stably as possible. Uniqueness can be ensured by reformulating the notion of the solution. Definition 2.2. Let T : X Y be a bounded linear operator. An element x X is a least-squares solution of Tx = y if If in addition Tx y = inf{ Tz y z X }. x = inf{ z z is least-squares solution of T x = y} then x is called best-approximate or generalized solution. Thus, the best-approximate solution is defined as the least-squares solution of minimal norm. We can now define the Moore-Penrose inverse which is, roughly speaking, the operator that maps y onto the best-approximate solution of (2.1). 16

17 Definition 2.3. Let T := T N(T) : N(T) R(T). The Moore-Penrose generalized inverse T of T is defined as the unique linear extension of T 1 s.t. D(T ) := R(T) R(T) (2.2) and Moreover, for y D(T ), we define x := T y. N(T ) = R(T). (2.3) Proposition 2.1. Let P and Q be the orthogonal projectors onto N(T) and R(T) respectively. Then, the following Moore-Penrose equations hold: Proof. See [7], Prop TT T = T, T TT = T, T T = I P, Q D(T ) = TT. Proposition 2.2. The generalized inverse T is continuous if and only if R(T) is closed. Proof. See [7], Prop The next proposition gives the connection between the generalized inverse and least-squares solutions: Proposition 2.3. Let y D(T ). Then, x is the unique best-approximate solution of Tx = y. Furthermore, x + N(T) is the set of all least-squares solutions. Proof. See [7], Thm Let T denote the adjoint of T which is defined as the operator that maps from Y to X such that for all x X and y Y Tx, y = x, T y. Then, the least-squares solutions can be characterized by the normal equation: 17

18 Proposition 2.4. Let y D(T ). Then x X is a least-squares solution of (2.1) if and only if it is a solution of the normal equation where T is the adjoint of T. Proof. See [7], Thm Compact Operators T Tx = T y, (2.4) Let us now consider compact operators which are an important class of operators that lead to ill-posed problems. They are of interest because under suitable assumptions integral operators are compact and many problems can be formulated as integral equations. In the following, K always denotes a compact operator. Definition 2.4. An operator K L(X, Y) is compact if for all bounded subsets B of X it holds that R(B) is a compact set. A self-adjoint compact linear operator can be represented by its eigensystem which will help introducing regularization methods. In the following,, should denote the inner products of the respective Hilbert spaces X and Y. By taking all non-zero eigenvalues λ n of K and a corresponding complete system of eigenvectors v n we obtain the following representation for all x X: Kx = λ n x, v n v n. (2.5) n=1 If K is not self-adjoint, we can find a decomposition by its singular system (σ n ; v n, u n ) which is defined as follows: The operator K K is self-adjoint since K Kx, y = Kx, Ky = x, K Ky. Therefore, we can do a decomposition w.r.t. the complete eigensystem (λ n, v n ) of K K: K Kx = λ n x, v n v n. (2.6) Due to n=1 K Kx, x = Kx, Kx = Kx

19 we have that K K is positive semi-definite. Thus, all the eigenvalues λ n are non-negative and we can define σ n and u n such that Similar to (2.8) we get σ n := + λ n, (2.7) Kv n = σ n u n. (2.8) σ 2 nv n = K σ n u n, i.e. σ n v n = K u n, since a possible zero eigenvalue λ n is not used for the decomposition. By applying the operator K to the last equation we get or equivalently Kσ n v n = KK u n σ 2 n u n = KK u n which means that u n is an eigenvector of KK to the eigenvalue σ 2 n. Moreover, it is easy to show that (u n ) n N is an orthonormal system: u n, u m = 1 σ n σ m Kv n, Kv m = 1 σ n σ m K Kv n, v m = σ n σ m v n, v m = δ nm. For the proof of the next proposition, we need two fundamental theorems of functional calculus, which can be found e.g. in [9]: Theorem 2.1. Let K : X Y be a compact operator and K its adjoint. Then N(K) = R(K ), N(K ) = R(K). Theorem 2.2. Let H be a Hilbert space and let S be a complete subspace of H. Then, H is given as the direct sum We can now state the following H = S S. 19

20 Proposition 2.5. Let K : X Y be a compact operator. The following properties hold: R(K K) = R(K ), (2.9) R(KK ) = R(K). (2.10) Proof. It is sufficient to show the first equality, since the second one is its immediate consequence. Due to Theorems 2.1 and 2.2 we only need to show N(K) = N(K K). (2.11) For x X it is obvious that Kx = 0 implicates K Kx = 0, i.e. N(K) N(K K). Now let x N(K K). Because of K Kx = 0, we get that Kx N(K ) = R(K). But since Kx R(K), it follows that Kx = 0, i.e. x N(K). Hence, and therefore, (2.11) holds. N(K K) N(K) From this proposition we conclude that (u n ) n N and (v n ) n N span R(K) and R(K ) respectively. Therefore, for x X and y Y we get the following singular value decomposition: Kx = K y = σ n x, v n u n, (2.12) n=1 σ n y, u n v n. (2.13) n=1 If R(K) is finite-dimensional, then K has only finitely many singular values. Otherwise there is exactly one accumulation point for the singular values which is 0: lim n σ n = 0. The range R(K) is closed if and only if it is finite-dimensional. This yields, together with Proposition 2.2, Proposition 2.6. Let K : X Y be a compact operator. Then, the generalized inverse K is continuous if and only if dim R(K) <. 20

21 Therefore, the generalized inverse of a compact operator with infinite dimensional range cannot be continuous. This means that the best-approximate solution does not depend continuously on the right-hand side which makes the equation ill-posed. The next statement is of main importance: Proposition 2.7. Let (σ n ; v n, u n ) be a singular system for the compact linear operator K and y Y. Then y D(K y, u n 2 ) < (2.14) and for y D(K ) Proof. See [7], Thm K y = n=1 n=1 σ 2 n y, u n σ n v n. (2.15) Equivalence (2.14) is called Picard criterion and gives a necessary and sufficient condition for the existence of a best-approximate solution. It states that the coefficients ( y, u n ) n=1 have to decay fast enough w.r.t. the singular values σ n. If such a solution exists, then equation (2.15) yields a formula for computing it. Note that error components corresponding to small singular values can be drastically amplified. In case that dim R(K) <, there are only finitely many singular values and therefore the amplification is bounded. But still, it can be unacceptably large. In order to introduce regularization operators we need the notion of a function of a self-adjoint operator. Recall that if (σ n ; v n, u n ) is a singular system for the compact operator K : X Y we get for all x X K Kx = σn x, 2 v n v n, (2.16) n=1 since (σn 2; v n) is an eigensystem of K K. For λ R, x X and P as the orthogonal projector onto N(K K) we define E λ x := x, v n v n (+Px). (2.17) n=1 σn 2<λ The component Px is meant to appear only for λ > 0. The operator E λ is an orthogonal projector onto X λ := span {v n n N, σ 2 n < λ} (+N(K K), if λ > 0). 21

22 Obviously, E λ = 0 for λ 0. For the case that λ > σ 2 1, we have E λ = I since X λ = R(K K) + N(K K) = X. We can show a monotonicity property of the spectral family E λ. For all λ µ the following holds: E λ x, x = x, v n 2 (+ Px 2 ) n=1 σn 2 <λ x, v n 2 (+ Px 2 ) = E µ x, x. n=1 σ 2 n<µ Additionally, E λ is piecewise constant with jumps at λ = σn 2 (and at λ = 0 if and only if N(K K) {0}) of magnitude, v n v n. n=1 σ 2 n=λ Recall that the integral w.r.t. a piecewise constant weight function is defined as the sum over all function values at the jumps of the integrand multiplied by the heights of these jumps. Without going into details of measure theory, we state that the following representation is justified (note that the monotonicity property shown above is crucial): K Kx = σn x, 2 v n v n = λde λ x. (2.18) λ R + n=1 Moreover, one can define a piecewise continuous function f of a self-adjoint (not necessarily compact) operator as f(t T)x := and the norm of its evaluation by f(t T)x 2 := 0 0 f(λ)de λ x f(λ) d E λ x 2. For compact operators this reduces to f(k K)x := f(σn) x, 2 v n v n and f(k K)x 2 := n=1 f 2 (σn) x, 2 v n 2. n=1 22

23 2.2 Regularization Operators Let us go back to our original problem. For a linear bounded operator T : X Y and a given right-hand side y Y, we want to find a solution x X such that Tx = y. (2.19) We have introduced the notion of a generalized inverse T. If this operator exists, the best-approximate solution is given as x = T y. (2.20) In most applications, the right-hand side y is not given exactly. Then, we are only given an approximation y δ and a bound δ on the noise level such that y y δ δ. (2.21) is guaranteed. In the ill-posed case T is not continuous and thus, T y δ is not necessarily a good approximation of T y even if it exists. Therefore, we introduce the notion of regularization which roughly speaking means to approximate an ill-posed problem by a family of well-posed problems. The aim is to find an approximation of x that on one hand depends continuously on y δ in order to compute it in a stable way. On the other hand, it should tend to x if the noise level δ approaches zero. We do not want to determine this approximation only for a specific righthand side, but rather approximate the unbounded operator T by a family of continuous parameter-dependent operators {R α } in the way that for an appropriate choice of α = α(δ, y δ ) tends to x for δ 0. x δ α := R α y δ Definition 2.5. Let T : X Y be a bounded linear operator between Hilbert spaces and α 0 (0, ]. For every α (0, α 0 ), let R α : Y X be a continuous operator. The family {R α } is called a regularization for T if for all y D(T ) there exists a parameter choice rule α = α(δ, y δ ) such that the following holds: lim sup{ Rα(δ,y )y δ T y y δ Y, y y δ δ} = 0. (2.22) δ 0 δ 23

24 The parameter choice rule α : R + Y (0, α 0 ) has to fulfill lim sup{α(δ, δ 0 yδ ) y δ Y, y y δ δ} = 0. (2.23) Therefore, a regularization method has two components: a regularization operator and a parameter choice rule. If α depends only on δ, we call it an a-priori, otherwise an a-posteriori parameter choice rule. However, due to a theorem by Bakushinskii, α cannot depend on y δ only. The theorem states that in this case, convergence of the regularization method implies the boundedness of T, i.e. the well-posedness of the problem. Thus, for regularization of an ill-posed problem α cannot be chosen independently of δ. The questions that arise are how to construct a family of regularization operators and how to choose parameter choice rules that yield convergence. The following proposition gives an answer to the first question for the case of linear operator equations. Proposition 2.8. Let R α be a continuous operator for all α > 0. Then, the family {R α } is a regularization of T if R α α 0 T pointwise on D(T ). In this case, for all y D(T ) there exists an a-priori parameter choice rule α(δ) such that (R α, α) is a convergent regularization method for Tx = y. Proof. See [7], Prop Thus, we need to construct the regularization operators R α such that they converge pointwise towards T. We will now discuss possible ways of constructing the regularization operators in case T is linear. One can extend the notion of the spectral family to self-adjoint bounded but not necessarily compact operators. Now, let {E λ } be the spectral family of T T. If T T is continuously invertible, then (T T) 1 = 1 de λ λ. By Proposition 2.4 we get for the best-approximate solution x = λ R + 1 λ de λt y. (2.24) If R(T) is non-closed, i.e. in the case of ill-posedness, the eigenvalues {λ} accumulate in 0, which means that the above integral has a pole in 0. The 24

25 crucial idea of regularization is to replace 1/λ by a family of functions {g α (λ)} that have to fulfill some continuity conditions. We can now replace (2.24) by x α := g α (λ)de λ T y λ R + and define the family of regularization operators according to R α := g α (λ)de λ T. (2.25) λ R + The following proposition states under which assumptions on {g α } the convergence can be guaranteed: Proposition 2.9. Let for all α > 0, g α : [0, T 2 ] R be piecewise continuous and constructed in the way that for all λ (0, T 2 ] and Then, we have that for all y D(T ) holds. Proof. See [7], Thm λg α (λ) C lim α 0 g α(λ) = 1 λ. lim x α = x α 0 Since the operator R α is continuous, y y δ δ implies the boundedness of the error between x α and We define x δ α := λ R + g α (λ)de λ T y δ. r α (λ) := 1 λg α (λ) and state the following convergence result for a-priori parameter choice rules: Proposition Let g α fulfill the assumptions of Proposition 2.9 and assume that µ > 0. Furthermore, let for all α (0, α 0 ) and λ [0, T 2 ] and some c µ > 0 λ µ r α (λ) c µ α µ (2.26) 25

26 hold. If G α := sup{ g α (λ) λ [0, T 2 ]} = O(α 1 ) as α 0 and the so-called source condition x R((T T) µ ) is satisfied, then the parameter choice rule according to α δ 2 2µ+1 yields x δ α x = O(δ 2µ 2µ+1 ). Proof. See [7], Cor Source conditions usually imply smoothness and boundary conditions on the exact solution. An example for how to choose {g α } is { 1/λ λ α, g α (λ) := 0 λ < α. This method is called truncated singular value expansion. The assumptions of the previous propositions hold with C = 1, c µ = 1, arbitrary µ > 0 and G α = 1/α. In general, determining µ > 0 such that the source condition (2.10) holds, is not possible. Therefore, we briefly introduce a-posteriori parameter choice rules. The most common a-posteriori choice rule is Morozov s discrepancy principle which can be formulated as follows: For g α fulfilling the same assumptions as in Proposition 2.9 and a constant τ chosen according to τ > sup{ r α (λ) α > 0, λ [0, T 2 ]}, the regularization parameter defined by the discrepancy principle is α(δ, y δ ) := sup{α > 0 Tx δ α y δ τδ}. (2.27) Remark 2.1. The underlying idea of the discrepancy principle is the fact that since the right-hand side of (2.19) is only known up to a noise level δ, it does not make sense to search for an approximate solution x with a residual T x y δ < δ. We should only ask for a solution such that the residual is of order δ. In addition, a smaller regularization parameter implicates less stability, which is why we take the largest possible value for α. This is what is done by using the discrepancy principle. 26

27 Proposition The regularization method (R α, α) where α is defined by (2.27) is convergent for all y R(T). Moreover, we have x δ α(δ,y δ ) 2µ x = O(δ 2µ+1 ) (2.28) for all µ (0, µ 0 1/2]. Here, µ 0 denotes the largest number µ for which (2.26) holds. Proof. See [7], Thm Remark 2.2. It is sufficient, that the parameter choice rule α(δ, y δ ) satisfies Tx δ α yδ τδ Tx δ β yδ for some β with α β 2β. This is crucial for the numerical realization of the discrepancy principle. Finally, we introduce three regularization methods. The most commonly used is Tikhonov regularization for which g α (λ) := 1 λ + α. Due to the definition of x δ α and since {λ+α} are the eigenvalues of T T +αi, we have i.e. x δ α = λ R + g α (λ)de λ T y δ = (T T + αi) 1 T y δ, (2.29) (T T + αi)x δ α = T y δ, which can be regarded as a regularized form of the normal equation. By applying Tikhonov regularization to a compact operator K with singular system (σ n ; v n, u n ), we get x δ α = n=1 σ n σ 2 n + α yδ, u n v n. Compared to the original singular value expansion, the factor 1 σ n is now replaced by σn which is bounded for n. σn 2 +α Proposition Let x δ α be defined as in (2.29). Then it is the unique minimizer of the Tikhonov functional x Tx y δ 2 + α x 2. 27

28 Proof. See [7], Thm This proposition clearly shows what regularization does. One tries to find a solution that on the one hand minimizes the residual as far as possible and on the other hand enforces stability by introducing the penalty term x. The factor α ensures that the second term tends to 0 as the noise vanishes. Corollary 1. In case that α(δ, y δ ) is chosen according to the discrepancy principle Tikhonov regularization converges and yields (2.28). Proof. This is an immediate consequence of Proposition Another wide-spread regularization method is the so-called Landweber method. It uses only discrete values for α and is therefore an iterative method. Here, the family of functions approximating 1/λ is defined by 1 (1 λ)k g k (λ) :=, k N. λ Finally, we mention a version of the conjugate gradient method. The CG algorithm is a very efficient solver for self-adjoint positive (semi)-definite well-posed linear equations. In the case of an ill-posed equation Tx = y δ, we apply the CG method to the corresponding normal equation T Tx = T y δ and call this the CGNE method (conjugate gradients for the normal equation). Unlike Landweber iteration, the CGNE method is not based on a fixed sequence of polynomials {g k } and {r k } since these polynomials now depend on the given right-hand side. This ensures higher flexibility, however, the drawback is that {x δ k } depends non-linearly on the data yδ. Proposition In case that k(δ, y δ ) is chosen according to the discrepancy principle both CGNE and Landweber iteration converge and yield (2.28). Proof. See [7], Thm and Thm Remark 2.3. As for the CGNE method, in the non-attainable case y D(T ) \ R(T), y δ needs to be replaced by Qy δ in (2.27). Since T Tx = T y = Tx = Qy (2.30) it is sufficient to replace Tx δ α y δ by T Tx δ α T y δ in (2.27). Proof of (2.30). By Proposition 2.4, T Tx = T y = T y = x = TT y = Tx (2.31) Qy = Tx. (2.32) 28

29 Algorithm 1 CGNE x δ 0 = x, d 0 = y δ Tx δ 0, p 1 = r 0 = T d 0, k=1; while T Tx δ k T y δ > τδ do q k = Tp k, α k = r k 1 2 / q k 2, x δ k = x δ k 1 + α k p k, d k = d k 1 α k q k, r k = T d k, β k = r k 2 / r k 1 2, p k+1 = r k + β k p k, k = k + 1. end while 2.3 CGNE for the Bilinear Ansatz After this brief introduction to ill-posed problems we go back to our original problem (1.9), which is ill-conditioned as discussed in Section 1.4. Therefore it is necessary to apply regularization methods in order to get better results. We try to reconstruct the phase screen φ atm shown in Figure 2.1 that has a resolution of by applying the CGNE method. For the time being, we are only interested in reconstructing φ atm from given slope measurements s, i.e. we do not consider a telescope with a given number of subapertures. We compute the corresponding Poke matrix P as if the number of subapertures was equal to the resolution of φ atm and the pupil mask was neglected. What needs to be done is to evaluate the operator that maps φ atm to the slope measurements s = [s x, s y ]. Since we assume Fried geometry, the components of s are given by (s x ) i = d(x, y), (2.33) x (s y ) i = Ω i φ atm φ atm Ω i y d(x, y). (2.34) By applying the theorem of Fubini and the midpoint rule for approximating 29

30 Figure 2.1: Original phase screen. the integrals, we get (s x ) i = (s y ) i = yi2 +1 y i2 y i 2 2 xi1 +1 x i1 x i 1 2 [φ atm (x i1 +1, y) φ atm (x i1, y)]dy [φ atm (x i1 +1, y i2 +1) φ atm (x i1, y i2 +1) + φ atm (x i1 +1, y i2 ) φ atm (x i1, y i2 )], [φ atm (x, y i2 +1) φ atm (x, y i2 )]dx [φ atm (x i1 +1, y i2 +1) φ atm (x i1 +1, y i2 ) + φ atm (x i1, y i2 +1) φ atm (x i1, y i2 )], with x i1 := x i1 +1 x i1 and y i2 := y i2 +1 y i2. We then generate s δ by adding a random noise vector to s with normally distributed components. The noisy data should satisfy s s δ rms δ for a given noise level δ. Here, the norm of a vector s of length m is defined as ( s 2 ) i 1/2. s rms := m 1 30

31 noise level δ = 0% noise level δ = 2.5% noise level δ = 5% noise level δ = 10% Figure 2.2: Reconstructed phase screens for different noise levels, τ = 1.2. Then, we apply the CGNE algorithm with the discrepancy principle to Pa = s δ in order to determine the point-wise values a j. The reconstructed wavefront is then given by φ δ rec = a j h j. j Figure 2.2 shows the reconstructed phase screens for different choices of δ. The crucial point is that the error φatm φ δ rec tends to zero for δ 0 (see Fig. 2.3). 31

32 φ φ rec / φ s s δ / s Figure 2.3: Relative error in φ w.r.t. noise level δ, τ =

33 3 Sparse Reconstruction If it is known that the desired solution of Tx = y is likely to be sparse in some basis {ϕ γ : γ Γ} of X, one could consider regularization methods that yield solutions that are sparse w.r.t. {ϕ γ : γ Γ}. Therefore, we discuss an additional regularization method for linear inverse problems that promotes sparsity. The underlying idea is to replace the quadratic penalty term in (2.12) by a weighted l p -norm of the coefficients of x w.r.t. an orthonormal basis {ϕ γ : γ Γ} of X. Thus, we aim at minimizing Φ w,p (x) = Tx y 2 + γ Γ w γ x, ϕ γ p, (3.1) where p [1, 2] and w = (w γ ) γ Γ is a sequence of strictly positive weights. For the choice w 1, the penalty term is the ordinary l p -norm of the coefficients. Another possible choice leads to a penalty term that is equivalent to the Besov norm (see Section 3.3.1). Keeping the weights fixed and decreasing p from 2 to 1, yields an increase in the penalization of coefficients that are smaller than 1 and a decrease in the penalization of those that are larger than 1. Thus, the more we decrease p, the more we are likely to get a generalized solution that has a sparse expansion w.r.t. {ϕ γ : γ Γ}. 3.1 An Iterative Soft-Thresholding Algorithm The following algorithm was introduced in [6]. Minimizing the functional in (3.1) can be rewritten in a variational formulation. Since {ϕ γ : γ Γ} is an ONB, we get Φ w,p (x) = x, T Tx 2 x, T y + y, y + γ Γ = γ Γ x γ T Tx, ϕ γ 2 γ Γ w γ x γ p x γ T y, ϕ γ + y, y + γ Γ w γ x γ p, where x γ is a shortcut for x, ϕ γ. Differentiating w.r.t. to x, yields 2 T Tx, ϕ γ ϕ γ 2 γ Γ γ Γ T y, ϕ γ ϕ γ + w γ p x, ϕ γ p 1 sign( x, ϕ γ )ϕ γ = 0, γ Γ which implies that each coefficient is zero, i.e. for all γ Γ T Tx, ϕ γ T y, ϕ γ + w γp 2 x, ϕ γ p 1 sign( x, ϕ γ ) = 0. (3.2) 33

34 Here, we implicitly set the derivative of the absolute value to be zero in the origin. Both the coupling of the equations due to T Tx and the nonlinearity of the equations make the above system hard to solve. Therefore, one introduces a surrogate functional that has nicer properties and minimizes it instead of Φ w,p (x). As we will state later, the surrogate functional introduced below approximates Φ w,p. The new surrogate functional is constructed by adding an additional functional to Φ w,p (x): Φ SUR w,p (x; a) :=Φ w,p (x) Tx Ta 2 + C x a 2, = Tx y 2 + γ Γ w γ x γ p Tx Ta 2 + C x a 2, =C x 2 2 x, T y T Ta + Ca + γ Γ w γ x γ p + + y 2 Ta 2 + C a 2, = ) (Cx 2γ 2x γ(ca + T y T Ta) γ + w γ x γ p + γ Γ + y 2 Ta 2 + C a 2. (3.3) where C is a constant fulfilling T T < C. Note that instead of introducing C we could also rescale the equation such that T < 1. Since Φ w,p (x) and Ψ(x; a) := C x a 2 Tx Ta 2 are both strictly convex in x (for any 1 p 2 and any a), Φ SUR w,p (x; a) is also strictly convex in x and therefore has a unique minimizer for any a. The main advantage of this surrogate functional is that the variational equations of x γ are no longer coupled. We can now define an iterative algorithm that will lead us to a minimizer of the original functional Φ w,p (x): x 0 arbitrary; n N x n = arg min x X (ΦSUR w,p (x; xn 1 )). Thus, we first determine the minimizer x 1 of the surrogate functional with a = x 0 and then, in each iteration minimize the functional for a = x n 1. Let us now consider the case p = 1, that is most likely to yield sparse solutions. Then, the summand in (3.3) is differentiable in x γ only for x γ 0. In this case, we end up with the following variational equation 2Cx γ 2 ( Ca + T (y Ta) ) γ + w γsign(x γ ) = 0. 34

35 Thus, for x γ > 0 we get which is only valid for x γ = a γ + 1 C a γ + 1 C For x γ < 0, we similarly get ( T (y Ta) ) γ w γ 2C, ( T (y Ta) ) γ > w γ 2C. (3.4) which is true only if x γ = a γ + 1 C ( T (y Ta) ) γ + w γ 2C, a γ + 1 C ( T (y Ta) ) γ < w γ 2C. (3.5) If neither (3.4) nor (3.5) holds, we set x γ = 0. Let S w,1 : R R be a function defined as t w/2 if t w/2, S w,1 (t) := 0 if t < w/2, t + w/2 if t w/2. Then, x γ = S wγ/c,1 (a γ + 1 C (T (y Ta)) γ ). Therefore for p = 1, the iterative method reads as follows: γ Γ x n γ = S w γ/c,1 ( x n 1 γ + 1 C (T (y Tx n 1 )) γ ). If p > 1, the summand in (3.3) is differentiable in x γ and minimization reduces to solving the variational equation ( ) 2Cx γ 2 Ca + T (y Ta) + pw γ sign(x γ ) x γ p 1 = 0. (3.6) Since for any w 0 and p > 1, the real function F w,p (x) = x+ wp 2 sign(x) x p 1 is bijective on R, we can define γ S w,p := (F w,p ) 1 35

36 and can again find the minimizer of (3.3) via x γ = S wγ,p (a γ + 1 C (T (y Ta)) γ ). Unlike for p = 1, there is no explicit formula for S w,p. Thus, when implementing this method for p > 1, we need an algorithm to solve the non-linear equation (3.6), which is discussed in Section We now want to turn to convergence results of this method. The following proposition states the existence of a unique minimizer of the surrogate functional: Proposition 3.1. Let T be an operator mapping from a Hilbert space X to another Hilbert space Y, assume T T < 1 and y Y. Additionally, suppose that {ϕ γ } γ Γ is an orthonormal basis of X and that w = {w γ } γ Γ is a sequence of strictly positive elements. Then, for arbitrarily chosen a X, Φ SUR w,p (x; a) = Tx y 2 + w γ x γ p Tx Ta 2 + C x a 2 γ Γ has a unique minimizer in X that is given by x = ) S wγ,p (a γ + (T (y Ta)) γ ϕ γ. γ Γ Proof. See [6], Prop The next proposition guarantees that the iterates of the algorithm using surrogate functionals converge to a minimizer of the original functional Φ w,p : Proposition 3.2. Assume that the conditions of the previous statement hold. If in addition, the sequence w = {w γ } is uniformly bounded from below by a constant c > 0, i.e. if for all γ Γ w γ c, then, for any x 0 X the iterates x n = ( S wγ,p xγ n 1 + (T (y Tx n 1 )) γ )ϕ γ γ Γ strongly converge to a minimizer of Φ w,p (x) = Tx y 2 + γ Γ w γ x, ϕ γ p. (3.7) Proof. See [6], Thm

37 So far, it is not clear, whether minimizing (3.7) is related to solving Tx = y. The question is, how the penalized functional (3.7) can lead to a regularization method for the original problem. First of all, we need to introduce an additional parameter α in order to be able to vary the weight of the penalty term. Thus, we consider the functional Φ α,w,p (x) = Tx y 2 + α γ Γ w γ x, ϕ γ p. Due to Definition 2.5 the family {Φ α,w,p } is a regularization for T if for all y D(T ) there exists a parameter choice rule α = α(δ, y δ ) such that lim sup{ Φα(δ,y ),w,py δ T y y δ Y, y y δ δ} = 0. δ 0 δ The parameter choice rule has to fulfill lim sup{α(δ, δ 0 yδ ) y δ Y, y y δ δ} = 0. The following proposition states, that with an additional requirement for α(δ, y δ ) the functionals {Φ α,w,p } yield a regularization method for our linear system. Proposition 3.3. Let T : X Y be a bounded operator with T < 1 and assume that p [1, 2] and that w = {w γ } γ Γ is uniformly bounded from below by c > 0. For any y D(T ) and any α > 0, define x α,w,p;y to be the minimizer of Φ α,w,p;y (x). If α = α(δ) satisfies and then we have Proof. See [6], Thm lim α(δ) = 0 (3.8) δ 0 lim δ 0 δ 2 α(δ) = 0 (3.9) lim sup{ x α(δ),w,p;y T y } = 0. δ 0 Remark 3.1. As proven in [1], if α is chosen according to the discrepancy principle, it satisfies (3.8) and (3.9) and can be thus used here instead of an a-priori choice rule. 37

38 3.2 Wavelets To apply the algorithm discussed in this chapter, the first issue that has to be clarified is the choice of the orthonormal basis {ϕ γ } of X. In Section 1.4 we used the basis of bilinear influence functions. This basis is not orthogonal and thus cannot be considered for the algorithm based on surrogate functionals. As mentioned in [10] the distorted incoming wavefronts tend to have a fractal structure. Hence, wavelets might be a good choice for a basis in which to expand the phase screens. Before proceeding any further, we first want to give a brief introduction to wavelets. For a detailed discussion on wavelets we refer to [5]. Definition 3.1. Let ψ L 2 be a function mapping from R to R. If ψ fulfills the admissibilty condition C ψ := 2π ξ 1 ˆψ(ξ) 2 dξ <, (3.10) R where ˆψ denotes the Fourier transform of ψ, it is called a wavelet. From a wavelet ψ we can generate a family of wavelets according to ( x b ) ψ a,b (x) := a 1 2 ψ, a, b R, a 0, (3.11) a and refer to the function ψ as the mother wavelet. The factor a 1 2 ensures that the L 2 -norm is independent of a and b, i.e. ψ a,b = ψ. (3.12) The continuous wavelet transform of a function f L 2 (R) is defined via ( x b ) (T wav f)(a, b) := a 1 2 f(x)ψ dx = f, ψ a,b. a R Note that due to (3.12) and if ψ is compactly supported, functions ψ a,b with a high frequency a show a small support, whereas ψ a,b with a low frequency have a large support. This basic property of wavelets shows its major advantage in signal processing, compared to the Fourier transform, since it allows a good localization in both space and time. We now turn to the question of reconstructing a function from its wavelet transform. We assume that ψ = 1. Since ψ fulfills (3.10), we can recover any f L 2 (R) from its wavelet transform according to f = C 1 ψ a 2 (T wav f)(a, b)ψ a,b dadb. R R 38

39 If ψ L 1 (R), then condition (3.10) can only hold if ψ(x)dx = 0. R For implementation issues, we need to discuss how to discretize wavelet transforms. For convenience, we only consider the case in which f can be reconstructed by using only positive values of a. W.l.o.g, we fix a 0 > 1 and b 0 > 0 and restrict a and b to the discrete values which yields a = a j 0 and b = kb 0 a j 0, j, k Z, ψ j,k (x) = ψ a j 0,kb 0a j 0 (x) = a j/2 0 ψ ( x kb0 a j ) a j 0 = a j/2 0 ψ(a j 0x kb 0 ). Here, b 0 is chosen such that the functions ψ(x kb 0 ) cover the entire real axis. Then, for any fixed j, this property will also hold for the functions ψ j,k. As in the continuous case, the question arises whether or not we can reconstruct f out of f, ψ j,k in a stable way. Another question, dual to the first one, is whether any function f can be written as a superposition of elementary building blocks ψ j,k. If the ψ j,k constitute an orthonormal basis of L 2 (R), we can ensure that any f L 2 (R) is characterized by the coefficients f, ψ j,k. In addition, we can represent the L 2 -norm of any f L 2 (R) according to f 2 = f, ψ j,k 2. j,k Z Thus, if f L 2 (R), then { f, ψ j,k } j,k Z l 2 (Z), which guarantees a stable reconstruction. Since {ψ j,k } is a basis, the reconstruction is simply given by f = f, ψ j,k ψ j,k. j,k Z The question that arises is, how to construct orthonormal wavelet bases. We introduce multiresolution analysis in order to tackle this issue Multiresolution Analysis A sequence of successive approximation spaces V j L 2 (R) is called multiresolution analysis if it satisfies V 2 V 1 V 0 V 1 V 2, (3.13) 39 0

40 V j = L 2 (R), (3.14) j Z V j = {0}. (3.15) j Z In addition we require that all the spaces are scaled versions of the space V 0, i.e. j Z f V j f(2 j ) V 0 (3.16) and that V 0 is translation invariant, i.e. k Z f V 0 = f( k) V 0. (3.17) An immediate consequence of (3.16) and (3.17) is that k, j Z f V j = f( 2 j k) V j. The last requirement that has to be met is the existence of φ V 0 s.t. {φ 0,k : k Z} is an orthonormal basis of V 0, (3.18) where φ j,k (x) := 2 j/2 φ(2 j x k) for k, j Z. Due to (3.16), it follows that {φ j,k : k Z} is an orthonormal basis of V j. We call φ the scaling function of the multiresolution analysis. If we define P j to be the orthogonal projector onto V j, then (3.14) ensures that lim j P j f = f for all f L 2 (R). As stated below, Conditions (3.13) - (3.18) guarantee the existence of an orthonormal wavelet basis {ψ j,k : j, k Z} of L 2 (R), where ψ j,k (x) := 2 j/2 ψ(2 j x k), s.t. P j+1 f = P j f + k Z f, ψ j,k ψ j,k. (3.19) The mother wavelet ψ can be constructed in the following way: For all j Z we define W j to be the orthogonal complement of V j in V j+1, i.e. V j+1 = V j W j. Obviously, we have that Thus, for l < n W j W l for j l. n 1 V n = V l j=l W j 40

41 and together with (3.14) and (3.15), this yields L 2 (R) = j Z W j. (3.20) For the spaces W j we again have the scaling property f W j f(2 j ) W 0, (3.21) which is due to (3.16). Since (3.19) is equivalent to {ψ j,k : k Z} being an orthonormal basis of W j, we get with (3.20) that {ψ j,k : j, k Z} is an orthonormal basis of L 2 (R). Moreover we can conclude from (3.21) that if {ψ 0,k : k Z} is an orthonormal basis of W 0, then {ψ j,k : k Z} is an orthonormal basis of W j. Thus, we need to find ψ W 0 such that {ψ( k) : k Z} is an orthonormal basis of W 0. In order to introduce how this can be done, we need some definitions. First of all, since φ V 0 V 1 and {φ 1,k : k Z} is an orthonormal basis of V 1, we can define h k = φ, φ 1,k and represent φ according to φ = k Z h k φ 1,k. (3.22) We know that φ 1,k (x) = 2φ(2x k) and thus, can rewrite (3.22) as φ(x) = 2 h k φ(2x k) k Z or equivalently, as ˆφ(ξ) = 1 ( h k e ik ξ ξ 2 ˆφ 2 2) with convergence of the sums in the L 2 sense. Therefore, by defining we get k Z m 0 (ξ) := 1 h k e ikξ, (3.23) 2 k Z ( ξ ( ξ ˆφ(ξ) = m 0. 2)ˆφ 2) One possible way to construct ψ, is given by the following 41

42 Proposition 3.4. Let (V j ) j Z be a sequence of closed subspaces of L 2 (R) which satisfies (3.13) - (3.18). Then, there exists an orthonormal basis of wavelets {ψ j,k : j, k Z} of L 2 (R), s.t. P j+1 = P j + k Z, ψ j,k ψ j,k. One possibility for constructing the mother wavelet ψ is or equivalently, ψ = k Z ( ˆψ(ξ) = e i ξ ξ ( ξ 2 m0 )ˆφ 2 2) + π ( 1) k 1 h k 1 φ 1,k = 2 k Z( 1) k 1 h k 1 φ(2 k), where m 0 is defined via (3.23) and φ is chosen s.t. (3.18) is satisfied. Proof. See [5], Thm Remark 3.2. The orthonormality of the φ( k) leads to the following property of m 0 : m 0 (ζ) 2 + m 0 (ζ + π) 2 = 1 a.e. (3.24) Orthonormal Bases of Compactly Supported Wavelets If we go back to the problem of phase reconstruction, we see that if the basis functions we choose have compact support, we get the nice property that the Poke matrix will be sparse. Thus, we are interested in wavelets that are compactly supported. In order to ensure that all ψ j,k, j, k Z have compact support, we only need that the mother wavelet ψ is compactly supported. Furthermore, this is ensured if the scaling function φ has compact support, since then, only finitely many h k are non-zero and thus, ψ reduces to a finite linear combination of compactly supported functions. For compactly supported φ the 2π-periodic function m 0 becomes a trigonometric polynomial. One can show that if m 0 is a trigonometric polynomial with m 0 (0) = 1 and fulfilling (3.24), then under some further assumptions we get the following result: If we define φ and ψ according to ˆφ(ξ) := 1 m 0 (2 j ξ), 2π j=1 ( ξ ( ) ˆψ(ξ) := e iξ/2 m 0 )ˆφ 2 + π ξ/2, 42

43 then, {ψ j,k : j, k Z} is an orthonormal basis of L 2 (R)([5], Thm ). Thus, we need to find m 0 that satisfies (3.24). In addition, we are interested in making φ and ψ reasonably regular. One can show that imposing some regularity constraints implies that m 0 should be of the form ( 1 + e iξ ) NL(ξ), m 0 (ξ) = 2 where N 1 and L is a trigonometric polynomial. Proposition 3.5. A trigonometric polynomial m 0 of the form ( 1 + e iξ ) NL(ξ), m 0 (ξ) = 2 fulfills (3.24) if and only if L(ξ) = L(ξ) 2 can be written as L(ξ) = P(sin 2 ξ/2), with ( 1 ) P(y) = P N (y) + y N R 2 y, where P N (y) = N 1 k=0 ( N 1 + k k ) y k and R is an odd polynomial, chosen s.t. P(y) 0 for y [0, 1]. Proof. See [5], Prop This proposition completely characterizes m 0 2. With spectral factorization one can extract the square root (for further details we refer to Chapter 6 of [5]). One important class of compactly supported orthonormal wavelet bases is the family of Daubechies wavelets, first introduced in [4], which corresponds to R 0. By varying N we get the different Daubechies wavelets (which we abbreviate with dbn) and the regularity increases with N. Except for N = 1 which corresponds to the Haar basis (introduced in 3.3.3), there is no closed representation for φ and ψ. Nonetheless, their graph can be computed up to arbitrarily high precision by applying the Cascade Algorithm (see [5]). Figure 3.1 shows the graphs of φ and ψ for N = 2 and N = 4. Remark 3.3. The phase screen shown in Figure 2.1 can be very well compressed with wavelets. If we decompose the image w.r.t. the Haar wavelets and set all wavelet coefficients that have an absolute value smaller than 10 to zero, we get 91.81% zero coefficients and 99.95% retained energy. Thus, the 43

44 1.5 N=2 2 N=2 1 φ 1 ψ N=4 1.5 N=4 1 φ 1 ψ Figure 3.1: φ and ψ for Daubechies wavelets. image can be very well approximated with images that are sparse in the Haar wavelet. For the 1 D slice of the phase screen in Figure 3.2 we get 80.15% zero coefficients for a retained energy of 99.49% by setting the threshold to 9. We get similar results for other wavelet bases. The retained energy is defined as a comp 2 2 φ 2 2 where a comp is the coefficient vector of the compressed signal and φ is the original signal. 3.3 Implementation For simplicity (see Section for more details), we consider the issue of phase reconstruction in 1 D, i.e. we aim at reconstructing a slice of the 2 D phase screen. 44

45 3.3.1 Choosing the Weights w γ The first question we have to answer is how to choose the weights w γ. In Proposition 3.3, one of the necessary conditions is that the sequence w = (w γ ) γ Γ has to be uniformly bounded below away from zero. This is the only condition that has to be guaranteed for the weights. For the implementation we take two interesting choices of the w γ. The first one is w γ = 1 for all γ Γ. In this case we get Φ w,p (x) = Tx y 2 + x p p, where p denotes the l p -norm and x = ( x, ϕ γ ) γ Γ. The second choice is based on the fact that wavelets do not only constitute orthonormal bases of L 2 (R) but also bases for a variety of other Banach spaces of functions, such as Hölder spaces, Sobolev spaces and, more generally, Besov spaces. Roughly speaking, the Besov spaces B s p,q (Rd ) consist of functions that have s derivatives in L p. The parameter q provides some additional fine-tuning to the definition of these spaces. The norm x B s p,q is related to the modulus of continuity ω of x which is defined as a function ω : [0, ] [0, ] such that x(s) x(t) ω( s t ) for all s and t in the domain of x. We refer to [11] for further details and only want to point out that the Besov norm is equivalent to a norm that can be computed from the wavelet coefficients. More precisely, we assume that the scaling function φ and the mother wavelet ψ fulfill the smoothness property ( of being in C L (R), with L > s and that σ := s + d ( p only consider d = 1, we get σ = s + according to ( x s;p,q = (2 jσp j=0 γ Γ, γ =j ) Since we 2 p ). We define the norm s;p,q x, ψ γ p ) q/p ) 1/q, (3.25) where γ denotes the scale of the wavelet ψ γ. This norm is then equivalent to the Besov norm, i.e. there exist A > 0 and B > 0 s.t. A x s;p,q x B s p,q B x s;p,q. The condition σ 0 ensures that Bp,q s (R) is a subspace of L2 (R). We will restrict ourselves to choosing q to be equal to p, since then (3.25) reduces to ( ) x s;p,q = (2 ) 1/p jσp x, ψ γ p. j=0 γ Γ, γ =j 45

46 Both choices of weights should yield sparse solutions for p 1 and will be compared in Chapter The Shrinkage Function S wγ,p In order to speed up the computation, we implement a vector-valued version S w,p = (S wγ,p) γ Γ of the shrinkage function. For p = 1 we do this by using the MATLAB built-in function wthresh. As mentioned in Section 3.1, for p > 1 we cannot find S wγ,p explicitly but moreover have to find the solution x γ of the non-linear equation x γ + ksign(x γ ) x γ p 1 = c, (3.26) ( ) where k = pαwγ and c = a 2C γ + 1/C T (y Ta). At a first glance, using γ Newton s method for solving (3.26) seems to be a good idea. Unfortunately, this method fails in some cases. For instance, if p = 1.2, C = , w γ = 2 3(3/2 1/p)p and c = 0.2, the iterates of the Newton method oscillate between x 2k γ = and x 2k+1 γ = However, if the factor k is small enough, the method works. We can achieve this by either starting with a small regularization parameter α or with a large value for C. The latter has the drawback of slower convergence and should thus be avoided if possible. On the other hand, in the case of data noise, α cannot be chosen arbitrarily small. To ensure that the implemented algorithm always works, we add the method of bisection which will be used in case the Newton method fails. This algorithm is much slower than the Newton method, but it converges for any continuous strictly monotonic function f for which lim x f(x) = and lim x f(x) =, [14]. Algorithm 2 Bisection for solving f(x) = x + ksign(x) x p 1 = c a := min(c, 0), b := max(c, 0), m := a; while a b > ǫ f(m) > ǫ do m := a + b a, 2 if f(m) > 0 then b = m, else a = m, end if end while x = m; 46

47 Proposition 3.6. Let a < b R be chosen such that the unique solution x of f(x) = c fulfills a x b and let the iterates of the method of bisection be denoted by x k. Then, the method converges with the rate x k x 2 k b a. Proof. By induction. For k = 0 we know that Now suppose that x 0 x b a. x k x 2 k b a which means that in the k-th step the interval we are restricting our solution to is of length 2 k b a. In the k + 1-th step we reduce this interval by taking only half of it, i.e. x k+1 x k b a = 2 (k+1) b a. Again, we implement both the Newton and bisection algorithm such that the computation is done for vectors Building the Poke Matrix In order to test the algorithm, we need to determine the Poke matrix according to the chosen wavelet basis. In addition, we have to compute the slope measurements for the given wavefront. In 2 D, the entries of the Poke matrix P = [P x,p y ] are given by P i,l ψ j,k(x, y) x = d(x, y), Ω i x yi2 +1 ( = ψj,k (x i1 +1, y) ψ j,k (x i1, y) ) dy, y i2 P i,l ψ j,k(x, y) y = d(x, y), Ω i y xi1 +1 ( = ψj,k (x, y i2 +1) ψ j,k (x, y i2 ) ) dx, x i1 where l corresponds to the linear indexing of (j, k). In order to compute this integral we can apply a quadrature rule. However, we will need the 47

48 evaluation of the wavelet functions ψ j,k at some given points. This is not an easy task, since we cannot use the MATLAB wavelet toolbox to evaluate the wavelets and would thus need to build the wavelet family manually. Hence, for simplicity, we only want to reconstruct a 1D slice φ atm : [0, 1] R of the phase screen (shown in Fig. 3.2). We assume that φ atm L 2 and that it has zero mean. Furthermore, the resolution in which φ atm is actually given is 2 8. We choose the scale j to range from 0 to 7. In 1D, computing the slopes reduces to Figure 3.2: 1D slice of original phase screen. s i = φ atm(x)dx = φ atm (x i+1 ) φ atm (x i ), Ω i and the Poke matrix P is determined by P i,l = ψ j,k (x i+1 ) ψ j,k (x i ), where l corresponds to the linear indexing of (j, k). Here, we still need the evaluation of the wavelet functions which can be implemented in MATLAB by decomposing signals (i.e. computing their coefficients) that have only one non-zero value which is equal to 1. In order to test the introduced algorithm, we use it to solve Pa = s, 48

49 for the wavelet coefficient vector a of the reconstructed phase screen φ rec, which we finally compute according to φ rec = k a 0,k φ 0,k + 7 a j,k ψ j,k. j=1 k We start with the simplest possible wavelets: Haar wavelets. In this case, a 0 = 2, b 0 = 1 and the mother wavelet is defined as 1 if 0 x < 1 2, ψ(x) = 1 1 if 2 x < 1, 0 otherwise. In order to justify using these wavelets, we need to ensure that they constitute an orthonormal basis of L 2 (R), i.e. that 1. the ψ j,k are orthonormal and 2. any function in L 2 (R) can be approximated, up to any desired precision, by a finite linear combination of the ψ j,k. Orthonormality is easy to show. Since two Haar wavelets of the same scale j do not overlap, it holds that ψ j,k, ψ j,k = δ k,k. If they are of different scales j < j, then the support of ψ j,k lies in a region, where ψ j,k is constant. Therefore, the scalar product ψ j,k, ψ j,k is proportional to the integral of ψ j,k itself and thus, is zero. We skip the proof of the second statement, which can be found in [5]. Remark 3.4. The Haar wavelet basis can also be constructed with the multiresolution analysis for { 1 0 x < 1, φ(x) = 0 otherwise. Since the basis functions have compact support, the Poke matrix is sparse. Figure 3.3 shows the sparsity pattern of P. Figures 3.5 and 3.6 illustrate the behaviour of the algorithm for p = 1 and Haar wavelets for different noise levels. The reconstruction error approaches zero as the noise level tends to zero. Here, the implementation of the Haar basis is done independently, 49

50 Figure 3.3: Sparsity pattern of P for the Haar basis db db Figure 3.4: Sparsity pattern of P for the Daubechies wavelets. which is very complex for other Daubechies wavelets. For the latter we use the MATLAB wavelet toolbox to evaluate the basis functions at different values in order to build the Poke matrix and to compute φ rec from its wavelet 50

Regularization and Inverse Problems

Regularization and Inverse Problems Regularization and Inverse Problems Caroline Sieger Host Institution: Universität Bremen Home Institution: Clemson University August 5, 2009 Caroline Sieger (Bremen and Clemson) Regularization and Inverse

More information

3 Compact Operators, Generalized Inverse, Best- Approximate Solution

3 Compact Operators, Generalized Inverse, Best- Approximate Solution 3 Compact Operators, Generalized Inverse, Best- Approximate Solution As we have already heard in the lecture a mathematical problem is well - posed in the sense of Hadamard if the following properties

More information

Sparse Recovery in Inverse Problems

Sparse Recovery in Inverse Problems Radon Series Comp. Appl. Math XX, 1 63 de Gruyter 20YY Sparse Recovery in Inverse Problems Ronny Ramlau and Gerd Teschke Abstract. Within this chapter we present recent results on sparse recovery algorithms

More information

Morozov s discrepancy principle for Tikhonov-type functionals with non-linear operators

Morozov s discrepancy principle for Tikhonov-type functionals with non-linear operators Morozov s discrepancy principle for Tikhonov-type functionals with non-linear operators Stephan W Anzengruber 1 and Ronny Ramlau 1,2 1 Johann Radon Institute for Computational and Applied Mathematics,

More information

Regularization via Spectral Filtering

Regularization via Spectral Filtering Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

Spectral Regularization

Spectral Regularization Spectral Regularization Lorenzo Rosasco 9.520 Class 07 February 27, 2008 About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse problems,

More information

Polarization Shearing Interferometer (PSI) Based Wavefront Sensor for Adaptive Optics Application. A.K.Saxena and J.P.Lancelot

Polarization Shearing Interferometer (PSI) Based Wavefront Sensor for Adaptive Optics Application. A.K.Saxena and J.P.Lancelot Polarization Shearing Interferometer (PSI) Based Wavefront Sensor for Adaptive Optics Application A.K.Saxena and J.P.Lancelot Adaptive Optics A Closed loop Optical system to compensate atmospheric turbulence

More information

Wavefront reconstruction for adaptive optics. Marcos van Dam and Richard Clare W.M. Keck Observatory

Wavefront reconstruction for adaptive optics. Marcos van Dam and Richard Clare W.M. Keck Observatory Wavefront reconstruction for adaptive optics Marcos van Dam and Richard Clare W.M. Keck Observatory Friendly people We borrowed slides from the following people: Lisa Poyneer Luc Gilles Curt Vogel Corinne

More information

2 Tikhonov Regularization and ERM

2 Tikhonov Regularization and ERM Introduction Here we discusses how a class of regularization methods originally designed to solve ill-posed inverse problems give rise to regularized learning algorithms. These algorithms are kernel methods

More information

PDEs in Image Processing, Tutorials

PDEs in Image Processing, Tutorials PDEs in Image Processing, Tutorials Markus Grasmair Vienna, Winter Term 2010 2011 Direct Methods Let X be a topological space and R: X R {+ } some functional. following definitions: The mapping R is lower

More information

Contents. 0.1 Notation... 3

Contents. 0.1 Notation... 3 Contents 0.1 Notation........................................ 3 1 A Short Course on Frame Theory 4 1.1 Examples of Signal Expansions............................ 4 1.2 Signal Expansions in Finite-Dimensional

More information

SPARSE SIGNAL RESTORATION. 1. Introduction

SPARSE SIGNAL RESTORATION. 1. Introduction SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful

More information

Kernel Method: Data Analysis with Positive Definite Kernels

Kernel Method: Data Analysis with Positive Definite Kernels Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University

More information

An Introduction to Wavelets and some Applications

An Introduction to Wavelets and some Applications An Introduction to Wavelets and some Applications Milan, May 2003 Anestis Antoniadis Laboratoire IMAG-LMC University Joseph Fourier Grenoble, France An Introduction to Wavelets and some Applications p.1/54

More information

Statistical Geometry Processing Winter Semester 2011/2012

Statistical Geometry Processing Winter Semester 2011/2012 Statistical Geometry Processing Winter Semester 2011/2012 Linear Algebra, Function Spaces & Inverse Problems Vector and Function Spaces 3 Vectors vectors are arrows in space classically: 2 or 3 dim. Euclidian

More information

Direct Learning: Linear Classification. Donglin Zeng, Department of Biostatistics, University of North Carolina

Direct Learning: Linear Classification. Donglin Zeng, Department of Biostatistics, University of North Carolina Direct Learning: Linear Classification Logistic regression models for classification problem We consider two class problem: Y {0, 1}. The Bayes rule for the classification is I(P(Y = 1 X = x) > 1/2) so

More information

Convergence rates in l 1 -regularization when the basis is not smooth enough

Convergence rates in l 1 -regularization when the basis is not smooth enough Convergence rates in l 1 -regularization when the basis is not smooth enough Jens Flemming, Markus Hegland November 29, 2013 Abstract Sparsity promoting regularization is an important technique for signal

More information

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms (February 24, 2017) 08a. Operators on Hilbert spaces Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2016-17/08a-ops

More information

Wavelets and multiresolution representations. Time meets frequency

Wavelets and multiresolution representations. Time meets frequency Wavelets and multiresolution representations Time meets frequency Time-Frequency resolution Depends on the time-frequency spread of the wavelet atoms Assuming that ψ is centred in t=0 Signal domain + t

More information

L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS

L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS Rend. Sem. Mat. Univ. Pol. Torino Vol. 57, 1999) L. Levaggi A. Tabacco WAVELETS ON THE INTERVAL AND RELATED TOPICS Abstract. We use an abstract framework to obtain a multilevel decomposition of a variety

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR 1. Definition Existence Theorem 1. Assume that A R m n. Then there exist orthogonal matrices U R m m V R n n, values σ 1 σ 2... σ p 0 with p = min{m, n},

More information

A Short Course on Frame Theory

A Short Course on Frame Theory A Short Course on Frame Theory Veniamin I. Morgenshtern and Helmut Bölcskei ETH Zurich, 8092 Zurich, Switzerland E-mail: {vmorgens, boelcskei}@nari.ee.ethz.ch April 2, 20 Hilbert spaces [, Def. 3.-] and

More information

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions

Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions Chapter 3 Scattered Data Interpolation with Polynomial Precision and Conditionally Positive Definite Functions 3.1 Scattered Data Interpolation with Polynomial Precision Sometimes the assumption on the

More information

Real Variables # 10 : Hilbert Spaces II

Real Variables # 10 : Hilbert Spaces II randon ehring Real Variables # 0 : Hilbert Spaces II Exercise 20 For any sequence {f n } in H with f n = for all n, there exists f H and a subsequence {f nk } such that for all g H, one has lim (f n k,

More information

Short note on compact operators - Monday 24 th March, Sylvester Eriksson-Bique

Short note on compact operators - Monday 24 th March, Sylvester Eriksson-Bique Short note on compact operators - Monday 24 th March, 2014 Sylvester Eriksson-Bique 1 Introduction In this note I will give a short outline about the structure theory of compact operators. I restrict attention

More information

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS G. RAMESH Contents Introduction 1 1. Bounded Operators 1 1.3. Examples 3 2. Compact Operators 5 2.1. Properties 6 3. The Spectral Theorem 9 3.3. Self-adjoint

More information

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee

RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets Class 22, 2004 Tomaso Poggio and Sayan Mukherjee RKHS, Mercer s theorem, Unbounded domains, Frames and Wavelets 9.520 Class 22, 2004 Tomaso Poggio and Sayan Mukherjee About this class Goal To introduce an alternate perspective of RKHS via integral operators

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces 9.520: Statistical Learning Theory and Applications February 10th, 2010 Reproducing Kernel Hilbert Spaces Lecturer: Lorenzo Rosasco Scribe: Greg Durrett 1 Introduction In the previous two lectures, we

More information

NORMS ON SPACE OF MATRICES

NORMS ON SPACE OF MATRICES NORMS ON SPACE OF MATRICES. Operator Norms on Space of linear maps Let A be an n n real matrix and x 0 be a vector in R n. We would like to use the Picard iteration method to solve for the following system

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 27 Introduction Fredholm first kind integral equation of convolution type in one space dimension: g(x) = 1 k(x x )f(x

More information

Convergence rates for Morozov s Discrepancy Principle using Variational Inequalities

Convergence rates for Morozov s Discrepancy Principle using Variational Inequalities Convergence rates for Morozov s Discrepancy Principle using Variational Inequalities Stephan W Anzengruber Ronny Ramlau Abstract We derive convergence rates for Tikhonov-type regularization with conve

More information

Chapter 3 Numerical Methods

Chapter 3 Numerical Methods Chapter 3 Numerical Methods Part 3 3.4 Differential Algebraic Systems 3.5 Integration of Differential Equations 1 Outline 3.4 Differential Algebraic Systems 3.4.1 Constrained Dynamics 3.4.2 First and Second

More information

Wavelets and applications

Wavelets and applications Chapter 3 Wavelets and applications 3. Multiresolution analysis 3.. The limits of Fourier analysis Although along this chapter the underlying Hilbert space will be L 2 (R), we start with a completely explicit

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS TSOGTGEREL GANTUMUR Abstract. After establishing discrete spectra for a large class of elliptic operators, we present some fundamental spectral properties

More information

6 The SVD Applied to Signal and Image Deblurring

6 The SVD Applied to Signal and Image Deblurring 6 The SVD Applied to Signal and Image Deblurring We will discuss the restoration of one-dimensional signals and two-dimensional gray-scale images that have been contaminated by blur and noise. After an

More information

NOTES ON FRAMES. Damir Bakić University of Zagreb. June 6, 2017

NOTES ON FRAMES. Damir Bakić University of Zagreb. June 6, 2017 NOTES ON FRAMES Damir Bakić University of Zagreb June 6, 017 Contents 1 Unconditional convergence, Riesz bases, and Bessel sequences 1 1.1 Unconditional convergence of series in Banach spaces...............

More information

Numerical Solution of a Coefficient Identification Problem in the Poisson equation

Numerical Solution of a Coefficient Identification Problem in the Poisson equation Swiss Federal Institute of Technology Zurich Seminar for Applied Mathematics Department of Mathematics Bachelor Thesis Spring semester 2014 Thomas Haener Numerical Solution of a Coefficient Identification

More information

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Recap: Linear Algebra Today... Topics: Mathematical Background Linear algebra Analysis & differential geometry Numerical techniques Geometric

More information

A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator

A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator Ismael Rodrigo Bleyer Prof. Dr. Ronny Ramlau Johannes Kepler Universität - Linz Florianópolis - September, 2011.

More information

On the Numerical Evaluation of Fractional Sobolev Norms. Carsten Burstedde. no. 268

On the Numerical Evaluation of Fractional Sobolev Norms. Carsten Burstedde. no. 268 On the Numerical Evaluation of Fractional Sobolev Norms Carsten Burstedde no. 268 Diese Arbeit ist mit Unterstützung des von der Deutschen Forschungsgemeinschaft getragenen Sonderforschungsbereiches 611

More information

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product Chapter 4 Hilbert Spaces 4.1 Inner Product Spaces Inner Product Space. A complex vector space E is called an inner product space (or a pre-hilbert space, or a unitary space) if there is a mapping (, )

More information

Two-parameter regularization method for determining the heat source

Two-parameter regularization method for determining the heat source Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 13, Number 8 (017), pp. 3937-3950 Research India Publications http://www.ripublication.com Two-parameter regularization method for

More information

October 25, 2013 INNER PRODUCT SPACES

October 25, 2013 INNER PRODUCT SPACES October 25, 2013 INNER PRODUCT SPACES RODICA D. COSTIN Contents 1. Inner product 2 1.1. Inner product 2 1.2. Inner product spaces 4 2. Orthogonal bases 5 2.1. Existence of an orthogonal basis 7 2.2. Orthogonal

More information

8 The SVD Applied to Signal and Image Deblurring

8 The SVD Applied to Signal and Image Deblurring 8 The SVD Applied to Signal and Image Deblurring We will discuss the restoration of one-dimensional signals and two-dimensional gray-scale images that have been contaminated by blur and noise. After an

More information

Space-Frequency Atoms

Space-Frequency Atoms Space-Frequency Atoms FREQUENCY FREQUENCY SPACE SPACE FREQUENCY FREQUENCY SPACE SPACE Figure 1: Space-frequency atoms. Windowed Fourier Transform 1 line 1 0.8 0.6 0.4 0.2 0-0.2-0.4-0.6-0.8-1 0 100 200

More information

Representations of moderate growth Paul Garrett 1. Constructing norms on groups

Representations of moderate growth Paul Garrett 1. Constructing norms on groups (December 31, 2004) Representations of moderate growth Paul Garrett Representations of reductive real Lie groups on Banach spaces, and on the smooth vectors in Banach space representations,

More information

8 The SVD Applied to Signal and Image Deblurring

8 The SVD Applied to Signal and Image Deblurring 8 The SVD Applied to Signal and Image Deblurring We will discuss the restoration of one-dimensional signals and two-dimensional gray-scale images that have been contaminated by blur and noise. After an

More information

Normed & Inner Product Vector Spaces

Normed & Inner Product Vector Spaces Normed & Inner Product Vector Spaces ECE 174 Introduction to Linear & Nonlinear Optimization Ken Kreutz-Delgado ECE Department, UC San Diego Ken Kreutz-Delgado (UC San Diego) ECE 174 Fall 2016 1 / 27 Normed

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

CHAPTER VIII HILBERT SPACES

CHAPTER VIII HILBERT SPACES CHAPTER VIII HILBERT SPACES DEFINITION Let X and Y be two complex vector spaces. A map T : X Y is called a conjugate-linear transformation if it is a reallinear transformation from X into Y, and if T (λx)

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

STAT 200C: High-dimensional Statistics

STAT 200C: High-dimensional Statistics STAT 200C: High-dimensional Statistics Arash A. Amini May 30, 2018 1 / 57 Table of Contents 1 Sparse linear models Basis Pursuit and restricted null space property Sufficient conditions for RNS 2 / 57

More information

IMAGE RESTORATION: TOTAL VARIATION, WAVELET FRAMES, AND BEYOND

IMAGE RESTORATION: TOTAL VARIATION, WAVELET FRAMES, AND BEYOND IMAGE RESTORATION: TOTAL VARIATION, WAVELET FRAMES, AND BEYOND JIAN-FENG CAI, BIN DONG, STANLEY OSHER, AND ZUOWEI SHEN Abstract. The variational techniques (e.g., the total variation based method []) are

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Space-Frequency Atoms

Space-Frequency Atoms Space-Frequency Atoms FREQUENCY FREQUENCY SPACE SPACE FREQUENCY FREQUENCY SPACE SPACE Figure 1: Space-frequency atoms. Windowed Fourier Transform 1 line 1 0.8 0.6 0.4 0.2 0-0.2-0.4-0.6-0.8-1 0 100 200

More information

Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods

Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods Empirical Risk Minimization as Parameter Choice Rule for General Linear Regularization Methods Frank Werner 1 Statistical Inverse Problems in Biophysics Group Max Planck Institute for Biophysical Chemistry,

More information

Scientific Computing WS 2017/2018. Lecture 18. Jürgen Fuhrmann Lecture 18 Slide 1

Scientific Computing WS 2017/2018. Lecture 18. Jürgen Fuhrmann Lecture 18 Slide 1 Scientific Computing WS 2017/2018 Lecture 18 Jürgen Fuhrmann juergen.fuhrmann@wias-berlin.de Lecture 18 Slide 1 Lecture 18 Slide 2 Weak formulation of homogeneous Dirichlet problem Search u H0 1 (Ω) (here,

More information

Topics in Harmonic Analysis Lecture 1: The Fourier transform

Topics in Harmonic Analysis Lecture 1: The Fourier transform Topics in Harmonic Analysis Lecture 1: The Fourier transform Po-Lam Yung The Chinese University of Hong Kong Outline Fourier series on T: L 2 theory Convolutions The Dirichlet and Fejer kernels Pointwise

More information

Inverse problems in statistics

Inverse problems in statistics Inverse problems in statistics Laurent Cavalier (Université Aix-Marseille 1, France) Yale, May 2 2011 p. 1/35 Introduction There exist many fields where inverse problems appear Astronomy (Hubble satellite).

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Lecture Notes 5: Multiresolution Analysis

Lecture Notes 5: Multiresolution Analysis Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and

More information

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in

More information

1 Math 241A-B Homework Problem List for F2015 and W2016

1 Math 241A-B Homework Problem List for F2015 and W2016 1 Math 241A-B Homework Problem List for F2015 W2016 1.1 Homework 1. Due Wednesday, October 7, 2015 Notation 1.1 Let U be any set, g be a positive function on U, Y be a normed space. For any f : U Y let

More information

Iterative regularization of nonlinear ill-posed problems in Banach space

Iterative regularization of nonlinear ill-posed problems in Banach space Iterative regularization of nonlinear ill-posed problems in Banach space Barbara Kaltenbacher, University of Klagenfurt joint work with Bernd Hofmann, Technical University of Chemnitz, Frank Schöpfer and

More information

International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July ISSN

International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July ISSN International Journal of Scientific & Engineering Research, Volume 4, Issue 7, July-2013 96 Performance and Evaluation of Interferometric based Wavefront Sensors M.Mohamed Ismail1, M.Mohamed Sathik2 Research

More information

Second Order Elliptic PDE

Second Order Elliptic PDE Second Order Elliptic PDE T. Muthukumar tmk@iitk.ac.in December 16, 2014 Contents 1 A Quick Introduction to PDE 1 2 Classification of Second Order PDE 3 3 Linear Second Order Elliptic Operators 4 4 Periodic

More information

Sparse linear models

Sparse linear models Sparse linear models Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 2/22/2016 Introduction Linear transforms Frequency representation Short-time

More information

Exercise Sheet 1.

Exercise Sheet 1. Exercise Sheet 1 You can download my lecture and exercise sheets at the address http://sami.hust.edu.vn/giang-vien/?name=huynt 1) Let A, B be sets. What does the statement "A is not a subset of B " mean?

More information

1 Sparsity and l 1 relaxation

1 Sparsity and l 1 relaxation 6.883 Learning with Combinatorial Structure Note for Lecture 2 Author: Chiyuan Zhang Sparsity and l relaxation Last time we talked about sparsity and characterized when an l relaxation could recover the

More information

Regularization Theory

Regularization Theory Regularization Theory Solving the inverse problem of Super resolution with CNN Aditya Ganeshan Under the guidance of Dr. Ankik Kumar Giri December 13, 2016 Table of Content 1 Introduction Material coverage

More information

SPECTRAL THEORY EVAN JENKINS

SPECTRAL THEORY EVAN JENKINS SPECTRAL THEORY EVAN JENKINS Abstract. These are notes from two lectures given in MATH 27200, Basic Functional Analysis, at the University of Chicago in March 2010. The proof of the spectral theorem for

More information

Structured Linear Algebra Problems in Adaptive Optics Imaging

Structured Linear Algebra Problems in Adaptive Optics Imaging Structured Linear Algebra Problems in Adaptive Optics Imaging Johnathan M. Bardsley, Sarah Knepper, and James Nagy Abstract A main problem in adaptive optics is to reconstruct the phase spectrum given

More information

LINEAR SYSTEMS (11) Intensive Computation

LINEAR SYSTEMS (11) Intensive Computation LINEAR SYSTEMS () Intensive Computation 27-8 prof. Annalisa Massini Viviana Arrigoni EXACT METHODS:. GAUSSIAN ELIMINATION. 2. CHOLESKY DECOMPOSITION. ITERATIVE METHODS:. JACOBI. 2. GAUSS-SEIDEL 2 CHOLESKY

More information

Lecture notes: Applied linear algebra Part 1. Version 2

Lecture notes: Applied linear algebra Part 1. Version 2 Lecture notes: Applied linear algebra Part 1. Version 2 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 1 Notation, basic notions and facts 1.1 Subspaces, range and

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

PhD Course: Introduction to Inverse Problem. Salvatore Frandina Siena, August 19, 2012

PhD Course: Introduction to Inverse Problem. Salvatore Frandina Siena, August 19, 2012 PhD Course: to Inverse Problem salvatore.frandina@gmail.com theory Department of Information Engineering, Siena, Italy Siena, August 19, 2012 1 / 68 An overview of the - - - theory 2 / 68 Direct and Inverse

More information

MATH 590: Meshfree Methods

MATH 590: Meshfree Methods MATH 590: Meshfree Methods Chapter 34: Improving the Condition Number of the Interpolation Matrix Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2010 fasshauer@iit.edu

More information

TOEPLITZ OPERATORS. Toeplitz studied infinite matrices with NW-SE diagonals constant. f e C :

TOEPLITZ OPERATORS. Toeplitz studied infinite matrices with NW-SE diagonals constant. f e C : TOEPLITZ OPERATORS EFTON PARK 1. Introduction to Toeplitz Operators Otto Toeplitz lived from 1881-1940 in Goettingen, and it was pretty rough there, so he eventually went to Palestine and eventually contracted

More information

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms.

Vector Spaces. Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. Vector Spaces Vector space, ν, over the field of complex numbers, C, is a set of elements a, b,..., satisfying the following axioms. For each two vectors a, b ν there exists a summation procedure: a +

More information

An Iterative Thresholding Algorithm for Linear Inverse Problems with a Sparsity Constraint

An Iterative Thresholding Algorithm for Linear Inverse Problems with a Sparsity Constraint An Iterative Thresholding Algorithm for Linear Inverse Problems with a Sparsity Constraint INGRID DAUBECHIES Princeton University MICHEL DEFRISE Department of Nuclear Medicine Vrije Universiteit Brussel

More information

I teach myself... Hilbert spaces

I teach myself... Hilbert spaces I teach myself... Hilbert spaces by F.J.Sayas, for MATH 806 November 4, 2015 This document will be growing with the semester. Every in red is for you to justify. Even if we start with the basic definition

More information

Some Background Material

Some Background Material Chapter 1 Some Background Material In the first chapter, we present a quick review of elementary - but important - material as a way of dipping our toes in the water. This chapter also introduces important

More information

Your first day at work MATH 806 (Fall 2015)

Your first day at work MATH 806 (Fall 2015) Your first day at work MATH 806 (Fall 2015) 1. Let X be a set (with no particular algebraic structure). A function d : X X R is called a metric on X (and then X is called a metric space) when d satisfies

More information

Lectures notes. Rheology and Fluid Dynamics

Lectures notes. Rheology and Fluid Dynamics ÉC O L E P O L Y T E C H N IQ U E FÉ DÉR A L E D E L A U S A N N E Christophe Ancey Laboratoire hydraulique environnementale (LHE) École Polytechnique Fédérale de Lausanne Écublens CH-05 Lausanne Lectures

More information

Multiscale Frame-based Kernels for Image Registration

Multiscale Frame-based Kernels for Image Registration Multiscale Frame-based Kernels for Image Registration Ming Zhen, Tan National University of Singapore 22 July, 16 Ming Zhen, Tan (National University of Singapore) Multiscale Frame-based Kernels for Image

More information

Error Budgets, and Introduction to Class Projects. Lecture 6, ASTR 289

Error Budgets, and Introduction to Class Projects. Lecture 6, ASTR 289 Error Budgets, and Introduction to Class Projects Lecture 6, ASTR 89 Claire Max UC Santa Cruz January 8, 016 Page 1 What is residual wavefront error? Telescope AO System Science Instrument Very distorted

More information

SPECTRAL THEOREM FOR SYMMETRIC OPERATORS WITH COMPACT RESOLVENT

SPECTRAL THEOREM FOR SYMMETRIC OPERATORS WITH COMPACT RESOLVENT SPECTRAL THEOREM FOR SYMMETRIC OPERATORS WITH COMPACT RESOLVENT Abstract. These are the letcure notes prepared for the workshop on Functional Analysis and Operator Algebras to be held at NIT-Karnataka,

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

A Spectral Characterization of Closed Range Operators 1

A Spectral Characterization of Closed Range Operators 1 A Spectral Characterization of Closed Range Operators 1 M.THAMBAN NAIR (IIT Madras) 1 Closed Range Operators Operator equations of the form T x = y, where T : X Y is a linear operator between normed linear

More information

Measure and Integration: Solutions of CW2

Measure and Integration: Solutions of CW2 Measure and Integration: s of CW2 Fall 206 [G. Holzegel] December 9, 206 Problem of Sheet 5 a) Left (f n ) and (g n ) be sequences of integrable functions with f n (x) f (x) and g n (x) g (x) for almost

More information

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability... Functional Analysis Franck Sueur 2018-2019 Contents 1 Metric spaces 1 1.1 Definitions........................................ 1 1.2 Completeness...................................... 3 1.3 Compactness......................................

More information

Balanced Truncation 1

Balanced Truncation 1 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.242, Fall 2004: MODEL REDUCTION Balanced Truncation This lecture introduces balanced truncation for LTI

More information

Linear Inverse Problems

Linear Inverse Problems Linear Inverse Problems Ajinkya Kadu Utrecht University, The Netherlands February 26, 2018 Outline Introduction Least-squares Reconstruction Methods Examples Summary Introduction 2 What are inverse problems?

More information

Reproducing Kernel Hilbert Spaces

Reproducing Kernel Hilbert Spaces Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert

More information