Global convergence of damped semismooth Newton methods for l 1 Tikhonov regularization

Size: px
Start display at page:

Download "Global convergence of damped semismooth Newton methods for l 1 Tikhonov regularization"

Transcription

1 Global convergence of damped semismooth Newton methods for l Tikhonov regularization Esther Hans and Thorsten Raasch Institut für Mathematik, Johannes Gutenberg-Universität Mainz, Mainz, Germany hanse@uni-mainz.de, raasch@uni-mainz.de Abstract. We are concerned with Tikhonov regularization of linear ill-posed problems with l coefficient penalties. In [Inverse Probl ) ], Griesse and Lorenz proposed a semismooth Newton method for the efficient minimization of the corresponding Tikhonov functionals. While the convergence of semismooth Newton methods is locally superlinear in general, their application to l Tikhonov regularization is particularly attractive because here, one obtains the exact Tikhonov minimizer after a finite number of iterations, given a sufficiently good initial guess. In this work, we discuss the efficient globalization of Bouligand)-semismooth Newton methods for l Tikhonov regularization by means of damping strategies and suitable descent with respect to an associated merit functional. Numerical examples are provided which show that our method compares well with existing iterative, globally convergent approaches.. Introduction We are concerned with the efficient numerical solution of the minimization problem min Ku u l f Y + w k u k, ) k N where K : l Y is a linear, bounded and injective operator mapping the sequence space l = l N ), N {N, {,..., n}}, into a separable Hilbert space Y, f Y is a given datum and w = w k ) k N is a weight sequence with w k w 0 > 0 for all k N. By the injectivity of K and the coercivity of the penalty term k N w k u k with respect to l, ) has a unique minimizer û l which, moreover, has only a finite number of nonzero entries [6]. Such minimization problems typically arise from sparsity-constrained Tikhonov regularization of the discretization of linear ill-posed operator equations Au = f. Here A : X Y is a bounded linear mapping from a separable Hilbert space X into Y, and f Y are given measurement data. For the stable numerical solution of such operator equations in the presence of measurement errors, regularization strategies are

2 Global convergence of damped semismooth Newton methods for l regularization essential [3]. A common approach is to regularize by minimizing a Tikhonov functional, min Au u X f Y + Ru), where the penalty term R : X R models certain a priori regularity assumptions on the unknown solution. Recently, it has become clear that sparse expansibility of the quantity of interest with respect to a given Riesz basis Ψ = {ψ k } k N of X may serve to stabilize the recovery problem [0]. Denoting by T : l X the synthesis operator T u = k N u kψ k associated with Ψ, the sparsity of u X with respect to Ψ can be modeled by l coefficient penalties Ru) = k N w k u k, u = T u, which readily induces the minimization problem ) by setting K := AT. For the numerical solution of ), several classes of algorithms have been discussed in the literature, see also [] for an extensive survey and numerical comparisons. Direct homotopy-type solvers like the LARS-LASSO algorithm [, 4] follow the piecewise linear, continuous trajectory of minimizers û = ûα), w k = αŵ k for all k N, as α decreases to 0. They compute the exact Tikhonov minimizer within finite many computational operations. However, their performance is poor as the number of active coefficients # supp û of the Tikhonov minimizer increases, which drives the need for efficient iterative solvers for large-scale problems. Iterative algorithms for the computation of Tikhonov minimizers are typically inspired by sub)gradient descent strategies for the convex Tikhonov functional. The iterative soft thresholding algorithm ISTA) [0] and its interpretations as proximal forward-backward operator splitting [8] and as a generalized conditional gradient method [5] are very popular in practical applications and have been studied intensively in the literature. It is well-known that the performance of gradient descent strategies hinges on a clever step size selection along the iteration, as iterative soft shrinkage with constant step size can perform arbitrarily slowly in the presence of small eigenvalues of K K. Among the most efficient gradient descent strategies with variable step sizes, we mention the FISTA algorithm of Beck and Teboulle [], see also []. By the convexity of the Tikhonov functional, the convergence of gradient descent-type algorithms is usually global. However, it is still linear [4], and the iteration does not yield the exact minimizer after finite many iterations. As an alternative approach to gradient descent, one may use that the minimization problem ) is equivalent to the shrinkage equation Fu) := u S γw u + γk f Ku) ) = 0, γ > 0, ) where γ can be chosen arbitrarily, and S β v) := sgnv k ) v k β k ) + ) k N denotes componentwise soft thresholding of v with respect to a positive weight sequence β = β k ) k N. It was shown in [6] that the continuous, piecewise linear nonlinearity

3 Global convergence of damped semismooth Newton methods for l regularization 3 F = F j ) j N in ) is Newton differentiable at each u l, i.e. there exists a family of mappings G: l Ll, l ) such that Fu + h) Fu) Gu + h)h l lim h 0 h l = 0, 3) cf. [7]. The function G is called a generalized derivative of F. Therefore, ) can be treated by semismooth Newton methods [7, 6] u j+) = u j) + d j), Gu j) )d j) = Fu j) ), j = 0,,... 4) The convergence of semismooth Newton methods is locally superlinear in general [7]. However, in the case of the particular piecewise linear left-hand side F from ), one may even obtain the exact solution after a finite number of iterations, as long as the initial guess is sufficiently close to it, see also Section. The global convergence properties of the semismooth Newton method, applied to Tikhonov functionals with sparsity constraints, have not yet been considered in the literature. In this work, we discuss a damping strategy which, at least in a finitedimensional setting, yields a globally convergent variant of the semismooth Newton method from [6]. Our key idea to prove the global convergence is to work with Bouligand derivatives instead of Newton derivatives in 4), as proposed in [5]. Recall from [4, 30, 3] that a locally Lipschitz-continuous mapping F : l l is Bouligand) differentiable at u l if all directional derivatives F Fu+hd) Fu) u, d) := lim h 0 exist h and fulfill the approximation property Fu+d) Fu) F u, d) = o d ), d 0. For a B-differentiable nonlinearity F, the associated B-semismooth Newton method reads as u j+) = u j) + d j), F u j), d j) ) = Fu j) ), j = 0,,... 5) In view of 3), the generalized derivative G fulfills Gu + d)d F u, d) = o d ), as d 0, which is the reason why the B-semismooth Newton method 5) usually has the very same convergence properties as the semismooth Newton method 4), see also [5]. The main issue in the concrete realization of 5) is to prove that the nonlinear system F u, d) = Fu) 6) has a unique solution d l for each u l. We will accomplish this by reducing 6) to a finite-dimensional linear complementarity problem [9]. What is more, we will show that there exists a particular generalized derivative G of F such that even Gu)d = F u, d) for all u, d l. As to the envisaged globalization strategy, we will focus our analysis on damping schemes. Here, the semismooth) Newton direction d j) is multiplied with a suitable damping parameter, and global convergence is enforced by monitoring the descent properties of the iterates u j) with respect to a suitable merit functional Θ : l R. In our paper, we will choose Θ p u) := Fu) p p = k N Fk u) p, u l, 7)

4 Global convergence of damped semismooth Newton methods for l regularization 4 where we shall restrict the discussion to the special cases p {, }. On the one hand, p = is a widely accepted choice in the globalization of Newton methods for smooth equations []. On the other hand, the case p = is tempting in view of the piecewise linear nonlinearity F at hand. Our work is inspired by similar investigations for the numerical solution of optimal control problems [] and of nonlinear complementarity problems [7] by semismooth Newton methods, and by the recent work [3] where the semismooth Newton iteration of [6] is globalized by interleaving it with iterative soft shrinkage steps. The paper is organized as follows. In Section we review the semismooth Newton method from [6] and give counterexamples to demonstrate that the method is not globally convergent in general. A local semismooth Newton method based on B- derivatives is treated in Section 3. In Section 4, we propose a damped version of the semismooth Newton method and prove its global convergence properties. Finally, Section 5 provides numerical experiments, ranging from inverse integration and deblurring of images to an inverse heat equation.. The semismooth Newton method and its basic properties In this section, we are going to review the semismooth Newton method, applied to the numerical solution to the shrinkage equation )... The semismooth Newton method One of the key observations of [6] was that the nonlinearity F of the shrinkage equation is Newton differentiable on the full sequence space l, cf. 3). We cite the following theorem from [6]. Theorem.. The mapping F : l l from ) is Newton differentiable in each u l with nonunique) generalized derivative ) γk K) A,A γk K) A,I Gu) = Ll ) 8) 0 I I which is a block matrix with respect to the active set A = Au) := { k N : u + γk f Ku)) k > γw k } 9) and the inactive set I = Iu) := N \ Au). Note that by w k w 0 > 0 and K Ll, Y ), the active sets Au) are finite for each u l, even in the case N = N. Therefore, Gu) is boundedly invertible for each u l. What is more, Gu) is uniformly bounded in the operator norm in a sufficiently small neighborhood of the Tikhonov minimizer û, see the following lemma from [6].

5 Global convergence of damped semismooth Newton methods for l regularization 5 Lemma.. There exist ρ, C > 0 such that Gu) Ll ) C for all u l with u û l ρ. The iteration 4) is well-defined by the bounded invertibility of G from 8) on l. Moreover, by using the local uniform boundedness of Gu) and by applying the generic results from [7], it was proved in [6] that 4) is locally superlinearly convergent. Moreover, as pointed out in loc. cit., the proof of Theorem. reveals that the numerator of 3) for F from ), i.e., Fu + h) Fu) Gu + h)h, already vanishes for sufficiently small h l, u l being fixed. Therefore, if the current iterate u j) is sufficiently close to the exact minimizer û, it holds that Fu j) ) Fû) Gu j) )u j) û) = Fu j) ) Gu j) )u j) û) = 0, so that the next step of the semismooth Newton iteration 4) jumps into û,.. Cycling effect u j+) = u j) Gu j) ) Fu j) ) = û. One of the well-known pitfalls of the classical Newton method is the cycling effect. Here, after a certain number of iterations, the iterates start to oscillate between two or more distinct points. We will show now that the semismooth Newton iteration 4) may run into the same situation, at least if the parameter γ is not chosen large enough.... One-dimensional case Assume that N = {}, Y = R, f R, K R \ {0} and w > 0. The unique minimizer of the Tikhonov functional is given by Ju) = Ku f) + w u, u R 0) f w f, > w K K K K û = S w f ) = K K 0, f K w K f + w f, < w K K K K. ) Let us analyze the convergence behavior of the semismooth Newton iteration 4), 8) in detail. Using the notation ru) := Kf Ku), the nonlinearity F from the shrinkage equation ) has the explicit form γk u Kf + w), u + γru) > γw F u) = u, u + γru) γw. ) γk u Kf w), u + γru) < γw The shape of the graph of F depends on the size of the parameter γ > 0. In the outer region u + γru) = γk )u + γkf > γw, the slope of F is given by γk. For small parameters 0 < γ < K, the graph of F is hence changing from convex to concave

6 Global convergence of damped semismooth Newton methods for l regularization 6 F u) F u) F u) û Kf w /γ K Kf+w /γ K u Kf w /γ K û Kf+w /γ K u Kf w /γ K Kf+w /γ K û u a) γ < K, û = f K + w K < 0 F u) b) γ < K, û = 0 F u) c) γ < K, û = f K w K > 0 F u) Kf w K /γ û Kf+w K /γ u Kf w K /γ û Kf+w K /γ u û Kf w K /γ Kf+w K /γ u d) γ > K, û = f K + w K < 0 e) γ > K, û = 0 f) γ > K, û = f K w K > 0 Figure. Qualitative behavior of F and the location of û in the one-dimensional case as u is growing, and vice versa for large parameters γ >. We refer to Figure for the K six possible cases concerning the location of the zero û of F. The generalized derivative 8) of F at u R is { γk, u + γru) > γw Gu) =, 3), u + γru) γw so that one semismooth Newton step maps the current iterate u j) R to f w, u j) + γru j) ) > γw K K u j+) = u j) F uj) ) = Gu j) ) 0, u j) + γru j) ) γw. 4) f + w, u j) + γru j) ) < γw K K By taking all possible configurations into account, the complete behavior of the semismooth Newton iteration 4), 8) in the one-dimensional case can be summarized as in Table, compare also Figure. If û 0 and for any γ > 0, the algorithm obviously terminates after at most three iterations with the exact minimizer û. In case that û = 0, however, the behavior of the semismooth Newton iteration strongly depends on the size of γ. On the one hand, if û = 0 and γ >, then u j+) = 0 = û regardless of the K choice of u j), because one of the cases for u j+) in the second row of Table never occurs. As an example, if u j) + γru j) ) > γw and hence u j+) = f w, it holds K K

7 Global convergence of damped semismooth Newton methods for l regularization 7 location of u j) û u j) + γru j) ) > γw u j) + γru j) ) γw u j) + γru j) ) < γw f K + w K < 0 0 f K w K > 0 u j+) = f K w K u j+) = 0 u j+) = û f u j+) û, K = K < γw f 0, K w K γw u j+) = û u j+3) = û u j+) = f K w K u j+) = û u j+) = f K + w K f u j+) K = + w K, f K w K < γw f f û, K w K γw u j+) û, K = + w K γw f K w K, f K + w K > γw u j+) = û u j+) = 0 u j+) = f K + w K u j+) = û f u j+) 0, K = K γw f û, K + w K > γw u j+3) = û Table. Behavior of the semismooth Newton iteration in the one-dimensional case. that u j+) + γru j+) ) = f w + γw. û = 0 and hence f w imply that K K K K u j+) + γru j+) ) γw. Finally, γ > yields u j+) + γru j+) ) w + γw > γw K K and hence u j+) = 0 = û. The same argument works in the case u j) + γru j) ) < γw. Summarizing, we can conclude that the semismooth Newton iteration 4), 8) converges in at most three steps if γ is chosen larger than. K On the other hand, if û = 0, i.e., f w, and 0 < γ < is sufficiently small, K K K the semismooth Newton iteration might start to oscillate between f ± w. We obtain K K a cycle, e.g., if f < w, 0 < γ < f ) and K K K wk uj) = f + w 0 = û. Here K K u j) + γru j) ) = f + w γw w f γw > γw and hence K K K K uj+) = f w. In K K the next step, u j+) + γru j+) ) = f w + γw f w + γw < γw and hence K K K K u j+) = f + w = u j). K K... Finite-dimensional case The cycling effect can be observed in higher dimensions as well. In the two-dimensional case, choosing ) ) ) ) K = 0, f =, w =, γ = 3, 6 u0) =, the next iterates are ) ) u ) 0 =, u ) 6 = = u 0), i.e. a cycle arises. Choosing γ = 3 instead, the zero of F would be found within two steps. In the four-dimensional case, the choice K = , f =, w =, γ =, u0) = 3 0 0

8 Global convergence of damped semismooth Newton methods for l regularization 8 leads to the iterates 8 36 u ) = 3 0, u) = 3 0 = u0). 0 0 These examples show that the unmodified semismooth Newton method 4), 8) is not globally convergent in general and may even lead to cycles, motivating the analysis of suitable globalization strategies. 3. The local B-semismooth Newton method Instead of 4), we are going to study the B-semismooth Newton method 5) which is based on the Bouligand)-derivative F. The B-Newton directions d j) will be chosen by solving the generalized Newton equation 6), cf. Subsection Directional differentiability of the nonlinearity F Let us first analyze the directional differentiability of the nonlinearity F = F k ) k N, i.e., the existence and concrete shape of F Fu+hd) Fu) u, d) = lim h 0, where u, d l h. To this end, we shall use the following straightforward componentwise representation of F: γ K Ku f) ) + γw k k, k A + u) F k u) = γ K Ku f) ) k γw k, k A u), 5) u k, k Iu) where we have splitted the active set into Au) = A + u) A u) with A + u) := { k N : u + γk f Ku)) k > γw k }, 6) A u) := { k N : u + γk f Ku)) k < γw k }. 7) Let us also split the inactive set into Iu) = I + u) I u) I u), where I + u) := { k N : u + γk f Ku)) k = γw k }, 8) I u) := { k N : u + γk f Ku)) k = γw k }, 9) I u) := { k N : u + γk f Ku)) k < γw k }. 0) In the following we drop the argument u of the active and inactive sets if there is no risk of confusion. With this notation at hand, we shall now prove the directional differentiability of F. While the directional differentiability of a single component F k and the corresponding formula for the directional derivative F k u, d) = lim h 0 F ku+hd) F k u) h are almost obvious, the directional differentiability of the full operator F is nontrivial, at least in the case N = N. For this reason, we will prove the stronger property that the residual Fu + hd) Fu) hf u, d) vanishes for sufficiently small h.

9 Global convergence of damped semismooth Newton methods for l regularization 9 Lemma 3.. Let u, d l be arbitrary. There exists h 0 = h 0 u, d) > 0, so that F k u + hd) F k u) h γk Kd) k, d k, = min{d k, γk Kd) k }, max{d k, γk Kd) k }, k Au) k I u), 0 < h h 0. ) k I + u) k I u) In particular, F = F k ) k N and its components F k are directionally differentiable at u in the direction d, with directional derivative F u, d) = F k u, d)) k N l given componentwise by the right-hand side of ). Proof. Let u, d l, k N and h > 0 be arbitrary. If k I + u), 5) and 8) tell us that F k u) = u k = γk Ku f)) k + γw k. If, additionally, I γk K)d) k > 0, then k A + u + hd), so that an application of 5) yields F k u + hd) F k u) = hγk Kd) k, showing ). The same argument works for k I u) and I γk K)d) k < 0. If k I + u) and I γk K)d) k 0, we choose h 0 > 0 so small that h 0 I γk K)d l γw 0. It follows that for all 0 < h h 0, u + hd + γk f Ku + hd)) ) k = γw k + hi γk K)d) k γw k, γw k ] and hence k Iu+hd). We conclude that F k u+hd) F k u) = hd k for all 0 < h h 0, showing ). The same argument works for k I u) and I γk K)d) k 0. Moreover, due to the fact that Au) is a finite set, we can find h 0 > 0 such that Au) Au + hd) for all 0 < h h 0 with { h 0 min inf h > 0 : u + hd + γk f Ku + hd))) l > γwl }. l Au) For 0 < h h 0 and k Au) Au+hd), 5) yields F k u+hd) F k u) = hγk Kd) k and hence ). Finally, in order to show ) for k I u), we decompose I u) = I u) I u), setting I u) := { } k I u) : u + γk f Ku))) k > γw 0, I u) := { } k I u) : u + γk f Ku))) k γw 0. Note that I u) is a finite set, due to u + γk f Ku) l. We may therefore choose h 0 > 0 even so small that I u) Iu + hd) for all 0 < h h 0, e.g., by additionally requiring that { h 0 min inf h > 0 : u + hd + γk f Ku + hd))) l < γwl }. l I u)

10 Global convergence of damped semismooth Newton methods for l regularization 0 Additionally, we choose h 0 > 0 so small that h 0 I γk K)d l γw 0. We conclude that for all 0 < h h 0 and k I u), u + hd + γk f Ku + hd))) k u + γk f Ku)) k + I γk K)d l γw 0 γw k, so that also I u) Iu+hd) for all 0 < h h 0. 5) implies that F k u+hd) F k u) = hd k for all k I u) and 0 < h h 0. Note, that h 0 = h 0 u, d) is independent of k. As already mentioned, our globalization strategy in section 4 will choose damping parameters t j in such a way that sufficient descent of the merit functional Θ p from 7) is guaranteed. To this end, we compute the directional derivatives of Θ p. Lemma 3.. Let u, d l be arbitrary. Then for p {, }, Θ p is directionally differentiable at u in the direction d with directional derivatives F k u, d) + F k u, d) F k u, d), p = Θ F pu, d) = k u)>0 F k u)=0 F k u)<0 F k u, d)f ku), p =. ) k N Proof. ) immediately follows from the chain rule for directionally differentiable mappings, see [9, Theorem 3..]. 3.. The feasibility of the local B-semismooth Newton iteration In the sequel, for a given u l, we will discuss the existence and uniqueness of a solution d l to the nonlinear problem 6) F u, d) = Fu). Note that in view of Lemma 3., a solution d to 6) has the particularly interesting properties that Θ u, d) = k N F k u) = Θ u), Θ u, d) = F u, d), Fu) = Θ u). 3) It will turn out that a solution d to 6) coincides with the semismooth Newton direction from [6] almost everywhere, see Lemma 3.4. The following auxiliary results ensure the solvability of 6). Lemma 3.3. Let u l, γ > 0, M := K K and M I +,I + M I +,AM A,A M A,I + N := γ M I,AM A,A M A,I + M I,I + M I +,AM A,A M A,I M I +,I M I,I M I,AM A,A M A,I Then the finite-dimensional matrix N is symmetric and positive definite. ). 4)

11 Global convergence of damped semismooth Newton methods for l regularization Proof. Without loss of generality, let γ =. The symmetry of N readily follows from that of M. Concerning the positive definiteness of N, note that each real 3 3 block matrix with symmetric off-diagonal blocks and symmetric invertible diagonal block F can be decomposed as A B C A CF C CF E B 0 B D E = X EF C B D EF E 0 X, 5) C E F 0 0 F with I 0 CF X = 0 I EF. 0 0 I Setting A := M I +,I +, B := M I +,I, C := M I +,A, D := M I,I, E := M I,A and F := M A,A, 5) tells us that N is the upper left diagonal block of X M I + I A,I + I AX T and hence positive definite. The solvability of 6) hinges on the solvability of a particular linear complementarity problem. Lemma 3.4. Let u l. Then with M := K K, d l solves 6) if and only if and γmd) A = Fu) A, 6) d I = u I, 7) ) ) d x := I + + u I + γmd), y := I + + u I +, 8) d I u I γmd) I u I solve the linear complementarity problem γ x, y 0, y = Nx + z, x, y = 0, 9) where the data N = Nu) and z = zu) are given by 4) and ) γm z = I +,AM A,A M A,I M I +,I )u I M I +,AM A,A F u) A + u I + γm I,I M I,AM A,A M A,I )u I + M I,AM A,A F u) A u I ) M I +,I + M I +,AM A,A M A,I + M I +,AM A,A M A,I M I +,I M I,AM A,A M A,I + M I,I + M I,I M I,AM A,A M A,I u I + u I Proof. By componentwise application of 5) and ) to the indices from A, I, I + and I, we observe first that 6) is equivalent to 6), 7) and the conditions dk + u k > 0 γmd) k + u k = 0 ) d k + u k = 0 γmd) k + u k 0 ), k I +, dk + u k < 0 γmd) k + u k = 0 ) d k + u k = 0 γmd) k + u k 0 ), k I, ). 30)

12 Global convergence of damped semismooth Newton methods for l regularization respectively. The last two conditions are actually equivalent to d I + + u I + 0, u I + + γmd) I + 0, di + + u I +, u I + + γmd) I + = 0 3) and d I + u I 0, u I + γmd) I 0, di + u I, u I + γmd) I = 0. 3) Now assume that d l solves 6) and hence 6), 7), 3) and 3) hold. By defining x, y as in 8), 3) and 3) tell us that x 0, y 0 and x, y = 0. Moreover, 6) and 7) imply that d A = γ A,A M γma,i u I Fu) A γm A,I +d I + γm A,I d I ). 33) Inserting 33) into the definition of y from 8), we obtain ) γmd) y = I + + u I + γmd) I u I ) γm = I +,Ad A + γm I +,I +d I + + γm I +,I d I γm I +,I u I + u I + γm I,Ad A γm I,I +d I + γm I,I d I + γm I,I u I u I ) ) d = N I + γm + I +,AM A,A M A,I M I +,I )u I M I +,AM A,A F u) A + u I + d I γm I,I M I,AM A,A M A,I )u I + M I,AM A,A F u) A u I = Nx + z, with N from 4) and z from 30), showing 9). Conversely, assume that x, y solve 9) and define d l blockwise via ) ) d I + u := x + I + d I u I and the system 6), 7) for the remaining entries from A I. It remains to show that d solves 6). As we have already seen above, 6) and 7) are equivalent to 6) holding in the entries corresponding to A I. Moreover, as above, 6) and 7) imply 33), which yields ) γmd) y = Nx + z = I + + u I +, γmd) I u I tracing back the previous computation. By consequence, 8) holds. 9) implies 3) and 3), and thus 6) holds in the remaining entries corresponding to I + I. We are now in the position to prove the following theorem on the solvability of 6). Theorem 3.5. Let x, y be the unique solution of 9) for an iterate u j) and let I = I I + I. Then the local B-semismooth Newton iteration 5) is well-defined,

13 Global convergence of damped semismooth Newton methods for l regularization 3 with d j) I = uj), d j) I + = x I + u j) I +, d j) I = x I u j) d j) A = γ M A,A I, γma,i d j) I Fuj) A )). 34) Proof. The finite-dimensional linear complementarity problem 9) is uniquely solvable if N is a P-matrix, i.e. if all of its principal minors are positive [9, Definition 3.3., Theorem 3.3.7]. According to Lemma 3.3, N is symmetric and positive definite for every iterate u j) l. Hence, N is a P-Matrix and 5), 34) are well-defined. Concerning the convergence properties of the local B-semismooth Newton method, we can rely on the following result from [7, Corollary 3.3]. Corollary 3.6. Let u be the unique solution to Fu) = 0. Then, the iterates of the local B-semismooth Newton iteration 5) converge to u superlinearly in a neighborhood of u. 4. The global B-semismooth Newton iteration In this section, we will analyze the following damped B-semismooth Newton method for the computation of l Tikhonov minimizers u j+) = u j) + t j d j), F u j), d j) ) = Fu j) ), j = 0,,... 35) and its global convergence properties in a finite-dimensional setting N = {,..., n}. Inspired by the works [7,, 5, 7], we shall choose the damping parameters t j by an Armijo-type rule. The overall algorithm can be found in Algorithm. Let us first show that damping parameters t j are well-defined. Algorithm Damped B-semismooth Newton method Choose starting value u 0) and parameters p {, }, β 0, ), σ 0, p ) and a tolerance tol while Θ p u j) ) tol do Compute the Newton update d j) according to 34) t j = while Θ p u j) + t j d j) ) > pσt j )Θ p u j) ) do t j = t j β end while u j+) = u j) + t j d j) j = j + end while

14 Global convergence of damped semismooth Newton methods for l regularization 4 Proposition 4.. Let p {, }, β 0, ), σ 0, p ). Let uj) with Θ p u j) ) > 0 be an iterate in Algorithm and let d j) = du j) ) be chosen according to 34). Then there exists a finite index l N with Θ p u j) + β l d j) ) pσβ l )Θ p u j) ). 36) Proof. According to 3), it holds Θ pu j), d j) ) = pθ p u j) ) < 0. Assuming that there exists no such index l N, it holds for all l N Θ p u j) + β l d j) ) Θ p u j) ) β l > pσθ p u j) ). For l, it follows that Θ pu j) d j) ) pσθ p u j) ), which is a contradiction to Θ pu j) d j) ) = pθ p u j) ) < pσθ p u j) ). Remark 4.. For the particular B-Newton direction from 34), inequality 36) is equivalent to the ordinary Armijo rule, because it holds with σ 0, p ). Θ p u j) + t j d j) ) pσt j )Θ p u j) ) = Θ p u j) ) pσt j Θ p u j) ) = Θ p u j) ) + σt j Θ pu j), d j) ) As an important ingredient of our convergence analysis, we will show that F ) p, p {, } is coercive with respect to the Euclidean norm, i.e., lim u Fu) p =. For the proof, we need the following estimates for soft shrinkage in finite-dimensional spaces. Lemma 4.3. Let 0 w R n. Then it holds that Proof. For each u R n, 37) follows from u S w u) p n p w, for all u R n. 37) u S w u) p p = u k p + u k w k u k >w k w p k n w p k n w p. k= The coercivity of F ) p with respect to the Euclidean norm for each γ > 0 immediately follows from the following lemma, because I γk K R n n is symmetric and is not an eigenvalue of this matrix because K is injective. Lemma 4.4. Let 0 w R n, c R n, B R n n be symmetric, and assume that is not an eigenvalue of B. Then there exist α > 0 and β R with u Sw Bu + c) p α u + β, for all u R n. 38)

15 Global convergence of damped semismooth Newton methods for l regularization 5 Proof. By assumption on the spectral properties of B, there exist B-invariant subspaces U, U and constants 0 < ρ < < η with and Bu ρ u, for all u U, Bu η u, for all u U, such that the splitting R n = U U is orthogonal. For an arbitrary u R n and its orthogonal projections of u onto U j, j, a double application of Pythagoras theorem implies that I B)u = I B)u + I B)u min{ ρ, η } u. 39) By combining 39), 37) and the nonexpansivity of the soft shrinkage operator S w, we obtain that u Sw Bu + c) I B)u) Bu Sw Bu) Sw Bu) S w Bu + c) min{ ρ, η } u n w c, proving the claim for p =. The claim for p = follows from u Sw Bu + c) u Sw Bu + c). We proceed by verifying the compactness of the level sets of F ) p. Proposition 4.5. Let u 0) be an arbitrary starting vector of Algorithm. Then the level set Lu 0) ) := {u : Fu) p Fu 0) ) p }, p {, } is compact. Proof. This follows immediately from the coercivity 38) of the function F ) p. In the following proposition, we will show that the B-semismooth Newton directions from 34) are bounded with respect to the current residual norms. Proposition 4.6. Let u R n, p {, }, and let d = du) be chosen according to 34). There exists a constant C > 0, independent of u, with d p C Fu) p. Proof. Let N, z, x, y be as in 4), 9), 30). It suffices to consider p =. It holds d = d A + d I + d I + d I + and d I + = x I + u I + x + Fu) d I = x I + u I x + Fu) d I = Fu)I Fu).

16 Global convergence of damped semismooth Newton methods for l regularization 6 Consider first x. Since N is symmetric and positiv definite for all u R n cf. Lemma 3.3), N is a P-Matrix, cf. proof of Theorem 3.5. Corresponding to [9, p.478], we define cn) := min { max z inz) i }. z = i n According to [9, Lemma 7.3.0], it holds for arbitrary z, z R n where x is the unique solution of and x is the unique solution of x x cn) z z, x 0, Nx + z 0, x, Nx + z = 0 x 0, Nx + z 0, x, Nx + z = 0. Choosing z = 0, it follows x, Nx = 0. Since N is positive definite, x = 0 and x cn) z max cn) z. A,I,I +,I Note that there exist only 4 n possibilities for the decomposition of {,..., n} into the sets A, I, I +, I. Therefore, there exists a constant c = c n) > 0 with x c z. Moreover, 30) implies that [ γ MI +,AM where c := z Moreover, A,A M A,I M I +,I + M I +,AM A,A + ) Fu) + γ M I,I M I,AM A,A M A,I + M I,AM A,A + ) Fu) + N Fu) c Fu), [ γ MI max +,AM A,I,I +,I A,A M A,I M I +,I + M I +,AM A,A + ) + γ M I,I M I,AM A,A M A,I + M I,AM A,A + ) ] + N. d A = γ M A,A Fu) A M A,A M A,Id I ) γ M A,A Fu) + γ M A,A M A,I d I, where I = I + I I, and d I = d I + d I + + d I x + Fu) ) + Fu) c c + ) + ) Fu). ]

17 Global convergence of damped semismooth Newton methods for l regularization 7 Therefore, with c 3 := Altogether we obtain d A c 3 Fu) max γ M A,I,I +,I A,A + γ M A,A M A,I c c + ) + ). d C Fu) with C := c 3 + c c + ) + ), independent of u. In our convergence proof, we shall need the continuity of Θ p. Lemma 4.7. Let u j), d j) ) u, d), j, Θ := Θ p, p {, }. If Θ from 7) fulfills the condition Θu) Θv) Θ u, u v) lim = 0, 40) u,v) u,u ) u v p then the directional derivative Θ u, d) is continuous at u, d), as a function of u, d), i.e. lim j Θ u j), d j) ) = Θ u, d). Proof. Since Θ from 7) is locally Lipschitz continuous and directionally differentiable in the level set Lu 0) ) from Proposition 4.5, it follows for u L that Θ u, ) is a locally Lipschitz continuous function with the same Lipschitz constant as Θ, cf. [9, Theorem 3..], i.e. it holds lim j Θ u, d j) ) = Θ u, d). Because of condition 40) of Θ at u, the double limit Θu j) + td j) ) Θu j) ) lim j,t 0 t exists and is equal to Θ u, d). Additionally it holds that Θu j) + td j) ) Θu j) ) lim t 0 t = Θ u j), d j) ) for every fixed j N. Therefore, we conclude that ) lim j Θ u j), d j) Θu j) + td j) ) Θu j) ) ) = lim lim j t 0 t = lim j,t 0 Θu j) + td j) ) Θu j) ) t = Θ u, d).

18 Global convergence of damped semismooth Newton methods for l regularization 8 We may now prove our main result on the global convergence of Algorithm. Theorem 4.8. Consider the damped B-semismooth Newton method for the finitedimensional shrinkage equation ). Let {u j) } j be a sequence of iterates produced by Algorithm, {t j } j the chosen stepsizes and p {, }, Θ p := Θ. i) If lim sup j t j > 0, then u j) u, j with Θu ) = 0. ii) If lim sup j t j = 0 and if u is an accumulation point of {u j) } j where condition 40) holds at u, then u j) u, j with Θu ) = 0. Proof. We proceed as in [7,,5,7]. The B-Newton equation 6) has a unique solution in each step, according to Theorem 3.5. Moreover, Proposition 4. ensures that the Armijo rule outputs well-defined stepsizes t j. Due to Proposition 4.5, the sequence of iterates {u j) } j is bounded and has an accumulation point u. If t j is chosen equal to infinitely many times, then it follows with pσ < that u j) u, j, with Θu ) = 0. Otherwise, {Θu j) )} j is still strictly decreasing, due to the Armijo rule, and obviously bounded below by 0. Therefore, the sequence {Θu j) )} j is convergent and the sequence of differences {Θu j+) ) Θu j) )} j tends to 0. The Armijo rule 36) implies t j Θu j) ) Θuj) ) Θu j+) ) pσ 0, j. If the chosen stepsizes are bounded away from 0, i.e. lim sup j t j > 0, it follows that u j) u, j with Θu ) = 0, proving i). Suppose now that lim sup j t j = 0, which implies that lim j t j = 0. Let {Θu j) )} j J be a subsequence converging to u with Θu ) > 0. Let t j = β l j denote the chosen stepsizes in Algorithm. We define τ j := β l j. The Armijo rule 36) yields Θu j) + τ j d j) ) Θu j) ) τ j > pσθu j) ), j = 0,,... 4) According to Proposition 4.6, the sequence {d j) } j is bounded. Without loss of generality we may suppose that d j) d, j, j J. Passing to the limit in 4), assumption 40) of Θ at u yields with Lemma 4.7 Θ u, d) pσθu ). Due to 40), the directional derivative Θ as a function of u, v) is continuous at u, d), see Lemma 4.7. Therefore, Θ u, d) = Altogether, we obtain lim j,j J Θ u j), d j) ) = lim j,j J pθuj) ) = pθu ). pθu ) pσθu ), which is a contradiction to σ < and hence finishes the proof. p

19 Global convergence of damped semismooth Newton methods for l regularization 9 Remark 4.9. Assumption 40) on Θ is only needed in case ii) of Theorem 4.8. Pang proved in [6, Proposition ] that if the merit function Θ has no strong F-derivative at the accumulation point u, which is a sufficient condition for 40), then u is not a zero of Θ. Therefore, this assumption in Theorem 4.8 is necessary. Concerning the speed of convergence of our B-semismooth Newton scheme, we cite the following corollary from [7, Theorem 4.3, Corollary 4.4]. Corollary 4.0. Let {u j) } be any sequence generated by Algorithm with p =. Assume that Fu j) ) 0 for all j. Suppose that u is an accumulation point of {u j) }. Then u is the zero of F if and only if {u j) } converges to u superlinearly and t j eventually becomes. On the other hand, u is not the zero of F if and only if either the sequence {u j) } diverges or lim j t j = 0. The following remark clarifies the naming of our B-semismooth Newton method, by interpreting it as a semismooth Newton method in the sense of [7]. Remark 4.. Since F is locally Lipschitz continuous, there exists a generalized derivative G with F u, d) = Gu)d 4) such that the Newton direction from 4) coincides with the B-Newton direction 34), see for example [8, Lemma., Proposition 3.4]. Here, Gu) is given by the block matrix ) γk K Gu) = B,B γk K B,C, 43) 0 I C where B = Bu) := Au) {k I ± : x k > 0} C = Cu) := I u) {k I ± : x k = 0} = N \B and x is the solution to the linear complementarity problem 9). With this choice, 4) holds. If I ± u) =, then the generalized derivatives G from 8) and 43) coincide. 43) is indeed a generalized derivative of F because lim h 0 Fu + h) Fu) Gu + h)h p lim h 0 h p Fu + h) Fu) F u, h) p h p + lim h 0 F u, h) F u + h, h) p h p. The first limit is equal to 0 because of the B-differentiability of F. The second limit is 0 since F is semismooth and locally Lipschitz continuous, see [8, Theorem.3].

20 Global convergence of damped semismooth Newton methods for l regularization 0 5. Numerical results In this section, we discuss numerical results on the application of the damped B- semismooth Newton method from Algorithm to l Tikhonov regularization of different problems. We shall consider the inverse integration problem [6], the inverse heat equation and deblurring of images [8 0]. In particular, we will analyze the influence of the parameter γ > 0 on the performance of Algorithm. For our computations, we use MATLAB R 03a and the package regtools from [0]. In the Armijo rule 36), we choose the parameter values σ = 0.0 and β = 0.5. We will work with constant weights w k = w 0, for all k N, and w 0 is adapted to the measurement errors a posteriori by the discrepancy principle, see also [, 3]. If not otherwise specified, p = is used to define the merit function Θ p. In fact, our algorithm yields similar results independent of the choice p {, }. In the following examples, the index sets I +, I were empty in each iteration. If all stepsizes are chosen equal to and if I + I = in each step, then Algorithm coincides with the semismooth Newton iteration 4), 8) from [6]. Because of the sparsity of the solution, the starting vector was chosen as the zero vector in all following computations, but our algorithm performs well also for arbitrary starting vectors. In all examples, the stopping criterion was a residual norm smaller than a tolerance of Inverse integration Here, analogous to [6], we consider the inverse integration problem. Our aim is to find ux) from measurements fx) with Kux) = fx) on [0, ]. The operator K : L [0, ]) L [0, ]) is defined as K is discretized by Kux) = x 0 ut)dt. 0 0 K = N RN N and the data f = fx k )) k=,...,n are given on a grid = {x k = k : k =,..., N}. N Figure illustrates the reconstruction of u = ux k )) k=,...,n from 5% noisy data f δ with the damped B-semismooth Newton method for p =. Here, the space dimension was N = 500, γ = 0 4 and w k = 0 for all k {,..., 500}. The residual norms in the original semismooth Newton method 4) from [6] might not decrease monotonically, see Table of loc. cit. However, the damped Newton steps in Algorithm effect a strictly monotonic decreasing of the residual norm. Close to the zero of the merit functional, the stepsizes were chosen equal to. This fits well the locally superlinear convergence of the B-semismooth Newton method, see Corollary 3.6. One can often observe a remarkable

21 Global convergence of damped semismooth Newton methods for l regularization Fu j) ) tj j j Figure. Inverse integration problem with N = 500, γ = 0 4, w k = 0 and p =. From left to right and top to bottom: noisy data containing 5% of noise, reconstruction, residual norm, chosen stepsizes. Table. Performance of Algorithm, depending on the choice of γ for space dimension N = 500, w k = 0 for all k and with 5% noisy data f δ for the example of inverse integration. γ # iterations Ju m) ) Fu m) ) #{j : t j = } Fu 0) ) e e e e e e e e e e e e e e + 08 decrease of the residual norm in the last step, as mentioned for the semismooth Newton method in [6] and in Section.. If p is chosen equal to in this example, steps need to be computed to achieve a similar reconstruction if γ is chosen equal to 0 5, see Table. The influence of the parameter γ on the performance of Algorithm is illustrated in Table. The space dimension N = 500 was fixed and the number m of iterations,

22 Global convergence of damped semismooth Newton methods for l regularization Table 3. Performance of Algorithm, depending on the problem size N for the example of inverse integration, where γ = 0 5, w k = 0 for all k. The noisy data f δ contained 5% of noise. N # iterations Fu m) ) Fu 0) ) e 3.605e e e e e e.88e e.7675e e.4968e + 06 Table 4. Influence of the noise level on the performance of Algorithm for the example of inverse integration. Here, N = 500 and γ = 0 5. noise level w k # iterations Fu m) ) Fu 0) ) 0.% e 8.4e % e e + 05 % e e + 05 % e 8.067e % e e % 0.056e e % e 7.985e + 05 the functional value at convergence Ju m) ) = Kum) f + N k= w k u m) k, the residual norm Fu m) ), the number of steps with stepsize equal to as well as the residual norm at the initial guess are listed depending on the choice of γ. For small values of γ, many iterations need to be computed. Table shows that the choice of γ is crucial for the performance of Algorithm. In contrast to the semismooth Newton method from [6], see also Section, the damped B-semismooth Newton method is guaranteed to converge for an arbitrary choice of γ > 0. However, the computational work does not only depend on the number of iterations, but also on the space dimension N, the number of active indices of each iterate, the chosen stepsizes and the presence of degenerated indices k I + u j) ) I u j) ) during the iteration. Degenerated indices did not appear in our numerical experiments. Table 3 demonstrates the influence of the problem size N on the performance of Algorithm. We chose γ = 0 5, w k = 0 for all k {,..., N} and a noise level of 5%. The number of iterations only increases very slowly with the problem size. Moreover, the number m of iterations is almost independent of the noise level for the example of inverse integration, as is demonstrated in Table 4. Here, the problem

23 Global convergence of damped semismooth Newton methods for l regularization 3 Figure 3. Reconstruction of a blurred test image with N = 00 pixels, γ = 00 and w k = 0. Left: blurred image containing % of noise. Right: reconstruction. size was chosen as N = 500, γ = 0 5 and the regularization parameters were chosen again by the discrepancy principle. 5.. Deblurring of images The next example treats the deblurring of digital images which are degraded by atmospheric turbulence blur, see [8,9]. The mathematical model is a Fredholm integral equation with Gaussian point-spread kernel hx, y) = x +y πρ e ρ. The problem is discretized by a symmetric doubly Toeplitz matrix K = πρ T T RN N, where T is a symmetric banded Toeplitz matrix with bandwidth 5 T = e ρ e 4 ρ e e 4 ρ ρ e 4 ρ e e 4 ρ e ρ ρ R n n with N = n, see [9]. Figure 3 shows the reconstruction of a test image from [0], from % noisy data f δ. The image contained N = 00 pixels and we chose ρ = 0.7, γ = 00 and w k = 0 for all k. The algorithm terminated after 5 steps with a residual norm of The residual norm of the starting vector was All stepsizes were chosen equal

24 Global convergence of damped semismooth Newton methods for l regularization Figure 4. Solution of the inverse heat equation with % noisy data. From left to right: Exact solution, reconstruction for κ = 5 and N = 00, reconstruction for κ = and N = 0. to one and the index sets I ± did not appear throughout the iteration. Therefore, the algorithm coincided with the semismooth Newton method from [6] for this very choice of γ. For larger parameter values γ, however, the output of our Algorithm deviates from that of the semismooth Newton method from [6] because the Armijo rule will select stepsizes t j < at least for early iterates Inverse heat equation The third example treats the inverse heat equation [6, 9]. Here, we consider a Volterra integral equation of the first kind on the interval [0, ] with kernel Ks, t) = s t) 3 κ π e 4κ s t). For κ =, the inverse heat equation is severely ill-posed. The equation is less ill-posed for larger κ. The problem is discretized as in [0]. Figure 4 shows the results of Algorithm for the inverse heat equation problem with % noisy data. First, we chose κ = 5. With N = 00, w k = 0 3 and γ = 0 4, the reconstruction was found after 4 steps with a residual norm of Second, we considered κ = and N = 0. The regularization parameters were chosen as w k = 0 4 by the discrepancy principle. With γ = 0 4, we obtain a residual norm of after 6 steps. In these two examples, the B-semismooth Newton iteration was consistent with the semismooth Newton iteration from [6]. 6. Discussion In [6], the locally superlinearly convergent semismooth Newton method for Tikhonov functionals with sparsity constraints was presented in the infinite-dimensional setting. We have discussed a globalized variant of this very method, by introducing damping parameters which ensure the proper descent of the semismooth Newton iterates with respect to a suitable merit functional. The consideration of a Bouligand direction was crucial for our proof of global convergence. Interestingly, the Newton direction

25 Global convergence of damped semismooth Newton methods for l regularization 5 of [6] always coincided with the Bouligand direction in our numerical experiments. More precisely, our method can be interpreted as a semismooth Newton method in the sense of [7], with a generalized derivative that coincides with the generalized derivative from [6] almost everywhere. Several properties of the proposed B-semismooth Newton method are advantegeous: our scheme combines the convergence speed of the semismooth Newton method from [6] with the global convergence property of other state-of-the-art, but still linearly convergent, methods. Additionally, our method yields the exact minimizer of the Tikhonov functional within finite many iterations. Further research will focus on the analysis of the infinite-dimensional setting, which may shed some light on the robustness with respect to the spatial dimension and on the option to accelerate the convergence even more, e.g., by multigrid strategies. Moreover, the analysis of matrix-free inexact B-semismooth Newton methods will be of interest in the context of l Tikhonov regularization of large-scale discrete ill-posed problems. References [] Anzengruber and S W Ramlau R 00 Morozov s discrepancy principle for Tikhonov-type functionals with nonlinear operators Inverse Problems 6) 0500 [] Beck A and Teboulle M 009 A fast iterative shrinkage-thresholding algorithm for linear inverse problems SIAM J. Imaging Sci. ) 83 0 [3] Bonesky T 009 Morozov s discrepancy principle and Tikhonov-type functionals Inverse Problems 5) 0505 [4] Bredies K and Lorenz D A 008 Linear convergence of iterative soft-thresholding J. Fourier Anal. Appl. 45-6) [5] Bredies K, Lorenz D A and Maass P 009 A generalized conditional gradient method and its connection to an iterative shrinkage method Comput. Optim. Appl [6] Carasso A 98 Determining surface temperatures from interior observations SIAM J. Appl. Math. 43) [7] Chen X, Nashed Z and Qi L 000 Smoothing methods and semismooth methods for nondifferentiable operator equations SIAM J. Numer. Anal. 384) 00 6 [8] Combettes P L and Wajs V R 005 Signal recovery by proximal forward-backward splitting Multiscale Model. Simul. 44) [9] Cottle R W, Pang J-S and Stone R E 009 The linear complementarity problem Philadelphia: SIAM) [0] Daubechies I, Defrise M and De Mol C 004 An iterative thresholding algorithm for linear inverse problems with a sparsity constraint Commun. Pure Appl. Math [] Deuflhard P 004 Newton methods for nonlinear problems. Affine invariance and adaptive algorithms Springer series in computational mathematics vol 35) Berlin: Springer) [] Efron B, Hastie T, Johnstone I and Tibshirani R 004 Least angle regression Ann. Statist. 3) [3] Engl H W, Hanke M and Neubauer A 996 Regularization of inverse problems Mathematics and its Applications vol 375) Dordrecht: Kluwer Academic Publishers) [4] Facchinei F and Pang J-S 003 Finite-dimensional variational inequalities and complementarity problems volume I New York: Springer) [5] Facchinei F and Pang J-S 003 Finite-dimensional variational inequalities and complementarity problems volume II New York: Springer)

26 Global convergence of damped semismooth Newton methods for l regularization 6 [6] Griesse R and Lorenz D A 008 A semismooth Newton method for Tikhonov functionals with sparsity constraints Inverse Probl. 43) [7] Han S-P, Pang J-S and Rangaraj N 99 Globally convergent Newton methods for nonsmooth equations Math. Oper. Res. 73) [8] Hanke M and Hansen P C 993 Regularization methods for large-scale problems Surv. Math. Ind [9] Hansen P C 008 Regularization tools. A Matlab package for analysis and solution of discrete ill-posed problems. From: pdf..3) [0] Hansen P C 008 Regularization tools version 4. From: [] Ito K and Kunisch K 009 On a semi-smooth Newton method and its globalization Math. Program., Ser. A [] Loris I 009 On the performance of algorithms for the minimization of l -penalized functionals Inverse Probl. 53) [3] Milzarek A and Ulbrich M 04 A semismooth Newton method with multidimensional filter globalization for l -optimization SIAM J. Optim. 4) [4] Osborne M R, Presnell B and Turlach B A 000 A new approach to variable selection in least squares problems IMA J. Numer. Anal [5] Pang J-S 990 Newton s method for B-differentiable equations Math. Oper. Res. 5) 3 34 [6] Pang J-S 99 A B-differentiable equation-based, globally and locally quadratically convergent algorithm for nonlinear programs, complementarity and variational inequality problems Math. Prog [7] Qi L 993 Convergence analysis of some algorithms for solving nonsmooth equations Math. Oper. Res. 8) 7 44 [8] Qi L and Sun J 993 A nonsmooth version of Newton s method Math. Program [9] Scholtes S 0 Introduction to piecewise differentiable equations New York, Heidelberg, Dordrecht, London: Springer) [30] Shapiro A 990 On concepts of directional differentiability J. Optimization Th. Appl. 663) [3] Yu C and Liu X 03 Four kinds of differentiable maps Int. J. Pure Appl. Math. 833)

Robust error estimates for regularization and discretization of bang-bang control problems

Robust error estimates for regularization and discretization of bang-bang control problems Robust error estimates for regularization and discretization of bang-bang control problems Daniel Wachsmuth September 2, 205 Abstract We investigate the simultaneous regularization and discretization of

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

On the acceleration of the double smoothing technique for unconstrained convex optimization problems

On the acceleration of the double smoothing technique for unconstrained convex optimization problems On the acceleration of the double smoothing technique for unconstrained convex optimization problems Radu Ioan Boţ Christopher Hendrich October 10, 01 Abstract. In this article we investigate the possibilities

More information

Proximal methods. S. Villa. October 7, 2014

Proximal methods. S. Villa. October 7, 2014 Proximal methods S. Villa October 7, 2014 1 Review of the basics Often machine learning problems require the solution of minimization problems. For instance, the ERM algorithm requires to solve a problem

More information

Affine covariant Semi-smooth Newton in function space

Affine covariant Semi-smooth Newton in function space Affine covariant Semi-smooth Newton in function space Anton Schiela March 14, 2018 These are lecture notes of my talks given for the Winter School Modern Methods in Nonsmooth Optimization that was held

More information

An Iteratively Regularized Projection Method with Quadratic Convergence for Nonlinear Ill-posed Problems

An Iteratively Regularized Projection Method with Quadratic Convergence for Nonlinear Ill-posed Problems Int. Journal of Math. Analysis, Vol. 4, 1, no. 45, 11-8 An Iteratively Regularized Projection Method with Quadratic Convergence for Nonlinear Ill-posed Problems Santhosh George Department of Mathematical

More information

Convergence rates in l 1 -regularization when the basis is not smooth enough

Convergence rates in l 1 -regularization when the basis is not smooth enough Convergence rates in l 1 -regularization when the basis is not smooth enough Jens Flemming, Markus Hegland November 29, 2013 Abstract Sparsity promoting regularization is an important technique for signal

More information

Spectral gradient projection method for solving nonlinear monotone equations

Spectral gradient projection method for solving nonlinear monotone equations Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department

More information

Two-parameter regularization method for determining the heat source

Two-parameter regularization method for determining the heat source Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 13, Number 8 (017), pp. 3937-3950 Research India Publications http://www.ripublication.com Two-parameter regularization method for

More information

Linear convergence of iterative soft-thresholding

Linear convergence of iterative soft-thresholding arxiv:0709.1598v3 [math.fa] 11 Dec 007 Linear convergence of iterative soft-thresholding Kristian Bredies and Dirk A. Lorenz ABSTRACT. In this article, the convergence of the often used iterative softthresholding

More information

Sparsity Regularization

Sparsity Regularization Sparsity Regularization Bangti Jin Course Inverse Problems & Imaging 1 / 41 Outline 1 Motivation: sparsity? 2 Mathematical preliminaries 3 l 1 solvers 2 / 41 problem setup finite-dimensional formulation

More information

Adaptive methods for control problems with finite-dimensional control space

Adaptive methods for control problems with finite-dimensional control space Adaptive methods for control problems with finite-dimensional control space Saheed Akindeinde and Daniel Wachsmuth Johann Radon Institute for Computational and Applied Mathematics (RICAM) Austrian Academy

More information

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization / Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

More information

Morozov s discrepancy principle for Tikhonov-type functionals with non-linear operators

Morozov s discrepancy principle for Tikhonov-type functionals with non-linear operators Morozov s discrepancy principle for Tikhonov-type functionals with non-linear operators Stephan W Anzengruber 1 and Ronny Ramlau 1,2 1 Johann Radon Institute for Computational and Applied Mathematics,

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications

A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications A globally and R-linearly convergent hybrid HS and PRP method and its inexact version with applications Weijun Zhou 28 October 20 Abstract A hybrid HS and PRP type conjugate gradient method for smooth

More information

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming

Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Iterative Reweighted Minimization Methods for l p Regularized Unconstrained Nonlinear Programming Zhaosong Lu October 5, 2012 (Revised: June 3, 2013; September 17, 2013) Abstract In this paper we study

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

A NOTE ON THE NONLINEAR LANDWEBER ITERATION. Dedicated to Heinz W. Engl on the occasion of his 60th birthday

A NOTE ON THE NONLINEAR LANDWEBER ITERATION. Dedicated to Heinz W. Engl on the occasion of his 60th birthday A NOTE ON THE NONLINEAR LANDWEBER ITERATION Martin Hanke Dedicated to Heinz W. Engl on the occasion of his 60th birthday Abstract. We reconsider the Landweber iteration for nonlinear ill-posed problems.

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

Optimization methods

Optimization methods Optimization methods Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda /8/016 Introduction Aim: Overview of optimization methods that Tend to

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

CLASSICAL ITERATIVE METHODS

CLASSICAL ITERATIVE METHODS CLASSICAL ITERATIVE METHODS LONG CHEN In this notes we discuss classic iterative methods on solving the linear operator equation (1) Au = f, posed on a finite dimensional Hilbert space V = R N equipped

More information

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE METHODS AND APPLICATIONS OF ANALYSIS. c 2011 International Press Vol. 18, No. 1, pp. 105 110, March 2011 007 EXACT SUPPORT RECOVERY FOR LINEAR INVERSE PROBLEMS WITH SPARSITY CONSTRAINTS DENNIS TREDE Abstract.

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Bulletin of the. Iranian Mathematical Society

Bulletin of the. Iranian Mathematical Society ISSN: 1017-060X (Print) ISSN: 1735-8515 (Online) Bulletin of the Iranian Mathematical Society Vol. 41 (2015), No. 5, pp. 1259 1269. Title: A uniform approximation method to solve absolute value equation

More information

SPARSE SIGNAL RESTORATION. 1. Introduction

SPARSE SIGNAL RESTORATION. 1. Introduction SPARSE SIGNAL RESTORATION IVAN W. SELESNICK 1. Introduction These notes describe an approach for the restoration of degraded signals using sparsity. This approach, which has become quite popular, is useful

More information

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS

CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS Igor V. Konnov Department of Applied Mathematics, Kazan University Kazan 420008, Russia Preprint, March 2002 ISBN 951-42-6687-0 AMS classification:

More information

ON THE DYNAMICAL SYSTEMS METHOD FOR SOLVING NONLINEAR EQUATIONS WITH MONOTONE OPERATORS

ON THE DYNAMICAL SYSTEMS METHOD FOR SOLVING NONLINEAR EQUATIONS WITH MONOTONE OPERATORS PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume, Number, Pages S -9939(XX- ON THE DYNAMICAL SYSTEMS METHOD FOR SOLVING NONLINEAR EQUATIONS WITH MONOTONE OPERATORS N. S. HOANG AND A. G. RAMM (Communicated

More information

On Friedrichs inequality, Helmholtz decomposition, vector potentials, and the div-curl lemma. Ben Schweizer 1

On Friedrichs inequality, Helmholtz decomposition, vector potentials, and the div-curl lemma. Ben Schweizer 1 On Friedrichs inequality, Helmholtz decomposition, vector potentials, and the div-curl lemma Ben Schweizer 1 January 16, 2017 Abstract: We study connections between four different types of results that

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

PDEs in Image Processing, Tutorials

PDEs in Image Processing, Tutorials PDEs in Image Processing, Tutorials Markus Grasmair Vienna, Winter Term 2010 2011 Direct Methods Let X be a topological space and R: X R {+ } some functional. following definitions: The mapping R is lower

More information

Discreteness of Transmission Eigenvalues via Upper Triangular Compact Operators

Discreteness of Transmission Eigenvalues via Upper Triangular Compact Operators Discreteness of Transmission Eigenvalues via Upper Triangular Compact Operators John Sylvester Department of Mathematics University of Washington Seattle, Washington 98195 U.S.A. June 3, 2011 This research

More information

Newton-type Methods for Solving the Nonsmooth Equations with Finitely Many Maximum Functions

Newton-type Methods for Solving the Nonsmooth Equations with Finitely Many Maximum Functions 260 Journal of Advances in Applied Mathematics, Vol. 1, No. 4, October 2016 https://dx.doi.org/10.22606/jaam.2016.14006 Newton-type Methods for Solving the Nonsmooth Equations with Finitely Many Maximum

More information

A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems

A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems Wei Ma Zheng-Jian Bai September 18, 2012 Abstract In this paper, we give a regularized directional derivative-based

More information

This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author s institution, sharing

More information

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES Fenghui Wang Department of Mathematics, Luoyang Normal University, Luoyang 470, P.R. China E-mail: wfenghui@63.com ABSTRACT.

More information

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS TSOGTGEREL GANTUMUR Abstract. After establishing discrete spectra for a large class of elliptic operators, we present some fundamental spectral properties

More information

A model function method in total least squares

A model function method in total least squares www.oeaw.ac.at A model function method in total least squares S. Lu, S. Pereverzyev, U. Tautenhahn RICAM-Report 2008-18 www.ricam.oeaw.ac.at A MODEL FUNCTION METHOD IN TOTAL LEAST SQUARES SHUAI LU, SERGEI

More information

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization

A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization A Multilevel Proximal Algorithm for Large Scale Composite Convex Optimization Panos Parpas Department of Computing Imperial College London www.doc.ic.ac.uk/ pp500 p.parpas@imperial.ac.uk jointly with D.V.

More information

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization Iranian Journal of Operations Research Vol. 4, No. 1, 2013, pp. 88-107 Research Note A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization B. Kheirfam We

More information

Necessary conditions for convergence rates of regularizations of optimal control problems

Necessary conditions for convergence rates of regularizations of optimal control problems Necessary conditions for convergence rates of regularizations of optimal control problems Daniel Wachsmuth and Gerd Wachsmuth Johann Radon Institute for Computational and Applied Mathematics RICAM), Austrian

More information

A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator

A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator Ismael Rodrigo Bleyer Prof. Dr. Ronny Ramlau Johannes Kepler Universität - Linz Florianópolis - September, 2011.

More information

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference

ECE G: Special Topics in Signal Processing: Sparsity, Structure, and Inference ECE 18-898G: Special Topics in Signal Processing: Sparsity, Structure, and Inference Sparse Recovery using L1 minimization - algorithms Yuejie Chi Department of Electrical and Computer Engineering Spring

More information

Numerical Methods for Large-Scale Nonlinear Systems

Numerical Methods for Large-Scale Nonlinear Systems Numerical Methods for Large-Scale Nonlinear Systems Handouts by Ronald H.W. Hoppe following the monograph P. Deuflhard Newton Methods for Nonlinear Problems Springer, Berlin-Heidelberg-New York, 2004 Num.

More information

Krylov subspace iterative methods for nonsymmetric discrete ill-posed problems in image restoration

Krylov subspace iterative methods for nonsymmetric discrete ill-posed problems in image restoration Krylov subspace iterative methods for nonsymmetric discrete ill-posed problems in image restoration D. Calvetti a, B. Lewis b and L. Reichel c a Department of Mathematics, Case Western Reserve University,

More information

A Recursive Trust-Region Method for Non-Convex Constrained Minimization

A Recursive Trust-Region Method for Non-Convex Constrained Minimization A Recursive Trust-Region Method for Non-Convex Constrained Minimization Christian Groß 1 and Rolf Krause 1 Institute for Numerical Simulation, University of Bonn. {gross,krause}@ins.uni-bonn.de 1 Introduction

More information

A range condition for polyconvex variational regularization

A range condition for polyconvex variational regularization www.oeaw.ac.at A range condition for polyconvex variational regularization C. Kirisits, O. Scherzer RICAM-Report 2018-04 www.ricam.oeaw.ac.at A range condition for polyconvex variational regularization

More information

An Iteratively Regularized Projection Method for Nonlinear Ill-posed Problems

An Iteratively Regularized Projection Method for Nonlinear Ill-posed Problems Int. J. Contemp. Math. Sciences, Vol. 5, 2010, no. 52, 2547-2565 An Iteratively Regularized Projection Method for Nonlinear Ill-posed Problems Santhosh George Department of Mathematical and Computational

More information

Tikhonov Regularization of Large Symmetric Problems

Tikhonov Regularization of Large Symmetric Problems NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2000; 00:1 11 [Version: 2000/03/22 v1.0] Tihonov Regularization of Large Symmetric Problems D. Calvetti 1, L. Reichel 2 and A. Shuibi

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

Chapter 1 Foundations of Elliptic Boundary Value Problems 1.1 Euler equations of variational problems

Chapter 1 Foundations of Elliptic Boundary Value Problems 1.1 Euler equations of variational problems Chapter 1 Foundations of Elliptic Boundary Value Problems 1.1 Euler equations of variational problems Elliptic boundary value problems often occur as the Euler equations of variational problems the latter

More information

Iterative regularization of nonlinear ill-posed problems in Banach space

Iterative regularization of nonlinear ill-posed problems in Banach space Iterative regularization of nonlinear ill-posed problems in Banach space Barbara Kaltenbacher, University of Klagenfurt joint work with Bernd Hofmann, Technical University of Chemnitz, Frank Schöpfer and

More information

A convergence result for an Outer Approximation Scheme

A convergence result for an Outer Approximation Scheme A convergence result for an Outer Approximation Scheme R. S. Burachik Engenharia de Sistemas e Computação, COPPE-UFRJ, CP 68511, Rio de Janeiro, RJ, CEP 21941-972, Brazil regi@cos.ufrj.br J. O. Lopes Departamento

More information

A G Ramm, Implicit Function Theorem via the DSM, Nonlinear Analysis: Theory, Methods and Appl., 72, N3-4, (2010),

A G Ramm, Implicit Function Theorem via the DSM, Nonlinear Analysis: Theory, Methods and Appl., 72, N3-4, (2010), A G Ramm, Implicit Function Theorem via the DSM, Nonlinear Analysis: Theory, Methods and Appl., 72, N3-4, (21), 1916-1921. 1 Implicit Function Theorem via the DSM A G Ramm Department of Mathematics Kansas

More information

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS

ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS MATHEMATICS OF OPERATIONS RESEARCH Vol. 28, No. 4, November 2003, pp. 677 692 Printed in U.S.A. ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS ALEXANDER SHAPIRO We discuss in this paper a class of nonsmooth

More information

Newton Method with Adaptive Step-Size for Under-Determined Systems of Equations

Newton Method with Adaptive Step-Size for Under-Determined Systems of Equations Newton Method with Adaptive Step-Size for Under-Determined Systems of Equations Boris T. Polyak Andrey A. Tremba V.A. Trapeznikov Institute of Control Sciences RAS, Moscow, Russia Profsoyuznaya, 65, 117997

More information

Douglas-Rachford splitting for nonconvex feasibility problems

Douglas-Rachford splitting for nonconvex feasibility problems Douglas-Rachford splitting for nonconvex feasibility problems Guoyin Li Ting Kei Pong Jan 3, 015 Abstract We adapt the Douglas-Rachford DR) splitting method to solve nonconvex feasibility problems by studying

More information

Convergence rates of convex variational regularization

Convergence rates of convex variational regularization INSTITUTE OF PHYSICS PUBLISHING Inverse Problems 20 (2004) 1411 1421 INVERSE PROBLEMS PII: S0266-5611(04)77358-X Convergence rates of convex variational regularization Martin Burger 1 and Stanley Osher

More information

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1

EE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1 EE 546, Univ of Washington, Spring 2012 6. Proximal mapping introduction review of conjugate functions proximal mapping Proximal mapping 6 1 Proximal mapping the proximal mapping (prox-operator) of a convex

More information

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets Ole Christensen, Alexander M. Lindner Abstract We characterize Riesz frames and prove that every Riesz frame

More information

Proximal-like contraction methods for monotone variational inequalities in a unified framework

Proximal-like contraction methods for monotone variational inequalities in a unified framework Proximal-like contraction methods for monotone variational inequalities in a unified framework Bingsheng He 1 Li-Zhi Liao 2 Xiang Wang Department of Mathematics, Nanjing University, Nanjing, 210093, China

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. Minimization Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping. 1 Minimization A Topological Result. Let S be a topological

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725

Proximal Newton Method. Ryan Tibshirani Convex Optimization /36-725 Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h

More information

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology

Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 2007 Technische Universiteit Eindh ove n University of Technology Inverse problems Total Variation Regularization Mark van Kraaij Casa seminar 23 May 27 Introduction Fredholm first kind integral equation of convolution type in one space dimension: g(x) = 1 k(x x )f(x

More information

A Family of Preconditioned Iteratively Regularized Methods For Nonlinear Minimization

A Family of Preconditioned Iteratively Regularized Methods For Nonlinear Minimization A Family of Preconditioned Iteratively Regularized Methods For Nonlinear Minimization Alexandra Smirnova Rosemary A Renaut March 27, 2008 Abstract The preconditioned iteratively regularized Gauss-Newton

More information

Convergence rates for Morozov s Discrepancy Principle using Variational Inequalities

Convergence rates for Morozov s Discrepancy Principle using Variational Inequalities Convergence rates for Morozov s Discrepancy Principle using Variational Inequalities Stephan W Anzengruber Ronny Ramlau Abstract We derive convergence rates for Tikhonov-type regularization with conve

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Levenberg-Marquardt method in Banach spaces with general convex regularization terms

Levenberg-Marquardt method in Banach spaces with general convex regularization terms Levenberg-Marquardt method in Banach spaces with general convex regularization terms Qinian Jin Hongqi Yang Abstract We propose a Levenberg-Marquardt method with general uniformly convex regularization

More information

The speed of Shor s R-algorithm

The speed of Shor s R-algorithm IMA Journal of Numerical Analysis 2008) 28, 711 720 doi:10.1093/imanum/drn008 Advance Access publication on September 12, 2008 The speed of Shor s R-algorithm J. V. BURKE Department of Mathematics, University

More information

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1

More information

Convex Optimization and l 1 -minimization

Convex Optimization and l 1 -minimization Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l

More information

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in

More information

A Continuation Method for the Solution of Monotone Variational Inequality Problems

A Continuation Method for the Solution of Monotone Variational Inequality Problems A Continuation Method for the Solution of Monotone Variational Inequality Problems Christian Kanzow Institute of Applied Mathematics University of Hamburg Bundesstrasse 55 D 20146 Hamburg Germany e-mail:

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

On Penalty and Gap Function Methods for Bilevel Equilibrium Problems

On Penalty and Gap Function Methods for Bilevel Equilibrium Problems On Penalty and Gap Function Methods for Bilevel Equilibrium Problems Bui Van Dinh 1 and Le Dung Muu 2 1 Faculty of Information Technology, Le Quy Don Technical University, Hanoi, Vietnam 2 Institute of

More information

20 J.-S. CHEN, C.-H. KO AND X.-R. WU. : R 2 R is given by. Recently, the generalized Fischer-Burmeister function ϕ p : R2 R, which includes

20 J.-S. CHEN, C.-H. KO AND X.-R. WU. : R 2 R is given by. Recently, the generalized Fischer-Burmeister function ϕ p : R2 R, which includes 016 0 J.-S. CHEN, C.-H. KO AND X.-R. WU whereas the natural residual function ϕ : R R is given by ϕ (a, b) = a (a b) + = min{a, b}. Recently, the generalized Fischer-Burmeister function ϕ p : R R, which

More information

Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

More information

Lecture 25: November 27

Lecture 25: November 27 10-725: Optimization Fall 2012 Lecture 25: November 27 Lecturer: Ryan Tibshirani Scribes: Matt Wytock, Supreeth Achar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have

More information

FETI domain decomposition method to solution of contact problems with large displacements

FETI domain decomposition method to solution of contact problems with large displacements FETI domain decomposition method to solution of contact problems with large displacements Vít Vondrák 1, Zdeněk Dostál 1, Jiří Dobiáš 2, and Svatopluk Pták 2 1 Dept. of Appl. Math., Technical University

More information

UNIQUENESS OF POSITIVE SOLUTION TO SOME COUPLED COOPERATIVE VARIATIONAL ELLIPTIC SYSTEMS

UNIQUENESS OF POSITIVE SOLUTION TO SOME COUPLED COOPERATIVE VARIATIONAL ELLIPTIC SYSTEMS TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 0002-9947(XX)0000-0 UNIQUENESS OF POSITIVE SOLUTION TO SOME COUPLED COOPERATIVE VARIATIONAL ELLIPTIC SYSTEMS YULIAN

More information

Quasi-Newton Methods

Quasi-Newton Methods Quasi-Newton Methods Werner C. Rheinboldt These are excerpts of material relating to the boos [OR00 and [Rhe98 and of write-ups prepared for courses held at the University of Pittsburgh. Some further references

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

Local strong convexity and local Lipschitz continuity of the gradient of convex functions

Local strong convexity and local Lipschitz continuity of the gradient of convex functions Local strong convexity and local Lipschitz continuity of the gradient of convex functions R. Goebel and R.T. Rockafellar May 23, 2007 Abstract. Given a pair of convex conjugate functions f and f, we investigate

More information

Extensions of the CQ Algorithm for the Split Feasibility and Split Equality Problems

Extensions of the CQ Algorithm for the Split Feasibility and Split Equality Problems Extensions of the CQ Algorithm for the Split Feasibility Split Equality Problems Charles L. Byrne Abdellatif Moudafi September 2, 2013 Abstract The convex feasibility problem (CFP) is to find a member

More information

Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems

Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems Regularization methods for large-scale, ill-posed, linear, discrete, inverse problems Silvia Gazzola Dipartimento di Matematica - Università di Padova January 10, 2012 Seminario ex-studenti 2 Silvia Gazzola

More information

On Riesz-Fischer sequences and lower frame bounds

On Riesz-Fischer sequences and lower frame bounds On Riesz-Fischer sequences and lower frame bounds P. Casazza, O. Christensen, S. Li, A. Lindner Abstract We investigate the consequences of the lower frame condition and the lower Riesz basis condition

More information

On the Convergence and O(1/N) Complexity of a Class of Nonlinear Proximal Point Algorithms for Monotonic Variational Inequalities

On the Convergence and O(1/N) Complexity of a Class of Nonlinear Proximal Point Algorithms for Monotonic Variational Inequalities STATISTICS,OPTIMIZATION AND INFORMATION COMPUTING Stat., Optim. Inf. Comput., Vol. 2, June 204, pp 05 3. Published online in International Academic Press (www.iapress.org) On the Convergence and O(/N)

More information

The nonsmooth Newton method on Riemannian manifolds

The nonsmooth Newton method on Riemannian manifolds The nonsmooth Newton method on Riemannian manifolds C. Lageman, U. Helmke, J.H. Manton 1 Introduction Solving nonlinear equations in Euclidean space is a frequently occurring problem in optimization and

More information

Lasso: Algorithms and Extensions

Lasso: Algorithms and Extensions ELE 538B: Sparsity, Structure and Inference Lasso: Algorithms and Extensions Yuxin Chen Princeton University, Spring 2017 Outline Proximal operators Proximal gradient methods for lasso and its extensions

More information

A Distributed Newton Method for Network Utility Maximization, II: Convergence

A Distributed Newton Method for Network Utility Maximization, II: Convergence A Distributed Newton Method for Network Utility Maximization, II: Convergence Ermin Wei, Asuman Ozdaglar, and Ali Jadbabaie October 31, 2012 Abstract The existing distributed algorithms for Network Utility

More information

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization

Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization Rapid, Robust, and Reliable Blind Deconvolution via Nonconvex Optimization Shuyang Ling Department of Mathematics, UC Davis Oct.18th, 2016 Shuyang Ling (UC Davis) 16w5136, Oaxaca, Mexico Oct.18th, 2016

More information

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization

Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Infeasibility Detection and an Inexact Active-Set Method for Large-Scale Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with James V. Burke, University of Washington Daniel

More information

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities

On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities On the Iteration Complexity of Some Projection Methods for Monotone Linear Variational Inequalities Caihua Chen Xiaoling Fu Bingsheng He Xiaoming Yuan January 13, 2015 Abstract. Projection type methods

More information

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS

ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS ON ILL-POSEDNESS OF NONPARAMETRIC INSTRUMENTAL VARIABLE REGRESSION WITH CONVEXITY CONSTRAINTS Olivier Scaillet a * This draft: July 2016. Abstract This note shows that adding monotonicity or convexity

More information

3 Compact Operators, Generalized Inverse, Best- Approximate Solution

3 Compact Operators, Generalized Inverse, Best- Approximate Solution 3 Compact Operators, Generalized Inverse, Best- Approximate Solution As we have already heard in the lecture a mathematical problem is well - posed in the sense of Hadamard if the following properties

More information