arxiv: v4 [math.oc] 29 Jan 2019

Size: px

Start display at page:

Download "arxiv: v4 [math.oc] 29 Jan 2019"

Sylvia Williams
5 years ago
Views:

1 Solution Paths of Variational Regularization Methods for Inverse Problems Leon Bungert and Martin Burger arxiv: v4 [math.oc] 29 Jan 2019 Institut für Numerische und Angewandte Mathematik, Westfälische Wilhelms-Universität (WWU) Münster. Einsteinstr. 62, D Münster, Germany. January 30, 2019 Abstract We consider a family of variational regularization functionals for a generic inverse problem, where the data fidelity and regularization term are given by powers of a ilbert norm and an absolutely one-homogeneous functional, respectively. We investigate the small and large time behavior of the associated solution paths and, in particular, prove finite extinction time for a large class of functionals. Depending on the powers, we also show that the solution paths are of bounded variation or even Lipschitz continuous. In addition, it will turn out that the models are almost mutually equivalent in terms of the minimizers they admit. Finally, we apply our results to define and compare two different non-linear spectral representations of data and show that only one of it is able to decompose a linear combination of non-linear eigenfunctions into the individual eigenfunctions. For that purpose, we will also briefly address piecewise affine solution paths. Keywords: inverse problems, variational methods, solution paths, regularity, finite extinction time, nonlinear spectral theory, nonlinear eigenfunctions, nonlinear spectral decomposition AMS Subject Classification: 49N45, 47J10, 47A52. PACS: Zz, Xx. 1 Introduction A standard approach for approximating solutions of an ill-posed inverse problem Au = f (IP) with possibly noise-corrupted data f consists in variational regularization. To this end, one typically aims at solving the optimization problem min u D(Au, f) + tr(u) (P) where the data fidelity term D enforces Au to be close to f and the regularization functional R incorporates prior knowledge about the solution (e.g. sparsity, smoothness, etc.) into the leon.bungert@wwu.de martin.burger@wwu.de 1

2 model. The real number t > 0 is typically referred to as regularization parameter and balances data fidelity and regularization. One of the most famous examples for (P) within the field of mathematical imaging is the ROF denosing model [37] 1 min u BV(Ω) 2 u f 2 L 2 (Ω) + t TV(u). (ROF) ere, t should be chosen dependent on the noise level of f to obtain a satisfyingly denoised image. In contrast, the parameter t can also be interpreted as an artificial time that steers the solution of (P) from being under-regularized to over-regularized as time increases, or speaking in the ROF context, that successively and edge-preservingly smoothes f until a constant state is reached. In this manuscript we will refer to the maps t {u t : u t solves (P)} and t {Au t : u t solves (P)} as solution path and forward solution path, respectively. Recently, this and similar evolutions, which can be viewed as a non-linear scale space representation of the input f, have been used to define non-linear spectral multiscale decompositions, e.g. [11, 12, 23, 24, 26, 25]. Typically, these decompositions involve computing derivatives with respect to the parameter t of the (forward) solution path wherefore it is interesting to study its regularity. Furthermore, not only in the ROF model but also in general, a very popular choice for the data fidelity in (P) is the squared norm of some ilbert space whereas the regularization functional is often assumed to be absolutely one-homogeneous. owever, apart from some computational and theoretical advantages there is often no substantial justification for preferring such models over others. In particular, one could consider arbitrary powers of a ilbert space norm and of an absolutely one-homogeneous functional J instead which leads to the weighted problem 1 min u α Au f α + t β J(u)β (wp) with weights α, β 1. Note that the multiplicative scalings 1/α and 1/β do not restrict generality since they can be absorbed into t. Indeed there are only few contributions in literature that consider general powers of norms (cf. [27, 7] for a ilbert norm with α = 1 and [39] for error analysis for a Banach norm with fixed α 1) or a different scaling of an absolutely one-homogeneous regularization functional [20]. While such modifications seem only minor at first glance and the resulting models will be equivalent for parameters t in a certain interval, we will see that outside this interval the qualitative behavior of the models differs significantly. In a nutshell, the models disintegrate into four classes, depending on whether α or β are larger or equal than 1. If both parameters equal 1, due to the homogeneity of J, the corresponding problem (wp) becomes contrast invariant, meaning that if u solves (wp) with some f then cu solves the problem where f is replaced by cf and c > 0. Our precise setting in this paper is as follows: Let (X, X ) be the dual space of an separable predual Banach space Y and let (,, ) be a ilbert space with norm :=,. We consider a bounded linear forward operator A : X mapping between these spaces and denote by N (A) and ran(a) its null-space and range. Let furthermore J : X R + {+ } be an absolutely one-homogeneous, weak lower semi-continuous, and proper convex functional, whose null-space and effective domain we denote by N (J) := {u X : J(u) = 0} and dom(j) := {u X : J(u) < }, respectively. For parameters α, β 1, t 0, and given data f we define functionals E α,β t (u; f) := 1 α Au f α + t β J(u)β, u X, (1.1) 2

3 which we aim to minimize. If f ran(a), meaning that there exists u X with Au = f, we assume that u / N (J). This is the only interesting scenario since otherwise u is a minimizer of E α,β t ( ; f) for any t 0. The remainder of this work is organized as follows: We will perform a thorough analysis of the variational problem at hand in an infinite dimensional setting in Section 2. A special emphasis will lie on the small and large time behavior and of the so called solution path and uniqueness of the forward solution path. In Section 3 we briefly demonstrate the equivalence of some classes of the models under consideration. Using these results, Section 4 will deal with regularity of the forward solution path depending on the weights α and β. In Section 5 we will indicate how our results can be used to define non-linear spectral representations. We undertake numerical experiments that illustrate our theoretical findings in Section 6 and conclude with some open questions. Basic notation and relevant notions from convex analysis, as well as fundamental properties of generalized orthogonal complements and projections with respect to the forward operator A are collected in the appendix. 2 Analysis of the variational problem In this section we will provide a basic analysis of the variational problem of minimizing (1.1). We start with fixed t and then proceed towards the behaviour of the solution path for small respectively large t, which can allow for exact penalization respectively finite time extinction. 2.1 Basic properties of the variational problem In the following, we make three assumptions, related to the forward operator A and its interplay with the regularization functional J which we make use of throughout this manuscript: Assumption 1. u A := Au is a norm on N (J) which is equivalent to the restriction of X to N (J). Note that for Assumption 1 to hold it is sufficient to have N (J) N (A) = {0} and dim N (J) < together with an appropriate definition of X which is satisfied in most cases. The second assumption is a generalized Poincaré inequality which assures a weaker form of coercivity of J. To this end we define the map { P A X, : f P A (2.1) (f) := argmin u N (J) Au f, whose well-definedness and important properties are proved in Section B of the appendix. We call this map the A-orthogonal projection onto the null-space of J. Assumption 2. There is C > 0 such that u P A (Au) X CJ(u), u X. Apart from guaranteeing coercivity, this assumption will be utilized to study the small and large time behavior of the solution path. Assumption 3. The operator A is weak -to-weak continuous, that is if (u k ) X is a sequence which weakly converges to some u X, then (Au k ) weakly converges to Au in. 3

4 This assumption is guaranteed if A = B with some bounded linear operator B : Y. owever, in some cases it is not obvious how to ensure this condition. In the following remark we demonstrate how an appropriate choice of the space X can accomplish this. Remark 2.1. In most cases the space X is solely determined by the regularization functional, but in some very mildly ill-posed cases the data fidelity needs to be taken into account as well in order to satisfy the assumptions. The canonical case is indeed TV in multiple dimensions. We define X := BV L 2 with norm X := BV + L 2, choose = L 2, and let A be the continuous embedding operator. A predual of X is given by Y := Z+L 2 where Z = BV. Since weak convergence in X implies in particular weak L 2 -convergence, the embedding X is weak -to-weak continuous. More general, it can be checked that the dual of a sum of Banach spaces equals the intersection of the duals. Now we provide some basic results concerning the minimization problem for the energy functional E α,β t ( ; f). We start with an existence result: Theorem 2.2 (Existence of minimizers). Let Assumptions 1-3 hold. For each f, t > 0, and α, β 1 there exists a minimizer u t of E α,β t ( ; f). Proof. We first note that E α,β t in [0, ) and there is a minimizing sequence. ( ; f) 0 and E α,β t (0; f) <. ence, the infimum exists For coercivity of E α,β t ( ; f) we let (u k ) X with u k X as k. Splitting u k = v k + w k with v k := u k P A (Au k ) and w k := P A (Au k ) N (J), we first assume that v k X. Assumption 2 together with the non-negativity of the fidelity term and (A.8c) then implies that E α,β t obtain u k X C + w k X and hence w k X. We calculate (u k ; f) as desired since t > 0. If, however, v k X < C for C > 0 we Au k f = A(v k + w k ) f Aw k f Av k w k A f A C. Since w k X and w k N (J) for all k N, by means of Assumption 1 we infer that the right hand side diverges which shows coercivity of functional E α,β t ( ; f). Consequently, any minimizing sequence will be bounded in X and, thus, possess a weak convergent subsequence (u kl ) with limit u and Au kl weakly converges to Au in by Assumption 3. The weak lower semi-continuity of E α,β t ( ; f), which follows from weak lower semi-continuity of and weak lower semi-continuity of J, shows that u is a minimizer. Now we turn to optimality conditions for minimizers. In some of the following statements we will utilize the range condition u dom(j) : Au = f (RC) and for convenience we define B 1 := {q : q 1}. Theorem 2.3 (Optimality conditions). Let t > 0 and α, β 1, u t be a minimizer of E α,β t ( ; f). We distinguish between two cases: If u t = u for some u which satisfies (RC), then α = 1 holds necessarily and there is q B1 such that A q p t := tj(u ) β 1 J(u ). (2.2) 4

5 If u t is such that Au t f, it holds p t := where we use the convention 0 0 = 1 if β = 1 and J(u t ) = 0. A (f Au t ) t Au t f 2 α J(u t) β 1 J(u t), (2.3) Proof. Standard results of subgradient calculus [39] allow us to calculate the subdifferential of the energy functional (1.1). Note in particular that u 1 α Au f α is continuous, thus the subgradients of E α,β t ( ; f) are given by the sum of subgradients of 1 α A f α and t β J( )β. By the chain rule for subdifferentials, see [5] for instance, the subdifferential of E α,β t ( ; f) in u dom(j) reads and for any q it holds E α,β t (u; f) = Au f α 1 ( Au f ) + tj(u)β 1 J(u) (2.4) { B q = 1, q = 0, q q, q 0. ence, the optimality condition for u and α > 1 reads 0 E α,β t (u ; f) = tj(u ) β 1 J(u ) which contradicts t > 0 since J(u ) 0, by assumption. Therefore, u cannot be a minimizer for α > 1. Similarly, any minimizer u t for β > 1 satisfies u t / N (J) since otherwise f = Au t held true due to (2.4). This would contradict our non-triviality assumption on the data. Equations (2.2) and (2.3) follow from rewriting the condition 0 E α,β t (u t ; f). Remark 2.4. Due to convexity of E α,β t ( ; f), conditions (2.2) and (2.3) are also sufficient for optimality. Now let us turn to the question of uniqueness. The simplest case is the one of A being injective: Theorem 2.5 (Uniqueness of minimizers). Let A be injective and α > 1. Then minimizers of E α,β t ( ; f) are unique. Proof. This follows from noting that E α,β t ( ; f) is strictly convex under these assumptions. By the following Theorem we infer that in any case the residuals of minimizers are unique for α > 1 or β > 1, even if A is not injective. Theorem 2.6 (Uniqueness of residuals). Let Φ, Ψ : [0, ) [0, ) be increasing and convex, J : X R { } be convex and proper, and u, v X be two minimizers of E t ( ) := D( ) + tr( ) where D( ) := Φ( A f ), R( ) := Ψ(J( )), and t > 0. If Φ or Ψ is strictly convex, then Au f = Av f and J(u) = J(v). Proof. The proof uses standard arguments from convex optimization and will be omitted. 5

6 Remark 2.7. With a little abuse of notation we introduce the following maps R : (0, ) [0, ), t R(t) := Au t f, (2.5) J : (0, ) [0, ), t J(t) := J(u t ), (2.6) where u t is a minimizer of E α,β t ( ; f). Note that we suppress the dependency of R on α and β for concise notation. By Theorem 2.6 the maps R and J are well-defined for α > 1 or β > 1. If α, β = 1, we will use the same expressions for minimizers of E 1,1 t ( ; f) although their values will depend on the individual minimizer, in general. A fairly well-known property (cf. [15], for instance) is that the residual map t R(t) is monotonously increasing whereas the regularizer map t J(t) decreases monotonously. Lemma 2.8. Let 0 < s < t and u s, u t for α, β 1, respectively. Then it holds where the inequalities are strict for α > 1 and A injective. Proof. Since u s and u t are minimizers, we can write denote minimizers of Es α,β ( ; f) and E α,β t ( ; f) R(s) R(t), (2.7) J(s) J(t), (2.8) Es α,β (u s ; f) Es α,β (u t ; f) (2.9) E α,β t (u t ; f) E α,β t (u s ; f). (2.10) Adding these inequalities, subtracting 1 α (R(s)α + R(t) α ) on both sides, and multiplying with β yields sj(s) β + tj(t) β sj(t) β + tj(s) β or equivalently, since s < t: ence, using (2.9) we obtain which implies J(t) J(s). (2.11) 1 α R(s)α + s β J(t)β Es α,β (u s ; f) Es α,β (u t ; f) = 1 α R(t)α + s β J(t)β R(s) R(t). If α > 1 and A is injective, inequalities (2.9), (2.10), and (2.11) become strict due to uniqueness of minimizers. This concludes the proof. 2.2 Behaviour for small time Obviously, for t = 0 any u fulfilling (RC) is a minimizer of E α,β 0 ( ; f). In this section we consider the special case α = 1 where such u can be a solution for small t > 0, as well, a phenomenon that is called exact penalization. We shall assume that (RC) holds and impose the following source condition: J(u ) p J(u ) q : p = A q. (SC) 6

7 Needless to say, since J(u ) 0, any such q fulfilling (SC) is also different from zero. Furthermore, any u fulfilling range and source condition is a J-minimizing solution of Au = f, according to [14], i.e., J(u ) J(u) for all u X with Au = f. In particular, the (positive) value J(u ) does not depend on the choice of u fulfilling as (RC) and (SC) and will be denoted by J min, in the sequel. It is obvious from (2.2) that (SC) is necessary for u being a minimizer for t > 0. Indeed, the source condition is also sufficient. To show this, we start with the following lemmas. Lemma 2.9. Let conditions (RC) and (SC) hold true. Then s given by { } s := inf u X : (RC), (SC) hold inf q : q, A q J(u ) (2.12) fulfills 0 < s <. Proof. Let us now assume that there is a sequence (u k ) X fulfilling conditions (RC) and (SC) and a corresponding sequence of source elements (q k ) with A q k J(u k ) for all k such that lim k q k = 0. In this case we calculate 0 < J min = J(u k ) = A q k, u k = q k, f f q k 0, k which is a contradiction. Finally, assumptions (RC) and (SC) imply that the admissible sets in (2.12) are non-empty and hence s <. Lemma Under the conditions of Lemma 2.9 the infimum is attained, i.e., there is û dom(j) fulfilling Aû = f and ˆq with A ˆq J(û) such that ˆq = s. Proof. Let (u k ) X fulfilling (RC) and (q k) such that A q k J(u k ), for every k N, be a minimizing sequence. By Assumption 2 we infer u k PA (Au k ) X CJ(u k ) = CJ min <, k N. ) ence, (u k PA (Au k ) is bounded in X and admits a subsequence weakly, which we denote with the same index, converging to some h X. As P(Au k ) = PA (f) holds for all k N, we obtain that (u k ) converges to û := h + PA (f). Using again that Au k = f, this implies that f = Aû. Furthermore, by the lower semi-continuity of J, we infer that û dom(j). ence, we have shown that the limit of (u k ) fulfills (RC). Similarly, being a minimizing sequence, (q k ) is bounded in and a subsequence weakly converges to some ˆq. It holds (after another round of subsequence refinement) A ˆq, û = ˆq, f = lim k q k, f = lim k A q k, u k = lim k J(u k ) J(û), using the lower semi-continuity of J. On the other hand, one clearly has J(u k ) = J min J(û), for all k N since û satisfies (RC). This shows A ˆq, û = J(û). 7

8 Furthermore, from A q k A ˆq, u = q k ˆq, Au, u X, and the weak convergence of (q k ) to ˆq we infer that (A q k ) weakly converges to A ˆq in X. Since the sequence (A q k ) lies in K = J(0) which is weakly closed (cf. [22]), also A ˆq K holds. Summing up, we have shown that A ˆq J(û), as desired. As a consequence of Lemmas 2.9 and 2.10 we obtain Theorem Under the conditions of Lemma 2.9 there is a minimizer u t of E 1,β t ( ; f) fulfilling Au t = f if and only if t t, where and s is given by (2.12). t := J 1 β min (2.13) s Proof. Let t t and choose û and ˆq as in the proof of Lemma Defining p := A ˆq, q := tj β 1 min ˆq we find that A q + tj β 1 min p = 0 and q β 1 = tjmin ˆq t J β 1 min s = 1. Consequently, by the optimality conditions (cf. Theorem 2.3 and Remark 2.4) it follows that û is a minimizer of E 1,β t ( ; f). On the other hand, let u, fulfilling (RC) and (SC), be a minimizer of E 1,β t ( ; f). By (2.2) from Theorem 2.3 there are q B1 and 0 p J(u ) such that A q + tj β 1 min p = 0 or equivalently ( ) p = A q. ence, it holds by definition of s that q s tj β 1 min tj β 1 min = t 1 s J β 1 min = t. Note that in the second part of the proof the source condition follows directly from the optimality condition and does not have to be imposed. Next we show that for t < t the forward solution path (and hence the residual) is uniquely determined: Theorem Let (RC) and (SC) hold. Every minimizer u t of E 1,β t ( ; f) for 0 < t < t fulfills Au t = f. Proof. Suppose u t is a minimizer for 0 < t < t and Au t f. Then Au t f + t β J(u t) β t β J(u ) β, where u fulfills range and source condition. ence, multiplication with t /t > 1 yields Au t f + t β J(u t) β < t β J(u ) β which contradicts that u is a minimizer of E 1,β t ( ; f). In order to maintain a concise notation, for the rest of this manuscript we will define t := 0 if α > 1 or if conditions (RC) and (SC) fail to hold. 8

9 2.3 Behaviour for large time In the following, we investigate the behavior for t sufficiently large where we expect that u t behaves the like a solution of inf Au f, (2.14) u N (J) which is the A-orthogonal projection of f onto N (J), introduced in (2.1). We refer to Section B of the appendix for further details. Closely related is the notion of the A-orthogonal complement of a subset U X (cf. Definition B.2). We will use this definition with U = N (J), in the following. Remark It is important to remark that for U X being a subspace, the two subspaces (AU) and AU,A of do not coincide, in general. In particular, being an orthogonal complement of a subspace, the former is always closed whereas the latter is not. owever, it is obvious that AU,A (AU) holds true. Remark Note that for X = and A = id it holds that P A = P, i.e., the minimizer of (2.14) coincides with the orthogonal projection on N (J) which fulfills f P(f), P(f) = 0. But even in our more general setting one can obtain properties for P A which resemble the classical ones for orthogonal projections in ilbert spaces. These are subsumed in Proposition B.4 and will be needed to obtain finite extinction time of minimizers of E α,β t ( ; f) with β = 1, meaning that there is T > 0 such that all minimizers for t > T coincide with P A (f). owever, first we will prove a weaker statement, namely that minimizers of E α,β t ( ; f) converge to P A (f) as t tends to infinity. Theorem Let (t k ) (0, ) be a sequence tending to infinity and u tk be a minimizer of E α,β t k ( ; f). Then (u tk ) weakly converges to u := P A (f) in X as k. Proof. Since E α,β t k (u tk ; f) E α,β t k (u ; f), we obtain 1 β J(u t k ) β Au f α, α t k (2.15) Au tk f Au f, (2.16) and, in particular, J(u tk ) 0 as k. Furthermore, for k large enough it holds E α,β 1 (u tk ; f) E α,β t k (u tk ; f) and since the functional E α,β 1 ( ; f) is coercive, (u tk ) is bounded in X. This implies the existence of a weakly convergent subsequence (denoted with the same indices) with limit u. Again, by Assumption 3, this implies that Au tk weakly converges to Au in. Due to weak closedness, u is an element of N (J). Consequently, we can calculate, using weak lower semi-continuity of the norm in and (2.16): Au f lim inf k Au t k f Au f. Since u is the unique minimizer of (2.14), this implies that u = u. The same argument holds true for all cluster points of (u tk ) which shows convergence of the whole sequence. In order to obtain a finite extinction time, one has to demand the Poincaré-type inequality of Assumption 2 and β = 1. We define Et α ( ; f) := E α,1 t ( ; f). 9

10 Theorem Let β = 1. Under Assumption 2 it holds that S(f) := sup u N (J),A J(u)=1 f, Au < and for t t, given by S(f) t := f AP A (f) 2 α (2.17) if f AP A (f) and t = 0 else, it holds that u t = P A (f) is a minimizer of Et α ( ; f). Moreover, for t > t this is the unique minimizer. Conversely, if P A (f) is a minimizer of Et α ( ; f), then t t. Proof. Using Assumption 2 and P A (u) = 0 for u N (J),A we calculate Now let S(f) f sup u N (J),A J(u)=1 Au C f A sup J(u) = C f A <. J(u)=1 p t := A (f AP A (f)) t P A (f) f 2 α. Then for any u N (J) we have p t, u = 0 = J(u) which holds in particular for u = P A (f). For arbitrary u X with J(u) 0 we have p t, u = = 1 t f AP A (f) 2 α f AP A 1 (f), Au = t f AP A (f) 2 α f, Au AP A (Au) J(u P A (u)) t f AP A (f) 2 α f, A u PA (u) J(u P A J(u PA (u)) (u)) t f AP A (f) 2 α S(f) J(u P A (u)) = J(u), using t t and (A.8c) as well as self-adjointness of P A (cf. Prop. B.4). Thus, p t J(P A (f)) and the optimality condition (2.3) is satisfied for u t = P A (f). Assume that there exists another minimizer u for t > t. Then 1 α Au f α + tj(u) 1 α APA (f) f α which implies 1 α Au f α + t J(u) < 1 α APA (f) f α and contradicts the minimization property of P A (f) for Et α ( ; f). Let us now assume that P A (f) is a minimizer. In this case, the optimality condition implies that p t := A (f AP A (f)) t f AP A (f) 2 α J(P A (f)). ence, using p t, u J(u) for all u X, we can estimate t = sup u N (J),A,J(u)=1 f, Au f AP A (f) 2 α = t sup p t, u t u N (J),A J(u)=1 which yields the assertion. = sup u N (J),A,J(u)=1 f AP A (f), Au f AP A (f) 2 α 10

11 Example If X = = R n equipped with the Euclidean inner product, A = id, and J is an arbitrary norm on, one obtains P A = P = 0 and, thus, Assumption 2 always holds true due to the equivalence of norms on finite dimensional vector spaces. If = X = L 2 (Ω), A = id, J is the total variation extended with infinity on L 2 (Ω)\BV(Ω), and Ω R n, Assumption 2 is just the Poincaré inequality for BV-functions. ere P A (u) = 1 Ω Ω u dx is the mean value of u over Ω. Summing up the results of the last two sections, the critical time t > 0 can exist only if α = 1 whereas t < requires β = 1. In more generality, one can easily extend these results to models of the type Φ( Au f ) + tψ(j(u)) with convex and differentiable functions Φ and Ψ. In this case, the critical times can appear only if Φ (0) or Ψ (0), respectively, are positive. 2.4 Uniqueness of the forward solution path Let us now prove that for each time t > 0 the forward solution path t Au t is uniquely determined if α > 1 or β > 1. Not surprisingly, this follows from the uniqueness of the residuals. Theorem 2.18 (Uniqueness of the forward solution path I). Let α > 1 or β > 1. Then the set {Au t : u t argmin E α,β t ( ; f)} is a singleton for t > 0. Proof. Let us first consider the case t > 0. Then necessarily α = 1 and β > 1 holds and by Theorem 2.6 we infer that every minimizer of E t ( ; f) for 0 < t t has the same residual. Since there is a minimizer with zero residual this has to holds for all minimizers, as well, and this implies that the forward solution path for 0 < t t coincides with the set {f}. Let us now turn to the case t > t. We use the optimality condition (2.3) for two minimizers u 0, u 1 with Au 0, Au 1 f to obtain 0 = A Au i f Au i f 2 α + tj(u i ) β 1 p i, where p i J(u i ) and i = 0, 1. Subtracting these equalities yields 0 = A Au 1 f Au 1 f 2 α A Au 0 f Au 0 f 2 α + t(j(u 1 ) β 1 p 1 J(u 0 ) β 1 p 0 ). By Theorem (2.6) we know that both the residuals and the values of the regularizer are unique and, hence, we can use the maps R and J from (2.5) and (2.6) to write 0 = A Au 1 f R(t) 2 α A Au 0 f R(t) 2 α + tj(t)β 1 (p 1 p 0 ). Multiplying with R(t) 2 α, taking a duality product with u 1 u 0 and using the non-negativity of the symmetric Bregman distance, we infer Au 1 f (Au 0 f), Au 1 Au 0 0. which is equivalent to Au 1 Au and shows Au 0 = Au 1. 11

12 It remains to study what happens for α = β = 1. Since in this case both the data fidelity and the regularizing term of the energy functional (1.1) are not strictly convex, one cannot expect uniqueness of the forward solution path for parameters t [t, t ]. owever, for values of t where non-uniqueness occurs, we are able to confine the set of possible forward solutions to a one-parameter family. Theorem 2.19 (Uniqueness of the forward solution path II). Let t t. Then it holds {Au : u argmin E 1,1 t ( ; f)} {f + c(aû f) : c 0}, (2.18) where û is an arbitrary minimizer of E 1,1 t ( ; f) fulfilling Aû f. Proof. The only non-trivial case is Au f since otherwise c = 0 can be chosen in (2.18). As before, we obtain by subtracting the optimality conditions (2.3) of u and û that 0 = A Au f A Aû f + t(p ˆp), (2.19) Au f Aû f where p and ˆp denote the corresponding subgradients. We shortcut w := Au f and ŵ := Aû f, multiply with u û, and use the non-negativity of the symmetric Bregman distance to obtain 0 w w ŵ, w ŵ = w ŵ + ŵ w, ŵ w + ŵ 0, (2.20) w ŵ where the second inequality follows from Cauchy-Schwarz. This immediately implies w, ŵ = w ŵ which is only possible if w = cŵ with c 0. ence, we obtain Au f = c(aû f) (2.21) which is equivalent to Au = f + c(aû f). This closes the proof. Remark Note that in case c 1, which corresponds to non-uniqueness of the forward solution, equation (2.21) can be rewritten as ( ) u cû A = f, (2.22) 1 c which means that in case of non-uniqueness one can construct an element from the two minimizers which fulfills the range condition (RC). This is a counter-intuitive behavior since one would not expect the two regularized solutions to carry sufficiently much information to allow for the exact reconstruction of the datum f. Indeed, if f / rana which can be interpreted as noisy data equation (2.22) is a contradiction and, hence, the forward solution path is unique in this case. Despite the considerations of the previous remark, on cannot expect uniqueness of the forward solution path, in general. This will be illustrated in the following example. ( ) 2 1 Example Let X = = R 2, A =, f = ( 2, 3 ) T and J(u) = u 1 = 1 1 u 1 + u 2. Then the forward solution path is not unique in t = 1/ 13 and in t = 1. This 12

13 can be seen as follows: It is well-known that the subdifferential of the 1-norm is given by the multivalued signum function, i.e, for u R n it holds 1, u i > 0, ( u 1 ) i = [ 1, 1], u i = 0, i = 1,..., n. 1, u i < 0, In addition, since A is invertible, the vector u := ( 1, 4 ) T is the unique vector to fulfill Au = f. ence, u 1 = p = ( 1, 1 ) T and q = (A ) 1 p = A 1 p = ( 2, 3 ) T is the unique source element. This implies that t = 1/ q 2 = 1/ 13 > 0. It can be easily checked using the optimality condition (2.3) that all members of the family ( ) ( ) [ 1 5 u λ := λ, λ 0, 1 ], are minimizers for t = t and, similarly, that all members of ( ) ( ) 1 1 u λ := λ, λ [1, 2], 4 2 are minimizers for t = 1. ence, due to the invertibility of A, also the corresponding forward solution paths are not unique. The strategy to find such non-unique solutions is using the ansatz Au λ = f λq (cf. (2.21)), where A q is a subgradient of u λ for λ in a suitable interval. Furthermore, since A is invertible, we can use the change of variables v = Au to obtain min Au f u R t u 1 = min v f + t A 1 v v R 2 1. ence, we can have non-uniqueness even if the forward operator is trivial, i.e., equals the identity. An important consequence of the uniqueness of the forward solution is the continuity of the residual map t R(t). Corollary 2.22 (Continuity of the residuals). Let α > 1 or β > 1. Then the map t R(t) is continuous for all t > 0. Proof. The continuity follows from a straightforward generalization of the proof of Claim 3 in [17], using that A is weak -to-weak continuous and the ilbert norm is weak lower semicontinuous. From the uniqueness of the forward solution path and the residuals we immediately obtain Corollary Under the conditions of Theorem 2.18 it holds: 1. For every t > t the subgradient p t from the optimality conditions (2.2) and (2.3) is uniquely determined. 2. If A is injective, then uniqueness of the forward solution path implies uniqueness of the solution path t u t. 13

14 3 Relation of the problems In this section, we will deal with the mutual relation of minimizers of E α,β t ( ; f) for different values of α and β. The structure of the subgradient (2.3) suggests that as long as Au t f, J(u t ) 0, one can switch back and forth between minimizers corresponding to different choices of the exponents α, β by adapting the regularization parameter t. Foreshadowing, one has one-to-one correspondences of all minimizers within the critical parameter range (t, t ) where t and t can attain the values 0 or, respectively. For instance, minimizers of E 1,2 t ( ; f) for t (t, ) correspond exactly to those of Eτ 2,1 ( ; f) for τ (0, τ ). Exemplary, we will prove this equivalence for minimizers of E α,1 t ( ; f) with α 1 and Eτ 2,1 ( ; f), the latter being the standard variational problem with squared norm and one-homogeneous regularization. Since both models possess finite extinction time due to β = 1, we will obtain full equivalence for t (t, ) and τ (0, ). Note that in the following, the expression u t will correspond to minimizers of E α,1 t ( ; f) whereas v τ will only be used for minimizers of ( ; f). In particular, R(t) = Au t f and R(τ) = Av τ f denote the respective E 2,1 τ residuals and are not to be confused. Furthermore, we remind of the fact that the residual Au t f is not uniquely determined if α = 1. By the optimality condition (2.3) we obtain the following two lemmas. Lemma 3.1. Let t > t and u t be a minimizer of E α,1 t ( ; f). Then u t is a minimzer of Eτ 2,1 ( ; f) with τ := tr(t) 2 α. Lemma 3.2. Let τ > 0 and v τ be the minimizer of Eτ 2,1 ( ; f). Then v τ is also a minimizer of E α,1 t ( ; f) with t := T (τ) := τr(τ) α 2. Theorem 3.3. The map T : (0, ) (0, ), τ T (τ) := τr(τ) α 2 is well-defined, nondecreasing, and surjective. If α > 1, it is even a bijection with continuous inverse S(t) := tr(t) 2 α. Proof. Since by Theorem 2.6 the residuals of minimizers with strictly convex data term are unique, map T is well-defined. By Corollary 2.22 T is continuous. Let us first consider the case α > 1. Then similarly S is well-defined and continuous. Furthermore, it follows from the uniqueness of the residuals that S and T are mutual inverses. Finally, T is non-decreasing which can be seen as follows. For α 2 this is obvious as T is the product of non-decreasing functions (cf. Lemma 2.8). For α (1, 2) the same holds true for S. Since they are inverses, shows that both T and S are increasing for arbitrary α > 1. Let us now address the case α = 1. As we have seen, the residuals R(t) are not unique in general and therefore, the map S(t) := tr(t) is not well-defined. owever, by Lemmas 3.1 and 3.2 we infer that T is still surjective. Furthermore, being the pointwise limit of the increasing functions τr(τ) α 2 for α 1 shows that T (τ) = τ/r(τ) is non-decreasing. Remark 3.4. Since S and T are continuous on the positive reals for α > 1, non-decreasing, and bounded from below, they can be continuously extended to [0, ). Theorem 3.5. Non uniqueness of the forward solution of E 1,1 t ( ; f) in some t > 0 is in oneto-one correspondence to an affine forward solution path of the form f τq of Eτ 2,1 ( ; f) for τ [τ 0, τ 1 ] where 0 τ 0 < τ 1. 14

15 Proof. Let us first assume that f τq is the forward solution path of E 2,1 τ ( ; f) for τ [τ 0, τ 1 ] and 0 τ 0 < τ 1. Then q 0 holds since f cannot be a minimizer for any positive value of τ. ence the time reparametrization T reduces to T (τ) = τ R(τ) = τ = 1, τ q q which means by Lemma 3.2 that f τq is also the forward solution of E 1,1 t ( ; f) for t := 1 q. Since τ runs in a proper interval this implies the non-uniqueness of the forward solution in t. Conversely, let us assume that the forward solution of E 1,1 t ( ; f) is not unique. Then there exist u 0, u 1 argmin E 1,1 t ( ; f) such that Au 0 Au 1. Due to convexity also the convex combinations u λ := (1 λ)u 0 + λu 1 for λ [0, 1] are minimizers of E 1,1 t ( ; f) and it holds Au λ f = (1 λ)(au 0 f) + λ(au 1 f). (3.1) We distinguish two cases. If Au 0 = f and Au 1 f (this corresponds to t = t ) we have If, however, Au 0, Au 1 f we can use (2.21) to write Au λ f = λ(au 1 f). (3.2) Au 1 f = c(au 0 f) for some c (0, ) \ {1} which, together with (3.1), implies Au λ f = (1 + λ(c 1))(Au 0 f). (3.3) In any case, we can define numbers τ 0 := t Au 0 f 0 and τ 1 := t Au 1 f. In the first case we have 0 = τ 0 < τ 1 and in the second case after possibly exchanging the roles of u 0 and u 1 we can assume c > 1 such that 0 < τ 0 < τ 1 holds. Next, we observe that, due to Lemma 3.1, u λ is a minimizer of Eτ 2,1 ( ; f) with τ := t Au λ f. By using (3.2) or (3.3), respectively, we infer that in both cases it holds τ = (1 λ)τ 0 + λτ 1 [τ 0, τ 1 ] which is equivalent to λ = (τ τ 0 )/(τ 1 τ 0 ) [0, 1]. Plugging this expression for λ into (3.2) or (3.3), respectively, the forward solution of Eτ 2,1 ( ; f) in τ [τ 0, τ 1 ] is given by ( ) f Au1 f τ in both cases, which follows after some algebra and allows us to conclude. Corollary 3.6. The forward solution of E 1,1 t ( ; f) is uniquely determined for almost every t > 0. Proof. According to Theorem 3.5, non-uniqueness implies a forward solution path of the form f τq of the problem Eτ 2,1 ( ; f) for τ [τ 0, τ 1 ] which implies that T (τ) = t is constant for τ [τ 0, τ 1 ]. Since the set {t R : {τ : T (τ) = t} > 0} has Lebesgue measure zero, we can conclude. The relation of the problems for different values of α and β can also be interpreted in terms of Bayesian models for inverse problems (cf. [41]). Under appropriate conditions, E α,β t ( ; f) can be interpreted as the Onsager-Machlup functional of a posterior distribution and its 15 τ 1

16 minimizer is the maximum a-posteriori probability (MAP) estimate (cf. [28, 1]). In the finitedimensional case the posterior density is often simply modeled as p(u f) exp( ce α,β t (u; f)). In practice, α is determined from the noise modelling, while one usually chooses β = 1 based on the standard formulation of the variational problem. Essentially, the posterior distribution is extrapolated from the collection of MAP estimates, in practice. owever, the equivalence of the minimization problems for different β shows that there is a variety of posterior distributions leading to the same MAP estimates for any f. The behaviour of the posterior however can differ strongly, in particular in degenerate cases such as BV (cf. [18, 33, 13]). 4 Regularity of the forward solution path In this section, we investigate regularity of the forward solution path t {Au t : u t argmin E α,β t ( ; f)} which we have shown to be a single-valued map for α > 1 or β > 1 in the previous section. As already mentioned, when using the minimization of (1.1) for obtaining non-linear spectral decompositions of the data f, one typically computes derivatives of the (forward) solution path with respect to t. While these solution paths can be shown to be sufficiently regular under some finite dimensional assumptions (cf. the discussion in Section 5.2.2), a general study of their regularity in a Banach or ilbert space setting is still pending. Our results are a first contribution in this direction and the topic will remain subject of future research. 4.1 Lipschitz continuity of the forward solution path, α > 1 or β > 1 If the range condition is fulfilled and the datum f meets a source condition, one can obtain an upper bounds for the growth of the residual map t R(t). This can also be interpreted as ölder continuity of the forward solution path close to zero in the case α > 1. First, we state a preparatory lemma. Lemma 4.1. Let α, β 1, and u t be a minimizer of E α,β t ( ; f) for t > t. If (RC) and (SC) hold, then q t, defined as q t := f Au t tr(t) 2 α, (4.1) J(t) β 1 fulfills } q t min {s, R(t)α 1 tj(t) β 1. (4.2) Proof. By the optimality conditions (2.3) we infer that p t := A q t J(u t ). Furthermore, letting ˆq and û be such that p 0 := A ˆq J(û) and ˆq = s, we calculate 1 q t ˆq, q t = tr(t) 2 α J(t) β 1 q 1 t ˆq, Au t f = tr(t) 2 α J(t) β 1 p t p 0, u t u 0, which is equivalent to q t 2 ˆq, q t. With Cauchy-Schwarz this implies q t ˆq = s. The other upper bound is trivial. 16

17 Now we are ready to prove the growth bounds of the residuals. Note that the growth in zero can be estimated more sharply when demanding the source condition (SC). Lemma 4.2. Let (RC) hold true. It holds for all t > t Under condition (SC) and if α > 1 it even holds R(t) t 1 α J β α min. (4.3) R(t) t 1 1 α 1 s α 1 Proof. From the optimality condition (2.3) for u t we obtain 0 = R(t) α 2 A (Au t f) + tj(t) β 1 p t, β 1 α 1 Jmin. (4.4) where p t J(u t ). Reordering yields A A(u t u ) = tr(t) 2 α J(t) β 1 p t and by taking the duality product with u t u we obtain R(t) 2 = Au t f 2 =tr(t)2 α J(t) β 1 p t, u u t tr(t) 2 α J(t) β 1 (J min J(u t )) tr(t) 2 α J β min, where we used that J is decreasing (cf. Lemma 2.8) and J(u t ) 0. Given (SC), we define q t as in (4.1) and use Lemma 4.1 to write p t, u u t = A q t, u u t q t, f Au t q t R(t) s R(t) which can be used to obtain the second inequality if α > 1. Lemma 4.3. Let α, β 1. For t < s < t the estimate Au t Au s tj(t)β 1 R(t) 2 α sj(s) β 1 R(s) 2 α tj(t) β 1 R(t) α 1 (4.5) holds, where u t and u s are minimizers of E α,β t ( ; f) and Es α,β ( ; f), respectively. Proof. Defining t := tj(t) β 1 R(t) 2 α and s analogously, we obtain from the optimality conditions for p t and p s given by (2.3): 1 s A A(u t u s ) + p t p s = t s s t A (Au t f). Taking a duality product with u t u s and using non-negativity of the symmetric Bregman distance yields 1 s Au t Au s 2 t s s t Au t f, Au t Au s. Applying the Cauchy-Schwarz inequality to the right hand side and simple reordering, leads to Au t Au s t s R(t) = t Plugging in the definitions of t and s concludes the proof. t s tj(t) β 1 R(t)α 1. 17

18 Up to now our results included the case α = β = 1. Since, however, in this case the forward solution path is not even uniquely defined, one cannot expect continuity properties. Thus, the following statements will always require that one of the weights α and β is larger than one. In particular, the maps R(t) and J(t) are well-defined in that case. Corollary 4.4 (Continuity of the forward solution path). If α > 1 or β > 1, estimates (4.5) together with the continuity of the residuals and the regularizers (cf. Corollary 2.22) shows that the forward solution path t Au t is continuous for all t > t. Continuity in t = t follows from the continuity of the residuals. For α 2 one even obtains Lipschitz estimates of the forward solution path. Once more, the estimates close to zero can be improved by assuming the source condition. Lemma 4.5. Let α 2 and β 1. For 0 < s < t the estimate Au t Au s t s R(t) t s f t t (4.6) holds, where u t and u s are minimizers of E α,β t ( ; f) and Es α,β ( ; f), respectively. Under conditions (RC) or (SC) this estimate can be improved to or Au t Au s C R t s respectively, with constants C R := J β α min and C S := s Proof. We start from (4.5): t α 1 α (4.7) t s Au t Au s C S, (4.8) 1 α 1 t α 2 α 1 β 1 α 1 Jmin. Au t Au s tj(t)β 1 R(t) 2 α sj(s) β 1 R(s) 2 α tj(t) β 1 R(t) α 1 ( ) β 1 ( ) 2 α t s J(s) R(s) J(t) R(t) = R(t). t If we now use that α 2 and that for s < t it holds J(t) J(s) and R(s) R(t), we obtain Au t Au s t s R(t) t s f t t. ere we employed the a-priori estimate R(t) f which follows from E α,β t (0; f). Under (RC) or (SC) one uses Lemma 4.2 to further estimate R(t). E α,β s Using this lemma, we can deduce two regularity statements. (u t ; f) Lemma 4.6. Let α 2 and β 1. Then the maps t R(t) and t J(t) are Lipschitz continuous on (δ, ) for all δ > 0. 18

19 Proof. The first assertion is an immediate consequence of the reverse triangle inequality: R(t) R(s) = Au t f Au s f Au t Au s. Since by estimates (4.6), (4.7), or (4.8) the forward solution path is Lipschitz, the same holds for R. For the second claim, let 0 < s < t and let u s and u t denote corresponding minimizers. Thus, it holds β sα R(s)α + J(s) β β sα R(t)α + J(t) β from which we deduce J(s) β J(t) β = J(s) β J(t) β β sα (R(t)α R(s) α ). Since R( ) is locally Lipschitz on (δ, ) for all δ > 0, the same holds for R( ) α and for t J(t) β. Applying the β-th root, preserves local Lipschitz continuity away from zero and hence we can conclude. Theorem 4.7 (Lipschitz continuity of the forward solution path I). Let α 2, β 1. The forward solution path t Au t is Lipschitz continuous on (δ, ) for all δ > 0. ence, (Au t ) exists almost everywhere in (δ, ) and it holds If (RC) or (SC) is fulfilled, it holds (Au t ) R(t) t (Au t ) C R t 1 α α or f. (4.9) t (Au t ) C S t 2 α α 1, (4.10) respectively, for almost every t (0, ). Furthermore, if α = 2 and assuming conditions (RC) and (SC), the Lipschitz continuity becomes global on [0, ). Proof. Lipschitz continuity of t Au t is a direct consequence of estimate (4.6). Since, being a ilbert space, has the Radon-Nikodym property (cf. [40], for instance), we can deduce from a generalization of Rademacher s theorem [4, 30] that (Au t ) exists almost everywhere on (0, ). Estimates (4.9) and (4.10) are direct consequences of (4.6), (4.7) and (4.8). Global Lipschitz continuity on the whole real line for α = 2 follows from (4.8). In order to proceed to the case 1 α < 2, where α = 1 requires β > 1, we use the relation between the different formulations established above. For simplicity, we will only consider the case 1 < α < 2 and β = 1. Defining τ T (τ) = τr(τ) α 2 as in Theorem 3.3, one observes that, due to Lemma 4.6, function T is Lipschitz continuous on (δ, ) for all δ > 0. ence, its derivative T exists almost everywhere in (δ, ) and it holds T (τ) = d dτ τr(τ)α 2 = τ(α 2)R(τ) α 4 Av τ f, (Av τ ) + R(τ) α 2. (4.11) ere, we used that also R (τ) exists almost everywhere according to Lemma 4.6, and can be computed with the chain rule: R (τ) = Av τ f, (Av τ ) /R(τ). (4.12) 19

20 Thus, T is positive if and only if τ(2 α)r(τ) 2 Av τ f, (Av τ ) < 1. For 1 < α < 2, this inequality is true due to Cauchy-Schwarz and estimate (4.9) which can be used to bound (Av τ ). ence, in that case also S, the inverse of T, is a Lipschitz function. Consequently, we obtain Lipschitz continuity for minimizers u t of E α,1 t ( ; f) with 1 < α < 2 since Au t = Av S(t) is a composition of Lipschitz functions. By setting T (τ) = τr(τ) α 2 J(τ) 1 β this argument can easily be repeated for β > 1, which makes the calculations more cumbersome but leads to the same results. In this case, also α = 1 and β > 1 yields the desired Lipschitz continuity. ence, the assumption α 2 in Lemma 4.6 and Theorem 4.7 can be relaxed to α > 1 or β > 1 without loosing Lipschitz continuity or differentiability of the forward solution path. owever, estimates (4.9) and (4.10) need to be adapted. To keep the presentation short, we only formulate the estimates for β = 1. Theorem 4.8 (Lipschitz continuity of the forward solution path II). Let 1 α < 2 and β 1 such that α > 1 or β > 1. The forward solution path t Au t is Lipschitz continuous on (δ, ) for all δ > 0. Furthermore, (Au t ) exists almost everywhere in (δ, ) and it holds for almost all t (0, ), 1 < α < 2, and β = 1 If (RC) or (SC) is fulfilled, it holds (Aut ) 1 R(t) 1 α 1 t α 1 (Aut ) 1 α 1 C Rt 1 α α respectively, for almost every t (0, ). or f. (4.13) t (Aut ) 1 α 1 C St 2 α α 1, (4.14) Proof. For simplicity, we only consider the case β = 1. It remains to prove the bound (4.13). To this end, we let u t denote a minimizer of E α,1 t ( ; f). Then it holds according to the previous results that u t = v τ with τ = S(t) and with the chain rule together with (4.9) we obtain Now from S(t) = tr(t) 2 α we find that (Aut ) (Avτ ) S (t) R(τ) S (t). τ S (t) = S (t) = t(2 α)r(t) α Au t f, (Au t ) + R(t) 2 α = R(t) α [ t(2 α) Au t f, (Au t ) + R(t) 2]. Consequently, if we use R(τ) = Av τ f = Au t f = R(t), the definition of S, and τ = S(t) we infer (Au t ) R(t) tr(t) 2 α S (t) = 1 [ t(2 α) Aut f, (Au t ) + R(t) 2] tr(t) (2 α) (Au t ) + R(t). t Reordering yields the first inequality in (4.13), from where on we proceed as before. 20

21 4.2 Bounded variation of the forward solution path, α = β = 1 Using the equivalence of the problems together with the Lipschitz regularity of minimizers of the quadratic problem one can at least show that the forward solution path t Au t for α = β = 1, which is well-defined almost everywhere according to Corollary 3.6, has bounded variation. Proposition 4.9. The solution path t Au t where u t is the minimizer of E 1,1 t ( ; f) is of bounded variation on (δ, ) for all δ > 0 and on [0, ) if t > 0 holds. Furthermore, the jump part of the measure (Au t ) is supported in [t, t ]. Proof. First, we notice that Au t is well-defined for almost every t > 0 according to Remark 3.6. Let us first assume that t > 0, i.e, conditions (RC) and (SC) hold. We already know that Au t has zero variation on (0, t ) and (t, ). ence, it is enough to assert finite variation on the interval (t, t ). To this end, let t = t 1 < t 2 < < t n 1 < t n = t be a finite partition of the interval (t, t ). By Theorem 3.3, we can choose numbers 0 τ 1 < < τ n such that Au tk = Av τk for all k = 1,..., n. ere, τ n = τ is given by the finite extinction time of minimizers of Eτ 2,1 ( ; f). Furthermore, using (4.8) with α = 2 we compute n 1 n 1 Autk+1 Au tk = Avτk+1 Av τk k=1 k=1 n 1 C S (τ k+1 τ k ) = C S (τ τ 1 ) C S τ <. k=1 Forming the supremum over all partions of (t, t ) shows that Au t has bounded variation. If t = 0 one uses (4.6) to deduce the weaker result. Consequently, the finite Radon measure (Au t ) can be decomposed into an absolutely continuous part, a jump part, and a Cantor part (see [3] for precise definitions), where the jump part is supported in [t, t ] since Au t is constant outside this interval. Once more, we obtain statements concerning the subgradient and the solution path. Corollary Under the conditions of Theorem 4.8 or Proposition 4.9, respectively, it holds: 1. The map t p t, where p t is given by the optimality conditions (2.2) and (2.3), has the same regularity as the forward solution path. 2. If A is bounded from below, meaning that there is c > 0 such that c u X Au, u X, then the solution path t u t has the same regularity as the forward solution path. 5 Towards spectral decompositions 5.1 Solution path of generalized singular vectors A canonical approach to define a spectral representation φ t of some data f with respect to the functional E α,β t ( ; f) consists in examining the solution path that corresponds to singular vectors (cf. [8, 9]) of J, i.e., f = Au where λa Au J(u ) for some λ > 0. For such data, one would like to have a peak in the spectral representation, that is, φ t = fδ 1/λ (t), where 1/λ can be interpreted as a generalized frequency. 21

22 Proposition 5.1. Let λ > 0 and u such that f = Au and λa Au J(u ), i.e., u is a singular vector with singular value λ. Letting 1 denote the indicator function (cf. (A.7)), a minimizer u t of E α,1 t ( ; f) is given by { 1(0,(λ f ) u t = 1 )(t) u, α = 1, (1 (tλ) 1 2 α α 1 f α 1 ) +u, α > 1, and a minimizer for E 2,2 t ( ; f) is given by u t = 1/(1 + tλ 2 )u. The extinction times of these solution are given by (λ f 2 α ) 1 for α 1 and for α, β = 2, respectively. Proof. In the case α = 1 one can easily check that t = t = 1/(λ f ) if f is an eigenfunction. The other minimizers can be obtained by inserting the ansatz u t = c(t)u into the optimality condition (2.3). ut u 0 0 1/λ t α = 1 α = 1.1 α = 1.5 α = 2 α = 3 α = 10 Figure 1: Solution paths of normalized singular vectors for different values of α and β = 1 Figure 5.1 shows the corresponding solution paths for a singular vector u with singular value λ such that f = Au has unit norm and β = 1. In this case, all paths extinct in 1/λ. ence, in order to obtain φ t = fδ 1/λ (t), formally suitable spectral representations for β = 1 are φ t = (Au t ) if α = 1 and φ t = t(au t ) if α = 2. If A is bounded from below, one can even choose φ t = u t or φ t = tu t, respectively. For other α s an integer derivative does not produce a delta peak, in general. owever, if α = 3 2, for instance, we find that φ t = t2 2 (Au t) does the trick. Note that by these definitions and due to the finite extinction time the reconstruction formula f = holds which can be used for spectral filtering by defining f F := 0 0 φ t dt + AP A (f) (5.1) F (t)φ t dt + F ( )AP A (f), (5.2) where F is a sufficiently well-behaved filter function (cf. [24], for instance). Remark 5.2. Note that while φ t = (Au t ) is a well-defined finite Radon measure according to Proposition 4.9, whereas this is a-priori unclear for φ t = t(au t ). owever, due to the finite extinction time, this spectral representation can be defined in a distributional sense, via φ t (ψ) := 0 (Au t ), (tψ(t)) dt, (5.3) 22

23 where ψ : R is a Fréchet-differentiable test function with ψ(t) = 0 for all t in a neighborhood of 0. Owing to Theorem 4.7, the second condition is not even necessary if one of the conditions (RC) or (SC) holds since in that case (Au t ) is integrable in zero. Proposition 5.1 also shows that, although all problems for α > 1 are equivalent, they significantly differ in terms of the spectral representations which can be obtained from their solution paths. Furthermore, since the minimizer for β = 2 smoothly depends on t, no singular spectral representation can be achieved by computing time derivatives which is why we will restrict ourselves to the case β = 1 for the rest of the manuscript. Another interesting consequence of Proposition 5.1 is that some of the models E α,1 t ( ; f) are scale invariant on eigenfunctions. To see this, we choose J = TV as the total variation of functions on R n, X = BV(R n ) L 2 (R n ), = L 2 (R n ), and A the continuous embedding operator. It is well-known that eigenfunctions of TV are given by indiactor functions of so called calibrable sets Ω R n with eigenvalue P (Ω)/ Ω where P denotes the perimeter and is the n-dimensional Lebesgue measure (cf. [6, 2]). If f = 1 Ω for calibrable Ω, we find that the extinction time of minimizers E α,1 t ( ; f) is given by t ext (Ω) = Ω α 2 /P (Ω) for α 1. If one rescales Ω r = rω with some r > 0, then Ω r is still calibrable and the extinction time changes to t ext (Ω r ) = r n(α 2)+2 2 t ext (Ω). ence, we observe that for any dimension n 2 there is α := 2 2/n [1, 2) such that t ext (Ω r ) = t ext (Ω) which makes the model scale invariant. Note that in dimension n = 2, which is most relevant for imaging applications, the model E 1,1 ( ; f) becomes both contrast and scale invariant. 5.2 Spectral representations for α = 1 and α = 2 From now on our setting will be a Gelfand-triple X X such that operator A becomes a continuous embedding operator and will thus be omitted in our notation. Due to the observations in the previous section, we will only study the functionals F τ ( ; f) := Eτ 2,1 ( ; f) and E t ( ; f) := E 1,1 t ( ; f) and fix our notation in such that way that the corresponding minimizers are denoted by v τ and u t, respectively. We consider the spectral representations given by ϕ τ := τv τ, which is to be understood in the distributional sense, and φ t := u t, the latter being a finite Radon measure according to Proposition 4.9. From Proposition 5.1 we see that t can lie in the support of the jump part of φ t, but a-priori it is unclear whether the solution path t u t is really discontinuous at t. Recalling Theorem 3.5, this is equivalent to an affine behavior of v τ for small τ > 0. To assure that t > 0 in the first place, we make use of the range and source conditions (RC) and (SC), the latter boiling down to demanding that each subgradient p J(f) X is in fact an element of the smaller space. ence, we will only use the ilbert norm := in this section and omit the subscript. 23

24 5.2.1 Upper bound of the jump in t In case there is a jump, we can give an upper bound of the jump height. By transforming to the standard problem and using inequality (4.3) with α = 2, we obtain for almost every t > t R(t) = u t f = v S(t) f S(t)J(f) = tr(t)j(f) R(t) 2 tr(t)j(f) R(t) tj(f). ence, by letting t tend to t, we find lim sup u t f t J(f). (5.4) t t Note that this estimate is sharp since equality holds for eigenfunctions (cf. Proposition 5.1), where u t = 0 for all t > t and hence f = lim sup t t u t f t J(f) = 1 λ f λ f 2 = f Lower bound of the jump in t Let us now turn to the question under which conditions there actually is a jump at t. We have seen in Section 3 that this is guaranteed if the time parametrisation T, mapping minimizers of F τ ( ; f) to those of E t ( ; f), constantly attains the value t on an interval (0, ˆτ) where ˆτ > 0. In this case, the monotonicity of the residuals implies u t f vˆτ f for all t > t and therefore lim inf t t u t f vˆτ f > 0. (5.5) On one hand, since (4.4) with α = 2 implies that R(τ) τs, we obtain T (τ) = τ R(τ) 1 s = t, τ > 0. (5.6) On the other hand, since T being constant on an interval (0, ˆτ) implies that u T (τ) is not unique for τ (0, ˆτ), the results of Theorem 3.5 show that T (τ) = t Affine solution paths of the quadratic problem ence, in order to understand the behavior of u t at t = t it suffices to study the respective behavior of v τ for small times τ. By Moreau s identity (cf. [35] for a finite dimensional version) and letting K := J(0), we find that the minimizer v τ of F τ ( ; f) is given by ( ) f v τ = f τproj K. (5.7) τ ere we used that J = χ K (cf. (A.5), (A.6)) and let proj K( ) denote the projection on the closed and convex set K with respect to the ilbert norm which is well-defined as K 0. 24

25 Remark 5.3. While Moreau s identity is often formulated in ilbert spaces or finite dimensions, the identity p J(u) u J (p), which holds for lower semi-continuous and convex J defined on a Banach space X (cf. [32, Ch. 5]), makes it easy to show that it is applicable also in our slightly more general setting. The beauty of the representation (5.7) lies in the fact that it allows us to study the solution path v τ by investigating the geometric properties of the set K and the projection onto it. Using (5.7), the residual is given by R(τ) = τ proj K (f/τ) and therefore T (τ) = proj K (f/τ) 1. Taking Theorem 3.5 into account we can say that there is a jump if and only if there exists ˆτ > 0 such that one of the following holds proj K (f/τ) argmin{ p : p J(f)}, 0 < τ ˆτ, (5.8a) τ v τ := f τproj K (f/τ) is affine for 0 < τ ˆτ, t T (τ) is constant on (0, ˆτ]. (5.8b) (5.8c) Note that (5.8) is always fulfilled if K R n is polyhedral 1 since in this case the solution v τ is piecewise affine with v τ = f τp for τ [0, ˆτ] and p J(f), as it was shown in [12] or less general for LASSO / l 1 problems in [21, 36, 42, 10]. owever, the condition of a polyhedral K is neither necessary nor can it be completely waived, as the following examples show. Example 5.4. Let a 1, a 2 > 0 with a 1 a 2, let M = diag(a 1, a 2 ), and J(u) = u, Mu. Then, K is an ellipse with semi-axes a 1 and a 2 and, therefore, not polyhedral. ere, J(f) = {(a 1 f 1, a 2 f 2 )/J(f)} for f (0, 0). If f is no eigenvector, i.e., f is not parallel to a semi-axis, the projection of f/τ onto K does not equal J(f) for any τ > 0, as it can be easily seen from the corresponding Karush-Kuhn-Tucker conditions. ence, conditions (5.8) are violated and there is no jump. Example 5.5. Let now J : R 2 R be given by { u J(u) = 1, if sgn(u 1 ) = sgn(u 2 ) u 2, else. Then K coincides with the unit square in the first and the third quadrant, and with the unit circle in the remaining quadrants of R 2 (Fig. 2). It is easy to see that all vectors in the second and fourth quadrant are eigenfunctions and hence (5.8) trivially holds. The solution path of vectors in the first and third quadrant is also piecewise affine since the problem coincides with standard l 1 -shrinkage (see the references before) there. Note that K is not polyhedral either. 1 u K 1 u 1 Figure 2: Non-polyhedral K Example 5.6. Let Ω R n be open and bounded, X = = L 2 (Ω) and J = 1. Then K = {u L (Ω) : u 1} and if f L 2 (Ω) fulfills f(x) c > 0 for almost every x Ω, then for 0 < τ ˆτ = c it holds proj K (f/τ)(x) = 1, for almost every x Ω, hence, the jump exists. Obviously, here K is also not polyhedral since the unit ball in L (Ω) is not generated by the convex combinations of a finite number of functions. 1 Polyhedral in the context means being the convex hull of a finite set of vectors. 25

26 Example 5.7. Let I R be an interval, X = BV(I), = L 2 (I), and J = TV. If f is piecewise constant, then according to [19] the solution v τ is piecewise affine with v τ = f τp for τ [0, ˆτ] and p J(f). In [31] the authors prove similar results in two dimensions, using anisotropic total variation as regularization and assuming the data to be piecewise constant on rectangles. Now we formulate a theorem which deals with a important question concerning non-linear spectral decompositions, namely with the decomposition of a linear combination of generalized eigenvectors. Two conditions that suffice for a perfect decomposition into eigenvectors are the (SUB0) condition and orthogonality of the eigenvectors, introduced in [38]. ere the authors showed that the inverse scale space flow is able to decompose the data perfectly into the eigenvectors. We prove a similar statement for the variational problem F τ ( ; f), in particular, the solution path v τ will shrink each eigenfunction linearly until disappearance and will, thus, be piecewise affine in τ. Theorem 5.8 (Linear combination of eigenvectors I). Let f being the linear combination of orthogonal eigenvectors, i.e., f = n i=1 γ iu i where γ i 0, λ i u i J(u i ) with λ i > 0, and u i, u j = 0 for all i, j = 1,..., n, i j. Furthermore, we define p k := n i=k sgn(γ i)λ i u i and assume that p k K, k = 1,..., n. (SUB0) Additionally, we assume an ordering such that γ i /λ i < γ i+1 /λ i+1 holds for all i = 1,..., n. Then the minimizer v τ of F τ ( ; f) is given by n v τ = sgn(γ i )( γ i τλ i )u i, τ k 1 < τ τ k, (5.9) i=k where τ 0 := 0, τ k := γ k /λ k, and k = 1,..., n. Proof. We need to check that p τ := (f v τ )/τ J(v τ ) for τ > 0. According to [38, Prop. 3.4] the functional J behaves linearly on non-negative linear combinations of the u i s, i.e., for every k = 1,..., n it holds ( n ) n J c i u i = c i J(u i ), c i 0, i = k,..., n. i=k i=k Using the definitions of f and v τ we find that for τ k 1 < τ τ k and k = 1,..., n it holds p τ = 1 k 1 n γ i u i + sgn(γ i )λ i u i. τ i=1 ence, using orthogonality of the u i s and that γ i τλ i 0 for i k, we compute p τ, v τ = 1 k 1 n n n γ i u i, sgn(γ i )( γ i τλ i )u i + sgn(γ i )λ i u i, sgn(γ i )( γ i τλ i )u i τ i=1 i=k i=k i=k }{{} = =0 n ( γ i τλ i ) λ i u i, u i = i=k i=k n ( γ i τλ i )J(sgn(γ i )u i ) i=k ( n ) = J sgn(γ i )( γ i τλ i )u i = J(v τ ). i=k 26

27 ere we also used that J is even and that u i is also an eigenvector if u i is one. It remains to show that p τ K. It holds for arbitrary v X p τ, v = 1 k 1 γ i u i, v + p k, v. τ i=1 If k = 1, the left summand vanishes and we can conclude using p k K. Otherwise, analogously to [38, Thm. 3.14], we argue as follows: If the first inner product is 0, we are done since p k K. Otherwise, we can estimate τ τ k 1 and hence p τ, v 1 τ k 1 = 1 τ k 1 = 1 τ k 1 k 1 γ i u i, v + p k, v i=1 k 2 γ i u i, v + sgn(γ k 1 )λ k 1 u k 1, v + p k, v i=1 k 2 γ i u i, v + p k 1, v. i=1 Depending on the sign of the left inner product, we can iterate this process using τ k 1 τ 1 and (SUB0) until we end up with p τ, v p 1, v J(v) since p 1 K. Consequently, p τ K and we can conclude. Remark 5.9. Note that it is straightforward to extend this result to data which is composed of generalized singular vectors, i.e., f = n i=1 γ iau i. To this end, one has to demand A- orthogonality Au i, Au j = 0 for i j and define p k := n i=k sgn(γ i)λ i A Au i, instead. Remark It is no significant restriction in Theorem 5.8 to assume that all γ i /λ i are different for i = 1,..., n. If this were not the case, the corresponding eigenfunctions would simply shrink away simultaneously. owever, in order to avoid unnecessarily complicated formulae, we refrained from considering this case. Remark 5.11 (Action of proximal operators). Theorem 5.8 can be interpreted in such a way that if the data f can be written as a linear combination of orthogonal { eigenfunctions} and (SUB0) is fulfilled, then the proximal operator prox τj (f) := argmin 1 v 2 v f 2 + τj(v) performs shrinkage on the eigendirections. This is in particular true for the 1 -norm where the standard basis of R n constitutes a set of orthogonal eigenvectors fulfilling (SUB0). Example Let us illustrate the preceding remark for the proximal operator of the - norm in two dimensions. Let { } 1 v τ := prox τ (f) := argmin 2 v f 2 + τ v : v R 2 and K := {v R 2 : v 1 1} be the unit ball of the 1-norm. We observe that u 1 = 1 ( ) 1, u = 1 ( )

28 constitute a basis of eigenvectors of with eigenvalue 1. In particular, any f R 2 can be written as ( ) f1 f = = (f 1 + f 2 )u 1 + (f 2 f 1 )u 2 =: γ 1 u 1 + γ 2 u 2. f 2 Note that the (SUB0) condition is met since u 1, u 2, u 1 + u 2 K and the u i s are orthogonal. If f is an eigenfunction of, the analytic expression for prox τ (f) becomes trivial and, thus, we assume that γ 1, γ 2 0 and γ 1 γ 2. This guarantees that f is no eigenfunction. Furthermore, we reorder such that 0 < γ 1 < γ 2 holds. ence, we find by (5.9) that v τ is given by { sgn(γ 1 )( γ 1 τ)u 1 + sgn(γ 2 )( γ 2 τ)u 2, 0 τ τ 1 := γ 1, v τ = sgn(γ 2 )( γ 2 τ)u 2, τ 1 < τ τ 2 := γ 2. Corollary 5.13 (Linear combination of eigenvectors II). Under the conditions of Theorem 5.8 the minimizer u t of E t ( ; f) is u t = v S(t) where S is given by S(t) = t k 1 i=1 γ2 i u i 2, t k 1 < t t k, k = 1,..., n. (5.10) 1 t 2 p k 2 ere, t k := T (τ k ) = τ k /R(τ k ) for k = 0,..., n and the S(t) := 0 if k = 1. Proof. From the definition of v τ in (5.9) we easily see, using the orthogonality of the u i s, that τ T (τ) = k 1, τ k 1 < τ τ k, k = 1,..., n. i=1 γ2 i u i 2 + τ 2 p k 2 Inverting this on the intervals (τ k 1, τ k ) for k 2 yields the expression for S. Furthermore, it holds that t k = T (τ k ) < 1/ p k for k 2 which makes S well-defined and continuous. Noting that S is the inverse of T (τ) for τ > τ 1 and applying Lemmas 3.1 and 3.2 shows that u t = v S(t). Now we investigate the spectral representations φ t and ϕ τ under the conditions of Theorem 5.8. By means of Corollary 5.13, we find φ t = u t = d dt n sgn(γ i )( γ i S(t)λ i )u i i=k for t k 1 < t < t k. From (5.10) it is obvious that S is continuously differentiable on the intervals (t k 1, t k ) and discontinuous only in t 1. ence, the measure φ t is singular only in t := t 1 = T (τ 1 ) = 1/ p 1 and, since S is continuously differentiable on (t k 1, t k ), represented by a bounded function, elsewhere. The jump of u t in t is given by f vˆτ, where ˆτ := τ 1, and hence n φ t = f vˆτ = γ 1 u 1 + ˆτsgn(γ i )λ i u i. (5.11) This can be considered bad news since, on one hand, the spectral representation φ of the contrast-invariant problem E t ( ; f) is not able to isolate an individual eigenfunction although i=2 28

29 it has a delta peak at t. On the other hand, the time point t where the peak occurs is independent of the specific eigenfunction that vanishes. Thus, it cannot be brought into correspondence with the eigenvalue λ 1 or the factor γ 1. In contrast, the spectral representation ϕ is given by n ϕ τ = γ k u k δ τk (τ), τ > 0 (5.12) k=1 which is a perfect decomposition of the data f into its components. Given a general datum f, we are now interested in a condition which assures that the solution path v τ is affine in τ on a small interval. Theorem 5.14 (Affine solution path). Let ˆp J(f) with ˆp = s. If ˆτ given by ˆτ := 2 inf q K q ˆp J(f) q, f ˆp 2 q 2 (5.13) is positive, then v τ := f τ ˆp is the minimizer of F τ ( ; f) for τ [0, ˆτ]. Proof. We only need to check that for 0 < τ ˆτ it holds that proj K (f/τ) = ˆp. To this end, we compute for arbitrary q K ˆp f/τ 2 q f/τ 2 = ˆp 2 q 2 2 ˆp q, f τ = ˆp 2 q 2 2 τ (J(f) q, f ). (5.14) ence, if q K is such that q ˆp this is expression is non-positive, taking into account that J(f) q, f 0. Furthermore, for τ ˆτ and q ˆp we obtain that (5.14) is 0, as well. Together, this implies that ˆp f/τ 2 q f/τ 2 for all 0 < τ ˆτ and for all q K which implies the desired result. The following proposition provides at least a necessary condition for ˆτ being positive. Proposition Let ˆp J(f) with ˆp = s. If ˆp is not an eigenfunction and {p : p, f = J(f)} is the only supporting hyperplane of K through ˆp, then ˆτ = 0. Proof. If ˆp is not an eigenfunction, then we know that there is a positive angle between ˆp and f, i.e., there exists a direction ϕ orthogonal to f and δ > 0 with ˆp, ϕ δ ϕ. Since the supporting hyperplane is unique, there exists a sequence of directions ϕ n with ϕ n 0 becoming orthogonal to f in the limit such that q n = ˆp + ϕ n K and ˆp, ϕ n δ 2 ϕ n, Thus, since ϕ n < δ for n large enough, lim sup n J(f) q n, f ˆp 2 q n 2 = lim sup n ϕ n, f ϕ n ϕ n, f ϕ n 0. ϕ n 2 ˆp, ϕ n ϕ n 2 lim n ϕ n, f ϕ n 1 δ ϕ n = 0. ence, we can deduce that for sets K with smooth boundary in particular, having unique supporting hyperplanes we will in general not observe a (piecewise) affine behavior of the solution path. This can also be derived from [29] which states that in case of X being a ilbert space the degree of differentiability of the projection map proj K ( ) is given by d 1 if the K has a C d -boundary. 29

30 6 Numerical results In the following, we will present numerical experiments that serve to illustrate the different theoretical results of this work. The first experiment will use artificially generated data whereas the second one is computed on a real image. To be able to compute a spectral representation, we will restrict ourselves to the functionals E t ( ; f) and F τ ( ; f) whose minimization we achieve using the Primal-Dual-Algorithm of Chambolle and Pock [16]. 6.1 Sparse deconvolution ere, we consider 1D sparse deconvolution of a signal f R n which is obtained by convolving a peak signal u R n with a gaussian kernel of finite length (cf. left in Figure 3). In this setting, X = = R n, A corresponds to a convolution operator, and J is given by the 1-norm. The data u = 0.1u u u 3 0.4u u 5 is a linear combination of A-orthogonal singular vectors u i all of which have the same singular value λ and satisfy (SUB0). The u i s simply consist of a single peak of height 1. Note that in this case A-orthogonality simply means that the supports of the convolved peaks do not intersect. ence, we know from Theorem 5.8 and the subsequent remarks that the solution path v τ successively shrinks the singular vectors until their contributions disappear. In particular, there are four critical time points τ i, i = 1,..., 4 corresponding to the four different peak heights where all peaks of this very height vanish. This is illustrated on the right hand side of Figure 3, where the red, pink, and blue markers indicate the height of the corresponding peak at times τ 1, τ 2, and τ 3, respectively. The fourth critical time τ 4 coincides with the extinction time, meaning that the solution is identical to zero. Figure 3: Ground truth u and forward data f (left), solution at cricital times (right) The residuals and regularizers of the solution paths, which are shown in Figure 4, clearly reflect this behavior by having kinks at the critical times. Note that R(t) and J(t) indeed jump at t where the forward solution is not unique. Furthermore, R(τ) and J(τ) are piecewise linear in τ, as expected. Also the spectra, which we define as the 1-norm of the spectral representations φ t and ϕ τ, match our analytic results (5.11) and (5.12) since they posses numerical δ-peak at t or at the four critical times, respectively. In particular, we see that φ t does not have any atoms for t t. Note that the numeric height of the spectral peaks is not informative since the measure at these points is given by an multiple of an Dirac measure which has infinite height. 30

31 Figure 4: Residual R and Regularizer J of u t (left) and v τ (right) Figure 5: L 1 -norm of the spectral representations φ t (left) and ϕ τ (right) 6.2 TV denoising Next, we turn to the denoising model (ROF) and the variant with non-squared L 2 -norm, respectively. The data f is given by a noise-free version of the popular Barbara image and is shown in Figure 6. We refrain from adding noise since otherwise TV(f) = would hold which implies that the range and source conditions (RC) and (SC) cannot be satisfied. Figure 7 shows the residuals and regularizers of u t and v τ, respectively. We can observe that there is a positive t and that there are no kinks, meaning there is no visible piecewise behavior of the solution paths. The magnitudes of the spectral representations are given in Figure 8. Note that both spectra behave very regular and do not show any numerical delta peaks. owever, the spectrum of ϕ τ contains much more information, being encoded in two elevations that are marked in red (dotted) and blue (dashed). Figure 9 shows the corresponding spectral components ϕ τ integrated with respect to τ over the red and green area, respectively (cf. (5.2)). This procedure can be viewed as bandpass filtering with respect to the non-linear frequency decomposition ϕ τ and allows to extract and manipulate patterns and textures from the original image. In our example, these images correspond to differently oriented stripe patterns on the table cloth and Barbara s clothing. The spectrum of φ t, however, cannot be used for this task since the only two significant parts 31

32 of the spectrum marked in the same fashion correspond to very fine and fine structures (cf. Figure 10) but do not separate different textures. We have the suspicion that this behavior is explained by the closing remarks of Section 5.1 according to which the TV-model with α = 1 is scale-invariant on eigenfunctions in 2D. Indeed further numerical experiments indicate that for 1D TV-denoising the model is capable of capturing different scales. Another popular filtering procedure is high-pass and low-pass filtering which corresponds to keeping only the frequency components beyond or until a threshold frequency. Figures 11 and 12 show the corresponding filtered images using the spectra of ϕ τ and φ t, respectively. ere, both methods succeed equally well in separating texture and objects. Regarding, high and low-pass filtering, it can be considered a slight advantage of the spectral representation generated by the scale and contrast-invariant model that the magnitude of the spectrum decreases more rapidly and that textures seem to be concentrated more compactly in the spectrum. This can make automatic filtering easier and more robust. Conclusion We have analyzed a family of variational regularization functionals with different powers of the data fidelity and regularization terms, among which the model with quadratic fidelity and absolutely one-homogeneous regularization stands out as the standard choice. Apart from trivial solutions which are achieved for very small, respectively, large values of the regularization parameter all models generate the same set of minimizers. Therefore, simply aiming at finding a regular approximate solution to the inverse problem (IP), no specific weighting can be preferred over others. owever, if one is interested in the whole solution path and derivatives thereof with respect to the regularization parameter, the choice of the specific weighting becomes relevant. In particular, we have argued why it is necessary to choose the standard weighting in order to obtain non-linear spectral decompositions. Furthermore, the failure of the contrast-invariant methods to decompose a linear combination of eigenfunctions shows that enforcing consistency on a single eigenfunction is not enough to define a meaningful spectral representation of arbitrary data. Some open questions We conclude this work by pointing out some interesting open questions that are subject to future research. 1. It is an interesting question whether and how our results connect with generalized Cheeger sets (cf. [34]). It is already well-known that a convex set is calibrable if and only if it is a Cheeger set in itself. Furthermore, we have seen that the extinction time of a calibrable set Ω under TV with data term 1 α u f α L 2 is given by Ω α 2 /P (Ω) which is precisely the inverse Cheeger constant if Ω is a generalized Cheeger set, i.e. a solution to P (E) inf E Ω E m with m := α/2, where usually 1 1/n < m < 1 is assumed which corresponds to 2 2/n < α < 2. 32

33 2. Furthermore, a relevant open point is to find sufficient conditions for ˆτ > 0, meaning that v τ is affine on an interval (0, ˆτ). Judging from the presented examples, we suspect that the necessary condition from Proposition 5.15 could also be sufficient but a proof is still pending. 3. Related to the former point is the well-definedness of ϕ τ as a Radon measure for general data. Certainly, a piecewise affine behavior of the solution path guarantees this but this does not occur, in general. owever, we have the hope that formula (5.7) can be used to deduce the regularity of v τ from the regularity of the boundary of the convex set K. Figure 6: Data image f Figure 7: Residual R and Regularizer J of u t (top) and v τ (bottom) Figure 8: Spectra of φ t (left) and ϕ τ (right) with threshold frequencies t T and τ T 33

34 Figure 9: Band-pass filter via integration of ϕ τ over the marked areas of the spectrum: red (left), blue (right) Figure 10: Band-pass filter via integration of φ t over the marked areas of the spectrum: red (left), blue (right) 34

35 Figure 11: Left: igh-pass filter, right: low-pass filter with threshold frequency τ T Figure 12: Left: igh-pass filter, right: low-pass filter with threshold frequency t T 35

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS TSOGTGEREL GANTUMUR Abstract. After establishing discrete spectra for a large class of elliptic operators, we present some fundamental spectral properties