Convergence rates in l 1 -regularization when the basis is not smooth enough

Similar documents
The impact of a curious type of smoothness conditions on convergence rates in l 1 -regularization

Morozov s discrepancy principle for Tikhonov-type functionals with non-linear operators

Convergence rates for Morozov s Discrepancy Principle using Variational Inequalities

c 2011 International Press Vol. 18, No. 1, pp , March DENNIS TREDE

A range condition for polyconvex variational regularization

Regularization in Banach Space

Iterative regularization of nonlinear ill-posed problems in Banach space

Statistical regularization theory for Inverse Problems with Poisson data

Due Giorni di Algebra Lineare Numerica (2GALN) Febbraio 2016, Como. Iterative regularization in variable exponent Lebesgue spaces

The small ball property in Banach spaces (quantitative results)

Conditional stability versus ill-posedness for operator equations with monotone operators in Hilbert space

Accelerated Landweber iteration in Banach spaces. T. Hein, K.S. Kazimierski. Preprint Fakultät für Mathematik

Problem Set 2: Solutions Math 201A: Fall 2016

Linear convergence of iterative soft-thresholding

Canberra Symposium on Regularization and Chemnitz Symposium on Inverse Problems on Tour

ADJOINTS, ABSOLUTE VALUES AND POLAR DECOMPOSITIONS

ETNA Kent State University

A NEW ITERATIVE METHOD FOR THE SPLIT COMMON FIXED POINT PROBLEM IN HILBERT SPACES. Fenghui Wang

On Riesz-Fischer sequences and lower frame bounds

arxiv: v1 [math.na] 21 Aug 2014 Barbara Kaltenbacher

EXISTENCE RESULTS FOR QUASILINEAR HEMIVARIATIONAL INEQUALITIES AT RESONANCE. Leszek Gasiński

arxiv: v2 [math.fa] 18 Jul 2008

Robust error estimates for regularization and discretization of bang-bang control problems

Regularization and Inverse Problems

Sparse Recovery in Inverse Problems

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

A DECOMPOSITION THEOREM FOR FRAMES AND THE FEICHTINGER CONJECTURE

FUNCTION BASES FOR TOPOLOGICAL VECTOR SPACES. Yılmaz Yılmaz

Decomposition of Riesz frames and wavelets into a finite union of linearly independent sets

Fredholm Theory. April 25, 2018

Subsequences of frames

Levenberg-Marquardt method in Banach spaces with general convex regularization terms

A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Convergence Rates in Regularization for Nonlinear Ill-Posed Equations Involving m-accretive Mappings in Banach Spaces

ON THE DYNAMICAL SYSTEMS METHOD FOR SOLVING NONLINEAR EQUATIONS WITH MONOTONE OPERATORS

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

HAIYUN ZHOU, RAVI P. AGARWAL, YEOL JE CHO, AND YONG SOO KIM

NONTRIVIAL SOLUTIONS FOR SUPERQUADRATIC NONAUTONOMOUS PERIODIC SYSTEMS. Shouchuan Hu Nikolas S. Papageorgiou. 1. Introduction

FUNCTIONAL ANALYSIS-NORMED SPACE

A Double Regularization Approach for Inverse Problems with Noisy Data and Inexact Operator

On Semicontinuity of Convex-valued Multifunctions and Cesari s Property (Q)

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

CONVERGENCE OF APPROXIMATING FIXED POINTS FOR MULTIVALUED NONSELF-MAPPINGS IN BANACH SPACES. Jong Soo Jung. 1. Introduction

Numerical differentiation by means of Legendre polynomials in the presence of square summable noise

Functional Analysis. Martin Brokate. 1 Normed Spaces 2. 2 Hilbert Spaces The Principle of Uniform Boundedness 32

On Total Convexity, Bregman Projections and Stability in Banach Spaces

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

A convergence result for an Outer Approximation Scheme

AW -Convergence and Well-Posedness of Non Convex Functions

SOME REMARKS ON KRASNOSELSKII S FIXED POINT THEOREM

Contents: 1. Minimization. 2. The theorem of Lions-Stampacchia for variational inequalities. 3. Γ -Convergence. 4. Duality mapping.

( f ^ M _ M 0 )dµ (5.1)

SYMMETRY RESULTS FOR PERTURBED PROBLEMS AND RELATED QUESTIONS. Massimo Grosi Filomena Pacella S. L. Yadava. 1. Introduction

On Friedrichs inequality, Helmholtz decomposition, vector potentials, and the div-curl lemma. Ben Schweizer 1

Two-parameter regularization method for determining the heat source

Where is matrix multiplication locally open?

SPACES ENDOWED WITH A GRAPH AND APPLICATIONS. Mina Dinarvand. 1. Introduction

THE ALTERNATIVE DUNFORD-PETTIS PROPERTY FOR SUBSPACES OF THE COMPACT OPERATORS

arxiv:math/ v1 [math.fa] 26 Oct 1993

Necessary conditions for convergence rates of regularizations of optimal control problems

Iterative Regularization Methods for Inverse Problems: Lecture 3

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Generalized Tikhonov regularization


ON A HYBRID PROXIMAL POINT ALGORITHM IN BANACH SPACES

Convexity in R n. The following lemma will be needed in a while. Lemma 1 Let x E, u R n. If τ I(x, u), τ 0, define. f(x + τu) f(x). τ.

NONTRIVIAL SOLUTIONS TO INTEGRAL AND DIFFERENTIAL EQUATIONS

SOME PROPERTIES ON THE CLOSED SUBSETS IN BANACH SPACES

Overview of normed linear spaces

A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE

Iterative algorithms based on the hybrid steepest descent method for the split feasibility problem

On nonexpansive and accretive operators in Banach spaces

Banach Spaces II: Elementary Banach Space Theory

6. Duals of L p spaces

A MODIFIED TSVD METHOD FOR DISCRETE ILL-POSED PROBLEMS

Jordan Journal of Mathematics and Statistics (JJMS) 8(3), 2015, pp THE NORM OF CERTAIN MATRIX OPERATORS ON NEW DIFFERENCE SEQUENCE SPACES

Dynamical systems method (DSM) for selfadjoint operators

Examples of Dual Spaces from Measure Theory

A note on the σ-algebra of cylinder sets and all that

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

I teach myself... Hilbert spaces

S. DUTTA AND T. S. S. R. K. RAO

Continuous Sets and Non-Attaining Functionals in Reflexive Banach Spaces

Adaptive methods for control problems with finite-dimensional control space

On the Weak Convergence of the Extragradient Method for Solving Pseudo-Monotone Variational Inequalities

Chapter 3: Baire category and open mapping theorems

Topological properties of Z p and Q p and Euclidean models

Spectral theory for compact operators on Banach spaces

WEAK LOWER SEMI-CONTINUITY OF THE OPTIMAL VALUE FUNCTION AND APPLICATIONS TO WORST-CASE ROBUST OPTIMAL CONTROL PROBLEMS

16 1 Basic Facts from Functional Analysis and Banach Lattices

Lecture 2: Tikhonov-Regularization

1.2 Fundamental Theorems of Functional Analysis

A New Fenchel Dual Problem in Vector Optimization

DFG-Schwerpunktprogramm 1324

SEMI-INNER PRODUCTS AND THE NUMERICAL RADIUS OF BOUNDED LINEAR OPERATORS IN HILBERT SPACES

Institut für Numerische und Angewandte Mathematik

Proof. We indicate by α, β (finite or not) the end-points of I and call

444/,/,/,A.G.Ramm, On a new notion of regularizer, J.Phys A, 36, (2003),

Transcription:

Convergence rates in l 1 -regularization when the basis is not smooth enough Jens Flemming, Markus Hegland November 29, 2013 Abstract Sparsity promoting regularization is an important technique for signal reconstruction and several other ill-posed problems. Theoretical investigation typically bases on the assumption that the unknown solution has a sparse representation with respect to a fixed basis. We drop this sparsity assumption and provide error estimates for nonsparse solutions. After discussing a result in this direction published earlier by one of the authors and coauthors we prove a similar error estimate under weaker assumptions. Two examples illustrate that this set of weaker assumptions indeed covers additional situations which appear in applications. MSC2010 subject classification: 65J20, 47A52, 49N45 Keywords: Linear ill-posed problems, Tikhonov-type regularization, l 1 - regularization, non-smooth basis, sparsity constraints, convergence rates, variational inequalities. 1 Introduction Variational approaches 1 p Ax yδ p Y + α x l 1 (N) min, x l 1 (N) 1 p <, α > 0, (1) Department of Mathematics, Technische Universität Chemnitz, 09107 Chemnitz, Germany. Centre for Mathematics and its Applications, The Australian National University, Canberra ACT, 0200, Australia. 1

have become a standard tool for solving ill-posed operator equations, Ax = y, (2) for a bounded linear operator A : X := l 1 (N) Y mapping absolutely summable infinite sequences x = (x 1, x 2,...) of real numbers x k, k N, into a Banach space Y, if the solutions are known to be sparse or if the sparsity constraints are narrowly missed. This means that either only a finite number of nonzero components x k occurs or that the remaining nonzero components are negligibly small for large k. We assume that the exact right-hand side y is in the range R(A) := {Ax : x l 1 (N)} of A, which is a nonclosed subset of Y due to the ill-posedness of equation (2), and that y is not directly accessible. Instead one only has some measured noisy version y δ Y at hand with a deterministic noise model y y δ Y δ using the given noise level δ 0. Moreover, we assume that x l 1 (N) denotes a solution of (2). In particular, let us suppose weak convergence Ae (k) 0 in Y as k, (3) where e (k) = (0, 0,..., 0, 1, 0,...) denotes the infinite unit sequence with 1 at the k-th position and 0 else. For successful application of l 1 -regularization existence of minimizers x δ α to (1) and their stability with respect to perturbations in the data y δ have to be ensured. Further, by choosing the regularization parameter α > 0 in dependence on the noise level δ and the given data y δ one has to guarantee that corresponding minimizers converge to a solution x of (2) if the noise level goes to zero. Such existence, stability, and convergence results can be found in the literature. Also the verification of convergence rates has been addressed, but mostly in the case of sparse solutions (cf. [3, 5, 7, 9, 11, 12, 13, 17, 18, 19, 20, 21]). For non-sparse solutions a first convergence rate result can be found in [6]. The aim of the present article is to formulate convergence rates results under assumptions which are weaker than those in [6] and to obtain in this context further insights into the structure of l 1 -regularization. By the way we should mention that under condition (3) in [6] the weak - to-weak continuity of A was shown by employing the space c 0 of infinite sequences tending to zero, which is a predual space of l 1 (N), i.e., (c 0 ) = l 1 (N). Furthermore, we have for the range R(A ) of the adjoint operator A : Y l (N) that R(A ) c 0 (cf. [6, Proposition 2.4 and Lemma 2.7]). These facts, which are essentially based on (3), ensure existence of regularized 2

solutions x δ α for all α > 0 and norm convergence x δ α x l 1 (N) 0 as δ 0 if the regularization parameter α = α(δ, y δ ) is chosen in an appropriate manner, for example according to the sequential discrepancy principle (cf. [1, 15]). A sufficient condition to derive (3) is the existence of an extension of A to l 2 (N) such that A : l 2 (N) Y is a bounded linear operator. Then, taking into account the continuity of the embedding from l 1 (N) to l 2 (N), condition (3) directly follows from the facts that {e (k) } k N is an orthogonal basis in l 2 (N) with e (k) 0 in l 2 (N) as k and that every bounded linear operator is weak-to-weak continuous. 2 Convergence rates for smooth bases and a counter example As important ingredient and crucial condition for proving convergence rates the authors of [6] assumed that the following assumption holds true. Assumption 2.1. For all k N there exist f (k) Y such that e (k) = A f (k), k = 1, 2,.... (4) Remark 2.2. The countable set of range conditions (4) concerning the unit elements e (k), which form a Schauder basis in all Banach spaces l q (N), 1 q <, with their usual norms as well as in c 0 with the supremum norm, can by using duality pairings, l (N) l 1 (N) be equivalently rewritten as e (k), x l (N) l 1 (N) C k Ax Y, k = 1, 2,..., (5) where, for fixed k N, (5) must hold for some constant C k > 0 and all x l 1 (N) (cf. [21, Lemma 8.21]). Since we have x k = e (k), x l (N) l 1 (N), for all k N, Assumption 2.1 implies that A : l 1 (N) Y is an injective operator. Moreover, it can be easily verified that the following Assumption 2.3 is equivalent to Assumption 2.1. Assumption 2.3. For all k N there exist f (k) Y such that, for all j N, { 1 if k = j f (k), Ae (j) Y Y = 0 if k j. (6) 3

The next proposition shows that the requirement (6) cannot hold if one of the elements Ae (j) equals the sum of a convergent series k N, k j λ ke (k). Proposition 2.4. From an equation λ k Ae (k) = 0, where λ k R, (λ 1, λ 2,...) 0, (7) k N it follows that condition (6) is violated. Proof. Without loss of generality let Ae (1) = µ j Ae (j) and let there exist j=2 f (1) Y such that (6) holds. Then it follows that 1 = f (1), Ae (1) Y Y = µ j f (1), Ae (j) Y Y = j=2 0 = 0, j=2 which yields a contradiction and proves the proposition. Remark 2.5. As always if range conditions occur in the context of ill-posed problems, the requirement (4) characterizes a specific kind of smoothness. In our case, (4) refers to the smoothness of the basis elements e (k). Precisely, since R(A) is not a closed subset of Y, as a conclusion of the Closed Range Theorem (cf., e.g., [23]) we have that the range R(A ) is also not a closed subset of l (N) or c 0, and only a sufficiently smooth basis {e (k) } k N can satisfy Assumption 2.1 and hence Assumption 2.3. If the l 1 -regularization to equation (2) with infinite sequences x = (x 1, x 2,...) is associated to elements Lx := x k u (k) X with some synthesis operator L : l 1 (N) X and some Schauder basis {u (k) } k N in a Banach space X (see, e.g., [6, Section 2] and [9]), Assumption 2.1 refers to the smoothness of the basis elements u (k) X. The paper [2] illustrates this matter by means of various linear inverse problems with practical relevance in the context of Gelfand triples. On the other hand, Example 2.6 in [6] indicates that operators A with diagonal structure in general satisfy Assumption 2.1. However, the following example will show that already for a bidiagonal structure this is not always the case. We give an example of an injective operator A where the basis is not smooth enough to satisfy Assumption 2.3, because (7) is fulfilled and hence by Proposition 2.4 condition (6) is violated. 4

Example 2.6 (bidiagonal operator). For this example we consider the bounded linear operator A : l 2 (N) Y := l 2 (N) [Ax] k := x k x k+1, k = 1, 2,..., (8) k with a bidiagonal structure. This operator is evidently injective, moreover a Hilbert-Schmidt operator due to Ae (1) = e (1) ; Ae (k) = e(k) k e(k 1), k = 2, 3,...; k 1 ( ) 1/2 ( ) 1/2 A HS := Ae (k) 2 1 Y 2 <, k 2 and therefore a compact operator. Its restriction to X := l 1 (N) in the sense of equation (2) is also injective, bounded and even compact, because the embedding operator from l 1 (N) to l 2 (N) is injective and bounded. One immediately sees that with Ae (k) = 0 (9) an equation (7) for λ k = 1, k N is fulfilled. The adjoint operator A : l 2 (N) l 2 (N) has the explicit representation [A η] 1 = η 1 and [A η] k = η k k η k 1, k 2, (10) k 1 and the condition (4) cannot hold, which also follows from the general conclusion. To satisfy, for example, the range condition e (1) = A η we find successively from (10) η 1 = 1, η 2 = 2,..., η k = k,..., which violates the requirement η l 2 (N). 3 Convergence rates if the basis is not smooth enough If the basis is not smooth enough, as for example when Proposition 2.4 applies, for proving convergence rates in l 1 -regularization a weaker assumption has to be established that replaces the stronger Assumption 2.1. We will do this in the following. 5

Definition 3.1 (collection of source sets). For prescribed c [0, 1) we say that a sequence { S (n) (c) } n N of subsets S(n) (c) (Y ) n is a collection of source sets to c if, for arbitrary n N, S (n) (c) contains all elements (f (n,1),..., f (n,n) ) (Y ) n satisfying the following conditions: (i) For each k {1,..., n} we have [A f (n,k) ] l = 0 for l {1,..., n} \ k and [A f (n,k) ] k = 1. (ii) [A f (n,k) ] l c for all l > n. The properties of the A f (n,k) in items (i) and (ii) of the definition are visualized in Figure 1. A f (n,1) = ( 1 0 0... 0 0... ) + + A f (n,2) = ( 0 1 0... 0 0... ) + +... + + A f (n,n) = ( 0 0 0... 0 1... ) c c Figure 1: Structure of the vectors (A f (n,1),..., A f (n,n) ) in Definition 3.1. The sums of the absolute values of the stars in each column have to be bounded by c. One easily verifies that each single source set S (n) (c) is convex but not necessarily closed. It is also not clear whether the source sets are nonempty. Thus, we will claim in the sequel the following assumption. Assumption 3.2. For some c [0, 1) there exists a collection of source sets { S (n) (c) } n N which contains only nonempty sets S(n) (c). Assumption 3.2 (with c = 0) follows from Assumption 2.1 by observing that (f (1),..., f (n) ) S (n) (0). The construction in Definition 3.1 may look a bit technical, but the elements f (n,k) define a type of approximate inverse as we will now show. 6

Remark 3.3. Let the linear operator F (n) : Y l 1 (N) be defined as { [F (n) f (n,k), y Y y] k = Y, k = 1,..., n, 0, k = n + 1,.... (11) The composition Q (n) := F (n) A (12) is then a bounded linear map from l 1 (N) into l 1 (N). The range of Q (n) is the set of sequences x with x k = 0 for k > n. Furthermore, one has ( Q (n) ) 2 = Q (n) and Q (n) P (n) l 1 (N) l 1 (N) c < 1, where P (n) is the canonical projection of l 1 (N) onto the set of sequences x with x k = 0 for k > n. From this it follows that Q (n) has the infinite matrix (cf. Figure 1) ( ) In R n, 0 0 where I n is the n-dimensional identity and where R n l 1 (N) l 1 (N) c. As already noted in Remark 2.2, Assumption 2.1 implies the injectivity of the operator A. The subsequent proposition shows that injectivity is also necessary for fulfilling Assumption 3.2. Proposition 3.4. If Assumption 3.2 is satisfied, then A is injective. Proof. Assume Ax = 0 for some x l 1 (N). By Assumption 3.2 there exists some c [0, 1) such that for each n N there is an element (f (n,1),..., f (n,n) ) S (n) (c), where { S (n) (c) } denotes the collection of source sets corresponding to c. For fixed k N and all n k we have n N x k = A f (n,k), x l (N) l 1 (N) [A f (n,k) ] l x l l=n+1 ( ) f (n,k), Ax Y Y + sup [A f (n,k) ] l x l l>n l=n+1 c x l. l=n+1 7

The last expression goes to zero if n tends to infinity. Thus, x k = 0 for arbitrary k N and therefore x = 0. Note that we did not need the bound c < 1 to prove the proposition. Remark 3.5. Assumption 3.2 can be seen as an approximate variant of Assumption 2.1. If {S (n) (c)} n N is a collection of nonempty source sets and if (f (n,1),..., f (n,n) ) S (n) (c), then A f (n,k) converges weakly in l (N) to e (k) for each k. For deducing convergence rates from Assumption 3.2 we use variational inequalities, which represent an up-to-date tool for deriving convergence rates in Banach space regularization (cf., e.g., [4, 8, 10, 14, 16, 22]) even if no explicit source conditions or approximate source conditions are available. Here our focus is on convergence rates of the form x δ α(δ,y δ ) x l 1 (N) = O(ϕ(δ)) as δ 0 (13) for concave rate functions ϕ. Condition (VIE). There is a constant β (0, 1] and a non-decreasing, concave, and continuous function ϕ : [0, ) [0, ) with ϕ(0) = 0 such that a variational inequality β x x l 1 (N) x l 1 (N) x l 1 (N) + ϕ( Ax Ax Y ) (14) holds for all x l 1 (N). Theorem 3.6. Under Assumption 3.2 Condition (VIE) is satisfied. More precisely, we have (VIE) for β = 1 c with c from Assumption 3.2 and for 1+c the concave function ( ( )) ϕ(t) = 2 inf x k + t inf f (n,k) Y (15) n N 1 + c f (n, ) S (n) (c) k=n+1 for t 0, where S (n) is defined in Definition 3.1. As a consequence we have the corresponding convergence rate (13) for that rate function ϕ when the regularization parameter is chosen according to the discrepancy principle. Proof. For n N define projections P n : l 1 (N) l 1 (N) by [P n x] k := x k if k n and [P n x] k = 0 else. Further, set Q n := I P n. Then β x x l 1 (N) x l 1 (N) + x l 1 (N) = β P n (x x ) l 1 (N) + β Q n (x x ) l 1 (N) P n x l 1 (N) Q n x l 1 (N) + P n x l 1 (N) + Q n x l 1 (N). 8

The triangle inequality yields Q n (x x ) l 1 (N) Q n x l 1 (N) + Q n x l 1 (N) and P n x l 1 (N) P n (x x ) l 1 (N) + P n x l 1 (N) and therefore β x x l 1 (N) x l 1 (N) + x l 1 (N) (1 + β) P n (x x ) l 1 (N) (1 β) Q n x l 1 (N) + (1 + β) Q n x l 1 (N) = 2 Q n x l 1 (N) + (1 + β) P n (x x ) l 1 (N) (1 β)( Q n x l 1 (N) + Q n x l 1 (N)). Now choose β = 1 c 1 β, which is equivalent to c =, and let (f (n,1),..., f (n,n) ) 1+c 1+β S (n) (c) be arbitrary. Then P n (x x ) l 1 (N) = A f (n,k), x x l (N) l 1 (N) Ax Ax Y f (n,k) Y + [A f (n,k) ] l (x l x l ) l=n+1 l=n+1 [A f (n,k) ] l x l x l and [A f (n,k) ] l x l x l = l=n+1 l=n+1 ( ) [A f (n,k) ] l x l x l c Q n (x x ) l 1 (N) 1 β 1 + β ( Q nx l 1 (N) + Q n x l 1 (N)). Combining the estimates yields β x x l 1 (N) x l 1 (N) + x l 1 (N) 2 Q n x l 1 (N) + (1 + β) Ax Ax Y = 2 k=n+1 x k + 2 Ax Ax Y 1 + c f (n,k) Y f (n,k) Y for arbitrary n N and arbitrary (f (n,1),..., f (n,n) ) S (n) (c). The convergence rate result then immediately follows from Theorem 2 in [15]. 9

Now we are going to show that Theorem 3.6 applies to the operator A from Example 2.6 which does not meet Assumption 2.1. In this context, we see that even if a variational inequality (14) holds for all β (0, 1), with possibly different functions ϕ, it does not automatically hold for β = 1. Proposition 3.7. Let the operator A be defined by Example 2.6 according to formula (8). If there is a function ϕ : [0, ) [0, ) with lim ϕ(t) = 0 t 0 such that Condition (VIE) with (14) holds for β = 1, then we have x = 0. Proof. Without loss of generality we assume that at least one component of x is positive. For n N define x (n) l 1 (N) by { x (n) x k k := x l (N), if k n, x k, if k > n for k N (if x has only nonpositive components, use plus instead of minus). Then x (n) x l 1 (N) x (n) l 1 (N) + x l 1 (N) ( = n x l (N) x k x l (N) + = n x l (N) + = 2 xk x k >0 k=n+1 ) x k ( x l (N) x k x k >0 x k ) + x l 1 (N) ) ( x l (N) + x k x k 0 and Ax (n) Ax Y = 1 n x l (N). Thus, the variational inequality (14) implies 2 x x k >0 k 1 n x l (N) for all n N, which is a contradiction since the left-hand side is bounded away from zero but the right-hand side approaches zero for large n. 10

The next proposition together with Theorem 3.6 shows that for each β (0, 1) a variational inequality is valid and hence a corresponding convergence rate (13) can be established for A from Example 2.6, where the rate function ϕ arises from properties of A in combination with the decay rate of x k 0 as k. Proposition 3.8. For the operator A from Example 2.6 the Assumption 3.2 holds for all c (0, 1). Proof. Let c (0, 1) and set 1 a := c and b := 1 ca. Then a N, b [0, c), and ca + b = 1. For n N and k {1,..., n} define e (n,k) l (N) by 1, if l = 0, p = k, e (n,k) ln+p := c, if l {1,..., a}, p = k, b, if l = a + 1, p = k, 0, else for l N 0 and p {1,..., n}. Then N l=1 e(n,k) l Thus, the sequence f (n,k) defined by f (n,k) l := l l m=1 e (n,k) m, l N, = 0 for all N > (a + 2)n. belongs to Y = l 2 (N) and we have e (n,k) = A f (n,k). Item (i) in Definition 3.1 is obviously satisfied and item (ii) can be easily deduced from the fact that for fixed n N the elements e (n,1),..., e (n,n) have mutually disjoint supports. Since for each n N we found (f (n,1),..., f (n,n) ) S (n), the source sets S (n) are nonempty. 4 Another example: integration operator and Haar wavelets In addition to Example 2.6 we now provide another, less artificial, example for a situation where Assumption 3.2 is satisfied but Assumption 2.1 is violated. 11

For X := L 2 (0, 1) and Y := L 2 (0, 1) define à : X Y by s (à x)(s) := x(t)d t, s (0, 1). (16) 0 Then X = L 2 (0, 1) and Y = L 2 (0, 1), too, and à : Y X is given by (à y)(t) = 1 t y(s)d s, t (0, 1). (17) Suppose we know that the unknown solution to à x = y is sparse or at least nearly sparse with respect to the Haar basis and denote the synthesis operator of the Haar system by L : l 2 (N) L 2 (0, 1). Then for given noisy data y δ we would like to find approximate solutions x δ α := Lx δ α, where x δ α l 1 (N) is the minimizer of (1) with A := ÃL. To obtain convergence rates for that method we have to verify Assumption 2.1 or Assumption 3.2. But first let us recall the definition of the Haar basis. The first element of the Haar system is given by u (1) (s) := 1 for s (0, 1). All other elements are scaled and translated versions of the function { 1, s (0, 1 2 ψ(s) := ), 1, s ( 1, 1). 2 More precisely u (1+2l +k) (s) := ψ l,k (s) := 2 l 2 ψ(2 l s k), s (0, 1), for l = 0, 1, 2... and k = 0, 1,..., 2 l 1. The following proposition shows that Assumption 2.1 is not satisfied in this setting, that is, the basis {e (k) } k N in l 1 (N), and thus the Haar basis in L 2 (0, 1), is not smooth enough with respect to A to obtain convergence rates via Assumption 2.1. Proposition 4.1. The element e (1) does not belong to R(A ) and, thus, Assumption 2.1 does not hold. Proof. Assume that there is some f (1) Y such that e (1) = A f (1). Then 1 = (à f (1) )(s) for all s, but elements from R(à ) always belong to the Sobolev space H 1 (0, 1) and are zero at s = 1. Thus, 1 = (à f (1) )(1) cannot be true and e (1) = A f (1) is not possible. 12

The second proposition states that Assumption 3.2 is satisfied and thus convergence rates can be obtained for our example. Proposition 4.2. There is a nonempty collection of source sets S (n) (c) with c = (4 8) 1 < 1. For n = 1 the element f (1,1) := 2 belongs to S (n) (c). For n = 2 m with m N the vector (f (2m,1),..., f (2m,2 m) ) given by 2m 1 f (2m,1+q) := 2 m 2 +2 belongs to S (n) (c). Here, and c (m) 1+2 r +s,1+p := p=0 c (m) 1+q,1+pψ m,p, q = 0,..., 2 m 1, (18) c (m) 1,1+p = 1 2 r 2, 2 m r s p 2 m r (s + 1) 1, 2 2 r 2, 2 m r (s + 1) p 2 2m r (s + 1) 1 0, else, for r = 0,..., m 1 and s = 0,..., 2 r 1 and p = 0,..., 2 m 1. For all other values of n an element contained in S (n) (c) can be obtained by truncating a vector from S (2m) (c) with 2 m n after n components. The sum in (15) can be estimated for n 2 m by f (n,k) Y 2 + 2 2m+2 2 m+2 (19) where equality holds if n = 2 m. Proof. The proposition can be proven by elementary but lengthy calculations. We only provide the elements A f (n,k) and the estimate for c. For n = 1 we have [A f (1,1) ] 1 = 1 and [A f (1,1) ] 1+2 l +k = 2 3 2 l 1, where l N 0 and k = 0,..., 2 l 1. For n = 2 m with m N the element A f (2m,1+q) has zeros in the first 2 m+1 components except for position q, where a one appears. The absolute values of the remaining components are given by [A f (2m,1) ] 1+2 l +k = 2 m 3 2 l 13

and { [A f (2m,1+2 r +s) ] 1+2 2 1 2 r+m 3 2 l, 2 (l r) (k + 1) 1 s 2 (l r) k, l +k = 0, else, where l > m, k = 0, 1,..., 2 l 1, r = 0, 1,..., m 1, s = 0, 1,..., 2 r 1. Now we come to the estimate of the constant c. First, note that for fixed m, r, l, k there is at most one s such that [A f (2m,1+2 r +s) ] 1+2 l +k 0. Then for fixed m, l, k with l > m we have [A f (2m,κ) ] 1+2 l +k = m 1 2 r 1 [A f (2m,1) ] 1+2 l +k + [A f (2m,1+2 r +s) ] 1+2 l +k 2 m 1 κ=0 2 m 3 2 l + m 1 r=0 = 2 m 3 2 l (1 + r=0 s=0 ) m 1 ( ) 2 1 2 r+m 3 2 l = 2 m 3 2 (1 l r + 2 ( ) m ) 2 1. 2 1 r=0 Using l m + 1 we further estimate ( 2 m 1 [A f (2m,κ) ] 1+2 l +k 2 1 2 m 3 2 1 + κ=0 = 2 3 2 ( ( ) m ) 2 1 2 1 ) 2 1 2 m + 1 2 1 2 m 2 1 = 2 3 ( 2 1 ( 2 2 ) ) 2 1 2 m 2 1 2 3 2 = 2 1 1 4 8. 5 Conclusions We have shown that the source condition e (k) R(A ) for all k, as published in [6] for obtaining convergence rates in l 1 -regularization, is rather strong. A 14

weaker assumption yields a comparable rate result and covers a wider field of settings. Especially nonsmooth bases, e.g. the Haar basis, can be handled even if the forward operator is smoothing and the basis elements do not belong to the range of the adjoint. A major drawback of our new approach (and of the previous one in [6]) is that the assumptions automatically imply injectivity of the operator. Sufficient condition for convergence rates in l 1 -regularization if the operator is not injective and if the solution is not sparse are not known up to now. Acknowledgments The authors thank Bernd Hofmann for many valuable comments on a draft of this article and for fruitful discussions on the subject. J. Flemming was supported by the German Science Foundation (DFG) under grant FL 832/1-1. M. Hegland was partially supported by the Technische Universität München Institute of Advanced Study, funded by the German Excellence Initiative. Work on this article was partially conducted during a stay of M. Hegland at TU Chemnitz, supported by the German Science Foundation (DFG) under grant HO 1454/8-1. References [1] S.W. Anzengruber, B. Hofmann, and P. Mathé. Regularization properties of the discrepancy principle for Tikhonov regularization in Banach spaces. Applicable Analysis, published electronically 03 Sept. 2013, http://dx.doi.org/10.1080/00036811.2013.833326. [2] S.W. Anzengruber, B. Hofmann and R. Ramlau. On the interplay of basis smoothness and specific range conditions occurring in sparsity regularization. Inverse Problems, 29:125002, 21pp., 2013. [3] S. W. Anzengruber and R. Ramlau. Convergence rates for Morozov s discrepancy principle using variational inequalities. Inverse Problems, 27:105007, 18pp., 2011. [4] R. I. Boț and B. Hofmann. An extension of the variational inequality approach for obtaining convergence rates in regularization of nonlin- 15

ear ill-posed problems. Journal of Integral Equations and Applications, 22(3):369 392, 2010. [5] K. Bredies and D. A. Lorenz. Regularization with non-convex separable constraints. Inverse Problems, 25(8):085011, 14pp., 2009. [6] M. Burger, J. Flemming and B. Hofmann. Convergence rates in l 1 -regularization if the sparsity assumption fails. Inverse Problems, 29(2):025013. 16pp, 2013. [7] I. Daubechies, M. Defrise, and C. De Mol. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Communications in Pure and Applied Mathematics, 57(11):1413 1457, 2004. [8] J. Flemming. Generalized Tikhonov Regularization and Modern Convergence Rate Theory in Banach Spaces. Shaker Verlag, Aachen, 2012. [9] M. Grasmair. Well-posedness and convergence rates for sparse regularization with sublinear l q penalty term. Inverse Probl. Imaging, 3(3):383 387, 2009. [10] M. Grasmair. Generalized Bregman distances and convergence rates for non-convex regularization methods. Inverse Problems, 26(11):115014, 16pp., 2010. [11] M. Grasmair. Linear convergence rates for Tikhonov regularization with positively homogeneous functionals. Inverse Problems, 27(7):075014, 16, 2011. [12] M. Grasmair, M. Haltmeier, and O. Scherzer. Sparse regularization with l q penalty term. Inverse Problems, 24(5):055020, 13pp., 2008. [13] M. Grasmair, M. Haltmeier, and O. Scherzer. Necessary and sufficient conditions for linear convergence of l 1 -regularization. Comm. Pure Appl. Math., 64(2):161 182, 2011. [14] B. Hofmann, B. Kaltenbacher, C. Pöschl, and O. Scherzer. A convergence rates result for Tikhonov regularization in Banach spaces with non-smooth operators. Inverse Problems, 23(3):987 1010, 2007. 16

[15] B. Hofmann and P. Mathé. Parameter choice in Banach space regularization under variational inequalities. Inverse Problems, 28(10):104006, 17pp, 2012. [16] B. Hofmann and M. Yamamoto. On the interplay of source conditions and variational inequalities for nonlinear ill-posed problems. Appl. Anal., 89(11):1705 1727, 2010. [17] D. A. Lorenz. Convergence rates and source conditions for Tikhonov regularization with sparsity constraints. J. Inverse Ill-Posed Probl., 16(5):463 478, 2008. [18] R. Ramlau. Regularization properties of Tikhonov regularization with sparsity constraints. Electron. Trans. Numer. Anal., 30:54 74, 2008. [19] R. Ramlau and E. Resmerita. Convergence rates for regularization with sparsity constraints. Electron. Trans. Numer. Anal., 37:87 104, 2010. [20] R. Ramlau and G. Teschke. Sparse recovery in inverse problems. In Theoretical foundations and numerical methods for sparse recovery, volume 9 of Radon Ser. Comput. Appl. Math., pages 201 262. Walter de Gruyter, Berlin, 2010. [21] O. Scherzer, M. Grasmair, H. Grossauer, M. Haltmeier, and F. Lenzen. Variational Methods in Imaging, volume 167 of Applied Mathematical Sciences. Springer, New York, 2009. [22] T. Schuster, B. Kaltenbacher, B. Hofmann, and K.S. Kazimierski. Regularization Methods in Banach Spaces, volume 10 of Radon Ser. Comput. Appl. Math. Walter de Gruyter, Berlin/Boston, 2012. [23] K. Yosida. Functional Analysis, volume 123 of Fundamental Principles of Mathematical Sciences (6th ed.), Springer-Verlag, Berlin, 1980. 17