A Generalized Uncertainty Principle and Sparse Representation in Pairs of R N Bases

Size: px

Start display at page:

Download "A Generalized Uncertainty Principle and Sparse Representation in Pairs of R N Bases"

James Dennis
6 years ago
Views:

1 A Generalized Uncertainty Principle and Sparse Representati in Pairs of R N Bases ichael Elad Arie Feuer Alfred. Bruckstein November 6th, 001 Abstract An elementary proof of a basic uncertainty principle ccerning pairs of representatis of R N vectors in different orthormal bases is provided. The result, slightly strger than stated before, has a direct impact the uniqueness property of the sparse representati of such vectors using pairs of orthormal bases as overcomplete dictiaries. The main ctributi in this paper is the improvement of an important result due to Doho and Huo ccerning the replacement of the l 0 optimizati problem by a linear programming minimizati when searching for the unique sparse representati. 1 ntroducti Given a real vector S in R N it has a unique representati in every basis of this space. ndeed, if φ 1,φ,...φ N are N orthogal vectors of unit length, i.e φ i,φ j = δ ij, we have that S = [ φ 1 φ... φ N α 1 α. α N = N α i φ i i=1 with coefficients α i given by φ i,s, since α 1 α. α N = φ T 1 φ T. φ T N S. Department of Computer Science (SCC), Stanford University, Stanford CA. USA Department of Electrical Engin., Techni srael nstitute of Technology, Techni City, Haifa 3000, srael Department of Computer Science, Techni srael nstitute of Technology, Techni City, Haifa 3000, srael 1

2 Suppose now that we have two different bases for R N : Φ = {φ 1 φ... φ N } and Ψ = {ψ 1 ψ... ψ N }. Then every vector S has representatis both in terms of {φ i } and in terms of {ψ i }. Let us write then S = [φ 1 φ... φ N α 1 α. [ψ = 1 ψ... ψ N β 1 β. α N β N A questi posed by Doho and his co-workers is the following: is there some benefit in representing S in a joint, overcomplete set of vectors, say {φ 1 φ... φ N ψ 1 ψ... ψ N } ( that can be called a dictiary ccatenating the Φ and Ψ bases )? ndeed, we can csider a signal S = γ k φ k + γ l ψ l that has a very sparse representati in terms of the joint set of vectors forming the Φ and the Ψ bases, but, in general, S will have highly n-sparse representatis in either of this bases ale. Sparse representatis can have advantages in terms of compressi of signals and/or in terms of understanding the underlying processes that generated them. The problem that arises, however, is that in terms of dictiaries of overcomplete set of vectors (as obtained by ccatenating the basis vectors of Φ and the Ψ) every signal has multiple representatis. Of those multiple representatis choosing e based sparsity is a difficult optimizati problem. ndeed suppose we have γ φ. γ φ N S = {φ 1 φ... φ N ψ 1 ψ... ψ N } = [Φ Ψ γ = γ ψ 1 γ ψ. γ φ 1 γ ψ N N N γ φ i φ i + γ ψ i ψ i i=1 i=1 Choosing γ involves solving an underdetermined set of equatis with N equatis and N unknowns, and hence must be de subject to additial requirements the soluti. The additial requirement for sparsity would be to minimize the support of γ, i.e., minimize the number of places where γ is nzero. Hence we need to solve the problem (P 0 ) inimize γ 0 subject to S = [Φ Ψγ

3 where γ 0 is the size of the support of γ. The main result of Doho and Huo is that in case S has a very sparse representati i.e. when γ so that S = [Φ Ψγ and γ 0 < Functi{[Φ Ψ}, then this sparse representati is the unique soluti of not ly (P 0 ) as defined above, but also of (P 1 ) inimize γ 1 subject to S = [Φ Ψγ where γ 1 = j γ j, the l 1 -norm of γ. This is an important result stating that very sparse representatis can be found by solving not a combinatorial search problem as implied by (P 0 ) but by solving a much simpler linear programming problem as implied by minimizing to l 1 -norm of γ. The bound defining sparsity as provided by Doho and Huo is where Functi{[Φ Ψ} = 1 (1 + 1 ) = Sup{ φ i,ψ j, i,j } The results of Doho and Huo are based exploiting an uncertainty principle showing that a signal cannot have representatis that are simultaneously sparse in two orthormal bases. This then leads to a result showing that sufficiently sparse representatis in dictiaries ccatenating the two bases must be unique, showing the uniqueness of the soluti of problem (P 0 ) as defined above. Subsequently for very sparse representatis (csiderably sparser than the es ensuring uniqueness!) it is shown that solving (P 1 ) leads also to the unique soluti of (P 0 ). n the general case discussed here Doho and( Huo showed ) in [ that uniqueness of the soluti of of the (P 0 )-problem is ensured for γ 0 < , and that in this case (P1 ) provides this unique soluti as well. Here we follow the path of work of Doho and Huo, and improve their bounds. First, we prove an improved uncertainty principle leading to better bounds yielding uniqueness of the (P 0 ) soluti. The result is that uniqueness of the (P 0 )-soluti is achieved for γ 0 < 1. Our main ctributi in this paper is an improvement of the result ccerning the replacement of the (P 0 ) minimizati problem with the (P 1 ) cvex minimizati, while achieving the same soluti. We show that the two problems ((P 0 ) and (P 1 )) solutis coincide for γ 0 < 1, i.e. we show that uniqueness of the sparse representati implies also equivalence between the P 0 and the P 1 problem. This way we enlarge the class of signals for which we can apply a simple, linear-programming based search for the optimal sparse representati. The Basic Uncertainty Principle We shall first prove a basic uncertainty principle ccerning pairs of representatis of a given vector S (or signal) in two given orthormal bases Φ and Ψ. Suppose a signal S has the 3

4 representatis: S = [φ 1 φ... φ N α 1 α. [ψ = Φα = 1 ψ... ψ N β 1 β. = Ψβ α N β N and, w.l.o.g. we assume that S T S = 1, i.e. we have normalized the l energy of the signal to 1. We have α i = S,φ i, β j = S,ψ j Now let us write 1 = S T S = [α 1 α...α N φ T 1 φ T. φ T N [ψ 1 ψ...ψ N β 1 β. β N = αt Φ T Ψβ hence 1 = [α 1 α...α N φ T ψ φ T ψ φ T ψ N φ T ψ φ T ψ φ T ψ 1 N... β 1 β. = N N α i < φ i,ψ j > β j (1) i=1 φ T N ψ 1 φ T N ψ φ T N ψ N β N Writing = Sup { φ i,ψ j, (i,j) } = Sup { φ T ψ, (i,j) }, i j all the entries in the matrix Φ T Ψ are smaller than. We further have the (Parseval) energy preservati property 1 = S T S = α i = β i Now assume that Then Equati (1) becomes α 0 = A and β 0 = B A B 1 = α i φ i,ψ j β j, i =1 j =1 where i runs over the nzero support of α and j runs over the nzero support of β. Now we can write that 1 = A B α i φ i,ψ j β j i =1 j =1 A B α i φ i,ψ j β j i =1 j =1 4

5 A B α i β j () i =1 j =1 A B = α i β j i =1 j =1 ( A = α i ) B β j i =1 j =1 Next, in order to bound the above expressi from above let us find aximize subject to α i,β j > 0, ( A ) B α i β j i=1 A αi = 1, i=1 Since this problem is separable, we maximize the following aximize aximize ( A ) α i i=1 subject to α i > 0 B β j subject to β j > 0 B βj = 1 A αi = 1 i=1 B βj = 1 To solve these problems let us csider the following Lagrangian (note that we do not explicitly enforce the positivity cstraint, but verify its correctness at the end of the process) Next, using the unit-norm cstraint we get [ A A L(α) = α i + λ 1 αi i=1 i=1 L = 1 λ α i = 0 α i = 1 α i λ A A ( ) 1 αi = = A i=1 i=1 λ 4λ = 1 λ = A α i = 1 A Therefore the maximal value of A α i is i =1 A α i = A = A i=1 A 5

6 A and the maximal value of α i B β j is i =1 j =1 1 1 (A B) = AB. A B Returning to our derivati of the uncertainty relatis in Equati we now have: A B 1 α i β j AB i =1 j =1 We have obtained the following result: f we have two representatis of a signal S in the bases Φ = [φ 1 φ φ N and Ψ = [ψ 1 ψ ψ N and the coefficient vectors α and β have supports of sizes α 0 = A and β 0 = B then 1 AB where = Sup{ φ i,ψ j, (i,j)}. Using the well known inequality between the geometric and arithmetic means, A+B AB, we also have that A + B ( AB ) 1 A + B. Doho and Huo obtained, by emulating arguments for the sinusoidal and spike bases a weaker result stating that A + B ( ) (see eq. (7.1) of [). Clearly = , since 1. n [ it is said that their... general bound (i.e. (7.1)) can be a factor of two away from sharpness w.r.t. those (earlier particular, i.e. sinusoidal and spike bases) cases. ts generality can be an advantage in some cases. We see here that an elementary derivati provides tight bounds that are not by a factor of two away from sharpness. ndeed, as 1 N (as it happens for spikes and complex sinusoids) ( ) goes to 1+ N while goes to N, which is the bound claimed for this particular case in [ (see e.g. (.1). n fact, the uncertainty result that we have obtained is even strger since we got AB 1. The use of the arithmetic mean instead of the geometric e looses tightness and is correct ly for A = B. Also, this uncertainty result is of the form of the classical multiplicative uncertainty results (see e.g. [3) σt σω 1. 4 The value of is crucial in the above arguments. For any pair of orthormal bases Φ and Ψ of R N we have that 1 N. To see this, e simply notices that Φ T Ψ is an orthormal matrix, having the sum of squares of its entries equal to N. All entries cannot therefore be less than 1 N since then we would have that (the sum of all squared entries) N. 6

7 3 Uniqueness of Sparse Representatis A direct csequence of the uncertainty relati derived above is the following fact: if we have a sparse representati in terms of a dictiary that is the ccatenati of two orthormal bases Φ and Ψ, it is unique. How sparse the representati should be to achieve such uniqueness depends crucially the bound provided by the uncertainty principle. The cnecti follows easily from the following line of argumentati (taken from [). Suppose γ 1 and γ are the coefficient vectors of two different representatis of the same signal S in the dictiary [Φ Ψ, i.e. S = [Φ Ψγ 1 = [Φ Ψγ Then clearly or [Φ Ψ[γ 1 γ = 0 Φ [γ 1 γ φ + Ψ [γ 1 γ ψ = 0 Φγ φ = Ψγψ = Λ Hence in this case we have two vectors γ φ and γψ (defined as the upper N values in γ 1 γ and the lower N values in γ γ 1 ) that are different from 0 and represent the same vector Λ in two orthogal bases. Now the basic uncertainty principle states that if γ φ 0 = A and γ ψ 0 = B then we must have A + B ( AB ). Suppose that the original representatis were both sparse, i.e. that Then we must necessarily have On the other hand we have γ 1 0 < F and γ 0 < F. γ 1 γ 0 < γ γ 0 < F γ 1 γ 0 = γ φ 0 + γ ψ 0 = A + B. Hence sparsity of both γ 1 and γ with bound F implies that But by the uncertainty principle we have A + B < F. or A + B n cclusi, if F would be 1 or smaller, we would ctradict the uncertainty principle if we would assume two different sparse representatis. Hence we have the following uniqueness theorem: f a signal S has a sparse representati in the dictiary [Φ Ψ so that 7

8 S = [Φ Ψγ and γ 0 < 1 then this sparse representati is necessarily unique, (i.e., there cannot be two different γ s obeying γ 0 < 1 that represent the same signal). The bound 1/ is a better bound than the e implied by the uncertainty principle stated in [, which would yield 1 (1 + 1 ). This means that the uniqueness result will be true for much less sparse representatis than those required by the bound provided in Ref. [. 4 Finding Sparse Representatis via l 1 Optimizati The next questi that naturally arises is: if a signal S has a sparse representati in a dictiary [Φ Ψ, how should we find it? Solving the l 0 optimizati problem defined as (P 0 ) inimize γ 0 = N k=0 γ 0 k subject to S = [Φ Ψγ involves an unfeasible search problem. However, it was discovered experimentally that solving the l 1 optimizati problem (P 1 ) inimize γ 1 = N k=0 γ k subject to S = [Φ Ψγ often provides the sparse representati. Doho and Huo proved in [ that the strg sparsity cditi γ 0 < 1 (1 + 1 ) is a cditi ensuring that the soluti of the problem (P 1 ) yields the sparse soluti of (P 0 ) too. This is a wderful result, since (P 1 ) is essentially a linear programming problem! To show that (P 1 ) provides also the (P 0 ) soluti e has to prove that if γ 0 < F and [Φ Ψγ = S then, if there exists some other representati [Φ Ψ γ = S, we must have γ 1 γ 1. n words, we need to show that the (unique) sparse γ is also shortest in the l 1 - metric. As was said above, Doho and Huo have shown that for F = 1 (1 + 1 ) this requirement is met. We first describe below the derivati of this result, then, in the next secti we show how this bound can be improved. Following [ we have that if: [Φ Ψγ = S = [Φ Ψ γ then [Φ Ψ[ γ γ = 0. Therefore the difference vector x = γ γ satisfies Φx φ = Ψ( x ψ ) (3) 8

9 where we define x φ as the first N entries in x, and x ψ are the last N entries in x. We need to show that for every nzero vector x that obeys Equati (3) we shall have that Hence we need to show that N k=1 γ k + x k N k=1 γ k 0. x k + ( γ k + x k γ k ) 0 off support ofγ support ofγ n order to simplify notatis in following analysis, we define x k = all x k = off x k = N k=1 x k x k off support ofγ x k. support ofγ Since (due to v + m v m ) we have v + m v v m v = m we can write that x k + off ( γ k + x k γ k ) x k off x k Plugging this into the above inequality we get x k off x k 0 We can simplify this inequality by the following step x k + x k x k off which finally gives 1 x k all x k 0 or equivalently x k all x k 1. (4) Therefore, if we shall show that for some value of the support size of γ (F), all x obeying (3) also satisfy the inequalities in (4), then the asserti γ 1 > γ 1 follows, and (P 1 )-problem can replace the (P 0 )-problem while searching for the sparse γ. 9

10 To prove Equati (4) above csider the problem: inimize N N x φ i + x ψ i subject to { x φ/ψ k = V and Φ[x φ = Ψ[ x ψ } i=1 i=1 f the minimum attained for all cditis x φ/ψ k will be able to say that all x i x j = x i x j But we have assumed that γ 0 < F hence we shall have all x i x i F F(). all i = V is given by some f(v ) V F() then we V V F() = 1 F() γ 0 f we shall have F 1 F() then the required cditi for the optimality of γ for the l 1 problem will be met. We shall next prove that F() and it will lead to the result that if F 1(1 + 1 ) the ratio all x i x i 1 as needed. Since we have Φx φ = Ψx ψ we can write that x φ = Φ T Ψx ψ. Suppose w.l.o.g. that the cditi we have is that x φ k = V. Then we have x φ = max x φ j = x φ j (v) for some index v = [Φ T Ψrow v x ψ x ψ 1 Hence We also obviously have x ψ 1 xφ V. x φ 1 V, hence ( x ψ 1 + x φ 1 V ) 10

11 Hence we may take This proves that if F() = γ 0 < 1 ( ) ( ) the very sparse representati will be provided by the soluti of (P 1 ) too. So far we have seen that if the sparse representati is unique and if γ 0 < 1 γ 0 < 1 ( ) the unique sparse representati is provided by solving a linear programming problem (P 1 ). Hence if ( ) γ 0 < 1 we have uniqueness of the (P 0 ) soluti, but it is not sure that (P 1 ) will provide this soluti. So, obviously there remains a gap, and the natural questi is what happens for vectors of sparsity in this gap. A wealth of simulatis we have de shown that for signals with (unique) sparse representati in this gap the equivalence between the (P 0 ) and the (P 1 ) solutis remains true. otivated by these empirical results, we succeeded to prove that there is indeed no gap and the same bound holds for both the uniqueness and the (P 0 )-(P 1 ) equivalence. 5 mproving the Bound for l 1 Optimizati n the previous secti we closely followed Doho and Huo s work and reproduced their results. Next we take a different route in order to obtain an improved bound necessary for equivalence between (P 0 ) and (P 1 ). Equatis (3) and (4) can re-interpreted as an optimizati problem of the following form inimize 1 x k all x k subject to Φx φ = Ψ( x ψ ). This problem should be solved for various values of F (F = γ 0 ) and all profiles of n-zero entries in γ. As F increases the minimum of the above expressi decreases and tends to a negative value. The maximal F that yields a minimum that is still above zero will be the bound the sparsity of γ, ensuring equivalence between (P 0 ) and (P 1 ). However, working with the above minimizati problem is complicated because of several reass: 11

12 1. We have not explicitly set cditis x to avoid the trivial soluti x = 0,. The problem involves both the entries of the vector x and their absolute values, 3. The orthormal matrices Ψ and Φ appear explicitly, and we would like them to influence the result ly through the parameter they define, 4. The problem is clearly sensitive not ly to the number of n-zero elements in the support of γ, but also to their positi. n order to solve the first problem, we introduce an additial cstraint that prevents the trivial soluti. Such a cstraint could be posed the l or the l 1 norm of the unknown vector x, i.e., x T x = 1 or all x k = 1. The newly added cstraint will clearly not change the sign of the result, i.e., if for some F the minimizati result becomes negative, then it would have been so without the cstraint as well. Thus, the new cstraint does not interfere with our goals in this optimizati problem. As to the other problems, we solve them via the definiti of an alternative minimizati problem. n this new problem we minimize the same functi, but pose a weaker set of cstraints, as follows inimize 1 x k all x k subject to: 1. x φ 1 N N x ψ. x ψ 1 N N x φ 3. all x i = 1 where 1 N N is an N by N matrix ctaining es in all its entries. The first and secd cstraints simply use the fact that any given entry in e of the vectors (x ψ or x φ ) cannot be grater than multiplying the sum of the absolute entries in the other vector. Clearly, every feasible soluti of the original cstraint set is also a feasible soluti of the new cstraint set, but not vice-versa. Thus, if the minimum of the functi is still positive using the new cstraint set, it implies that it is surely positive using the original cstraint set. Looking closely at the the newly defined optimizati problem, we can further simplify it by noticing first that the sum over all the support that appears in the functi is known to be 1 due to the third cstraint. Also, ly the absolute values of the unknown vector x play a role here. Exploiting these two properties we can rewrite the the new problem and get 1. X 1 1 N N X inimize [ 1 1T γ 1 X 1 1 T γ X subject to:. X 1 N N X T N(X 1 + X ) = 1 (5) 4. X 1 0, X 0 where we define X 1 and X as the absolute values of the original vectors x ψ and x φ. This is the reas we added the fourth cstraint regarding positivity of the unknowns. The notatis 1

13 1 γ1 and 1 γ stand for vectors of length N, [1 γ1 1 γ being the N vector with es where γ is n-zero. 1 N is simply an N vector ctaining all es. f we assume that there are K 1 n-zeros in 1 γ1 and K n-zeros in 1 γ, then K 1 +K = γ 0. n the new formulati, we can assume w.l.o.g. that the K 1 and K n-zeros are located at the beginning of the two vectors X 1 and X, due to the symmetrical form of the cstraints. The problem we have obtained is a classical Linear Programming (LP) problem, and as such, has a unique local minimum point which is also the unique global minimum point. Let us bring it to its canical form (P) inimize C T Z subject to AZ B, Z 0. (P denotes the fact that this is a primal LP form). The matrices (A,Z,B,C) involved are defined as follows Z T = [ X T 1 X T 1 C T = [ 1 T γ 1 1 T γ B T = [ 0 1 T N 0 1 T N 1 1 A = N 1 N N 1 N N N 1 T N 1 T N 1 T N 1 T N Notice that the equality cstraint in the original problem is replaced by two inequalities, in order to fit the required canic form. n the problem defined in Equati (5) we wanted cditis so that the minimum will not be negative. After removing the 1/ from the functi, the new requirement becomes C T Z 0.5. t is still difficult to give an analytic form to the soluti of the LP problem we obtained. nstead, we shall exploit the dual LP problem of the form (D) aximize B T U subject to A T U C, U 0 (D here denotes that this is a dual LP form), with the same matrices as in the primal problem (see e.g. [4). The approach we are going to take is as follows: We know that the primal and the dual problems obtain the same optimal value ([4), i.e., inimize {C T Z} = aximize {B T U}. We require this optimal value to be higher than or equal to 0.5. n the dual problem, if we find a feasible soluti U such that B T U 0.5, we guarantee that the maximal value is also above 0.5, and thus we fulfill the original requirement the primal problem. Let us csider the following parameterized form for a feasible soluti for U U T = [ 1 T γ 1 α1 T γ β γ. 13

14 Using previous notatis, there are K 1 n-zeros in 1 γ1 and K n-zeros in 1 γ. We assume w.l.o.g. that K 1 K (the problem is perfectly symmetric with respect to these two sectis). We also assume that 0 α 1. This way, three parameters (α, β, γ) govern the entire soluti U, and we need to find requirements them in order to guarantee that the proposed soluti is indeed feasible. Substituting the proposed soluti form into the cstraint inequalities of the dual problem we get and rearranging these inequalities we get 1 γ1 + α 1 N N 1 γ + (β γ)1 N 1 γ1 α1 γ + 1 N N 1 γ1 + (β γ)1 N 1 γ α 1 N N 1 γ + (β γ)1 N 0 (1 α)1 γ + 1 N N 1 γ1 + (β γ)1 N 0. The multiplicati 1 N N 1 γ is simply K 1 N, and similarly, 1 N N 1 γ1 = K 1 1 N. Thus, the first set of N inequalities is actually a simple scalar requirement αk + (β γ) 0 f 0 α 1 as assumed before, the secd set of inequalities can be replaced by the following, scalar requirement (1 α) + K 1 + (β γ) 0 (For the N K places where 1 γ has zeros, we get the inequality K 1 + (β γ) 0, which is weaker than the above e). Solving the two inequalities as equalities, we get αk = (1 α) + K 1 α = 1 + K K We see that, indeed, our assumpti 0 α 1 is correct since we assumed K 1 K, (note that > 0). So far we have found an expressi for the first parameter α. As to the other two, substituting α in the above equatis we get (β γ) = K 1 + K K < 0. Thus, we can choose β = 0 and γ will be the above expressi multiplied by 1. This way we have satisfied all the inequality cstraints, and obtained a soluti U which is also n-negative in all its entries. Now that we have established that the proposed soluti is feasible, let us look at the value of the functi. The functi is simply B T U = (β γ). So we should require B T U = (β γ) = K 1 + K K 1. 14

15 Thus K 1 + K K 1 K 1 K + K 1 0 (6) We have therefore obtained a requirement K 1 and K which is posed in terms of the parameter. Thus, given a vector S R N, the two orthormal bases Φ and Ψ, and their induced crossproduct factor, we solve the (P 1 )-problem and obtain a candidate representati γ. From this representati we can determine K 1 and K, the number of n-zeros of γ s two N-sectis. Plugging them into the inequality (6) we immediately know whether this is also the (P 0 )-soluti as well, and hence also the sparsest representati. We now proceed and produce a simpler sparsity cditi the representati γ ensuring that (P 1 ) produces the unique and sparse (P 0 ) soluti. Assuming that we know K, the requirement K 1 is K 1 K + K 1 0 K K K. Adding K to both sides of the above inequality we get K 1 + K 1 1 K K + K = 1 1 K + (K ) K. Let us look at the term which multiplies the 1/ in the above bound. This term is a functi of K which is known to be n-negative. The minimum of this term is given by f(x) = 1 x + x x df(x) dx = x (4x 1) (1 x + x ) = 4x = 0 4x 4x x 1, = ± 0.5. The negative soluti is irrelevant, and the positive e is indeed the minimum (as easily verified with the secd derivative), where we get that f( 0.5) = 0.5 = Thus, our final result is the requirement γ 0 = K 1 + K ( 0.5) To summarize, we got that if γ 0 ( 0.5)/ then the dual LP necessarily gives a value above 0.5, which in turn assures that the minimum of the primal problem is above 0.5 as well. Thus, it is guaranteed that the l 1 norm of the soluti with this number of n-zeros is the smallest possible, so we have proved that solving the P 1 yields the soluti of P 0 too. The result obtained is better than the 0.5 (1 + 1/)-bound asserted by Doho and Huo. As an example, for = 1/ N, we get that for N = 16 the old (DH) requirement is to have less than.5 n-zeros, while we (EB) require 3.65 and below. For N = 64 the old requirement (DH) 1 15

16 is 4.5 and below, while the new bound requires 7.3 or less n-zeros. As N goes to infinity the ratio between the two results becomes ( 0.5) 0.5 ( ) = ( 0.5) N 0.5 ( 1 + N ) 1 = One questi that remains at the moment unanswered is whether the ( 0.5) coefficient we got reflects true behavior of the P 0 problem versus the P 1 e, or is it emerging due to the still loose bounding approximati in our proof. As we shall see in the next secti, an alternative proof does succeed in closing the gap entirely. t is interesting to note that if we assume that K 1 = 0, the requirement made in Equati (6) leads to the requirement K = γ 0 1. Similarly, if K = K 1 then Equati (6) leads to the requirement K = γ 0 1. Note that these two cases are the extremes in the range of K 1 (0 K 1 K ). Thus, for these two cases the 1/ bound is valid and there is no gap with respect to the uniqueness bound. 6 The Equivalence Between the (P 0 ) and the (P 1 ) Optimizatis. n the previous secti we closely followed Doho and Huo s work and reproduced their results. Next, we will take a different route and will show that the cditi derived for the uniqueness of the sparse representati is in fact, also, the cditi for the desired equivalence of the (P 0 ) and (P 1 ) optimizati problems. Using the notati of the previous secti, equatis (3) and (4) can be reinterpreted as demanding that x k 1 (7) for all x such that [Φ Ψx = 0 and x 1 = 1. (8) Note that in the above equati we have added a cstraint the l 1 norm of the vector x. However, proving the above, equatis (3) and (4) follow immediately. All vectors x in the null space of [Φ Ψ can be characterized by x = [ Ψ T Φ v (9) for v R N. Thus, equatis (7) and (8) can be rewritten as ([ Ψ T Φ ) v 1 k (10) 16

17 for all v R N such that [ Ψ T Φ v = 1. (11) 1 Csidering the set { [ C = v R N : Ψ T Φ } v 1, (1) 1 we readily note that this set is a polyhedr in R N and that the vectors satisfying (11) cstitute its boundary. Since ([ ) Ψ T v is piecewise linear in v it will have its maximal value Φ k in C at e of its vertices. So, let us next characterize these vertices. Claim 1 The polyhedr C has exactly 4N vertices which are given by ±e i 1 + N ψ T φ j i i = 1,,...,N (13) and ±Φ T ψ i 1 + N ψ T φ i j i = 1,,...,N (14) where e i denotes the i th column of the N N identity matrix. Proof. Since C results from restricting the polyhedr { x R N : x 1 1 } in R N to the null space of [Φ Ψ it has the same number of vertices, namely, 4N. Next we readily note that and [ [ Ψ T Φ Ψ T Φ ±e i 1 + N ψ T ±Φ T ψ i 1 + N ψ T φ j i 1 φ i j 1 = 1 (15) = 1 (16) so all these vectors are the boundary of C. Let us now csider the hyperplanes H 1 i = v : ±e i 1 + N ψ T φ + j i k i β k e k (17) 17

18 and H i = v : ±Φ T ψ i 1 + N ψ T φ + i j k i β k Φ T ψ k [ Then, we can observe that Ψ T v > 1 for every v H Φ i 1 Hi where any of the β k is 1 different from zero. Hence, we can cclude that the vectors in (13) and (14) are indeed vertices of C. Substituting the vertices in (9) we can calculate for the first type vertex ([ ) ([ ) Ψ T v Φ = N ψ e i T Ψ T (19) φ i and the secd type ([ Ψ T Φ k j φ i ) ([ v = 1 k 1 + N ψ T φ Φ T ) ψ i e i i j k Since we assume that < 1 (otherwise we have the trivial case) we cclude that for the first type ([ ) 1 + F 1 Ψ T v Φ ψ T φ k j i 1 + N ψ T (1) k j φ i k (18) (0) where F = γ 0. Similarly, the secd type ([ ) 1 + F 1 Ψ T v Φ ψ T φ i k j k 1 + N ψ T φ i j Now we need the following result: () Claim Let {α j } N be such that N α j = 1 (3) and 0 α j < 1 (4) Then N α j K + 1 K (5) 18

19 where K = 1 ( a denotes the largest integer smaller than the real number a). (6) Proof. We prove the above by ctradicti. Let us assume there exist { α j } N satisfying (4) and (3) for which N α j = C < K + 1 K (7) We note that all {α j } N satisfying N α j = C and (4) are the intersecti between a hyperplane and a hypercube, namely, they cstitute a polyhedr. Writing C = K 1 + β 1 (8) where K 1 = C we can readily cvince ourselves that the all the vertices of this polyhedr have K 1 entries equal to, e entry equal to β 1 and the remaining entries are zeros. Hence, if {α j } N are the entries of any e of these vertices we have Then, since (7) implies that either in which case N αj = K 1 + β1 (9) K 1 = K and β 1 < 1 K (30) or N αj = K 1 + β1 < K + 1 K = 1 (31) K 1 < K (3) in which case, since by their definitis K 1 and 0 β 1 <, we get N αj = K 1 + β1 < (K 1 + 1) K 1 (33) Namely, for every vertex of the polyhedr we get N α j < 1 which means that it has to be true for the whole polyhedr and we get a ctradicti to (3) and the proof is completed. 19

20 Observing that ψj T φ i and Ni=1 ψ T j φ i = N ψ T j φ i = 1 Claims 1 and can be used either (1) or () to cclude that Then if ([ Ψ T Φ ) v 1 + (F 1) k 1 + K + 1 K γ 0 = F < 1 (1 + (F 1) ) + 1 (34) (35) we csider two possibilities: < 1 which, by (35) leads to F = 1 and by (34) 1 ([ Ψ T Φ v ) k + 1 < 1 (36) 0 < < 1 which by (34) and (35) leads to ([ Ψ T Φ v ) k < ( ) + 1 To cclude, we have proved the following result: Theorem 3 Given that γ = F < 1 the (P 0 o) and (P 1 ) optimizati problems are equivalent. 7 Cclusis Given a vector S R N, and given an orthormal N by N matrix Φ representing a basis, we can uniquely represent S using this basis and get α = Φ T S. Assuming that we have a secd orthormal basis Ψ and a secd representati β = Ψ T S, the first questi we have addressed in this paper is whether there is a lower bound the number of n-zeros in these two representatis. t was found that α 0 + β 0 < 1 α 0 β 0 1, where is a scalar value denoting the maximal absolute value of cross-inner-products between vectors in Φ and Ψ. The secd questi answered in this paper refers to the uniqueness of sparse representati with overcomplete dictiaries. Assuming that the signal S is to be represented using a dictiary that ctains Φ and Ψ (N vectors), it is clear that there are numerous possible representatis. t was established that for any two feasible representatis denoted by γ 1, γ R N, we have that γ γ 0. (37) 0

21 Thus, there cannot be two different representatis for the same vector having each lower than 1/ n-zero entries, i.e., for any given representati γ, we have that uniqueness is ensured by γ 0 < 1. The main ctributi of the paper ccentrated to the way to find the sparse representati over an overcomplete dictiary as described above. Finding the optimal representati by minimizing the l 0 -norm is a highly complicated n-cvex combinatorial optimizati search. An alternative approach based l 1 -norm minimizati was proposed by Doho and Huo and proved to lead to the same result for sufficiently sparse signals. Their result was that if the signal has a sparse representati with no more than 0.5 (1 + 1/) n-zero entries, minimizati of the l 1 -norm can replace the original l 0 -norm minimizati while ensuring the same result. Obviously, such a result is very valuable since l 1 -norm minimizati leads to a linear programming problem, which is a cvex problem, while the l 0 -norm minimizati is a hard n-cvex problem. n this paper we have improved this sparsity bound and found that if S has a sparse representati γ such that γ 0 1, then the l 1 -norm minimizati soluti coincides with the minimizati of the l 0 -norm. n fact, this result implies that whenever the soluti found is sparsely-unique, it can be derived by solving the P 1 rather than the P 0 optimizati problem. Technically speaking, the obtained results suggests that, given the two orthormal bases, solve the P 1 (Linear-Programming problem), and if the soluti is sparse enough (1/), it is also guaranteed to be the P 0 soluti. 8 Acknowledgements This work emerged as a follow-up David Doho s extremely interesting talk (based Ref. [) at the Techni (srael nstitute of Technology) in December

22 References [1 D.L.Doho & P.B Stark (June 1989) Uncertainty Principles and Signal Recovery, SA Journal Applied athematics, Vol 49/3,pp [ D.L.Doho & X.Huo (June 1999) Uncertainty Principles and deal Atomic Decompositi, www manuscript. [3 S. allat(1998) A Wavelet Tour of Signal Processing, Academic Press, Secd Editi. [4 D. Bertsekas (1995) N-Linear Programming, Athena Publishing

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases

2558 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 48, NO 9, SEPTEMBER 2002 A Generalized Uncertainty Principle Sparse Representation in Pairs of Bases Michael Elad Alfred M Bruckstein Abstract An elementary