A Probabilistic and RIPless Theory of Compressed Sensing

Size: px
Start display at page:

Download "A Probabilistic and RIPless Theory of Compressed Sensing"

Transcription

1 A Probabilistic and RIPless Theory of Copressed Sensing Eanuel J Candès and Yaniv Plan 2 Departents of Matheatics and of Statistics, Stanford University, Stanford, CA Applied and Coputational Matheatics, Caltech, Pasadena, CA 925 Noveber 200; Revised June 20 Abstract This paper introduces a siple and very general theory of copressive sensing In this theory, the sensing echanis siply selects sensing vectors independently at rando fro a probability distribution F ; it includes all standard odels eg Gaussian, frequency easureents discussed in the literature, but also provides a fraework for new easureent strategies as well We prove that if the probability distribution F obeys a siple incoherence property and an isotropy property, one can faithfully recover approxiately sparse signals fro a inial nuber of noisy easureents The novelty is that our recovery results do not require the restricted isoetry property (RIP) to hold near the sparsity level in question, nor a rando odel for the signal As an exaple, the paper shows that a signal with s nonzero entries can be faithfully recovered fro about s log n Fourier coefficients that are containated with noise Keywords Copressed sensing, l iniization, the LASSO, the Dantzig selector, (weak) restricted isoetries, rando atrices, sparse regression, operator Bernstein inequalities, Gross golfing schee Introduction Dedicated to the eory of Jerrold E Marsden This paper develops a novel, siple and general theory of copressive sensing [2, 5, 20], a rapidly growing field of research that has developed protocols for acquiring certain types of signals with far fewer data bits than what is classically accepted A RIPless theory? The early paper [2] triggered a assive aount of research by showing that it is possible to saple signals at a rate proportional to their inforation content rather than their bandwidth For instance, in a discrete setting, this theory asserts that a digital signal x R n (which can be viewed as Nyquist saples of a continuous-tie signal over a tie window of interest) can be recovered fro a sall rando saple of its Fourier coefficients provided that x is sufficiently sparse Forally, suppose that our signal x has at ost s nonzero aplitudes at copletely unknown locations such a signal is called s-sparse and that we are given the value of its discrete Fourier transfor (DFT)

2 at frequencies selected uniforly at rando (we think of as being uch saller than n) Then [2] showed that one can recover x by solving an optiization proble which siply finds, aong all candidate signals, that with the iniu l nor; the nuber of saples we need ust be on the order of s log n In other words, if we think of s as a easure of the inforation content, we can saple nonadaptively nearly at the inforation rate without inforation loss By swapping tie and frequency, this also says that signals occupying a very large bandwidth but with a sparse spectru can be sapled (at rando tie locations) at a rate far below the Shannon-Nyquist rate Despite considerable progress in the field, soe iportant questions are still open We discuss two that have both a theoretical and practical appeal Is it possible to faithfully recover a nearly sparse signal x R n, one which is well approxiated by its s largest entries, fro about s log n of its Fourier coefficients? Is it still possible when these coefficients are further corrupted by noise? These issues are paraount since in real-world applications, signals are never exactly sparse, and easureents are never perfect either Now the traditional way of addressing these types of probles in the field is by eans of the restricted isoetry property (RIP) [4] The trouble here is that it is unknown whether or not this property holds when the saple size is on the order of s log n In fact, answering this one way or the other is generally regarded as extreely difficult, and so the restricted isoetry achinery does not directly apply in this setting This paper proves that the two questions forulated above have positive answers In fact, we introduce recovery results which are up to a logarithic factor as good as those one would get if the restricted isoetry property were known to be true To fix ideas, suppose we observe noisy discrete Fourier coefficients about an s-sparse signal x, ỹ k = n e ı2πωkt x[t] + σz k, k =,, () t=0 Here, the frequencies ω k are chosen uniforly at rando in {0, /n, 2/n,, (n )/n} and z k is white noise with unit variance Then if the nuber of saples is on the order of s log n, it is possible to get an estiate ˆx obeying ˆx x 2 l 2 = polylog(n) s σ2 (2) by solving a convex l -iniization progra (Note that when the noise vanishes, the recovery is exact) Up to the logarithic factor, which ay soeties be on the order of log n and at ost a sall power of this quantity, this is optial Now if the RIP held, one would get a squared error bounded by O(log n) s σ2 [6, 5] and, therefore, the RIPless theory developed in this paper roughly enjoys the sae perforance guarantees 2 A general theory The estiate we have just seen is not isolated and the real purpose of this paper is to develop a theory of copressive sensing which is both as siple and as general as possible At the heart of copressive sensing is the idea that randoness can be used as an effective sensing echanis We note that rando easureents are not only crucial in the derivation of any theoretical results, but also generally see to give better epirical results as well Therefore, 2

3 we propose a echanis whereby sensing vectors are independently sapled fro a population F Matheatically, we observe ỹ k = a k, x + σz k, k =,,, (3) where x R n, {z k } is a noise sequence, and the sensing vectors a k iid F (note that ak is a colun vector ) For exaple, if F is the faily of coplex sinusoids, this is the Fourier sapling odel introduced earlier All we require fro F is an isotropy property and an incoherence property Isotropy property: We say that F obeys the isotropy property if E aa = I, a F (4) If F has ean zero (we do not require this), then E aa is the covariance atrix of F In other words, the isotropy condition states that the coponents of a F have unit variance and are uncorrelated This assuption ay be weakened a little, as we shall see later Incoherence property: We ay take the coherence paraeter µ(f ) to be the sallest nuber such that with a = (a[],, a[n]) F, ax t n a[t] 2 µ(f ) (5) holds either deterinistically or stochastically in the sense discussed below The saller µ(f ), ie the ore incoherent the sensing vectors, the fewer saples we need for accurate recovery When a siple deterinistic bound is not available, one can take the sallest scalar µ obeying E[n a 2 l 2 E c] 20 n 3/2 and P(E c ) (n), (6) where E is the event {ax t n a[t] 2 µ} Suppose for instance that the coponents are iid N (0, ) Then a siple calculation we shall not detail shows that E[n a 2 l 2 E c] 2n P(Z > µ) + 2 µφ( µ), (7) P(E c ) 2n P(Z µ), where Z is standard noral and φ is its density function The inequality P (Z > t) φ(t)/t shows that one can take µ(f ) 6 log n as long as n 6 and n More generally, if the coponents of a are iid saples fro a sub-gaussian distribution, µ(f ) is at ost a constant ties log n If they are iid fro a sub-exponential distribution, µ(f ) is at ost a constant ties log 2 n In what follows, however, it ight be convenient for the reader to assue that the deterinistic bound (5) holds It follows fro the isotropy property that E a[t] 2 =, and thus µ(f ) This lower bound is achievable by several distributions and one such exaple is obtained by sapling a row fro the DFT atrix as before, so that a[t] = e ı2πkt/n, where k is chosen uniforly at rando in {0,,, n } Then another siple calculation shows that E aa = I and µ(f ) = since a[t] 2 = for all t At the other extree, suppose the easureent 3

4 process reveals one entry of x selected uniforly at rando so that a = n e i where i is unifor in {,, n}; the noralization ensures that E aa = I This is a lousy acquisition protocol because one would need to saple on the order of n log n ties to recover even a -sparse vector (the logarithic ter coes fro the coupon collector effect) Not surprisingly, this distribution is in fact highly coherent as µ(f ) = n With the assuptions set, we now give a representative result of this paper: suppose x is an arbitrary but fixed s-sparse vector and that one collects inforation about this signal by eans of the rando sensing echanis (3), where z is white noise Then if the nuber of saples is on the order µ(f )s log n, one can invoke l iniization to get an estiator ˆx obeying ˆx x 2 l 2 polylog(n) s σ2 This bound is sharp It is not possible to substantially reduce the nuber of easureents and get a siilar bound, no atter how intractable the recovery ethod ight be Further, with this any easureents, the upper bound is optial up to logarithic factors Finally, we will see that when the signal is not exactly sparse, we just need to add an approxiation error to the upper bound To suarize, this paper proves that one can faithfully recover approxiately s-sparse signals fro about s log n rando incoherent easureents for which µ(f ) = O() 3 Exaples of incoherent easureents We have seen through exaples that sensing vectors with low coherence are global or spread out Incoherence alone, however, is not a sufficient condition: if F were a constant distribution (sapling fro F would always return the sae vector), one would not learn anything new about the signal by taking ore saples regardless of the level of incoherence However, as we will see, the incoherence and isotropy properties together guarantee that sparse vectors lie away fro the nullspace of the sensing atrix whose rows are the a k s The role of the isotropy condition is to keep the easureent atrix fro being rank deficient when sufficiently any easureents are taken (and siilarly for subsets of coluns of A) Specifically, one would hope to be able to recover any signal fro an arbitrarily large nuber of easureents However, if E aa were rank deficient, there would be signals x R n that would not be recoverable fro an arbitrary nuber of saples; just take x 0 in the nullspace of E aa The nonnegative rando variable x aa x has vanishing expectation, which iplies a x = 0 alost surely (Put differently, all of the easureents would be zero alost surely) In contrast, the isotropy condition iplies that k= a ka k I alost surely as and, therefore, with enough easureents, the sensing atrix is well conditioned and has a left-inverse We now provide exaples of incoherent and isotropic easureents ˆ Sensing vectors with independent coponents Suppose the coponents of a F are independently distributed with ean zero and unit variance Then F is isotropic In addition, if the distribution of each coponent is light-tailed, then the easureents are clearly incoherent One could require near isotropy, ie E aa I If the approxiation were tight enough, our theoretical results would still follow with inial changes to the proof 4

5 A special case concerns the case where a N(0, I), also known in the field as the Gaussian easureent enseble, which is perhaps the ost coonly studied Here, one can take µ(f ) = 6 log n as seen before Another special case is the binary easureent enseble where the entries of a are syetric Bernoulli variables taking on the values ± A shifted version of this distribution is the sensing echanis underlying the single pixel caera [23] ˆ Subsapled orthogonal transfors: Suppose we have an orthogonal atrix obeying U U = n I Then consider the sapling echanis picking rows of U uniforly and independently at rando In the case where U is the DFT, this is the rando frequency odel introduced earlier Clearly, this distribution is isotropic and µ(f ) = ax ij U ij 2 In the case where U is a Hadaard atrix, or a coplex Fourier atrix, µ(f ) = ˆ Rando convolutions: Consider the circular convolution odel y = Gx in which g[0] g[] g[2] g[n ] g[n ] g[0] g[] G = g[] g[n ] g[0] Because a convolution is diagonal in the Fourier doain (we just ultiply the Fourier coponents of x with those of g), G is an isoetry if the Fourier coponents of g = (g[0],, g[n ]) have the sae agnitude In this case, sapling a convolution product at randoly selected tie locations is an isotropic and incoherent process provided g is spread out (µ(f ) = ax t g(t) 2 ) This exaple extends to higher diensions; eg to spatial 3D convolutions In fact, certain rando convolutions ay be used universally to efficiently saple a signal that is sparsely represented in any orthonoral basis while still giving strong theoretical guarantees for signal recovery [43] an appealing feature due to the fast application of a convolution atrix For a quick coparison, the theory in [43] deonstrates that about s log n easureents are sufficient for noiseless recovery and about s log 5 n are sufficient for stable recovery The theory in this paper ay also be applied to the rando convolution described in [43], deonstrating that about s log 2 n easureents are sufficient for stable recovery ˆ Subsapled tight or continuous fraes: We can generalize the exaple above by subsapling a tight frae or even a continuous frae An iportant exaple ight be the Fourier transfor with a continuous frequency spectru Here, a(t) = e ı2πωt, where ω is chosen uniforly at rando in [0, ] (instead of being on an equispaced lattice as before) This distribution is isotropic and obeys µ(f ) = A situation where this arises is in agnetic resonance iaging (MRI) as frequency saples rarely fall on an equispaced Nyquist grid By swapping tie and frequency, this is equivalent to sapling a nearly sparse trigonoetric polynoial at randoly selected tie points in the unit interval [38] More generally, as described in [4, Chapter 4], this odel applies to rando sapling of signals that have a sparse expansion in a bounded orthonoral syste of functions 5

6 These exaples could of course be ultiplied, and we hope we have ade clear that our fraework is general and encopasses any of the easureent odels coonly discussed in copressive sensing and perhaps any ore Our setup siply consists in selecting each easureent vector independently fro the others For copleteness, however, we describe other easureent ensebles discussed in the copressive sensing literature that do not assue this odel Whether the theory in this paper can be extended to these odels is open to further research A reversal of the rando convolution exaple described above is convolution with a rando vector at fixed locations (rather than convolution with a fixed vector at rando locations) In fact, such a easureent echanis has been studied in detail [40, 28, 4, 42], with strong theoretical results Siilarly to the theory in this paper, near optial theoretical guarantees on sparse-signal recovery are non-unifor [40, 4], whereas unifor recovery through RIP achinery has been quite suboptial Siilar results are available for rando Gabor atrices [34] 4 Matrix notation Before continuing, we pause to introduce soe useful atrix notation Divide both sides of (3) by, and rewrite our statistical odel as y = Ax + σ z; (8) the kth entry of y is ỹ k divided by, the kth row of A is a k divided by, and σ is σ divided by This noralization iplies that the coluns of A are approxiately unit-nored, and is ost used in the copressive sensing literature 5 Incoherent sapling theore For pedagogical reasons, we introduce our results by first presenting a recovery result fro noiseless data The recovered signal is obtained by the standard l -iniization progra in x x R n l subject to A x = y (9) (Recall that the rows of A are noralized independent saples fro F ) Theore (Noiseless incoherent sapling) Let x be a fixed but otherwise arbitrary s-sparse vector in R n and pick any scalar β > 0 Then with probability at least 5/n e β, x is the unique iniizer to (9) with y = Ax provided that C β µ(f ) s log n More precisely, C β ay be chosen as C 0 ( + β) for soe positive nuerical constant C 0 Aong other things, this theore states that one can perfectly recover an arbitrary sparse signal fro about s log n convolution saples, or a signal that happens to be sparse in the wavelet doain fro about s log n randoly selected noiselet coefficients It extends an earlier result [], which assued a subsapled orthogonal odel, and strengthens it since that reference could only prove the clai for randoly signed vectors x Here, x is arbitrary, and we do not ake any distributional assuption about its support or its sign pattern 6

7 This theore is also about a fundaental inforation theoretic liit: the nuber of saples for perfect recovery has to be on the order of µ(f ) s log n, and cannot possibly be uch below this nuber More precisely, suppose we are given a distribution F with coherence paraeter µ(f ) Then there exist s-sparse vectors that cannot be recovered with probability at least /n, say, fro fewer than a constant ties µ(f ) s log n saples When µ(f ) =, this has been already established since [2] proves that soe s sparse signals cannot be recovered fro fewer than a constant ties s log n rando DFT saples Our general clai follows fro a odification of the arguent in [2] Assue, without loss of generality, that µ(f ) is an integer and consider the isotropic process that saples rows fro an n n block diagonal atrix, each block being a DFT of a saller size; that is, of size n/l where µ(f ) = l Then if c 0 µ(f ) s log n, one can construct s-sparse signals just as in [2] for which Ax = 0 with probability at least /n We oit the details The iportant aspect, here, is the role played by the coherence paraeter µ(f ) In general, the inial nuber of saples ust be on the order of the coherence ties the sparsity level s ties a logarithic factor Put differently, the coherence copletely deterines the inial sapling rate 6 Main results We assue for siplicity that we are undersapling so that n Our general result deals with ) arbitrary signals which are not necessarily sparse (iages are never exactly sparse even in a transfored doain) and 2) noise To recover x fro the data y and the odel (8), we consider the unconstrained LASSO which solves the l regularized least-squares proble in x R n 2 A x y 2 l 2 + λσ x l (0) We assue that z is Gaussian z N(0, I) However, the theore below reains valid as long as A z l λ n for soe λ n 0, and thus any other noise odels would work as well In what follows, x s is the best s-sparse approxiation of x or, equivalently, the vector consisting of the s largest entries of x in agnitude Theore 2 Let x be an arbitrary fixed vector in R n and pick any scalar β > 0 probability at least 6/n 6e β, the solution ˆx to (0) with λ = 0 log n obeys ˆx x in C( + α) x x s l s log n s + σ s s provided that C β µ(f ) s log n If one easures the error in the l nor, then Then with () log n ˆx x l in C( + α) x x s s s l + sσ (2) Above, C is a nuerical constant, C β can be chosen as before, and α = is never greater than log 3/2 n in this setup (+β)sµ log n log log 2 s which Above, s ay be interpreted as an upper bound on allowable sparsity levels s that still lead to stable reconstruction (according to our proofs) We ay take s = C β µ(f ) log n 7

8 These robust error bounds do not require either () a rando odel on the signal or (2) the RIP nor one of a few closely related strong conditions such as the RIP- [25], the restricted eigenvalue assuption [5] or the copatibility condition [49] all conditions that would iply unifor recovery In contrast, our arguents work through a new achinery which guarantees fixed sparse signal recovery with high probability It is currently unknown whether our conditions are strong enough to iply unifor recovery Further, the error bound is within at ost a log 3/2 n factor of what has been established using the RIP since a variation on the arguents in [6] would give an error bound proportional to the quantity inside the square brackets in () As a consequence, the error bound is within a polylogarithic factor of what is achievable with the help of an oracle that would reveal the locations of the significant coordinates of the unknown signal [6] In other words, it cannot be substantially iproved Because uch of the copressive sensing literature works with restricted isoetry conditions we shall discuss exceptions such as [22, 4] in Section 7 we pause here to discuss these conditions and to copare the to our own We say that an n atrix A obeys the RIP with paraeters s and δ if ( δ) v 2 l 2 Av 2 l 2 ( + δ) v 2 l 2 (3) for all s-sparse vectors v In other words, all the subatrices of A with at ost s coluns are well conditioned When the RIP holds with paraeters 2s and δ < 044 [8] or even δ 0465 [24], it is known that the error bound () holds (without the factor ( + α)) This δ is soeties referred to as the restricted isoetry constant Bounds on the restricted isoetry constant have been established in [5] and in [46] for partial DFT atrices, and by extension, for partial subsapled orthogonal transfors For instance, [46] proves that if A is a properly noralized partial DFT atrix, or a unitary atrix with bounded entries, then the RIP with δ = /4 holds with high probability if C s log 4 n (C is soe positive constant) Rauhut [4, Chapter 4, Chapter 8] extended these results to the easureent atrices consisting of sapled bounded orthogonal systes While such easureents are a special case of those considered in this paper, the theory in [4, Chapter 8] extends to the easureent odel considered here Sall adjustents need to be ade in the case of stochastic incoherence, but the end result is that the RIP holds with high probability when C µ(f ) sµ(f ) log 2 s log n log, ie, about sµ log 4 n easureents are sufficient Thus, our result bridges the gap between the region where the RIP is known to hold and the region in which one has the iniu nuber of easureents needed to prove perfect recovery of exactly sparse signals fro noisy data, which is on the order of µ(f ) s log n The gap is bridged soothly in the sense that our error bounds atch those available with the RIP as soon as the RIP is known to hold, and grow slightly larger when there are fewer easureents The careful reader will no doubt reark that for very specific odels such as the Gaussian easureent enseble, it is known that on the order s log(n/s) saples are sufficient for stable recovery while our result asserts that on the order of s log 2 n are sufficient (and s log n for the binary easureent enseble) This slight loss is a sall price to pay for a very siple general theory, which accoodates a wide array of sensing strategies Having said this, the reader will also verify that specializing our proofs below gives an optial result for the Gaussian enseble; ie establishes a near-optial error bound fro about s log n observations (the log(n/s) factor, however, ay still require a different approach) 8

9 Finally, another frequently discussed algorith for sparse regression is the Dantzig selector [6] Here, the estiator is given by the solution to the linear progra in x x R n l subject to A (A x y) l λ σ (4) We show that the Dantzig selector obeys nearly the sae error bound Theore 3 The Dantzig selector, with λ = 0 log n and everything else the sae as in Theore 2, obeys with the sae probabilities as before ˆx x in C( + x x s s α2 ) s l s log n s + σ ˆx x l in C( + s s α2 ) log n x x s + sσ l The only difference is α 2 instead of α in the right-hand sides 7 Our contribution (5) (6) The ain contribution of this paper is to provide a siple fraework which applies to all the standard copressive sensing odels and soe new ones as well The results in this paper also reduce the inial nuber of easureents theoretically required in soe standard sensing odels such as Fourier easureents, or, ore generally, sensing atrices obtained by sapling a few rows fro an orthogonal atrix We establish that the restricted isoetry property is not necessarily needed to accurately recover fixed nearly sparse vectors fro noisy copressive saples This ay be of interest because in any cases RIP-based results have needed strong requireents on relationship between the nuber of easureents required and the sparsity level Thus our work is a significant departure fro the ajority of the literature, which establishes good noisy recovery properties via the RIP achinery This literature is, of course, extreely large and we cannot cite all contributions but a partial list would include [5, 2, 46, 3, 6, 5, 9, 50, 5, 3, 37, 29, 7, 2] The reason why one can get strong error bounds, which are within a polylogarithic factor of what is available with the aid of an oracle, without the RIP is that our results do not iply universality That is, we are not claiing that if A is randoly sapled and then fixed once for all, then the error bounds fro Section 6 hold for all signals x What we are saying is that if we are given an arbitrary x, and then collect data by applying our rando schee, then the recovery of this x will be accurate If one wishes to establish universal results holding for all x siultaneously, then we would need the RIP or a property very close to it As a consequence, we cannot possibly be in this setup and guarantee universality since we are not willing to assue that the RIP holds To be sure, suppose we had available an oracle inforing us about the support T of x Then we would need the pseudoinverse of the subatrix with coluns in T to be bounded In other words, the iniu singular value of this subatrix would have to be away fro zero For a universal result, this would need to be true for all subsets of cardinality s; that is, the iniu singular value of all subatrices with s coluns would have to be away fro zero This essentially is the restricted isoetry property 9

10 To the best of our knowledge, only a few other papers have addressed non-universal stability (the literature grows so rapidly that an inadvertent oission is entirely possible) In an earlier work [9], the authors also considered weak conditions that allow stable recovery; in this case the authors assued that the signal was sapled according to a rando odel, but in return the easureent atrix A could be deterinistic In the asyptotic case, stable signal recovery has been deonstrated for the Gaussian easureent enseble in a regie in which the RIP does not necessarily hold [22, 4] In fact, the authors of [22, 4] are able to give exact liits on the error rather than error bounds Aside fro these papers and the work in progress [0], it sees that that the literature regarding stable recovery with conditions weak enough that they do not iply universality is extreely sparse Lastly, we would like to point to two iportant papers in the literature which together give a strong result for stable RIPless recovery with subsapled discrete Fourier transfors [27, 52] The authors of [27] give an useful equivalence between the l and l 2 nor of any vector which is the su of a subset of the coluns of a discrete Fourier transfor These results, have iportant iplications for Gelfand widths, and also ay be used to characterize the null space of DFT This theory ay be cobined with null-space conditions of [52] to iply stable of recovery of sparse signals by the (constrained) Lasso; the required nuber of easureents by this theory is O() s log log 5 [(n/) log ] In the case when n, this leads to the necessity of only about s log log 5 log easureents, which nearly atches the requireent in this paper of about s log easureents However, the error bounds are quite different fro those presented in our paper See [35, Section 223] for ore details Finally and to be coplete, we would like to ention that earlier works have considered the recovery of perfectly sparse signals fro subsapled orthogonal transfors [], and of sparse trigonoetric polynoials fro rando tie saples [38] 8 Organization of the paper The paper is organized as follows In Section 2, we introduce several fundaental estiates which our arguents rely upon, but which also could be useful tools for other results in the field In Section 3, we prove the noiseless recovery result, naely, Theore In Section 4, we prove our ain results, Theores 2 and 3 Now all these sections assue for siplicity of exposition that the coherence bound holds deterinistically (5) We extend the proof to distributions obeying the coherence property in the stochastic sense (6) in the Appendix This Appendix also contains another iportant technical piece, naely, a difficult proof of an interediate result (weak RIP property) Finally, we conclude the ain text with soe final coents in Section 5 9 Notation We provide a brief suary of the notations used throughout the paper For an n atrix A and a subset T {,, n}, A T denotes the T atrix with colun indices in T Also, A {i} is the i-th colun of A Likewise, for a vector v R n, v T is the restriction of v to indices in T Thus, if v is supported on T, Av = A T v T In particular, a k,t is the vector a k restricted to T The operator nor of a atrix A is denoted A ; also, A,2 is the axiu colun nor of A The identity atrix, in any diension, is denoted I Further, e i always refers to the i-th standard basis eleent, eg, e = (, 0,, 0) For a scalar t, sgn(t) is the sign of t if t 0 and is zero otherwise 0

11 For a vector x, sgn(x) applies the sign function coponentwise We shall also use µ as a shorthand for µ(f ) whenever convenient Throughout, C is a constant whose value ay change fro instance to instance 2 Fundaental Estiates Our proofs rely on several estiates, and we provide an interpretation of each whenever possible The first estiates E E4 are used to prove the noiseless recovery result; when cobined with the weak RIP, they iply stability and robustness Throughout this section, δ is a paraeter left to be fixed in later sections; it is always less than or equal to one 2 Local isoetry Let T of cardinality s be the support of x in Theore, or the support of the best s-sparse approxiation of x in Theore 2 We shall need that with high probability, A T A T I δ (2) with δ /2 in the proof of Theore and δ /4 in that of Theore 2 Put differently, the singular values of A T ust lie away fro zero This condition essentially prevents A T fro being singular as, otherwise, there would be no hope of recovering our sparse signal x Indeed, letting h be any vector supported on T and in the null space of A, we would have Ax = A(x + h) and thus, recovery would be ipossible even if one knew the support of x The condition (2) is uch weaker than the restricted isoetry property because it does not need to hold uniforly over all sparse subsets only on the support set Lea 2 (E: local isoetry) Let T be a fixed set of cardinality s Then for δ > 0, In particular, if 56 3 µ(f ) s log n, then P ( A T A T I δ) 2s exp ( µ(f )s δ 2 2( + δ/3) ) (22) P ( A T A T I /2) 2/n Note that A T A T I δ iplies that (A T A T ) /( δ), a fact that we will use several ties In copressive sensing, the classical way of proving such estiates is via Rudelson s selection theore [44] Here, we use a ore odern technique based on the atrix Bernstein inequality of Ahlswede and Winter [], developed for this setting by Gross [26], and tightened in [48] by Tropp and in [32] by Oliveira We present the version in [48] Theore 22 (Matrix Bernstein inequality) Let {X k } R d d be a finite sequence of independent rando self-adjoint atrices Suppose that E X k = 0 and X k B as and put Then for all t 0, σ 2 = E Xk 2 k P ( X k t) 2d exp ( t2 /2 k σ 2 + Bt/3 ) (23)

12 Proof [Estiate E] Decopose A T A T I as A T A T I = (a k,t a k,t I) = X k, X k = a k,t a k,t I k= k= The isotropy condition iplies E X k = 0, and since a T 2 l 2 µ(f ) s, we have X k = ax( a i,t 2 l 2, ) µ(f ) s Lastly, 0 E X 2 k = E(a k,t a k,t )2 I E(a k,t a k,t )2 = E a k,t 2 a k,t a k,t However, E a k,t 2 a k,t a k,t µ(f ) s E a k,t a k,t = µ(f ) s I and, therefore, k E X 2 k µ(f ) s I so that σ2 is bounded above by µ(f ) s Plugging t = δ into (23) gives the lea Instead of having A act as a near isoetry on all vectors supported on T, we could ask that it preserves the nor of an arbitrary fixed vector (with high probability), ie Av v for a fixed v supported on T Not surprisingly, this can be proved with generally (slightly) weaker requireents Lea 23 (E2: low-distortion) Let w be a fixed vector supported on a set T of cardinality at ost s Then for each t /2, P( (A T A T I)w T t w ) exp( 4 (t µ(f )s )2 ) The estiate follows fro the vector Bernstein inequality, proved by Gross [26, Theore ] We use a slightly weaker version, which we find slightly ore convenient Theore 24 (Vector Bernstein inequality) Let {v k } R d be a finite sequence of independent rando vectors Suppose that E v k = 0 and v k B as and put σ 2 k E v k 2 l 2 Then for all 0 t σ 2 /B, P k v k l 2 t Note that the bound does not depend on the diension d Proof [Estiate E2] We have (A T A T I)w T = )2 exp ( (t/σ ) exp ( t2 4 8σ ) (24) i= a i,t a i,t, w T w T = where v k = a i,t, w T w T By isotropy, E v k = 0, and by the Cauchy-Schwarz inequality together with the triangle inequality, i= v k, v k ( a i,t + ) w T ( sµ + ) w T = B The second inequality follows fro the coherence condition For the variance σ 2, E v k 2 l 2 = E a i,t 2 l 2 a i,t, w T 2 w T 2 l 2 E a i,t 2 l 2 a i,t, w T 2 sµ E a i,t, w T 2 = sµ w T 2 l 2 The last equality follows by the isotropy condition Thus σ 2 is bounded by sµ w T 2 l 2 The proof now follows by plugging in B and σ 2 into the vector Bernstein inequality 2

13 22 Off-support incoherence Lea 25 (E3: off-support incoherence) Let v be supported on T with T = s Then for each t > 0, P( A T cav l t v ) 2n exp ( 2µ(F ) t 2 + ) (25) 3 st This lea says that if v = x, then ax i T c A {i}, Ax cannot be too large so that the off-support coluns do not correlate too well with Ax The proof of E3 is an application of Bernstein s inequality the atrix Bernstein inequality with d = together with the union bound 2 Proof We have A T cav l = ax e i, A Av i T c Assue without loss of generality that v =, fix i T c and write e i, A Av = k w k, w k = e i, a k a k v Since i T c, E w k = 0 by the isotropy property Next, the Cauchy-Schwarz inequality gives w k = e i, a k a k, v e i, a k a k,t Since e i, a k µ(f ) and a k,t µ(f )s, we have w k µ(f ) s Lastly, for the total variance, we have E w 2 k µ(f ) E a k,t, v 2 = µ(f ) where the equality follows fro the isotropy property Hence, σ 2 µ(f ), and Bernstein s inequality gives P( e i, A Av t) 2 exp ( 2µ(F ) t 2 + ) 3 st Cobine this with the union bound over all i T c to give the desired result We also require the following related bound: ax i T A c T A {i} δ In other words, none of the colun vectors of A outside of the support of x should be well approxiated by any vector sharing the support of x Lea 26 (E4: unifor off-support incoherence) Let T be a fixed set of cardinality s For any 0 t s, P (ax i T A c T A {i} t) n exp ( t2 8µ(F )s + 4 ) In particular, if 8µ(F ) s (2 log n + /4), then P (ax i T c A T A {i} ) /n 2 To extend this Lea to the coplex case, one can split A T cav into real and coplex parts, treating each separately with Bernstein s inequality (siilarly to [7, Lea 82]) 3

14 Proof We apply the vector Bernstein inequality (Theore 24) Fix i T c and write A T A {i} = a k,t a k, e i = v k k= k= As before, E v k = E a k,t a k, e i = 0 since i T c Also, v k = a k,t a k, e i µ(f ) s Lastly, we calculate the su of expected squared nors, E v k 2 l 2 = E v 2 l 2 E[ a,t 2 l 2 e i, a 2 ] µ(f )s E e i, a 2 = µ(f )s k= As before, the last equality follows fro the isotropy property Bernstein s inequality together with the union bound give the lea 23 Weak RIP In the nonsparse and noisy setting, we shall ake use of a variation on the restricted isoetry property to control the size of the reconstruction error This variation is as follows: Theore 27 (E5: weak RIP) Let T be a fixed set of cardinality s, fix δ > 0, let r be a natural nuber, and suppose that Then with probability at least 5e β C δ β µ(f ) ax(s log(sµ), r log n log 2 r log(rµ log n)) ( δ) v 2 l 2 Av 2 l 2 ( + δ) v 2 l 2 (26) for all v supported on T R, where R is any set of cardinality R r Here C δ is a fixed nuerical constant which only depends upon δ This theore is proved in the Appendix using Talagrand s generic chaining 3, and cobines the fraework and results of Rudelson and Vershynin in [46] and [44] In the proof of Theore 2, we take δ = /4 The condition says that the colun space of A T should not be too close to that spanned by another sall disjoint set R of coluns To see why a condition of this nature is necessary for any recovery algorith, suppose that x has fixed support T and that there is a single colun A {i} which is a linear cobination of coluns in T, ie, A T {i} is singular Let h 0 be supported on T {i} and in the null space of A Then Ax = A(x + th) for any scalar t Clearly, there are soe values of t such that x + th is at least as sparse as x, and thus one should not expect to be able to recover x by any ethod In general, if there were a vector v as above obeying Av v then one would have A T v T A R v R Thus, if the signal x were the restriction of v to T, it would be very difficult to distinguish it fro that of v to R under the presence of noise The weak RIP is a cobination of the RIP and the local conditioning estiate E When r = 0, this is E whereas this is the restricted isoetry property when s = 0 The point is that we do 3 We use the generic chaining, which provides a way of invoking the ajorizing easure theore without creating a easure A sipler approach would be to go through Dudley s inequality but according to our proof the ajorizing easures theore provides a tighter result by a logarithic factor 4

15 not need the RIP to hold for sparsity levels on the order of /[µ(f ) log n] Instead we need the following property: consider an arbitrary subatrix fored by concatenating coluns in T with r other coluns fro A selected in any way you like; then we would like this subatrix to be well conditioned Because T is fixed, one can prove good conditioning when s is significantly larger than the axiu sparsity level considered in the standard RIP 24 Iplications The careful reader ay ask why we bothered to state estiates E E4 since they are all iplied by the weak RIP! Our otivation is three-fold: () soe of these estiates, eg E2 hold with better constants and weaker requireents than those iplied by the weak RIP achinery; (2) the weak RIP requires an in-depth proof whereas the other estiates are siple applications of well-known theores, and we believe that these theores and the estiates should be independently useful tools to other researchers in the field; (3) the noiseless theore does not require the weak RIP 3 Noiseless and Sparse Recovery For siplicity, we state our odel and the proofs for the real case, but it is straightforward to generalize our results to coplex signals, noise, and easureents, while only changing the constants involved in the proofs The inner-products involved in several of the bounds below would be replaced by their absolute value or their real value In particular, Leas 44 and 43 would have a slightly different presentation This section proves the noiseless recovery theore, naely, Theore Our proof essentially adapts the arguents of David Gross [26] fro the low-rank atrix recovery proble 3 Dual certificates The standard ethod for establishing exact recovery is to exhibit a dual certificate; that is to say, a vector v obeying the two properties below Lea 3 (Exact duality) Set T = supp(x) with x feasible for (9), and assue A T has full colun rank Suppose there exists v R n in the row space of A obeying Then x is the unique l iniizer to (9) v T = sgn(x T ) and v T c l < (3) The proof is now standard, see [5] Roughly, the existence of a dual vector iplies that there is a subgradient of the l nor at x that is perpendicular to the feasible set This geoetric property shows that x is solution Following Gross, we slightly odify this definition as to ake use of an inexact dual vector Lea 32 (Inexact duality) Set T = supp(x) where x is feasible, and assue that (A T A T ) 2 and ax i T c A T A {i} (32) Suppose there exists v R n in the row space of A obeying Then x is the unique l iniizer to (9) v T sgn(x T ) /4 and v T c l /4 (33) 5

16 Proof Let ˆx = x + h be a solution to (9) and note that Ah = 0 since both x and ˆx are feasible To prove the clai, it suffices to show that h = 0 We begin by observing that ˆx l = x T + h T l + h T c l x T l + sgn(x T ), h T + h T c l Letting v = A w be our (inexact) dual vector, we have sgn(x T ), h T = sgn(x T ) v T, h T + v T, h T = sgn(x T ) v T, h T v T c, h T c, where we used v T, h T = v, h v T c, h T c = v T c, h T c since v, h = w, Ah = 0 The Cauchy- Schwarz inequality together with the properties of v yield sgn(x T ), h T 4 ( h T + h T c l ) and, therefore, ˆx l x l 4 h T h T c l We now bound h T First, it follows fro that h T 2 A T A T ch T c l 2 Second, h T = (A T A T ) A T A T h T = (A T A T ) A T A T ch T c A T A T ch T c i T c A T A {i} h i ax i T c A T A {i} h T c l h T c l In conclusion, h T 2 2 h T c and thus, ˆx l x l + 4 h T c l This iplies h T c = 0, which in turn iplies h T = 0 since we ust have A T h T = Ah = 0 (and A T has full rank) Lea 33 (Existence of a dual certificate) Under the hypotheses of Theore, one can find v R n obeying the conditions of Lea 32 with probability at least e β /n This lea, which is proved next, iplies Theore The reason is that we just need to verify conditions (32) However, by Leas 2 and 26, they jointly hold with probability at least 3/n provided that µ s (9 log n + 2) (recall that µ is a shorthand for µ(f )) 32 Proof of Lea 33 The proof uses the clever golfing schee introduced in [26] Partition A into row blocks so that fro now on, A are the first rows of the atrix A, A 2 the next 2 rows, and so on The l atrices {A i } l i= are independently distributed, and we have l = As before, A i,t is the restriction of A i to the coluns in T The golfing schee then starts with v 0 = 0, inductively defines v i = i A i A i,t (sgn(x T ) v i,t ) + v i 6

17 for i =,, l, and sets v = v l Clearly v is in the row space of A To siplify notation, let q i = sgn(x T ) v i,t, and observe the two identities and q i = (I i A i,t A i,t ) q i = v = l i= i j= (I j A j,t A j,t ) sgn(x T ) (34) i A i A i,t q i, (35) which shall be used frequently Fro (34) and the fact that I i A i,t A i,t should be a contraction (local isoetry E), we see that the nor of q i decreases geoetrically fast the terinology coes fro this fact since each iteration brings us closer to the target just as each golf shot would bring us closer to the hole so that v T should be close to sgn(x T ) Hopefully, the process keeps the size of v T c under control as well To control the size of v T c and that of sgn(x T ) v T, we clai that the following inequalities hold for each i with high probability: first, q i c i q i (36) and, second, i A i,t ca i,t q i l t i q i (37) (the values of the paraeters t i and c i will be specified later) Let p (i) (resp p 2 (i)) be the probability that the bound (36) (resp (37)) does not hold Lea 23 (Estiate E2) gives Thus, if p (i) exp ( 4 (c i i /(sµ) ) 2 ) (38) i then p (i) α e β Next, Lea 25 (Estiate E3) gives Thus, if 2 + 8(β + log α) c 2 sµ, (39) i 3t 2 i p 2 (i) 2n exp ( i 6µ + 2µ ) (30) st i i ( 2 t 2 i s + 2 3t i s ) (β + log(2α) + log n)sµ, (3) then p 2 (i) α e β It is now tie to set the nuber of blocks l, the block sizes i and the values of the paraeters c i and t i These are as follows: ˆ l = (log 2 s)/2 + 2; ˆ c = c 2 = /[2 log n] and c i = /2 for 3 i l; 7

18 ˆ t = t 2 = /[8 s] and t i = log n/[8 s] for 3 i l; ˆ, 2 35( + log 4 + β)sµc 2 i and i 35( + log 6 + β)sµc 2 i for 3 i l Note that under the assuptions of the lea, i i (extra rows need not be used in the construction of the dual certificate) To see why v is a valid certificate, suppose first that for each i, (36) and (37) hold Then (34) gives as desired Further, (35) yields v T c l sgn(x T ) v T = q l sgn(x T ) l i= i A i,t ca i,t q i l l i= l i= c i t i q i s s 2 l 4 l i t i i= j= (For i = 0, we take i j= c i = ) Now with our choice of paraeters, the right-hand side is bounded above by 8 ( + 2 log n + log n 4 log n + ) < 4, which is the desired conclusion Now we ust show that the bounds (36), (37) hold with probability at least e β /n It follows fro (39) and (3) that p (i), p 2 (i) 4 e β for i =, 2 and p (i), p 2 (i) 6 e β /6 for i 3 Thus, p () + p (2) + p 2 () + p 2 (2) e β and p (i) + p 2 (i) /3 for i 3 Now the union bound would never show that (36) and (37) hold with probability at least /n for all i 3 because of the weak bound on p (i) + p 2 (i) However, using a clever idea in [26], it is not necessary for each subset of rows A i to succeed and give the desired bounds Instead, one can saple a few extra batches of rows, and throw out those that fail our requireents We only need l 2 working batches, after the first 2 In particular, pick l + 2 > l batches of rows, so that we require 2 40(+log 4+β) µ s log n +l 40(+log 6+β)sµ (note that we have ade no attept to optiize constants) Now as in [26], let N be the the nuber of batches after the first 2 obeying (36) and (37); this N is larger (probabilistically) than a binoial(l, 2/3) rando variable Then by Hoeffding s inequality [3, Theore 23a] and thus if we pick l = 3 log n +, we have P(N < l 2) exp ( 2 ( 2 3 l l + 2) 2 l ) P(N < l 2) /n In suary, fro p () + p 2 () + p (2) + p 2 (2) e β and the calculation above, the dual certificate v obeys the required properties with probability at least /n e β, provided that C( + β) µ s log n c i 8

19 4 General Signal Recovery fro Noisy Data We prove the general recovery theores fro Section 6 under the assuption of Gaussian white noise but would like to ephasize that the sae result would hold for other noise distributions Specifically, suppose we have the noisy odel y = Ax + z, where A z l λ n (4) holds with high probability Then the conclusions of Theore 3 reain valid In details, the Dantzig selector with constraint A (y A x) l 4λ n obeys ˆx x C ( + α 2 ) [ x x s l s + λ n s] (42) with high probability Hence, (5) is a special case corresponding to λ n = 25σ log n = log n log n 25σ Likewise, the bound on the l loss (6) with λ n in place of σ holds as well A siilar generality applies to the LASSO as well, although in this case we need a second noise correlation bound, naely, A T c(i P )z l λ n, where P = A T (A T A T ) A T is the projection onto the range of A T Now when z N (0, I) and A is a fixed atrix, we have A z l 2 A,2 log n (43) with probability at least /2n; recall that A,2 is the axiu colun nor of A Indeed, the ith coponent of A z is distributed as N (0, A {i} 2 l 2 ) and, therefore, the union bound gives P( A z l > 2 A,2 log n) n P( N (0, ) > 2 log n) The conclusion follows for n 2 fro the well-known tail bound P( N (0, ) > t) 2φ(t)/t, where φ is the density of the standard noral distribution The sae steps deonstrate that with probability at least /2n 4 Proof of Theore 2 A (I P )z l 2 (I P )A,2 log n 2 A,2 log n (44) We begin with a few siplifying assuptions First, we assue in the proof that σ = since the general result follows fro a siple rescaling Second, because we are interested in situations where is uch saller than n, we assue for siplicity of presentation that n although our results extend with only a change to the nuerical constants involved if n O() In truth, they extend without any assuption on the relation between and n, but the general presentation becoes a bit ore coplicated Fix s obeying s s, and let T = supp(x s ) We prove the error bounds of Theore 2 with s fixed, and the final result follows by considering that s which iniizes either the l 2 () or l (2) error bound This is proper since the iniizing s has a deterinistic value With T as above, we assue in the rest of the proof that 9

20 (i) all of the requireents for noiseless recovery in Lea 32 are et (and in particular, the inexact dual vector is constructed for x T ), (ii) and that the inexact dual vector v of Section 3 is successfully constructed All of this occurs with probability at least 4/n e β Further, we assue that (iii) the weak RIP holds with δ = /4, r = s and T is as above C(+β) µ log n log log 2 s This occurs with probability at least 5e β, and iplies the RIP at sparsity level r and restricted isoetry constant δ = /4 Lastly, we assue (iv) the noise correlation bound A z l 25 log n (45) Assuing the weak RIP above (Estiate E5), which iplies A,2 5/4, the conditional probability that this occurs is at least /2n because of (43) Because the weak RIP iplies the local isoetry condition E with δ = /4, all of these conditions together hold with probability at least 4/n 6e β All of the steps in the proof are now deterinistic consequences of (i) (iv); fro now on, we will assue they hold With h = ˆx x, our goal is to bound both the l 2 and l nors of h We will do this with a pair of leas The first is frequently used (recall that λ is set to 0 log n) Lea 4 (Tube constraint) The error h obeys A Ah l 5λ 4 Proof As shown in [9, Lea 3], writing that the zero vector is a subgradient of the LASSO functional 2 y A x 2 l 2 + λ x l at x = ˆx gives Then it follows fro the triangle inequality that A (y Aˆx) l λ A Ah l A (y Aˆx) l + A z l λ + A z l, where z is our noise ter The clai is a consequence of (45) Lea 42 The error h obeys for soe nuerical constant C 0 h T c l C 0 (sλ + x T c l ) (46) Before proving this lea, we show that it gives Theore 2 Soe of the steps are taken fro the proof of Theore in [6] Proof [Theore 2] Set r as in (iii) above We begin by partitioning T c and let T be the indices of the r largest entries of h T c, T 2 be those of the next r largest, and so on We first bound h T T and set T = T T for short The weak RIP assuption (iii) gives 3 4 h T 2 l 2 A T h T 2 l 2 = A T h T, Ah A T h T, A T c h T c (47) 20

21 Fro Lea 4, we have A T h T, Ah = h T, A T Ah h T l A T Ah l 5 4 λ h T l Since T has cardinality at ost 2s, the Cauchy-Schwarz inequality gives A T h T, Ah 5 4 λ 2s h T (48) Next, we bound A T h T, A T c h T c A T h T, A T c h T c + A T h T, A T c h T c We have A T h T, A T ch T c A T h T, A Tj h Tj (49) j 2 As shown in [4, Lea 2], the parallelogra identity together with the weak RIP iply that and, therefore, A T h T, A Tj h Tj 4 h T h Tj A T h T, A T ch T c 4 h T h Tj (40) j 2 To bound the suation, we use the now standard result [6, (30)] which gives h Tj r /2 h T c l, (4) j 2 A T h T, A T c h T c 4 r /2 h T h T c l The sae analysis yields A T h T, A T c h T c 4 r /2 h T h T c l and thus, A T h T, A T c h T c Plugging these estiates into (47) gives 2 4 r /2 h T h T c l h T 5 2 2sλ r /2 h T c l (42) The conclusion is now one step away Obviously, h h T + h Tj h T + r /2 h T c l j sλ + r /2 h T c 3 3 l, where the second line follows fro (42) Lea 42 copletes the proof for the l 2 error For the l error, note that by the Cauchy-Schwarz inequality h l = h T l + h T c l s h T + h T c l s h T + h T c l Cobine this with (42) and Lea 42 (Note that if we had not set σ = at the beginning of the proof, we would have λ = 0σ leading to the general error bound) log n, 2

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010

A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING. Emmanuel J. Candès Yaniv Plan. Technical Report No November 2010 A PROBABILISTIC AND RIPLESS THEORY OF COMPRESSED SENSING By Eanuel J Candès Yaniv Plan Technical Report No 200-0 Noveber 200 Departent of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

Lecture 20 November 7, 2013

Lecture 20 November 7, 2013 CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Multi-Scale/Multi-Resolution: Wavelet Transform

Multi-Scale/Multi-Resolution: Wavelet Transform Multi-Scale/Multi-Resolution: Wavelet Transfor Proble with Fourier Fourier analysis -- breaks down a signal into constituent sinusoids of different frequencies. A serious drawback in transforing to the

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

Highly Robust Error Correction by Convex Programming

Highly Robust Error Correction by Convex Programming Highly Robust Error Correction by Convex Prograing Eanuel J. Candès and Paige A. Randall Applied and Coputational Matheatics, Caltech, Pasadena, CA 9115 Noveber 6; Revised Noveber 7 Abstract This paper

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements 1 Copressive Distilled Sensing: Sparse Recovery Using Adaptivity in Copressive Measureents Jarvis D. Haupt 1 Richard G. Baraniuk 1 Rui M. Castro 2 and Robert D. Nowak 3 1 Dept. of Electrical and Coputer

More information

Tail estimates for norms of sums of log-concave random vectors

Tail estimates for norms of sums of log-concave random vectors Tail estiates for nors of sus of log-concave rando vectors Rados law Adaczak Rafa l Lata la Alexander E. Litvak Alain Pajor Nicole Toczak-Jaegerann Abstract We establish new tail estiates for order statistics

More information

Exact tensor completion with sum-of-squares

Exact tensor completion with sum-of-squares Proceedings of Machine Learning Research vol 65:1 54, 2017 30th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David

More information

Weighted- 1 minimization with multiple weighting sets

Weighted- 1 minimization with multiple weighting sets Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

Chaotic Coupled Map Lattices

Chaotic Coupled Map Lattices Chaotic Coupled Map Lattices Author: Dustin Keys Advisors: Dr. Robert Indik, Dr. Kevin Lin 1 Introduction When a syste of chaotic aps is coupled in a way that allows the to share inforation about each

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

Physics 215 Winter The Density Matrix

Physics 215 Winter The Density Matrix Physics 215 Winter 2018 The Density Matrix The quantu space of states is a Hilbert space H. Any state vector ψ H is a pure state. Since any linear cobination of eleents of H are also an eleent of H, it

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

An RIP-based approach to Σ quantization for compressed sensing

An RIP-based approach to Σ quantization for compressed sensing An RIP-based approach to Σ quantization for copressed sensing Joe-Mei Feng and Felix Kraher October, 203 Abstract In this paper, we provide new approach to estiating the error of reconstruction fro Σ quantized

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

Hamming Compressed Sensing

Hamming Compressed Sensing Haing Copressed Sensing Tianyi Zhou, and Dacheng Tao, Meber, IEEE Abstract arxiv:.73v2 [cs.it] Oct 2 Copressed sensing CS and -bit CS cannot directly recover quantized signals and require tie consuing

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Lecture 9 November 23, 2015

Lecture 9 November 23, 2015 CSC244: Discrepancy Theory in Coputer Science Fall 25 Aleksandar Nikolov Lecture 9 Noveber 23, 25 Scribe: Nick Spooner Properties of γ 2 Recall that γ 2 (A) is defined for A R n as follows: γ 2 (A) = in{r(u)

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

A Simple Homotopy Algorithm for Compressive Sensing

A Simple Homotopy Algorithm for Compressive Sensing A Siple Hootopy Algorith for Copressive Sensing Lijun Zhang Tianbao Yang Rong Jin Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China Departent of Coputer

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Fourier Series Summary (From Salivahanan et al, 2002)

Fourier Series Summary (From Salivahanan et al, 2002) Fourier Series Suary (Fro Salivahanan et al, ) A periodic continuous signal f(t), - < t

More information

Multi-Dimensional Hegselmann-Krause Dynamics

Multi-Dimensional Hegselmann-Krause Dynamics Multi-Diensional Hegselann-Krause Dynaics A. Nedić Industrial and Enterprise Systes Engineering Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu B. Touri Coordinated Science Laboratory

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information

The Hilbert Schmidt version of the commutator theorem for zero trace matrices

The Hilbert Schmidt version of the commutator theorem for zero trace matrices The Hilbert Schidt version of the coutator theore for zero trace atrices Oer Angel Gideon Schechtan March 205 Abstract Let A be a coplex atrix with zero trace. Then there are atrices B and C such that

More information

Reed-Muller codes for random erasures and errors

Reed-Muller codes for random erasures and errors Reed-Muller codes for rando erasures and errors Eanuel Abbe Air Shpilka Avi Wigderson Abstract This paper studies the paraeters for which Reed-Muller (RM) codes over GF (2) can correct rando erasures and

More information

Compressed Sensing and Sparse Recovery

Compressed Sensing and Sparse Recovery ELE 538B: Sparsity, Structure and Inference Compressed Sensing and Sparse Recovery Yuxin Chen Princeton University, Spring 217 Outline Restricted isometry property (RIP) A RIPless theory Compressed sensing

More information

Recovery of Sparsely Corrupted Signals

Recovery of Sparsely Corrupted Signals TO APPEAR IN IEEE TRANSACTIONS ON INFORMATION TEORY 1 Recovery of Sparsely Corrupted Signals Christoph Studer, Meber, IEEE, Patrick Kuppinger, Student Meber, IEEE, Graee Pope, Student Meber, IEEE, and

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Robust Spectral Compressed Sensing via Structured Matrix Completion Yuxin Chen, Student Member, IEEE, and Yuejie Chi, Member, IEEE

Robust Spectral Compressed Sensing via Structured Matrix Completion Yuxin Chen, Student Member, IEEE, and Yuejie Chi, Member, IEEE 6576 IEEE TRANSACTIONS ON INORMATION THEORY, VOL 60, NO 0, OCTOBER 04 Robust Spectral Copressed Sensing via Structured Matrix Copletion Yuxin Chen, Student Meber, IEEE, and Yuejie Chi, Meber, IEEE Abstract

More information

A1. Find all ordered pairs (a, b) of positive integers for which 1 a + 1 b = 3

A1. Find all ordered pairs (a, b) of positive integers for which 1 a + 1 b = 3 A. Find all ordered pairs a, b) of positive integers for which a + b = 3 08. Answer. The six ordered pairs are 009, 08), 08, 009), 009 337, 674) = 35043, 674), 009 346, 673) = 3584, 673), 674, 009 337)

More information

arxiv: v5 [cs.it] 16 Mar 2012

arxiv: v5 [cs.it] 16 Mar 2012 ONE-BIT COMPRESSED SENSING BY LINEAR PROGRAMMING YANIV PLAN AND ROMAN VERSHYNIN arxiv:09.499v5 [cs.it] 6 Mar 0 Abstract. We give the first coputationally tractable and alost optial solution to the proble

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Ch 12: Variations on Backpropagation

Ch 12: Variations on Backpropagation Ch 2: Variations on Backpropagation The basic backpropagation algorith is too slow for ost practical applications. It ay take days or weeks of coputer tie. We deonstrate why the backpropagation algorith

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

Least Squares Fitting of Data

Least Squares Fitting of Data Least Squares Fitting of Data David Eberly, Geoetric Tools, Redond WA 98052 https://www.geoetrictools.co/ This work is licensed under the Creative Coons Attribution 4.0 International License. To view a

More information

Using a De-Convolution Window for Operating Modal Analysis

Using a De-Convolution Window for Operating Modal Analysis Using a De-Convolution Window for Operating Modal Analysis Brian Schwarz Vibrant Technology, Inc. Scotts Valley, CA Mark Richardson Vibrant Technology, Inc. Scotts Valley, CA Abstract Operating Modal Analysis

More information

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate The Siplex Method is Strongly Polynoial for the Markov Decision Proble with a Fixed Discount Rate Yinyu Ye April 20, 2010 Abstract In this note we prove that the classic siplex ethod with the ost-negativereduced-cost

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

Tight Complexity Bounds for Optimizing Composite Objectives

Tight Complexity Bounds for Optimizing Composite Objectives Tight Coplexity Bounds for Optiizing Coposite Objectives Blake Woodworth Toyota Technological Institute at Chicago Chicago, IL, 60637 blake@ttic.edu Nathan Srebro Toyota Technological Institute at Chicago

More information

On the theoretical analysis of cross validation in compressive sensing

On the theoretical analysis of cross validation in compressive sensing MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.erl.co On the theoretical analysis of cross validation in copressive sensing Zhang, J.; Chen, L.; Boufounos, P.T.; Gu, Y. TR2014-025 May 2014 Abstract

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

arxiv: v1 [math.nt] 14 Sep 2014

arxiv: v1 [math.nt] 14 Sep 2014 ROTATION REMAINDERS P. JAMESON GRABER, WASHINGTON AND LEE UNIVERSITY 08 arxiv:1409.411v1 [ath.nt] 14 Sep 014 Abstract. We study properties of an array of nubers, called the triangle, in which each row

More information

Necessity of low effective dimension

Necessity of low effective dimension Necessity of low effective diension Art B. Owen Stanford University October 2002, Orig: July 2002 Abstract Practitioners have long noticed that quasi-monte Carlo ethods work very well on functions that

More information

Solutions of some selected problems of Homework 4

Solutions of some selected problems of Homework 4 Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Compressive Sensing Over Networks

Compressive Sensing Over Networks Forty-Eighth Annual Allerton Conference Allerton House, UIUC, Illinois, USA Septeber 29 - October, 200 Copressive Sensing Over Networks Soheil Feizi MIT Eail: sfeizi@it.edu Muriel Médard MIT Eail: edard@it.edu

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

OPTIMIZATION in multi-agent networks has attracted

OPTIMIZATION in multi-agent networks has attracted Distributed constrained optiization and consensus in uncertain networks via proxial iniization Kostas Margellos, Alessandro Falsone, Sione Garatti and Maria Prandini arxiv:603.039v3 [ath.oc] 3 May 07 Abstract

More information

Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding

Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding IEEE TRANSACTIONS ON INFORMATION THEORY (SUBMITTED PAPER) 1 Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding Lai Wei, Student Meber, IEEE, David G. M. Mitchell, Meber, IEEE, Thoas

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

Constrained Consensus and Optimization in Multi-Agent Networks arxiv: v2 [math.oc] 17 Dec 2008

Constrained Consensus and Optimization in Multi-Agent Networks arxiv: v2 [math.oc] 17 Dec 2008 LIDS Report 2779 1 Constrained Consensus and Optiization in Multi-Agent Networks arxiv:0802.3922v2 [ath.oc] 17 Dec 2008 Angelia Nedić, Asuan Ozdaglar, and Pablo A. Parrilo February 15, 2013 Abstract We

More information

Convex Programming for Scheduling Unrelated Parallel Machines

Convex Programming for Scheduling Unrelated Parallel Machines Convex Prograing for Scheduling Unrelated Parallel Machines Yossi Azar Air Epstein Abstract We consider the classical proble of scheduling parallel unrelated achines. Each job is to be processed by exactly

More information

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,

More information

Data Dependent Convergence For Consensus Stochastic Optimization

Data Dependent Convergence For Consensus Stochastic Optimization Data Dependent Convergence For Consensus Stochastic Optiization Avleen S. Bijral, Anand D. Sarwate, Nathan Srebro Septeber 8, 08 arxiv:603.04379v ath.oc] Sep 06 Abstract We study a distributed consensus-based

More information

Ph 20.3 Numerical Solution of Ordinary Differential Equations

Ph 20.3 Numerical Solution of Ordinary Differential Equations Ph 20.3 Nuerical Solution of Ordinary Differential Equations Due: Week 5 -v20170314- This Assignent So far, your assignents have tried to failiarize you with the hardware and software in the Physics Coputing

More information

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co

More information

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13 CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture

More information

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition Upper bound on false alar rate for landine detection and classification using syntactic pattern recognition Ahed O. Nasif, Brian L. Mark, Kenneth J. Hintz, and Nathalia Peixoto Dept. of Electrical and

More information

Principal Components Analysis

Principal Components Analysis Principal Coponents Analysis Cheng Li, Bingyu Wang Noveber 3, 204 What s PCA Principal coponent analysis (PCA) is a statistical procedure that uses an orthogonal transforation to convert a set of observations

More information