2886 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015

Size: px
Start display at page:

Download "2886 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 61, NO. 5, MAY 2015"

Transcription

1 886 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 61, NO 5, MAY 015 Simultaneously Structure Moels With Application to Sparse an Low-Rank Matrices Samet Oymak, Stuent Member, IEEE, Amin Jalali, Stuent Member, IEEE, Maryam Fazel, Yonina C Elar, Fellow, IEEE, an Babak Hassibi, Member, IEEE Abstract Recovering structure moels (eg, sparse or group-sparse vectors, low-rank matrices) given a few linear observations have been well-stuie recently In various applications in signal processing an machine learning, the moel of interest is structure in several ways, for example, a matrix that is simultaneously sparse an low rank Often norms that promote the iniviual structures are known, an allow for recovery using an orerwise optimal number of measurements (eg, l 1 norm for sparsity, nuclear norm for matrix rank) Hence, it is reasonable to minimize a combination of such norms We show that, surprisingly, using multiobjective optimization with these norms can o no better, orerwise, than exploiting only one of the structures, thus revealing a funamental limitation in sample complexity This result suggests that to fully exploit the multiple structures, we nee an entirely new convex relaxation Further, specializing our results to the case of sparse an low-rank matrices, we show that a nonconvex formulation recovers the moel from very few measurements (on the orer of the egrees of freeom), whereas the convex problem combining the l 1 an nuclear norms requires many more measurements, illustrating a gap between the performance of the convex an nonconvex recovery problems Our framework applies to arbitrary structure-inucing norms as well as to a wie range of measurement ensembles This allows us to give sample complexity bouns for problems such as sparse phase retrieval an low-rank tensor completion Inex Terms Compresse sensing, convex relaxation, regularization, sample complexity Manuscript receive February 11, 013; revise July 1, 014; accepte October 9, 014 Date of publication February 9, 015; ate of current version April 17, 015 A Jalali an M Fazel were supporte by the National Science Founation uner Grant ECCS an Grant CCF Y C Elar was supporte in part by the Israel Science Founation uner Grant 170/10, in part by SRC, an in part by the Intel Collaborative Research Institute for Computational Intelligence B Hassibi was supporte in part by the National Science Founation uner Grant CNS-09348, Grant CCF , Grant CCF , an Grant CCF , in part by the Office of Naval Research through the MURI uner Grant N , in part by Qualcomm Inc, San Diego, CA, USA, an in part by King Abulaziz University, Jeah, Saui Arabia S Oymak is with the Department of Electrical Engineering an Computer Sciences, University of California at Berkeley, Berkeley, CA 9470 USA ( oymak@eecsberkeleyeu) A Jalali an M Fazel are with the Department of Electrical Engineering, University of Washington, Seattle, WA USA ( amjalali@uweu; mfazel@uweu) Y C Elar is with the Department of Electrical Engineering, Technion Israel Institute of Technology, Haifa 3000, Israel ( yonina@eetechnionacil) B Hassibi is with the Department of Electrical Engineering, California Institute of Technology, Pasaena, CA 9115 USA ( hassibi@systemscaltecheu) Communicate by V Saligrama, Associate Eitor for Signal Processing Color versions of one or more of the figures in this paper are available online at Digital Object Ientifier /TIT I INTRODUCTION RECOVERY of a structure moel (signal) given a small number of linear observations has been the focus of many stuies recently Examples inclue recovering sparse or group-sparse vectors (which gave rise to the area of compresse sensing) [1] [4], low-rank matrices [5], [6], an the sum of sparse an low-rank matrices [7], [8], among others More generally, the recovery of a signal that can be expresse as the sum of a few atoms out of an appropriate atomic set has been stuie in [9] Canonical questions in this area inclue: How many linear measurements are enough to recover the moel by any means? How many measurements are enough for a tractable approach? In the statistics literature, these questions are often pose in terms of error rates for estimators minimizing the sum of a quaratic loss function an a regularizer that reflects the esire structure [10] There are many applications where the moel of interest is known to have several structures at the same time (Section I-B) We then seek a signal that lies in the intersection of several sets efining the iniviual structures (in a sense that we will later make precise) The most common convex regularizer use to promote all structures together is a linear combination of well-known regularizers for each structure However, there is currently no general analysis an unerstaning of how well such regularization performs in terms of the number of observations require for successful recovery of the esire moel This paper aresses this ubiquitous yet unexplore problem; ie, the recovery of simultaneously structure moels An example of a simultaneously structure moel is a matrix that is simultaneously sparse an low-rank One woul like to come up with algorithms that exploit both types of structures to minimize the number of measurements require for recovery An n n matrix with rank r n can be escribe by O (rn) parameters, an can be recovere using O (rn) generic measurements via nuclear norm minimization [5], [11] On the other han, a block-sparse matrix with a k k nonzero block where k n can be escribe by k parameters an can be recovere given O ( ) k log n k generic measurements using l1 minimization However, a matrix that is both rank r an block-sparse can be escribe by O (rk) parameters The question is whether we can exploit this joint structure to efficiently recover such a matrix with O (rk) measurements In this paper we give a negative answer to this question in the following sense: if we use multi-objective optimization IEEE Personal use is permitte, but republication/reistribution requires IEEE permission See for more information

2 OYMAK et al: SIMULTANEOUSLY STRUCTURED MODELS WITH APPLICATION TO SPARSE AND LOW-RANK MATRICES 887 TABLE I SUMMARY OF RESULTS IN RECOVERY OF STRUCTURED SIGNALSTHIS PAPER SHOWS A GAP BETWEEN THE PERFORMANCE OF CONVEX AND NONCONVEX RECOVERY PROGRAMS FOR SIMULTANEOUSLY STRUCTURED MATRICES (LAST ROW) with the l 1 an nuclear norms (use for sparse signals an low rank matrices, respectively), then the number of measurements require is lower boune by O ( min{k,rn} ) In other wors, we nee at least this number of observations for the esire signal to be recoverable by a combination of the l 1 norm an the nuclear norm This means we can o no better than an algorithm that exploits only one of the two structures We introuce a framework to express general simultaneously structure moels, an as our main result, we prove that the same phenomenon happens for a general set of structures We analyze a wie range of measurement ensembles, incluing the subsample stanar basis (ie matrix completion), Gaussian an subgaussian measurements, an quaratic measurements Table I summarizes known results on recovery of some common structure moels, along with a result of this paper specialize to the problem of low-rank an sparse matrix recovery The first column gives the number of parameters neee to escribe the moel (often referre to as its egrees of freeom ), while the secon an thir columns show how many generic measurements are neee for successful recovery In nonconvex recovery, we assume we are able to fin the global minimum of a nonconvex problem This is clearly intractable in general, an not a practical recovery metho we consier it as a benchmark for theoretical comparison with the (tractable) convex relaxation in orer to etermine how powerful the relaxation is The first an secon rows are the results on k sparse vectors in R n an rank r matrices in R n n respectively, [11], [1] The thir row consiers the recovery of low-rank plus sparse matrices Consier a matrix X R n n that can be ecompose as X = X L + X S where X L is a rank r matrix an X S is a matrix with only k nonzero entries The egrees of freeom in X are O (rn + k) Minimizing the combination of the l 1 norm an nuclear norm, ie, f(x) =min Y Y + λ X Y 1 subject to ranom Gaussian measurements on X, gives a convex approach for recovering X It has been shown that uner reasonable incoherence assumptions, X can be recovere from O ( (rn + k)log n ) measurements which is suboptimal only by a logarithmic factor [13] Finally, the last row in Table I shows one of the results in this paper Let X R n n bearankr matrix whose entries are zero outsie a k 1 k submatrix The egrees of freeom of X are O ((k 1 + k )r) We consier both convex an non-convex programs for the recovery of this type of matrix The nonconvex metho involves minimizing the number of nonzero rows, columns an rank of the matrix jointly, as iscusse in Section III-B As shown later, O ((k 1 + k )r log n) measurements suffice for this program to successfully recover the original matrix The convex metho minimizes any convex combination of the iniviual structure-inucing norms, namely the nuclear norm an the l 1, norm of the matrix, which encourage low-rank an column/row-sparse solutions respectively We show that with high probability this program cannot recover the original matrix with fewer than Ω(rn) measurements In summary, while the nonconvex metho is only slightly suboptimal, the convex metho performs poorly as the number of measurements scales with n rather than k 1 + k A Contributions This paper presents an analysis for the recovery of moels with more than one structure, by combining penalties corresponing to each structure The setting consiere is very broa: any number of structures can be combine, any (convex) combination of the corresponing norms can be analyze, an the results hol for a variety of popular measurement ensembles The framework propose inclues special cases that are of interest in their own right, eg, sparse an low-rank matrix recovery, an low-rank tensor completion [14], [15] More specifically, our contributions can be summarize as follows 1) Poor Performance of Convex Relaxations: We consier a moel with several structures an associate structureinucing norms For recovery, we consier a multi-objective optimization problem to minimize the iniviual norms simultaneously Given the convexity of the problem, we know that minimizing a weighte sum of the norms an varying the weights traces out all points of the Pareto-optimal front (Section II) We obtain a lower boun on the number of measurements for any convex function combining the iniviual norms A sketch of our main result is as follows Given a moel x 0 with τ simultaneous structures, the number of measurements require for recovery with high probability using any linear combination of the iniviual norms satisfies the lower boun m c min i=1,,τ m i where m i is an intrinsic lower boun on the require number of measurements when minimizing the ith norm only The term c epens on the measurement ensemble For the norms of interest, m i is proportional to the egrees of freeom of the ith structure With min i m i as the bottleneck, this result inicates that the combination of norms performs

3 888 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 61, NO 5, MAY 015 no better than using only one of the norms, even though the target moel has small egrees of freeom ) Different Measurement Ensembles: Our characterization of recovery failure is easy to interpret an eterministic in nature We show that it can be use to obtain probabilistic failure results for various ranom measurement ensembles In particular, our results hol for measurement matrices with ii subgaussian rows, quaratic measurements an matrix completion type measurements 3) Unerstaning the Effect of Weighting: We characterize the sample complexity of the multi-objective function as a function of the weights associate with the iniviual norms Our upper an lower bouns reveal that the sample complexity of the multi-objective function is relate to a certain convex combination of the sample complexities associate with the iniviual norms We provie formulas for this combination as a function of the weights 4) Incorporating General Cone Constraints: In aition, we can incorporate sie information on x 0, expresse as convex cone constraints This aitional information helps in recovery; however, quantifying how much the cone constraints help is not trivial Our analysis explicitly etermines the role of the cone constraint: Geometric properties of the cone such as its Gaussian with etermines the constant factors in the boun on the number of measurements 5) Illustrating a Gap for the Recovery of Sparse an Low-Rank Matrices: As a special case, we consier the recovery of simultaneously sparse an low-rank matrices an prove that there is a significant gap between the performance of convex an non-convex recovery programs This gap is surprising when one consiers similar results on low-imensional moel recovery as shown in Table I B Applications We survey several applications where simultaneous structures arise, as well as existing results specific to these applications 1) Sparse Signal Recovery From Quaratic Measurements: Sparsity has long been exploite in signal processing, applie mathematics, statistics an computer science for tasks such as compression, enoising, moel selection, image processing an more Despite the great interest in exploiting sparsity in various applications, most of the work to ate has focuse on recovering sparse or low rank ata from linear measurements Recently, the basic sparse recovery problem has been generalize to the case in which the measurements are given by nonlinear transforms of the unknown input, [16] A special case of this more general setting is quaratic compresse sensing [17] in which the goal is to recover a sparse vector x from quaratic measurements b i = x T A i x This problem can be linearize by lifting, where we wish to recover a low rank an sparse matrix X = xx T subject to measurements b i = A i, X Sparse recovery problems from quaratic measurements arise in a variety of problems in optics One example is sub-wavelength optical imaging [17], [18] in which the goal is to recover a sparse image from its far-fiel measurements, where ue to the laws of physics the relationship between the (clean) measurement an the unknown image is quaratic In [17] the quaratic relationship is a result of using partiallyincoherent light The quaratic behavior of the measurements in [18] arises from coherent iffractive imaging in which the image is recovere from its intensity pattern Uner an appropriate experimental setup, this problem amounts to reconstruction of a sparse signal from the magnitue of its Fourier transform A relate an notable problem involving sparse an low-rank matrices is Sparse Principal Component Analysis (SPCA), mentione in Section IX ) Sparse Phase Retrieval: Quaratic measurements appear in phase retrieval problems, in which a signal is to be recovere from the magnitue of its measurements b i = a T i x, where each measurement is a linear transform of the input x R n an a i s are arbitrary, possibly complex-value measurement vectors An important case is when a T i x is the Fourier transform an b i is the power spectral ensity Phase retrieval is of great interest in many applications such as optical imaging [19], [0], crystallography [1], an more [] [4] The problem becomes linear when x is lifte an we consier the recovery of X = xx T where each measurement takes the form b i = a i a T i, X In [17], an algorithm was evelope to treat phase retrieval problems with sparse x base on a semiefinite relaxation, an low-rank matrix recovery combine with a row-sparsity constraint on the resulting matrix More recent works also propose the use of semiefinite relaxation together with sparsity constraints for phase retrieval [5] [8] An alternative algorithm was recently esigne in [9] base on a greey search In [7], the authors consier sparse signal recovery base on combinatorial an probabilistic approaches an provie uniqueness results uner certain conitions Stable uniqueness in phase retrieval problems is stuie in [30] The results of [31] an [3] applies to general (non-sparse) signals where in some cases maske versions of the signal are require 3) Fuse Lasso: Suppose the signal of interest x 0 is sparse an its entries vary slowly, ie, the signal can be approximate by a piecewise constant function To encourage sparsity, one can use the l 1 norm To promote the piece-wise constant structure, iscrete total variation can be use, efine as n 1 x TV = x i+1 x i, i=1 where TV is basically the l 1 norm of the graient an is approximately sparse The resulting optimization problem that estimates x 0 from its samples Ax 0 is known as fuse-lasso [33], an is given as min x 1 + λ x TV st Ax = Ax 0 (1) x To the best of our knowlege, the sample complexity of fuse lasso has not been analyze from a compresse sensing point of view However, there is a series of recent works [34], [35] on total variation minimization; which may lea to analysis of (1)

4 OYMAK et al: SIMULTANEOUSLY STRUCTURED MODELS WITH APPLICATION TO SPARSE AND LOW-RANK MATRICES 889 We remark that TV regularization is also use together with nuclear norm to encourage a low-rank an smooth (ie, slowly varying entries) solution This regularization fins applications in imaging an physics [36], [37] 4) Low-Rank Tensors: Tensors with small Tucker rank can be seen as a generalization of low-rank matrices [38] In this setup, the signal of interest is a tensor X 0 R n1 nτ,where X 0 is low-rank along its unfolings which are obtaine by reshaping X 0 as a matrix with size n i n n i, with n = τ i=1 n i Denoting the ith unfoling by U i (X 0 ), a stanar approach to estimate X 0 from y = A(X 0 ) is minimizing the weighte nuclear norms of the unfolings, τ min λ i U i (X) subject to y = A(X 0 ) () X i=1 Low-rank tensors have applications in machine learning, physics, computational finance an high imensional PDE s [15] The problem () has been investigate in several papers [14], [39] Closer to us, [40] recently showe that the convex relaxation () performs poorly compare to information theoretically optimal bouns for Gaussian measurements Our results can exten those to the more applicable tensor completion setup, where we observe the entries of the tensor Other applications of simultaneously structure signals inclue Collaborative Hierarchical Sparse Moeling [41] where sparsity is consiere within the non-zero blocks in a block-sparse vector, an the recovery of hyperspectral images where we aim to recover a simultaneously block sparse an low rank matrix from compresse observations [4] C Outline of the Paper The paper is structure as follows Backgroun an efinitions are given in Section II An overview of the main results is provie in Section III Section IV iscusses some measurement ensembles for which our results apply Section V erives upper bouns for the convex relaxations assuming a Gaussian measurement ensemble Proofs of the general results are presente in Section VI The proofs for the special case of simultaneously sparse an low-rank matrices are given in Section VII, where we compare corollaries of the general results with the results on non-convex recovery approaches, an illustrate a gap Numerical simulations in Section VIII empirically support the results on sparse an low-rank matrices Future irections of research an a iscussion on the results are presente in Section IX II PROBLEM SETUP We begin by reviewing some basic efinitions Our results will be on structure-inucing norms; examples inclue the l 1 norm, the l 1, norm, an the nuclear norm A Definitions an Signal Moel The nuclear norm of a matrix is enote by an is the sum of the singular values of the matrix The l 1, norm is the sum of the l norms of the columns of a matrix Fig 1 Depiction of the correlation between a vector x an a set S s achieves the largest angle with x,hences has the minimum correlation with x Minimizing the l 1 norm encourages sparse solutions, while the l 1, norm an nuclear norm encourage column-sparse an low-rank solutions respectively, [5], [6], [43] [45]; see Section VI-D for more etaile iscussion of these norms an their subifferentials The Eucliean norm is enote by, ie, the l norm for vectors an the Frobenius norm F for matrices Overlines enote normalization, ie, for a vector x an a matrix X, x = x x an X = X X F The minimum an maximum singular values of a matrix A are enote by σ min (A) an σ max (A) The set of n n positive semiefinite (PSD) an symmetric matrices are enote by S n + an S n respectively cone(s) enotes the conic hull of agivensets A( ) : R n R m is a linear measurement operator if A(x) is equivalent to the matrix multiplication Ax where A R m n Ifxis a matrix, A(x) will be a matrix multiplication with a suitably vectorize x In some of our results, we consier Gaussian measurements, in which case A has inepenent N (0, 1) entries For a vector x R n, x enotes a general norm an x = sup z 1 x, z is the corresponing ual norm A subgraient of the norm at x is a vector g for which z x + g, z x hols for any z Thesetof all subgraients is calle the subifferential an is enote by x The Lipschitz constant of the norm is efine as z 1 z L = sup z 1 z R n z 1 z Definition 1 (Correlation): Given a nonzero vector x an a set S, ρ(x,s) is efine as ρ(x,s):= inf 0 s S x T s x s ρ(x,s) correspons to the minimum absolute-value correlation between the vector x an elements of S Let x = x x The correlation between x an the associate subifferential has a simple form: ρ(x, x ) = x T g inf = g x g x sup g x g Here, we use the fact that, for norms, subgraients g x satisfy x T g = x, [46] The enominator of the right han sie is the local Lipschitz constant of at x an is upper We will by κ Recently, this quantity has been stuie boune by L Consequently, ρ(x, x ) x L enote x L

5 890 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 61, NO 5, MAY 015 Fig For a scale norm ball passing through x 0, κ = p x 0,wherep is any of the closest points on the scale norm ball to the origin by Mu et al to analyze the simultaneously structure signals in a similar spirit to us for Gaussian measurements [40] 1 Similar calculations as above gives an alternative interpretation for κ which is illustrate in Figure κ is a measure of alignment between the vector x an the subifferential For the norms of interest, it is associate with the moel complexity For instance, for a k-sparse vector x, x 1 lies between 1 an k epening on how spiky nonzero entries are Also the Lipschitz constant for the l 1 norm in R n is L = n Hence, when the nonzero entries are ±1, wefin κ = k n Similarly, given a, rankr matrix X, X lies between 1 an r If the singular values are sprea (ie ±1), we fin κ = r = r In these cases, κ is proportional to the moel complexity normalize by the ambient imension Simultaneously Structure Moels: We consier a signal x 0 which has several low-imensional structures S 1,S,,S τ (eg, sparsity, group sparsity, low-rank) Suppose each structure i correspons to a norm enote by (i) which promotes that structure (eg, l 1, l 1,, nuclear norm) We refer to such an x 0 as a simultaneously structure moel B Convex Recovery Program We investigate the recovery of the simultaneously structure x 0 from its linear measurements A(x 0 ) To recover x 0, we woul like to simultaneously minimize the norms (i), i =1,,τ, which leas to a multi-objective (vector-value) optimization problem For all feasible points x satisfying A(x) = A(x 0 ) an sie information x C, consier the set of achievable norms { x (i) } τ i=1 enote as points in R τ The minimal points of this set with respect to the positive orthant R τ + form the Pareto-optimal front, as illustrate in Figure 3 Since the problem is convex, one can alternatively consier the set {v R τ : x R n st x C, A(x) =A(x 0 ), v i x (i), for i =1,,τ}, which is convex an has the same Pareto optimal points as the original set (see [47, Ch 4]) 1 The work [40] was submitte after our initial manuscript; in which we projecte the subifferential onto a carefully chosen subspace to obtain bouns on the sample complexity (see Proposition 6) Inspire by [40], projection onto x 0 an the use of κ le to the simplification of the notation an improvement of the results in the current manuscript, in particular, Section IV Fig 3 Consier a point x 0 represente in the figure by the ot We nee at least m measurements for x 0 to be recoverable since for any m <mthis point is not on the Pareto optimal front Definition (Recoverability): We call x 0 recoverable if it is a Pareto optimal point; ie, there oes not exist a feasible x x satisfying A(x ) = A(x 0 ) an x C, with x (i) x 0 (i) for i =1,,τ The vector-value convex recovery program can be turne into a scalar optimization problem as minimize x C f(x) =h( x (1),, x (τ ) ) subject to A(x) =A(x 0 ), (3) where h : R τ + R + is convex an non-ecreasing in each argument (ie, non-ecreasing an strictly increasing in at least one coorinate) For convex problems with strong uality, it is known that we can recover all of the Pareto optimal points by optimizing weighte sums f(x) = τ i=1 λ i x (i), with positive weights λ i, among all possible functions f(x) =h( x (1),, x (τ ) ) For each x 0 on the Pareto, the coefficients of such a recovering function are given by the hyperplane supporting the Pareto at x 0 [47, Ch 4] In Figure 3, consier the smallest m that makes x 0 recoverable Then one can choose a function h an recover x 0 by (3) using the m measurements If the number of measurements is any less, then no function can recover x 0 Our goal is to provie lower bouns on m In [9], Chanrasekaran et al propose a general theory for constructing a suitable penalty, calle an atomic norm, given a single set of atoms that escribes the structure of the target object In the case of simultaneous structures, this construction requires efining new atoms, an then ensuring the resulting atomic norm can be minimize in a computationally tractable way, which is nontrivial an often intractable We briefly iscuss such constructions as a future research irection in Section IX III MAIN RESULTS In this section, we state our main theorems that aim to characterize the number of measurements neee to recover a simultaneously structure signal by convex or nonconvex programs We first present our general results, followe by results for simultaneously sparse an low-rank matrices as a specific but important instance of the general case The proofs

6 OYMAK et al: SIMULTANEOUSLY STRUCTURED MODELS WITH APPLICATION TO SPARSE AND LOW-RANK MATRICES 891 are given in Sections VI an VII All of our statements will implicitly assume x 0 0 This will ensure that x 0 is not a trivial minimizer an 0 is not in the subifferentials A General Simultaneously Structure Signals Consier the recovery of a signal x 0 that is simultaneously structure with S 1,S,,S τ as escribe in Section II-A We provie a lower boun on the require number of measurements, using the geometric properties of the iniviual norms Theorem 1 (Deterministic Failure): Suppose C = R n an, ρ(x 0, f(x 0 )) := inf g f(x ḡt x 0 > A x 0 0) σ min (A T ) (4) Then, x 0 is not a minimizer of (3) Theorem 1 is eterministic in nature However, it can be easily specialize to specific ranom measurement ensembles The left han sie of (4) epens only on the vector x 0 an the subifferential f(x 0 ), hence it is inepenent of the measurement matrix A For simultaneously structure moels, we will argue that, the left han sie cannot be mae too small, as the subgraients are aligne with the signal On the other han, the right han sie epens only on A an x 0 an is inepenent of the subifferential In linear inverse problems, A is often assume to be ranom For large class of ranom matrices, we will argue that, the right han sie is approximately m n which will yiel a lower boun on the number of require measurements Typical measurement ensembles inclue the following: Sampling Entries: In low-rank matrix an tensor completion problems, we observe the entries of x 0 uniformly at ranom In this case, rows of A are chosen from the stanar basis in R n We shoul remark that, instea of the stanar basis, one can consier other orthonormal bases such as the Fourier basis Matrices With ii Rows: A has inepenent an ientically istribute rows with certain moment conitions This is a wiely use setup in compresse sensing as each measurement we make is associate with the corresponing row of A [48] Quaratic Measurements: Arises in the phase retrieval problem as iscusse in Section I-B In Section IV, we fin upper bouns on the right han sie of (4) for these ensembles As iscusse in Section IV, we can moify the rows of A to get better bouns as long as this oes not affect its null space For instance, one can iscar the ientical rows to improve conitioning However, as m increases an A has more linearly inepenent rows, σ min (A T ) will naturally ecrease an (4) will no longer hol after a certain point In particular, (4) cannot hol beyon m n as σ min (A T )=0 This is inee natural as the system becomes overetermine The following proposition lower bouns the left han sie of (4) in an interpretable manner In particular, the correlation ρ(x 0, f(x 0 )) can be lower boune by the smallest iniviual correlation Proposition 1: Let L i be the Lipschitz constant of the ith norm an κ i = x0 (i) L i for 1 i τ Setκ min =min{κ i : i =1,,τ} Then: All functions f( ) in (3) satisfy, ρ(x 0, f(x 0 )) κ min Suppose f( ) is a weighte linear combination f(x) = τ i=1 λ i x (i) for nonnegative {λ i } τ i=1 λ Let λi = P il i τ for 1 i τ Then, i=1 λili ρ(x 0, f(x 0 )) τ λ i=1 i κ i Proof: From Lemma 3, any subgraient of f( ) can be written as, g = τ i=1 w ig i for some nonnegative w i s On the other han, from [46], x 0, g i = x 0 (i) Combining these results, τ g T x 0 = w i x 0 (i) i=1 From the triangle inequality, g τ i=1 w il i Therefore, τ i=1 w i x 0 (i) w i x 0 (i) τ i=1 w min = κ min (5) il i 1 i τ w i L i To prove the secon part, we use the fact that for the weighte sums of norms, w i = λ i an the subgraients have the form g = τ i=1 λ ig i, [47] Then, substitute λ i for λ i in the left han sie of (5) Before stating the next result, let us give a relevant efinition regaring the average istance between a set an a ranom vector Definition 3 (Gaussian Distance): Let M be a close convex set in R n an let h R n be a vector with inepenent stanar normal entries Then, the Gaussian istance of M is efine as D(M) =E[ inf h v ] v M When M is a cone, we have 0 D(M) n Similar efinitions have been use extensively in the literature, such as Gaussian with [9], statistical imension [49] an mean with [50] For notational simplicity, let the normalize istance be D(M) = D(M) n We will now state our result for Gaussian measurements; which can aitionally inclue cone constraints for the lower boun Other ensembles are consiere in Section IV Theorem (Gaussian Lower Boun): Suppose A has inepenent N (0, 1) entries Whenever m m low, x 0 will not be a minimizer of any of the recovery programs in (3) with probability at least 1 10 exp( 1 16 min{m low, (1 D(C)) n}), where m low (1 D(C))nκ min 100 Remark: When C = R n, D(C) =0hence, the lower boun simplifies to m low = nκ min 100 Note that D(C) epens only on C an can be viewe as a constant For instance, for the positive semiefinite cone, n we show that D(S + ) < 3 Observe that for a smaller cone C, it is reasonable to expect a smaller lower boun on the The lower boun κ min is irectly comparable to [40, Th 5] Inee, our lower bouns on the sample complexity will have the form O κ min n

7 89 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 61, NO 5, MAY 015 TABLE II SUMMARY OF THE PARAMETERS THAT ARE DISCUSSED IN THIS SECTIONTHE LAST THREE LINES ARE FOR A S&L (k, k, r) MATRIX WHERE n = INTHEFOURTH COLUMN, THE CORRESPONDING ENTRY FOR S&L IS κ min =min{κ l1,κ } require number of measurements Inee, as C gets smaller, D(C) increases As iscusse before, there are various options for the scalarizing function in (3), with one choice being the weighte sum of norms In fact, for a recoverable point x 0 there always exists a weighte sum of norms which recovers it This function is also often the choice in applications, where the space of positive weights is searche for a goo combination Thus, we can state the following corollary as a general result Corollary 1 (Weighte Lower Boun): Suppose A has ii N (0, 1) entries an f(x) = τ i=1 λ i x (i) for nonnegative weights {λ i } τ i=1 Whenever m m low, x 0 will not be a minimizer of the recovery program (3) with probability at least 1 10 exp( 1 16 min{m low, (1 D(C)) n}), where m low n(1 D(C))( τ λ i=1 i κ i ), 100 an λ i = P λili τ i=1 λili Observe that Theorem is stronger than stating a particular function h( x (1),, x (τ ) ) will not work Instea, our result states that with high probability none of the programs in the class (3) can return x 0 as optimal unless the number of measurements is sufficiently large To unerstan the result better, note that the require number of measurements is proportional to κ min n which is often proportional to the sample complexity of the best iniviual norm As we have argue in Section II-A, κ i n correspons to how structure the signal is For sparse signals it is equal to the sparsity, an for a rank r matrix, it is equal to the egrees of freeom of the set of rank r matrices Consequently, Theorem suggests that even if the signal satisfies multiple structures, the require number of measurements is effectively etermine by only one ominant structure Intuitively, the egrees of freeom of a simultaneously structure signal can be much lower, which is provable for simultaneously sparse an low-rank (S&L) matrices Hence, there is a consierable gap between the expecte measurements base on moel complexity an the number of measurements neee for recovery via (3) (κ min n) B Simultaneously Sparse an Low-Rank Matrices We now focus on a special case, namely simultaneously sparse an low-rank (S&L) matrices We consier matrices with nonzero entries containe in a small submatrix where the submatrix itself is low rank Here, norms of interest are 1,, 1 an an the cone of interest is the PSD cone We also consier nonconvex approaches an contrast the results with convex approaches For the nonconvex problem, we replace the norms 1, 1,, with the functions 0, 0,, rank( ) which give the number of nonzero entries, the number of nonzero columns an rank of a matrix respectively an use the same cone constraint as the convex metho We show that convex methos perform poorly as preicte by the general result in Theorem, while nonconvex methos require optimal number of measurements (up to a logarithmic factor) Proofs are given in Section VII Definition 4: We say X 0 R 1 is an S&L matrix with (k 1,k,r) if the smallest submatrix that contains nonzero entries of X 0 has size k 1 k an rank (X 0 )=r WhenX 0 is symmetric, let = 1 = an k = k 1 = k We consier the following cases (a) General: X 0 R 1 is S&L with (k 1,k,r) (b) PSD moel: X 0 R n n is PSD an S&L with (k, k, r) We are intereste in S&L matrices with k 1 1,k so that the matrix is sparse, an r min{k 1,k } so that the submatrix containing the nonzero entries is low rank Recall from Section II-B that our goal is to recover X 0 from linear observations A(X 0 ) via convex or nonconvex optimization programs The measurements can be equivalently written as A vec(x 0 ), where A R m 1 an vec(x 0 ) R 1 enotes the vector obtaine by stacking the columns of X 0 Base on the results in Section III-A, we obtain lower bouns on the number of measurements for convex recovery We aitionally show that significantly fewer measurements are sufficient for non-convex programs to uniquely recover X 0 ; thus proving a performance gap between convex an nonconvex approaches The following theorem summarizes the results Theorem 3 (Performance of S&L Matrix Recovery): Suppose A( ) is an ii Gaussian map an consier recovering X 0 R 1 via minimize X C f(x) subject to A(X) =A(X 0 ) (6) For the cases given in Definition 4, the following convex an nonconvex recovery results hol for some positive constants c 1,c (a) General Moel: (a1) Let f(x) = X 1, + λ 1 X T 1, + λ X where λ 1,λ 0 an C = R 1 Then, (6) will fail to recover X 0 with probability 1 exp( c 1 m 0 ) whenever m c m 0 where m 0 =min{ 1 k, k 1, ( 1 + )r} (a) Let f(x) = 1 k X 0, + 1 k 1 X T 0, + 1 r rank(x) an C = R 1 Then, (6) will uniquely

8 OYMAK et al: SIMULTANEOUSLY STRUCTURED MODELS WITH APPLICATION TO SPARSE AND LOW-RANK MATRICES 893 TABLE III SUMMARY OF RECOVERY RESULTS FOR MODELS IN DEFINITION 4, ASSUMING 1 = = AND k 1 = k = k FOR THE PSD WITH l 1 CASE, WE ASSUME X 0 1 AND X 0 TO BE k r APPROXIMATELY CONSTANT FOR THE SAKE OF SIMPLICITYNONCONVEX APPROACHES ARE OPTIMAL UP TO A LOGARITHMIC FACTOR, WHILE CONVEX APPROACHES PERFORM POORLY recover X 0 with probability 1 exp( c 1 m) whenever m c max{(k 1 +k )r, k 1 log 1 k 1,k log k } (b) PSD with l 1, : (b1) Let f(x) = X 1, + λ X where λ 0 an C = S + Then, (6) will fail to recover X 0 with probability 1 exp( c 1 r) whenever m c r (b) Let f(x) = k X 0, + 1 r rank(x) an C = S Then, (6) will uniquely recover X 0 with probability 1 exp( c 1 m) whenever m c max{rk, k log k } (c) PSD with l 1 : (c1) Let f(x) = X 1 + λ X an C = S + Then, (6) will fail to recover X 0 with probability 1 exp( c 1 m 0 ) for all possible λ 0 whenever m c m 0 where m 0 =min{ X 0 1, X 0 } (c) Suppose rank(x 0 )=1Letf(X) = 1 k X 0 + rank(x) an C = S Then, (6) will uniquely recover X 0 with probability 1 exp( c 1 m) whenever m c k log k Remark on PSD With l 1 : In the special case, X 0 = aa T for a k-sparse vector a, wehavem 0 =min{ ā 4 1,} When nonzero entries of a are ±1, wehavem 0 =min{k,} The nonconvex programs require almost the same number of measurements as the egrees of freeom (or number of parameters) of the unerlying moel For instance, it is known that the egrees of freeom of a rank r matrix of size k 1 k is simply r(k 1 + k r) which is O ((k 1 + k )r) Hence, the nonconvex results are optimal up to a logarithmic factor On the other han, our results on the convex programs that follow from Theorem show that the require number of measurements are significantly larger Table III provies a quick comparison of the results on S&L For the S&L (k,k,r) moel, from stanar results one can easily euce that [3], [5], [43], l 1 penalty only: requires at least k measurements, l 1, penalty only: requires at least k measurements, Nuclear norm penalty only: requires at least r measurements These follow from the moel complexity of the sparse, column-sparse an low-rank matrices Theorem shows that, combination of norms require at least as much as the best iniviual norm For instance, combination of l 1 an the nuclear norm penalization yiels the lower boun O ( min{k,r} ) for S&L matrices whose singular values an nonzero entries are sprea This is inee what we woul expect from the interpretation that κ n is often proportional to the sample complexity of the corresponing norm an, the lower boun κ minn is proportional to that of the best iniviual norm As we saw in Section III-A, aing a cone constraint to the recovery program oes not help in reucing the lower boun by more than a constant factor In particular, we iscuss the positive semiefiniteness assumption that is beneficial in the sparse phase retrieval problem an show that the number of measurements remain high even when we inclue this extra information On the other han, the nonconvex recovery programs performs well even without the PSD constraint We remark that, we coul have state Theorem 3 for more general measurements given in Section IV without the cone constraint For instance, the following result hols for the weighte linear combination of iniviual norms an for the subgaussian ensemble Corollary : Suppose X 0 R obeys the general moel with k 1 = k = k an A is a linear subgaussian map as escribe in Proposition Choose f(x) =λ l1 X 1 + λ X,whereλ l1 = β, λ =(1 β) an 0 β 1 Then, whenever, m min{m low,c 1 n}, where, m low = (β X 0 1 +(1 β) X 0 ), (6) fails with probability 1 4exp( c m low )Herec 1,c > 0 are constants as escribe in Proposition Remark: Choosing X 0 = aa T where nonzero entries of a are ±1 yiels 1 (βk +(1 β) ) on the right han sie An explicit construction of an S&L matrix with maximal X 1, X is provie in Section VII-C This corollary compares well with the upper boun obtaine in Corollary 5 of Section V In particular, both the bouns an the penalty parameters match up to logarithmic factors Hence, together, they sanwich the sample complexity of the combine cost f(x) IV MEASUREMENT ENSEMBLES This section will make use of stanar results on sub-gaussian ranom variables an ranom matrix theory to obtain probabilistic statements We will explain how one can analyze the right han sie of (4) for, Matrices with sub-gaussian rows, Subsample stanar basis (in matrix completion), Quaratic measurements arising in phase retrieval A Sub-Gaussian Measurements We first consier the measurement maps with sub-gaussian entries The following efinitions are borrowe from [51] Definition 5 (Sub-Gaussian Ranom Variable): A ranom variable x is sub-gaussian if there exists a constant K>0 such that for all p 1, (E x p ) 1/p K p The smallest such K is calle the sub-gaussian norm of x an is enote by x Ψ A sub-exponential ranom

9 894 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 61, NO 5, MAY 015 variable y is one for which there exists a constant K such that, P( y >t) exp(1 t K )Thevariablex is sub-gaussian if an only if x is sub-exponential Definition 6 (Isotropic Sub-Gaussian Vector): A ranom vector x R n is sub-gaussian if the one imensional marginals x T v are sub-gaussian ranom variables for all v R n The sub-gaussian norm of x is efine as, x Ψ = sup x T v Ψ v =1 The vector x is also isotropic if its covariance is equal to the ientity, ie E xx T = I n Proposition (Sub-Gaussian Measurements): Suppose A has ii rows in either of the following forms, a copy of a zero-mean isotropic sub-gaussian vector a R n,where a = n almost surely the rows consist of ii zero-mean an unit-variance sub-gaussian entries Then, there exists constants c 1,c epening only on the sub-gaussian norm of the rows, such that, whenever m c 1 n, with probability 1 4exp( c m), wehave, A x 0 σmin (AT ) m n Proof: Using [51, Th 558], there exists constants c, C epening only on the sub-gaussian norm of a such that for any t 0, with probability 1 exp( ct ) σ min (A T ) n C m t Choosing t = C n m an m 100C ensures that σ min (A T ) 4 n 5 Next, we shall estimate A x 0, which is a sum of ii sub-exponential ranom variables ientical to a T x 0 Note that E[ a T x 0 ]=1 Hence, [51, Proposition 516] gives, P( A x 0 m + t) exp( c min{ t m,t}) Choosing t = 7m 5, we fin that P( A x 0 3m 5 ) exp( c m) Combining the two, we obtain, ( A x0 P σmin (AT ) m ) 1 4exp( c m) n The secon statement can be prove in the exact same manner by using [51, Th 539] instea of Theorem 558 Remark: While Proposition assumes a has fixe l norm, this can be ensure by properly normalizing rows of A (assuming they stay sub-gaussian) For instance, if the l norm of the rows are larger than c n for a positive constant c, normalization will not affect sub-gaussianity Note that, scaling rows of a matrix oes not change its null space B Ranomly Sampling Entries We now consier the scenario where each row of A is chosen from the stanar basis uniformly at ranom Note that, when m is comparable to n, there is a nonnegligible probability that A will have uplicate rows Theorem 1 oes not take this situation into account which woul make σ min (A T )=0 In this case, one can iscar the copies as they on t affect the recoverability of x 0 This woul get ri of the ill-conitioning, since the new matrix is well-conitione with the exact same null space as the original This correspons to a sampling without replacement scheme where we ensure each row is ifferent Similar to achievability results in matrix completion [6], the following failure result requires true signal to be incoherent with the stanar basis, where incoherence is characterize by x 0, which lies between 1 n an 1 Proposition 3 (Sampling Entries): Let {e i } n i=1 be the stanar basis in R n an suppose each row of A is chosen from {e i } n i=1 uniformly at ranom Let  be the matrix obtaine by removing uplicate rows in A Then, with probability 1 exp( m 4n x 0 ), wehave,  x 0 σmin (Â) m n Proof: Let  be the matrix obtaine by iscaring the rows of A that occur multiple times except one of them Clearly Null(Â) = Null(A) hence they are equivalent for the purpose of recovering x 0 Furthermore,σ min (Â) = 1 Therefore, we are intereste in upper bouning  x 0 Clearly  x 0 A x 0 Hence, we will boun A x 0 probabilistically Let a be the first row of A a T x 0 is a ranom variable, with mean 1 n an is upper boune by x 0 Applying the Chernoff Boun yiels P( A x 0 m n (1 + δ)) exp( mδ (1 + δ)n x 0 ) Setting δ = 1, we fin that, with probability 1 exp( m 4n x 0 ), wehave,  x 0 σ A x 0 min (Â) σ m min (Â) n, completing the proof A significant application of this result woul be for the low-rank tensor completion problem, where we ranomly observe some entries of a low-rank tensor an try to reconstruct it A promising approach for this problem is using the weighte linear combinations of nuclear norms of the unfolings of the tensor to inuce the low-rank tensor structure escribe in (), [14], [15] Relate work [40] shows the poor performance of () for the special case of Gaussian measurements Combination of Theorem 1 an Proposition 3 will immeiately exten the results of [40] to the more applicable tensor completion setup (uner proper incoherence conitions that boun x 0 ) Remark: In Propositions an 3, we can make the upper boun for the ratio A x0 σ min(a) arbitrarily close to m n by changing the proof parameters Combine with Proposition 1, this woul suggest that, failure happens, when m<nκ min C Quaratic Measurements As mentione in the phase retrieval problem, quaratic measurements v T a of the vector a R can be linearize by the change of variable a X 0 = aa T an using V = vv T

10 OYMAK et al: SIMULTANEOUSLY STRUCTURED MODELS WITH APPLICATION TO SPARSE AND LOW-RANK MATRICES 895 The following proposition can be use to obtain a lower boun for such ensembles when combine with Theorem 1 Proposition 4: Suppose we observe quaratic measurements A(X 0 ) R m of a matrix X 0 = aa T R Here, assume that ith entry of A(X 0 ) is equal to vi T a where {v i } m i=1 are inepenent vectors, either with N (0, 1) entries or are uniformly istribute over the sphere with raius Then, there exists absolute constants c 1,c > 0 such that whenever m< c1 log, with probability 1 e, A( X 0 ) σ min (A T ) c m log Proof: Let V i = v i vi T Without loss of generality, assume v i s are uniformly istribute over the sphere with raius To lower boun σ min (A T ), we estimate the coherence of its columns, efine by, μ(a T )=max i j V i, V j = (vt i v j) V i F V j F [51, Sec 55] states that the sub-gaussian norm of v i is boune by an absolute constant Hence, conitione on v j (which satisfies v j = (v ), T i vj) is a subexponential ranom variable with mean 1 Using Definition 5, there exists a constant c>0 such that, ( (v T P i v j ) ) >clog e 4 Union bouning over all i, j pairs ensures that with probability e we have μ(a T ) c log Next, we use the stanar result that for a matrix with columns of equal length, σ min (A T ) (1 (m 1)μ) The reaer is referre to [5, Proposition 1] Hence, m c log, gives σ min (A T ) It remains to upper boun A( X 0 ) The ith entry of A( X 0 ) is equal to vi T ā, hence it is subexponential Consequently, there exists a constant c so that each entry is upper boune by c log with probability 1 e 3 Union bouning, an using m, we fin that A( X 0 ) c m log with probability 1 e Combining with the estimate of σ min (A T ) completes the proof Comparison to Existing Literature: Proposition 4 is useful to estimate the performance of the sparse phase retrieval problem, in which a is a k sparse vector, an we minimize a combination of the l 1 norm an the nuclear norm to recover X 0 Combine with Theorem 1, Proposition 4 gives that, whenever m c1 log an c m log min{ X 0 1, X 0 }, recovery fails with high probability Since X 0 =1an X 0 1 = ā 1, the failure conition reuces to, m c log min{ ā 4 1,} When ā is a k-sparse vector with ±1 entries, in a similar flavor to Theorem 3, the right han sie has the form c log min{k,} We emphasize that the lower boun provie in [6] is irectly comparable to our results The authors in [6] consier the same problem an give two results: first, if m O ( ā 1k log ) then minimizing X 1 + λ tr (X) for suitable value of λ over the set of PSD matrices will exactly recover X 0 with high probability Seconly, their Theorem 13 provies a necessary conition (lower boun) on the number of measurements, uner which the recovery program fails to recover X 0 with high probability In particular, their failure conition is m min{m 0, 40 log } where m 0 = max( ā 1 k/,0) 500 log ( ) Observe that both results require m O Focusing log on the sparsity requirements, when the nonzero entries are sufficiently iffuse (ie a ā 4 1 k) both results yiel O( log ) as a lower boun On the other han, if ā 1 k/, their lower boun becomes trivial while our lower boun still requires O( ā 4 log ) measurements ā 1 k/ can happen as soon as the nonzero entries are spiky, ie some of the entries are much larger than the rest In this sense, our bouns are tighter On the other han, their lower boun inclues the PSD constraint unlike ours D Asymptotic Regime While we iscusse two cases in the nonasymptotic setup, we believe significantly more general results can be state asymptotically (m, n ) For instance, uner finite fourth moment constraint, thanks to the Bai-Yin law [53], asymptotically, the smallest singular value of a matrix with ii unit variance entries concentrates aroun n m Similarly, A x 0 is the sum of inepenent variables; hence thanks to the law of large numbers, we will have A x0 m 1 Together, these yiel m n m A x0 σ min(a T ) V UPPER BOUNDS We now state an upper boun on the simultaneous optimization for Gaussian measurement ensemble Our upper boun will be in terms of istance to the ilate subifferentials To accomplish this, we make use of the recent theory on the sample complexity of linear inverse problems It has been recently shown that (3) exhibits a phase transition from failure with high probability to success with high probability when the number of Gaussian measurements are aroun the quantity m PT = D(cone( f(x 0 ))) [9], [49] This phenomenon was first observe by Donoho an Tanner, who calculate the phase transitions for l 1 minimization an showe that D(cone( x 0 1 )) k log en k for a k-sparse vector in R n [54] All of these works focus on signals with a single structure an o not stuy properties of a penalty that is a combination of norms The next theorem relates the phase transition point of the joint optimization (3) to the iniviual subifferentials Theorem 4: Suppose A has ii N (0, 1) entries an let f(x) = τ λ i x (i) For positive scalars {α i } τ i=1, let λ i = λiα 1i=1 i P τi=1 an efine, λ iα 1 i ( m up ({α i } τ i=1) := λ i D(α i x 0 (i) )) i

arxiv: v3 [cs.it] 25 Jul 2014

arxiv: v3 [cs.it] 25 Jul 2014 Simultaneously Structured Models with Application to Sparse and Low-rank Matrices Samet Oymak c, Amin Jalali w, Maryam Fazel w, Yonina C. Eldar t, Babak Hassibi c arxiv:11.3753v3 [cs.it] 5 Jul 014 c California

More information

Recovery of Simultaneously Structured Models using Convex Optimization

Recovery of Simultaneously Structured Models using Convex Optimization Recovery of Simultaneously Structured Models using Convex Optimization Maryam Fazel University of Washington Joint work with: Amin Jalali (UW), Samet Oymak and Babak Hassibi (Caltech) Yonina Eldar (Technion)

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

arxiv:submit/ [cs.it] 16 Dec 2012

arxiv:submit/ [cs.it] 16 Dec 2012 Simultaneously Structured Models with Application to Sparse and Low-rank Matrices arxiv:submit/0614913 [cs.it] 16 Dec 01 Samet Oymak c, Amin Jalali w, Maryam Fazel w, Yonina C. Eldar t, Babak Hassibi c,

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay

More information

arxiv: v2 [cs.it] 11 Feb 2013

arxiv: v2 [cs.it] 11 Feb 2013 Simultaneously Structured Models with Application to Sparse and Low-rank Matrices Samet Oymak c, Amin Jalali w, Maryam Fazel w, Yonina C. Eldar t, Babak Hassibi c arxiv:11.3753v [cs.it] 11 Feb 013 c California

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

Acute sets in Euclidean spaces

Acute sets in Euclidean spaces Acute sets in Eucliean spaces Viktor Harangi April, 011 Abstract A finite set H in R is calle an acute set if any angle etermine by three points of H is acute. We examine the maximal carinality α() of

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Lecture 2 Lagrangian formulation of classical mechanics Mechanics Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Qubit channels that achieve capacity with two states

Qubit channels that achieve capacity with two states Qubit channels that achieve capacity with two states Dominic W. Berry Department of Physics, The University of Queenslan, Brisbane, Queenslan 4072, Australia Receive 22 December 2004; publishe 22 March

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

Necessary and Sufficient Conditions for Sketched Subspace Clustering

Necessary and Sufficient Conditions for Sketched Subspace Clustering Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This

More information

Robustness and Perturbations of Minimal Bases

Robustness and Perturbations of Minimal Bases Robustness an Perturbations of Minimal Bases Paul Van Dooren an Froilán M Dopico December 9, 2016 Abstract Polynomial minimal bases of rational vector subspaces are a classical concept that plays an important

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

On combinatorial approaches to compressed sensing

On combinatorial approaches to compressed sensing On combinatorial approaches to compresse sensing Abolreza Abolhosseini Moghaam an Hayer Raha Department of Electrical an Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abolhos,raha}@msu.eu

More information

Some Examples. Uniform motion. Poisson processes on the real line

Some Examples. Uniform motion. Poisson processes on the real line Some Examples Our immeiate goal is to see some examples of Lévy processes, an/or infinitely-ivisible laws on. Uniform motion Choose an fix a nonranom an efine X := for all (1) Then, {X } is a [nonranom]

More information

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes Leaving Ranomness to Nature: -Dimensional Prouct Coes through the lens of Generalize-LDPC coes Tavor Baharav, Kannan Ramchanran Dept. of Electrical Engineering an Computer Sciences, U.C. Berkeley {tavorb,

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

Agmon Kolmogorov Inequalities on l 2 (Z d )

Agmon Kolmogorov Inequalities on l 2 (Z d ) Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,

More information

arxiv: v1 [math.mg] 10 Apr 2018

arxiv: v1 [math.mg] 10 Apr 2018 ON THE VOLUME BOUND IN THE DVORETZKY ROGERS LEMMA FERENC FODOR, MÁRTON NASZÓDI, AND TAMÁS ZARNÓCZ arxiv:1804.03444v1 [math.mg] 10 Apr 2018 Abstract. The classical Dvoretzky Rogers lemma provies a eterministic

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

arxiv: v1 [cs.lg] 22 Mar 2014

arxiv: v1 [cs.lg] 22 Mar 2014 CUR lgorithm with Incomplete Matrix Observation Rong Jin an Shenghuo Zhu Dept. of Computer Science an Engineering, Michigan State University, rongjin@msu.eu NEC Laboratories merica, Inc., zsh@nec-labs.com

More information

Analytic Scaling Formulas for Crossed Laser Acceleration in Vacuum

Analytic Scaling Formulas for Crossed Laser Acceleration in Vacuum October 6, 4 ARDB Note Analytic Scaling Formulas for Crosse Laser Acceleration in Vacuum Robert J. Noble Stanfor Linear Accelerator Center, Stanfor University 575 San Hill Roa, Menlo Park, California 945

More information

Logarithmic spurious regressions

Logarithmic spurious regressions Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate

More information

LECTURE NOTES ON DVORETZKY S THEOREM

LECTURE NOTES ON DVORETZKY S THEOREM LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.

More information

Ramsey numbers of some bipartite graphs versus complete graphs

Ramsey numbers of some bipartite graphs versus complete graphs Ramsey numbers of some bipartite graphs versus complete graphs Tao Jiang, Michael Salerno Miami University, Oxfor, OH 45056, USA Abstract. The Ramsey number r(h, K n ) is the smallest positive integer

More information

Discrete Mathematics

Discrete Mathematics Discrete Mathematics 309 (009) 86 869 Contents lists available at ScienceDirect Discrete Mathematics journal homepage: wwwelseviercom/locate/isc Profile vectors in the lattice of subspaces Dániel Gerbner

More information

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations Diophantine Approximations: Examining the Farey Process an its Metho on Proucing Best Approximations Kelly Bowen Introuction When a person hears the phrase irrational number, one oes not think of anything

More information

arxiv: v2 [cs.ds] 11 May 2016

arxiv: v2 [cs.ds] 11 May 2016 Optimizing Star-Convex Functions Jasper C.H. Lee Paul Valiant arxiv:5.04466v2 [cs.ds] May 206 Department of Computer Science Brown University {jasperchlee,paul_valiant}@brown.eu May 3, 206 Abstract We

More information

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7.

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7. Lectures Nine an Ten The WKB Approximation The WKB metho is a powerful tool to obtain solutions for many physical problems It is generally applicable to problems of wave propagation in which the frequency

More information

Counting Lattice Points in Polytopes: The Ehrhart Theory

Counting Lattice Points in Polytopes: The Ehrhart Theory 3 Counting Lattice Points in Polytopes: The Ehrhart Theory Ubi materia, ibi geometria. Johannes Kepler (1571 1630) Given the profusion of examples that gave rise to the polynomial behavior of the integer-point

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number

More information

REAL ANALYSIS I HOMEWORK 5

REAL ANALYSIS I HOMEWORK 5 REAL ANALYSIS I HOMEWORK 5 CİHAN BAHRAN The questions are from Stein an Shakarchi s text, Chapter 3. 1. Suppose ϕ is an integrable function on R with R ϕ(x)x = 1. Let K δ(x) = δ ϕ(x/δ), δ > 0. (a) Prove

More information

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection an System Ientification Borhan M Sananaji, Tyrone L Vincent, an Michael B Wakin Abstract In this paper,

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Capacity Analysis of MIMO Systems with Unknown Channel State Information Capacity Analysis of MIMO Systems with Unknown Channel State Information Jun Zheng an Bhaskar D. Rao Dept. of Electrical an Computer Engineering University of California at San Diego e-mail: juzheng@ucs.eu,

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

PDE Notes, Lecture #11

PDE Notes, Lecture #11 PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Approximate Constraint Satisfaction Requires Large LP Relaxations

Approximate Constraint Satisfaction Requires Large LP Relaxations Approximate Constraint Satisfaction Requires Large LP Relaxations oah Fleming April 19, 2018 Linear programming is a very powerful tool for attacking optimization problems. Techniques such as the ellipsoi

More information

The Principle of Least Action

The Principle of Least Action Chapter 7. The Principle of Least Action 7.1 Force Methos vs. Energy Methos We have so far stuie two istinct ways of analyzing physics problems: force methos, basically consisting of the application of

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13) Slie10 Haykin Chapter 14: Neuroynamics (3r E. Chapter 13) CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Neural Networks with Temporal Behavior Inclusion of feeback gives temporal characteristics to

More information

arxiv: v4 [cs.ds] 7 Mar 2014

arxiv: v4 [cs.ds] 7 Mar 2014 Analysis of Agglomerative Clustering Marcel R. Ackermann Johannes Blömer Daniel Kuntze Christian Sohler arxiv:101.697v [cs.ds] 7 Mar 01 Abstract The iameter k-clustering problem is the problem of partitioning

More information

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES Communications on Stochastic Analysis Vol. 2, No. 2 (28) 289-36 Serials Publications www.serialspublications.com SINGULAR PERTURBATION AND STATIONARY SOLUTIONS OF PARABOLIC EQUATIONS IN GAUSS-SOBOLEV SPACES

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

Equilibrium in Queues Under Unknown Service Times and Service Value

Equilibrium in Queues Under Unknown Service Times and Service Value University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER On the Performance of Sparse Recovery IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 11, NOVEMBER 2011 7255 On the Performance of Sparse Recovery Via `p-minimization (0 p 1) Meng Wang, Student Member, IEEE, Weiyu Xu, and Ao Tang, Senior

More information

ELECTRON DIFFRACTION

ELECTRON DIFFRACTION ELECTRON DIFFRACTION Electrons : wave or quanta? Measurement of wavelength an momentum of electrons. Introuction Electrons isplay both wave an particle properties. What is the relationship between the

More information

The Press-Schechter mass function

The Press-Schechter mass function The Press-Schechter mass function To state the obvious: It is important to relate our theories to what we can observe. We have looke at linear perturbation theory, an we have consiere a simple moel for

More information

Lecture 6 : Dimensionality Reduction

Lecture 6 : Dimensionality Reduction CPS290: Algorithmic Founations of Data Science February 3, 207 Lecture 6 : Dimensionality Reuction Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will consier the roblem of maing

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

On conditional moments of high-dimensional random vectors given lower-dimensional projections

On conditional moments of high-dimensional random vectors given lower-dimensional projections Submitte to the Bernoulli arxiv:1405.2183v2 [math.st] 6 Sep 2016 On conitional moments of high-imensional ranom vectors given lower-imensional projections LUKAS STEINBERGER an HANNES LEEB Department of

More information

Vectors in two dimensions

Vectors in two dimensions Vectors in two imensions Until now, we have been working in one imension only The main reason for this is to become familiar with the main physical ieas like Newton s secon law, without the aitional complication

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Chapter 4. Electrostatics of Macroscopic Media

Chapter 4. Electrostatics of Macroscopic Media Chapter 4. Electrostatics of Macroscopic Meia 4.1 Multipole Expansion Approximate potentials at large istances 3 x' x' (x') x x' x x Fig 4.1 We consier the potential in the far-fiel region (see Fig. 4.1

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Sparse Reconstruction of Systems of Ordinary Differential Equations

Sparse Reconstruction of Systems of Ordinary Differential Equations Sparse Reconstruction of Systems of Orinary Differential Equations Manuel Mai a, Mark D. Shattuck b,c, Corey S. O Hern c,a,,e, a Department of Physics, Yale University, New Haven, Connecticut 06520, USA

More information

PHYS 414 Problem Set 2: Turtles all the way down

PHYS 414 Problem Set 2: Turtles all the way down PHYS 414 Problem Set 2: Turtles all the way own This problem set explores the common structure of ynamical theories in statistical physics as you pass from one length an time scale to another. Brownian

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information

Chapter 6: Energy-Momentum Tensors

Chapter 6: Energy-Momentum Tensors 49 Chapter 6: Energy-Momentum Tensors This chapter outlines the general theory of energy an momentum conservation in terms of energy-momentum tensors, then applies these ieas to the case of Bohm's moel.

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena

Chaos, Solitons and Fractals Nonlinear Science, and Nonequilibrium and Complex Phenomena Chaos, Solitons an Fractals (7 64 73 Contents lists available at ScienceDirect Chaos, Solitons an Fractals onlinear Science, an onequilibrium an Complex Phenomena journal homepage: www.elsevier.com/locate/chaos

More information

Conservation Laws. Chapter Conservation of Energy

Conservation Laws. Chapter Conservation of Energy 20 Chapter 3 Conservation Laws In orer to check the physical consistency of the above set of equations governing Maxwell-Lorentz electroynamics [(2.10) an (2.12) or (1.65) an (1.68)], we examine the action

More information

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy,

NOTES ON EULER-BOOLE SUMMATION (1) f (l 1) (n) f (l 1) (m) + ( 1)k 1 k! B k (y) f (k) (y) dy, NOTES ON EULER-BOOLE SUMMATION JONATHAN M BORWEIN, NEIL J CALKIN, AND DANTE MANNA Abstract We stuy a connection between Euler-MacLaurin Summation an Boole Summation suggeste in an AMM note from 196, which

More information

Iterated Point-Line Configurations Grow Doubly-Exponentially

Iterated Point-Line Configurations Grow Doubly-Exponentially Iterate Point-Line Configurations Grow Doubly-Exponentially Joshua Cooper an Mark Walters July 9, 008 Abstract Begin with a set of four points in the real plane in general position. A to this collection

More information

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Ashish Goel Michael Kapralov Sanjeev Khanna Abstract We consier the well-stuie problem of fining a perfect matching in -regular bipartite

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

LeChatelier Dynamics

LeChatelier Dynamics LeChatelier Dynamics Robert Gilmore Physics Department, Drexel University, Philaelphia, Pennsylvania 1914, USA (Date: June 12, 28, Levine Birthay Party: To be submitte.) Dynamics of the relaxation of a

More information

Closed and Open Loop Optimal Control of Buffer and Energy of a Wireless Device

Closed and Open Loop Optimal Control of Buffer and Energy of a Wireless Device Close an Open Loop Optimal Control of Buffer an Energy of a Wireless Device V. S. Borkar School of Technology an Computer Science TIFR, umbai, Inia. borkar@tifr.res.in A. A. Kherani B. J. Prabhu INRIA

More information

05 The Continuum Limit and the Wave Equation

05 The Continuum Limit and the Wave Equation Utah State University DigitalCommons@USU Founations of Wave Phenomena Physics, Department of 1-1-2004 05 The Continuum Limit an the Wave Equation Charles G. Torre Department of Physics, Utah State University,

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

inflow outflow Part I. Regular tasks for MAE598/494 Task 1 MAE 494/598, Fall 2016 Project #1 (Regular tasks = 20 points) Har copy of report is ue at the start of class on the ue ate. The rules on collaboration will be release separately. Please always follow the

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

3 The variational formulation of elliptic PDEs

3 The variational formulation of elliptic PDEs Chapter 3 The variational formulation of elliptic PDEs We now begin the theoretical stuy of elliptic partial ifferential equations an bounary value problems. We will focus on one approach, which is calle

More information

Influence of weight initialization on multilayer perceptron performance

Influence of weight initialization on multilayer perceptron performance Influence of weight initialization on multilayer perceptron performance M. Karouia (1,2) T. Denœux (1) R. Lengellé (1) (1) Université e Compiègne U.R.A. CNRS 817 Heuiasyc BP 649 - F-66 Compiègne ceex -

More information

The total derivative. Chapter Lagrangian and Eulerian approaches

The total derivative. Chapter Lagrangian and Eulerian approaches Chapter 5 The total erivative 51 Lagrangian an Eulerian approaches The representation of a flui through scalar or vector fiels means that each physical quantity uner consieration is escribe as a function

More information

Entanglement is not very useful for estimating multiple phases

Entanglement is not very useful for estimating multiple phases PHYSICAL REVIEW A 70, 032310 (2004) Entanglement is not very useful for estimating multiple phases Manuel A. Ballester* Department of Mathematics, University of Utrecht, Box 80010, 3508 TA Utrecht, The

More information

arxiv: v2 [cond-mat.stat-mech] 11 Nov 2016

arxiv: v2 [cond-mat.stat-mech] 11 Nov 2016 Noname manuscript No. (will be inserte by the eitor) Scaling properties of the number of ranom sequential asorption iterations neee to generate saturate ranom packing arxiv:607.06668v2 [con-mat.stat-mech]

More information

Convergence rates of moment-sum-of-squares hierarchies for optimal control problems

Convergence rates of moment-sum-of-squares hierarchies for optimal control problems Convergence rates of moment-sum-of-squares hierarchies for optimal control problems Milan Kora 1, Diier Henrion 2,3,4, Colin N. Jones 1 Draft of September 8, 2016 Abstract We stuy the convergence rate

More information

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods

Hyperbolic Moment Equations Using Quadrature-Based Projection Methods Hyperbolic Moment Equations Using Quarature-Base Projection Methos J. Koellermeier an M. Torrilhon Department of Mathematics, RWTH Aachen University, Aachen, Germany Abstract. Kinetic equations like the

More information

Math 1B, lecture 8: Integration by parts

Math 1B, lecture 8: Integration by parts Math B, lecture 8: Integration by parts Nathan Pflueger 23 September 2 Introuction Integration by parts, similarly to integration by substitution, reverses a well-known technique of ifferentiation an explores

More information