Sample Complexity of Bayesian Optimal Dictionary Learning

Size: px
Start display at page:

Download "Sample Complexity of Bayesian Optimal Dictionary Learning"

Transcription

1 Sample Complexity of Bayesian Optimal Dictionary Learning Ayaka Sakata and Yoshiyuki Kabashima Dep. of Computational Intelligence & Systems Science Tokyo Institute of Technology Yokohama 6-85, Japan ariv:3.699v [cs.lg] 5 Feb 4 Abstract We consider a learning problem of identifying a dictionary matrix D R M from a sample set of M dimensional vectors Y R M P = / D R M P, where R P is a sparse matrix in which the density of non-zero entries is < ρ <. In particular, we focus on the minimum sample size P c (sample complexity) necessary for perfectly identifying D of the optimal learning scheme when D and are independently generated from certain distributions. By using the replica method of statistical mechanics, we show that P c O() holds as long as α = M/ > ρ is satisfied in the limit of. Our analysis also implies that the posterior distribution given Y is condensed only at the correct dictionary D when the compression rate α is greater than a certain critical value α M(ρ). This suggests that belief propagation may allow us to learn D with a low computational complexity using O() samples. I. ITRODUCTIO The concept of sparse representations has recently attracted considerable attention from various fields in which the number of measurements is limited. Many real-world signals such as natural images are represented sparsely in Fourier/wavelet domains; in other words, many components vanish or are negligibly small in amplitude when the signals are represented by Fourier/wavelet bases. This empirical property is exploited in the signal recovery paradigm of compressed sensing (CS), thereby enabling the recovery of sparse signals from much fewer measurements than those estimated by the yquist- Shannon sampling theorem [], [], [3], [4]. In signal processing techniques for exploiting sparsity, signals are generally assumed to be described as linear combinations of a few dictionary atoms. Therefore, the effectiveness of this approach is highly dependent on the choice of dictionary, by which the objective signals appear sparse. A method for choosing an appropriate dictionary for sparse representation is dictionary learning (DL), whereby the dictionary is constructed through a learning process from an available set of P training samples [5], [6], [7], [8]. The ambiguity of the dictionary is fatal in signal/data analysis after learning. Therefore, an important issue is the estimation of the sample complexity, i.e., the sample size P c necessary for correct identification of the dictionary. In a seminal work, Aharon et al. showed that when the training set Y R M P is generated by a dictionary D R M and a sparse matrix R P (planted solution) as Y = D, one can perfectly learn these if P > P c = (k+) C k andk is sufficiently small, where k is the number of non-zero elements in each column of [9]. Unfortunately, this bound becomes exponentially large in for k O(), which motivates us to improve the estimation. A recent study has shown that almost all dictionaries under the uniform measure are learnable with P c O(M) samples when klnm O( ) []. However, the fact that the number of unknown variables M +P and known variables MP are balanced with each other at P O() when M O() implies the possibility of DL with O() training samples. To answer this question, in this study, we evaluate the sample complexity of the optimal learning scheme defined for a given probabilistic model of dictionary learning. In a previous study, the authors assessed the sample complexity for a naive learning scheme: min D, Y D subj. to Pρ ( < ρ < ), where A indicates the Frobenius norm of A, and is the number of nonzero elements in and D is enforced to be normalized appropriately. They used the replica method of statistical mechanics and found that P c O() holds when α = M/ is greater than a certain critical value α naive (ρ) > ρ []. However, the smallest possible P c that can be obtained for α < α naive (ρ) has not been clarified thus far. In this study, we show that P c O() holds in the entire region of α > ρ for the optimal learning scheme. II. PROBLEM SETUP Let us suppose the following scenario of dictionary learning. Planted solutions, an M dictionary matrix D R M and an P sparse matrix R P, are independently generated from prior distributions, P(D) and P ρ (), respectively, where P(D) is the uniform distribution over an appropriate support and P ρ () = P ρ ( il )= { } ( ρ)δ( il )+ρf( il ). () i,l i,l The rate of non-zero elements in is given by ρ [,], and the distribution function f() does not have a finite mass probability at the origin. The set of training samples Y R M P, whose column vector corresponds to a training sample, is assumed to be given by the planted solutions as Y = D, ()

2 where/ is introduced for convenience in taking the largesystem limit. A learner is required to infer D and from Y. We impose the normalization constraint of µ D µi = D i = M for each column i =, to avoid the ambiguity of the product D = DA A for diagonal matrices of positive diagonal entries A. In addition, we introduce two other constraints that i) the values of M µ= D µi are set to be positive and ii) columns of D are lined up in the descending order of the absolute value of M µ= D µi so that ambiguities of simultaneous permutation and/or multiplication of same signs for columns in D and rows in are removed. In the following, the uniform prior P(D) is assumed to be defined on the support that satisfies all of these constraints. Our aim is to evaluate the minimum value of the sample size P required for perfectly identifying D and. III. BAYESIA OPTIMAL LEARIG For mathematical formulation of our problem, let us denote the estimates of D and yielded by an arbitrary learning scheme as ˆD(Y ) and ˆ(Y ). We evaluate the efficiency of the scheme using the mean squared errors (per element), MSE D ( ˆD( ))= P ρ (D,,Y) D M ˆD(Y ) (3) Y,D, MSE ( ˆ( ))= P ρ (D,,Y) P ˆ(Y ) (4) Y,D, where A B = i,j A ijb ij represents the inner product between two matrices of the same dimension A and B. We impose the normalization constraint M µ= ( ˆD(Y )) µi = M for each column index i =,,..., in order to avoid the ambiguity of the product ˆD(Y ) ˆ( Ŷ) = ˆD(Y )A A ˆ(Y ) for an arbitrary invertible diagonal matrix A. The joint distribution of D,, and Y is given by P ρ (D,,Y ) = δ(y D)P(D)P ρ (). (5) The perfect identification of D and can be characterized by MSE D = MSE =. The following theorem offers a useful basis for answering our question. Theorem. For an arbitrary learning scheme, (3) and (4) are bounded from below as MSE D ( ˆD( )) ( ) ( D ρ ) i P ρ (Y ) Y i= M (6) MSE ( ˆ( )) ( ρ P ρ (Y ) ) ρ ρ, P P Y (7) where P ρ (Y ) = D, P ρ(d,,y ), and ρ denotes the average over D and according to the posterior distribution of D and under a given Y, P ρ (D, Y ) = One could choose different constraints as long as the trivial ambiguities of the column order and the signs are resolved. (a) MSE D.5.5 θ = ρ θ =.8ρ θ =.5ρ.5.5 γ (b). θ = ρ θ.8ρ θ =.5ρ.5 MSE γ Fig.. γ-dependence of (a) MSE D and (b) MSE at α =.5, ρ =.. Plots for θ = ρ,.8ρ, and.5ρ show the optimality of the correct parameter choice θ = ρ. Broken curves represent locally unstable branches, which are thermodynamically irrelevant. P ρ (D,,Y )/P ρ (Y ). The equalities hold when the estimates satisfy ( ˆD opt (Y )) i = M ( D ρ ) i ( D ρ ) i, ˆopt (Y ) = ρ, (8) where (A) i denotes the i-th column vector of matrix A. We refer to (8) as the Bayesian optimal learning scheme []. Proof: By applying the Cauchy-Shwartz inequality and the minimization of the quadratic function to MSE D and MSE, respectively, one can obtain (6) (8) after inserting the expression ρ (D,,Y )=P ρ (Y D,xP ) xp ρ (D, Y)=P ρ (Y ) x ρ D, for x = D and into (3). This theorem guarantees that when the setup of dictionary learning is characterized by P(D) and P ρ (), the estimates of (8) offer the best possible learning performance in the sense that (3) and (4) are minimized. As the perfect identification of D and is characterized by MSE D = MSE =, our purpose is fulfilled by analyzing the performance of the Bayesian optimal learning scheme of (8).

3 IV. AALYSIS For simplicity of calculation, let us set f( il ) as the Gaussian distribution with mean and variance σ, and σ is set to unity for all numerical calculations later on. For generality, we consider cases in which the sparsity assumed by the learner, denoted as θ, can differ from the actual value ρ. When θ ρ, the estimates are given by ( ˆD(Y )) i = M( D θ ) i / ( D θ ) i and ˆ(Y ) = θ instead of (8). To evaluate MSE D and MSE, we need to evaluate macroscopic quantities q D = M [ D θ D θ ] Y, m D = M [ D θ D ρ ] Y (9) Q = P [ θ] Y () q = P [ θ θ ] Y, m = P [ θ ρ ] Y, () where [ ] Y = Y P ρ(y )( ). ote that (9) () yield MSE D m D / q D and MSE = ρσ +q m. Unfortunately, evaluating these is intrinsically difficult because it generally requires averaging the quantity D,,D, P θ (Y,D, )P θ (Y,D, )(D D ) D,,D, P θ(y,d, )P θ (Y,D, ) (= D θ D θ ), () which includes summations over exponentially many terms in the denominator, with respect to Y. One promising approach for avoiding this difficulty involves multiplying P n θ (Y ) = (,D P θ(y,d,)) n (n =,3,... ) inside the operation of [ ] Y for canceling the denominator of (), which makes the evaluation of a modified average q D (n) = [Pθ n(y ) D θ D θ ] Y M [Pθ n(y )] Y (3) feasible via the saddle point assessment of [Pθ n(y )] Y for,m,p, keeping α = M/ and γ = P/ as O(). Furthermore, the resulting expression is likely to hold for n R as well. Therefore, we evaluate q D using the formula q D = lim n q D (n) with the expression, and similarly, for m D, Q, q, and m. This procedure is often termed the replica method [3], [4]. Under the replica symmetric ansatz, which assumes that the dominant saddle point in the evaluation is invariant under any permutation of replica indices a =,,...,n, the assessment is reduced to evaluating the extremum of the free entropy (density) function ( ˆQ Q + ˆq q ) φ = γ ˆm m + lnξ + α +ˆq D q D ˆm D m D ln(ˆq D +ˆq D )+ ˆq D+ˆm ) D (ˆQD ˆQ D +ˆq D αγ { qd q m D m +ρσ } +ln(q q D q ), (4) Q q D q aive computation requires us to assess a column-wise overlap C D,i = M / [( D ρ) i ( D θ ) i / ( D θ ) i ( D θ ) i ] Y for each column index i =,,...,. However, the law of large numbers and the statistical uniformity allow the simplification of C D,i m D / q D as M = α tends to infinity. where ˆσ = +(ˆQ + ˆq )σ, Ξ =( θ)+ θ ( σ exp ( ˆq z + ˆm ) ) ˆσ ˆσ ( θ)+ξ +, (5) and denotes the average over and z, which are distributed according to P ρ () and a Gaussian distribution with mean zero and variance, respectively. The extremized value 3 of φ, φ, is related to the average loglikelihood (density) of Y as Y P ρ(y )lnp θ (Y ) = lim n ( / n) { ln[pθ n(y )] Y} = φ +constant. In Fig., (a) MSE D and (b) MSE for θ = ρ =. are plotted versus γ together with those for θ =.8ρ and θ =.5ρ. At θ = ρ, MSE D and MSE of thermodynamically relevant branches have minimum values in the entire γ region, while a branch of solution characterized by MSE D = MSE = is shared by the three parameter sets. This supports the optimality of the correct parameter choice of θ = ρ, and therefore, we hereafter focus our analysis on this case to estimate the minimum value of γ for the perfect learning, MSE D = MSE =. At θ = ρ, the relationships m D = q D, m = q, and Q = ρ hold from (9) (), and the extremum problem is reduced to (Ξ q D = ˆq + D ˆq z + ˆq ), q =, (6) + ˆq D Ξ where ˆq D and ˆq are given by ˆq = αq D ρσ q Dq, ˆq D = ˆσ γq ρσ q Dq. (7) The other variables are provided as ˆQ D =, ˆQ =, ˆm = ˆq, and ˆm D = ˆq D. A. Actual solutions V. RESULTS Fig. plots q D and q versus γ for α =.5 and ρ =.. As shown in the figure, the solutions of q D and q given by (6) are classified into three types: q D =, q = ρσ, q D = q =, and < q D <, < q < ρσ. The first one yields MSE D = MSE =, indicating the correct identification of D and, and hence, we name it the success solution. The second one is referred to as the failure solution because it yields MSE D = and MSE = ρσ, which indicates complete failure of the learning of D and. The third one yields finite MSE D and MSE, < MSE D <, < MSE < ρσ, and we term it the middle solution. ) Success solution: When the expression ( δ Y D ( ) MP ( )= lim exp Y D ), τ + πτ τ (8) is used, the success solution of q D and q behaves as (ρσ q )/τ = χ and ( q D )/τ = χ D while ˆq and ˆq D scale 3 When multiple extrema exist, the maximum value among them should be chosen as long as no consistency condition is violated.

4 q D q D γ S γ F q γ M γ q Fig.. γ-dependence of q D (left axis) and q (right axis) for α =.5 and ρ =.. as ˆq = ˆθ /τ and ˆq D = ˆθ D /τ. By substituting them into the equations of q D and q, they are given by χ = ργ g, χ D = α ρσ g, ˆθ = ρ χ, ˆθD = χ D, (9) where g = (α ρ)γ α. χ and χ D must be positive by definition, and hence, the success solution exists for γ > α α ρ γ S () only when α > ρ. ) Failure solution: The failure solution q D = q = appears at γ < γ F as a locally stable solution. When q D and q are sufficiently small, they are expressed as q = ρσ αq D +O(q ), q D = γq ρσ +O(q ), () where O(q ) denotes the higher-order terms over second-order with respect to q D and q. These expressions indicate that when γ > α γ F, () the local stability of q D = q = is lost. As shown in Fig., the failure solution vanishes at γ F =. for α =.5. 3) Middle solution: We define γ M over which the middle solution with < q D < and < q < ρσ disappears, denoted as a vertical line in Fig., which is provided as γ M = for the parameter choice of(α,ρ) = (.5,.). The value of γ M depends on (α,ρ), as shown in Fig. 3. This figure indicates that γ M diverges at ρ M = for α =.5. The relation between ρ M and α, denoted as ρ M (α) (or α M (ρ)), generally accords with the critical condition that belief propagation (BP)-based signal recovery using the correct prior starts to be involved with multiple fixed points for the signal reconstruction problem of compressed sensing [6] in which the correct dictionary D is provided in advance. 8 6 γ M 4 φ ρ M.5..5 ρ Fig. 3. γ M versus ρ for α =.5. φ S γ γ F γ M -. S φ F M.5.5 γ Fig. 4. γ-dependence of φ for α =.5 and ρ =.. φ S diverges positively for γ > γ S = α/(α ρ) = BP is also a potential algorithm for practically achieving the learning performance predicted by the current analysis because it is known that macroscopic behavior theoretically analyzed by the replica method can be confirmed experimentally for single instances by BP for many other systems [6], [7], [8]. The fact that only the success solution exists for γ > γ M implies that one may be able to perfectly identify the correct dictionary D with a computational cost of polynomial order in utilizing BP, without being trapped by other locally stable solutions, for α > α M (ρ). B. Free entropy density There are three extrema of the free entropy (density), φ S, φ F, and φ M, corresponding to the success solution, failure solution, and middle solution, respectively. Among them, the thermodynamically dominant solution that provides the correct evaluations of q D and q is the one for which the value of free entropy is the largest. Fig. 4 plots φ S, φ F, and φ M versus γ for α =.5, ρ =., where γ S = and γ F =..

5 .8.6 α.4. ( I ) ( II ) ( III )..4 ρ.6.8 Fig. 5. Phase diagram on α ρ plane. The dashed curve in the area of (I) is the result of []. In particular, functional forms of φ S and φ F are given by [ φ S = lim g {ln( g } { )} ρσ τ + τ ) αγ ln(αγ)+α ln( α ] +γρ(lnγ lnσ) γh(ρ) (3) φ F = { αγ(+logρσ )+α}, (4) where τ + originates from the expression of (8) and H(ρ) = ( ρ)log( ρ) ρlog(ρ). Further, (3) shows that φ S diverges positively for g = (α ρ)γ α >, which guarantees that the success solution is always thermodynamically dominant for γ > γ S = α/(α ρ) as φ of other solutions is kept finite. This leads to the conclusion that the sample complexity of the Bayesian optimal learning is P c = γ S, which is guaranteed as O() as long as α > ρ. This is the main consequence of the present study. Fig. 5 plots the phase diagram in the α ρ plane. The union of the regions (I) and (II) represents the condition that the sample complexity P c is O(), while the full curve of the upper boundary of (II) denotes α M (ρ) above which BP is expected to work as an efficient learning algorithm. Dictionary learning is impossible in the region of (III). The critical condition α naive (ρ) above which the naive learning scheme of [] can perfectly identify the planted solution by O() samples is drawn as the dashed curve for comparison. The considerable difference between α naive (ρ) and ρ (or even α M (ρ)) indicates the significance of using adequate knowledge of probabilistic models in dictionary learning. VI. SUMMARY In summary, we assessed the minimum sample size required for perfectly identifying a planted solution in dictionary learning (DL). For this assessment, we derived the optimal learning scheme defined for a given probabilistic model of DL following the framework of Bayesian inference. Unfortunately, actually evaluating the performance of the Bayesian optimal learning scheme involves an intrinsic technical difficulty. For resolving this difficulty, we resorted to the replica method of statistical mechanics, and we showed that the sample complexity can be reduced to O() as long as the compression rate α is greater than the density ρ of non-zero elements of the sparse matrix. This indicates that the performance of a naive learning scheme examined in a previous study [] can be improved significantly by utilizing the knowledge of adequate probabilistic models in DL. It was also shown that when α is greater than a certain critical value α M (ρ), the macroscopic state corresponding to perfect identification of the planted solution becomes a unique candidate for the thermodynamically dominant state. This suggests that one may be able to learn the planted solution with a computational complexity of polynomial order in utilizing belief propagation for α > α M (ρ). ote added: After completing this study, the authors became aware that [9] presents results similar to those presented in this paper, where an algorithm for dictionary learning/calibration is independently developed on the basis of belief propagation. ACKOWLEDGMET This work was partially supported by a Grant-in-Aid for JSPS Fellow o (AS) and KAKEHI os. 33 and 398 (YK), and JSPS Core-to-Core Program onequilibrium dynamics of soft matter and information. REFERECES [] J.-L. Starck, F. Murtagh, and J. M. Fadili, Sparse Image and Signal Processing: Wavelets, Curvelets, Morphological Diversity (Cambridge Univ. Press, ew York, ). [] H. yquist, Certain topics in telegraph transmission theory, Trans. AIEE 47 (), pp (98). [3] D. L. Donoho, Compressed sensing, IEEE Trans. Inform. Theory 5 (4), pp (6). [4] E. J. Candès, and T Tao, Decoding by Linear Programming, IEEE Trans. Inform. Theory 5 (), pp (5). [5] B. A. Olshausen and D. J. Field, Sparse Coding with an Overcomplete Basis Set: A Strategy Employed by V?, Vision Res. 37 (3), pp (997). [6] R. Rubinstein, A. M. Bruckstein, and M. Elad, Dictionaries for Sparse Representation Modeling, Proc. of IEEE 98 (6), pp (). [7] M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, (Springer-Verlag, ew York, ). [8] S. Gleichman, and Y. C. Eldar, Blind Compressed Sensing, IEEE Inform. Theory 57, pp (). [9] M. Aharon, M. Elad, and A. M. Bruckstein, On the uniqueness of overcomplete dictionaries, and a practical way to retrieve them, Linear Algebra and its Applications 46 (), pp (6). [] D. Vainsencher, S. Mannor, and A. M. Bruckstein, The Sample Complexity of Dictionary Learning, Journal of Machine Learning Research, pp (). [] A. Sakata, and Y. Kabashima, Statistical mechanics of dictionary learning, ariv: [] Y. Iba, The ishimori line and Bayesian statistics, J. Phys. A: Math. Gen. 3 (), (999). [3] M. Mézard, G. Parisi, and M. A. Virasoro, Spin Glass Theory and Beyond, (World Sci. Pub., 987). [4] H. ishimori, Statistical Physics of Spin Glasses and Information Processing: An Introduction, (Oxford Univ. Pr., ). [5] M. Mézard and A. Montanari, Information, Physics, and Computation, (Oxford Univ. Press, Oxford, UK, 9).

6 [6] F. Krzakala, M. Mézard, F. Sausset, Y. F. Sun, and L. Zdeborová Statistical-Physics-Based Reconstruction in Compressed Sensing, Phys. Rev., pp (). [7] D. J. Thouless, P. W. Anderson, and R. G. Palmer, Solution of Solvable model of a spin glass, Phil. Mag. 35 (3), pp (977). [8] K. Kabashima, A CDMA multiuser detection algorithm on the basis of belief propagation, J. Phys. A 36 (43), pp. (3). [9] F. Krzakala, M. Mézard, and L. Zdeborová, Phase Diagram and Approximate Message Passing for Blind Calibration and Dictionary Learning. Preprint received directly from the authors via private communication.

Spin Glass Approach to Restricted Isometry Constant

Spin Glass Approach to Restricted Isometry Constant Spin Glass Approach to Restricted Isometry Constant Ayaka Sakata 1,Yoshiyuki Kabashima 2 1 Institute of Statistical Mathematics 2 Tokyo Institute of Technology 1/29 Outline Background: Compressed sensing

More information

Performance Regions in Compressed Sensing from Noisy Measurements

Performance Regions in Compressed Sensing from Noisy Measurements Performance egions in Compressed Sensing from Noisy Measurements Junan Zhu and Dror Baron Department of Electrical and Computer Engineering North Carolina State University; aleigh, NC 27695, USA Email:

More information

arxiv: v3 [cs.na] 21 Mar 2016

arxiv: v3 [cs.na] 21 Mar 2016 Phase transitions and sample complexity in Bayes-optimal matrix factorization ariv:40.98v3 [cs.a] Mar 06 Yoshiyuki Kabashima, Florent Krzakala, Marc Mézard 3, Ayaka Sakata,4, and Lenka Zdeborová 5, Department

More information

Propagating beliefs in spin glass models. Abstract

Propagating beliefs in spin glass models. Abstract Propagating beliefs in spin glass models Yoshiyuki Kabashima arxiv:cond-mat/0211500v1 [cond-mat.dis-nn] 22 Nov 2002 Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology,

More information

SPARSE signal representations have gained popularity in recent

SPARSE signal representations have gained popularity in recent 6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying

More information

On convergence of Approximate Message Passing

On convergence of Approximate Message Passing On convergence of Approximate Message Passing Francesco Caltagirone (1), Florent Krzakala (2) and Lenka Zdeborova (1) (1) Institut de Physique Théorique, CEA Saclay (2) LPS, Ecole Normale Supérieure, Paris

More information

Statistical Mechanics of Multi Terminal Data Compression

Statistical Mechanics of Multi Terminal Data Compression MEXT Grant-in-Aid for Scientific Research on Priority Areas Statistical Mechanical Approach to Probabilistic Information Processing Workshop on Mathematics of Statistical Inference (December 004, Tohoku

More information

Signal Recovery from Permuted Observations

Signal Recovery from Permuted Observations EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,

More information

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases

A Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases 2558 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 48, NO 9, SEPTEMBER 2002 A Generalized Uncertainty Principle Sparse Representation in Pairs of Bases Michael Elad Alfred M Bruckstein Abstract An elementary

More information

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery Approimate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery arxiv:1606.00901v1 [cs.it] Jun 016 Shuai Huang, Trac D. Tran Department of Electrical and Computer Engineering Johns

More information

Statistical mechanics of low-density parity check error-correcting codes over Galois fields

Statistical mechanics of low-density parity check error-correcting codes over Galois fields EUROPHYSICS LETTERS 15 November 2001 Europhys. Lett., 56 (4), pp. 610 616 (2001) Statistical mechanics of low-density parity check error-correcting codes over Galois fields K. Nakamura 1 ( ), Y. Kabashima

More information

arxiv: v1 [cs.it] 17 Sep 2015

arxiv: v1 [cs.it] 17 Sep 2015 Online compressed sensing Paulo V. Rossi Dep. de Física Geral, Instituto de Física, University of São Paulo, São Paulo-SP 5314-9, Brazil Yoshiyuki Kabashima Department of Computational Intelligence and

More information

IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER

IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER 2015 1239 Preconditioning for Underdetermined Linear Systems with Sparse Solutions Evaggelia Tsiligianni, StudentMember,IEEE, Lisimachos P. Kondi,

More information

Risk and Noise Estimation in High Dimensional Statistics via State Evolution

Risk and Noise Estimation in High Dimensional Statistics via State Evolution Risk and Noise Estimation in High Dimensional Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento, Murat Erdogdu, Marc Lelarge, and Andrea Montanari Statistical

More information

Why Sparse Coding Works

Why Sparse Coding Works Why Sparse Coding Works Mathematical Challenges in Deep Learning Shaowei Lin (UC Berkeley) shaowei@math.berkeley.edu 10 Aug 2011 Deep Learning Kickoff Meeting What is Sparse Coding? There are many formulations

More information

Mismatched Estimation in Large Linear Systems

Mismatched Estimation in Large Linear Systems Mismatched Estimation in Large Linear Systems Yanting Ma, Dror Baron, Ahmad Beirami Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 7695, USA Department

More information

Equivalence Probability and Sparsity of Two Sparse Solutions in Sparse Representation

Equivalence Probability and Sparsity of Two Sparse Solutions in Sparse Representation IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 12, DECEMBER 2008 2009 Equivalence Probability and Sparsity of Two Sparse Solutions in Sparse Representation Yuanqing Li, Member, IEEE, Andrzej Cichocki,

More information

Thresholds for the Recovery of Sparse Solutions via L1 Minimization

Thresholds for the Recovery of Sparse Solutions via L1 Minimization Thresholds for the Recovery of Sparse Solutions via L Minimization David L. Donoho Department of Statistics Stanford University 39 Serra Mall, Sequoia Hall Stanford, CA 9435-465 Email: donoho@stanford.edu

More information

Approximate Message Passing Algorithm for Complex Separable Compressed Imaging

Approximate Message Passing Algorithm for Complex Separable Compressed Imaging Approximate Message Passing Algorithm for Complex Separle Compressed Imaging Akira Hirayashi, Jumpei Sugimoto, and Kazushi Mimura College of Information Science and Engineering, Ritsumeikan University,

More information

THE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES. Lenka Zdeborová

THE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES. Lenka Zdeborová THE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES Lenka Zdeborová (CEA Saclay and CNRS, France) MAIN CONTRIBUTORS TO THE PHYSICS UNDERSTANDING OF RANDOM INSTANCES Braunstein, Franz, Kabashima, Kirkpatrick,

More information

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization

Design of Projection Matrix for Compressive Sensing by Nonsmooth Optimization Design of Proection Matrix for Compressive Sensing by Nonsmooth Optimization W.-S. Lu T. Hinamoto Dept. of Electrical & Computer Engineering Graduate School of Engineering University of Victoria Hiroshima

More information

LEARNING OVERCOMPLETE SPARSIFYING TRANSFORMS FOR SIGNAL PROCESSING. Saiprasad Ravishankar and Yoram Bresler

LEARNING OVERCOMPLETE SPARSIFYING TRANSFORMS FOR SIGNAL PROCESSING. Saiprasad Ravishankar and Yoram Bresler LEARNING OVERCOMPLETE SPARSIFYING TRANSFORMS FOR SIGNAL PROCESSING Saiprasad Ravishankar and Yoram Bresler Department of Electrical and Computer Engineering and the Coordinated Science Laboratory, University

More information

Bayesian Paradigm. Maximum A Posteriori Estimation

Bayesian Paradigm. Maximum A Posteriori Estimation Bayesian Paradigm Maximum A Posteriori Estimation Simple acquisition model noise + degradation Constraint minimization or Equivalent formulation Constraint minimization Lagrangian (unconstraint minimization)

More information

The spin-glass cornucopia

The spin-glass cornucopia The spin-glass cornucopia Marc Mézard! Ecole normale supérieure - PSL Research University and CNRS, Université Paris Sud LPT-ENS, January 2015 40 years of research n 40 years of research 70 s:anomalies

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Asilomar 2011 Jason T. Parker (AFRL/RYAP) Philip Schniter (OSU) Volkan Cevher (EPFL) Problem Statement Traditional

More information

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit

New Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

GWAS V: Gaussian processes

GWAS V: Gaussian processes GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011

More information

Sparse & Redundant Signal Representation, and its Role in Image Processing

Sparse & Redundant Signal Representation, and its Role in Image Processing Sparse & Redundant Signal Representation, and its Role in Michael Elad The CS Department The Technion Israel Institute of technology Haifa 3000, Israel Wave 006 Wavelet and Applications Ecole Polytechnique

More information

Uniqueness Conditions for A Class of l 0 -Minimization Problems

Uniqueness Conditions for A Class of l 0 -Minimization Problems Uniqueness Conditions for A Class of l 0 -Minimization Problems Chunlei Xu and Yun-Bin Zhao October, 03, Revised January 04 Abstract. We consider a class of l 0 -minimization problems, which is to search

More information

Fundamental Limits of PhaseMax for Phase Retrieval: A Replica Analysis

Fundamental Limits of PhaseMax for Phase Retrieval: A Replica Analysis Fundamental Limits of PhaseMax for Phase Retrieval: A Replica Analysis Oussama Dhifallah and Yue M. Lu John A. Paulson School of Engineering and Applied Sciences Harvard University, Cambridge, MA 0238,

More information

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013

Machine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 Machine Learning for Signal Processing Sparse and Overcomplete Representations Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 1 Key Topics in this Lecture Basics Component-based representations

More information

On the Projection Matrices Influence in the Classification of Compressed Sensed ECG Signals

On the Projection Matrices Influence in the Classification of Compressed Sensed ECG Signals On the Projection Matrices Influence in the Classification of Compressed Sensed ECG Signals Monica Fira, Liviu Goras Institute of Computer Science Romanian Academy Iasi, Romania Liviu Goras, Nicolae Cleju,

More information

Signal Recovery, Uncertainty Relations, and Minkowski Dimension

Signal Recovery, Uncertainty Relations, and Minkowski Dimension Signal Recovery, Uncertainty Relations, and Minkowski Dimension Helmut Bőlcskei ETH Zurich December 2013 Joint work with C. Aubel, P. Kuppinger, G. Pope, E. Riegler, D. Stotz, and C. Studer Aim of this

More information

5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE

5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE 5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER 2009 Uncertainty Relations for Shift-Invariant Analog Signals Yonina C. Eldar, Senior Member, IEEE Abstract The past several years

More information

Approximate Message Passing Algorithms

Approximate Message Passing Algorithms November 4, 2017 Outline AMP (Donoho et al., 2009, 2010a) Motivations Derivations from a message-passing perspective Limitations Extensions Generalized Approximate Message Passing (GAMP) (Rangan, 2011)

More information

The Iteration-Tuned Dictionary for Sparse Representations

The Iteration-Tuned Dictionary for Sparse Representations The Iteration-Tuned Dictionary for Sparse Representations Joaquin Zepeda #1, Christine Guillemot #2, Ewa Kijak 3 # INRIA Centre Rennes - Bretagne Atlantique Campus de Beaulieu, 35042 Rennes Cedex, FRANCE

More information

Sparsity in Underdetermined Systems

Sparsity in Underdetermined Systems Sparsity in Underdetermined Systems Department of Statistics Stanford University August 19, 2005 Classical Linear Regression Problem X n y p n 1 > Given predictors and response, y Xβ ε = + ε N( 0, σ 2

More information

Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems

Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems 1 Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems Alyson K. Fletcher, Mojtaba Sahraee-Ardakan, Philip Schniter, and Sundeep Rangan Abstract arxiv:1706.06054v1 cs.it

More information

Efficient Adaptive Compressive Sensing Using Sparse Hierarchical Learned Dictionaries

Efficient Adaptive Compressive Sensing Using Sparse Hierarchical Learned Dictionaries 1 Efficient Adaptive Compressive Sensing Using Sparse Hierarchical Learned Dictionaries Akshay Soni and Jarvis Haupt University of Minnesota, Twin Cities Department of Electrical and Computer Engineering

More information

Stochastic geometry and random matrix theory in CS

Stochastic geometry and random matrix theory in CS Stochastic geometry and random matrix theory in CS IPAM: numerical methods for continuous optimization University of Edinburgh Joint with Bah, Blanchard, Cartis, and Donoho Encoder Decoder pair - Encoder/Decoder

More information

EUSIPCO

EUSIPCO EUSIPCO 013 1569746769 SUBSET PURSUIT FOR ANALYSIS DICTIONARY LEARNING Ye Zhang 1,, Haolong Wang 1, Tenglong Yu 1, Wenwu Wang 1 Department of Electronic and Information Engineering, Nanchang University,

More information

University of Luxembourg. Master in Mathematics. Student project. Compressed sensing. Supervisor: Prof. I. Nourdin. Author: Lucien May

University of Luxembourg. Master in Mathematics. Student project. Compressed sensing. Supervisor: Prof. I. Nourdin. Author: Lucien May University of Luxembourg Master in Mathematics Student project Compressed sensing Author: Lucien May Supervisor: Prof. I. Nourdin Winter semester 2014 1 Introduction Let us consider an s-sparse vector

More information

TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen

TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES Mika Inki and Aapo Hyvärinen Neural Networks Research Centre Helsinki University of Technology P.O. Box 54, FIN-215 HUT, Finland ABSTRACT

More information

Compressed Sensing in Astronomy

Compressed Sensing in Astronomy Compressed Sensing in Astronomy J.-L. Starck CEA, IRFU, Service d'astrophysique, France jstarck@cea.fr http://jstarck.free.fr Collaborators: J. Bobin, CEA. Introduction: Compressed Sensing (CS) Sparse

More information

Learning Bound for Parameter Transfer Learning

Learning Bound for Parameter Transfer Learning Learning Bound for Parameter Transfer Learning Wataru Kumagai Faculty of Engineering Kanagawa University kumagai@kanagawa-u.ac.jp Abstract We consider a transfer-learning problem by using the parameter

More information

ESANN'1999 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 1999, D-Facto public., ISBN X, pp.

ESANN'1999 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 1999, D-Facto public., ISBN X, pp. Statistical mechanics of support vector machines Arnaud Buhot and Mirta B. Gordon Department de Recherche Fondamentale sur la Matiere Condensee CEA-Grenoble, 17 rue des Martyrs, 38054 Grenoble Cedex 9,

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Generalized Orthogonal Matching Pursuit- A Review and Some

Generalized Orthogonal Matching Pursuit- A Review and Some Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents

More information

Compressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes

Compressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes Compressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes Item Type text; Proceedings Authors Jagiello, Kristin M. Publisher International Foundation for Telemetering Journal International Telemetering

More information

Inference in kinetic Ising models: mean field and Bayes estimators

Inference in kinetic Ising models: mean field and Bayes estimators Inference in kinetic Ising models: mean field and Bayes estimators Ludovica Bachschmid-Romano, Manfred Opper Artificial Intelligence group, Computer Science, TU Berlin, Germany October 23, 2015 L. Bachschmid-Romano,

More information

Sparse Superposition Codes for the Gaussian Channel

Sparse Superposition Codes for the Gaussian Channel Sparse Superposition Codes for the Gaussian Channel Florent Krzakala (LPS, Ecole Normale Supérieure, France) J. Barbier (ENS) arxiv:1403.8024 presented at ISIT 14 Long version in preparation Communication

More information

MMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm

MMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm Bodduluri Asha, B. Leela kumari Abstract: It is well known that in a real world signals do not exist without noise, which may be negligible

More information

Approximate Message Passing

Approximate Message Passing Approximate Message Passing Mohammad Emtiyaz Khan CS, UBC February 8, 2012 Abstract In this note, I summarize Sections 5.1 and 5.2 of Arian Maleki s PhD thesis. 1 Notation We denote scalars by small letters

More information

Part III Advanced Coding Techniques

Part III Advanced Coding Techniques Part III Advanced Coding Techniques José Vieira SPL Signal Processing Laboratory Departamento de Electrónica, Telecomunicações e Informática / IEETA Universidade de Aveiro, Portugal 2010 José Vieira (IEETA,

More information

Compressive Sensing of Temporally Correlated Sources Using Isotropic Multivariate Stable Laws

Compressive Sensing of Temporally Correlated Sources Using Isotropic Multivariate Stable Laws Compressive Sensing of Temporally Correlated Sources Using Isotropic Multivariate Stable Laws George Tzagkarakis EONOS Investment Technologies Paris, France and Institute of Computer Science Foundation

More information

Multiplicative and Additive Perturbation Effects on the Recovery of Sparse Signals on the Sphere using Compressed Sensing

Multiplicative and Additive Perturbation Effects on the Recovery of Sparse Signals on the Sphere using Compressed Sensing Multiplicative and Additive Perturbation Effects on the Recovery of Sparse Signals on the Sphere using Compressed Sensing ibeltal F. Alem, Daniel H. Chae, and Rodney A. Kennedy The Australian National

More information

Recovering overcomplete sparse representations from structured sensing

Recovering overcomplete sparse representations from structured sensing Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix

More information

Blind Compressed Sensing

Blind Compressed Sensing 1 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE arxiv:1002.2586v2 [cs.it] 28 Apr 2010 Abstract The fundamental principle underlying compressed sensing is that a signal,

More information

Fast Angular Synchronization for Phase Retrieval via Incomplete Information

Fast Angular Synchronization for Phase Retrieval via Incomplete Information Fast Angular Synchronization for Phase Retrieval via Incomplete Information Aditya Viswanathan a and Mark Iwen b a Department of Mathematics, Michigan State University; b Department of Mathematics & Department

More information

Introduction to Machine Learning

Introduction to Machine Learning Outline Introduction to Machine Learning Bayesian Classification Varun Chandola March 8, 017 1. {circular,large,light,smooth,thick}, malignant. {circular,large,light,irregular,thick}, malignant 3. {oval,large,dark,smooth,thin},

More information

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient

More information

Randomness-in-Structured Ensembles for Compressed Sensing of Images

Randomness-in-Structured Ensembles for Compressed Sensing of Images Randomness-in-Structured Ensembles for Compressed Sensing of Images Abdolreza Abdolhosseini Moghadam Dep. of Electrical and Computer Engineering Michigan State University Email: abdolhos@msu.edu Hayder

More information

SIGNAL SEPARATION USING RE-WEIGHTED AND ADAPTIVE MORPHOLOGICAL COMPONENT ANALYSIS

SIGNAL SEPARATION USING RE-WEIGHTED AND ADAPTIVE MORPHOLOGICAL COMPONENT ANALYSIS TR-IIS-4-002 SIGNAL SEPARATION USING RE-WEIGHTED AND ADAPTIVE MORPHOLOGICAL COMPONENT ANALYSIS GUAN-JU PENG AND WEN-LIANG HWANG Feb. 24, 204 Technical Report No. TR-IIS-4-002 http://www.iis.sinica.edu.tw/page/library/techreport/tr204/tr4.html

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

CSC321 Lecture 18: Learning Probabilistic Models

CSC321 Lecture 18: Learning Probabilistic Models CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling

More information

Compressed Sensing: Lecture I. Ronald DeVore

Compressed Sensing: Lecture I. Ronald DeVore Compressed Sensing: Lecture I Ronald DeVore Motivation Compressed Sensing is a new paradigm for signal/image/function acquisition Motivation Compressed Sensing is a new paradigm for signal/image/function

More information

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K

More information

Lecture Notes 9: Constrained Optimization

Lecture Notes 9: Constrained Optimization Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form

More information

High-dimensional graphical model selection: Practical and information-theoretic limits

High-dimensional graphical model selection: Practical and information-theoretic limits 1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Rui ZHANG Song LI. Department of Mathematics, Zhejiang University, Hangzhou , P. R. China

Rui ZHANG Song LI. Department of Mathematics, Zhejiang University, Hangzhou , P. R. China Acta Mathematica Sinica, English Series May, 015, Vol. 31, No. 5, pp. 755 766 Published online: April 15, 015 DOI: 10.1007/s10114-015-434-4 Http://www.ActaMath.com Acta Mathematica Sinica, English Series

More information

Asymptotic Achievability of the Cramér Rao Bound For Noisy Compressive Sampling

Asymptotic Achievability of the Cramér Rao Bound For Noisy Compressive Sampling Asymptotic Achievability of the Cramér Rao Bound For Noisy Compressive Sampling The Harvard community h made this article openly available. Plee share how this access benefits you. Your story matters Citation

More information

Estimation of linear non-gaussian acyclic models for latent factors

Estimation of linear non-gaussian acyclic models for latent factors Estimation of linear non-gaussian acyclic models for latent factors Shohei Shimizu a Patrik O. Hoyer b Aapo Hyvärinen b,c a The Institute of Scientific and Industrial Research, Osaka University Mihogaoka

More information

Compressed Sensing and Linear Codes over Real Numbers

Compressed Sensing and Linear Codes over Real Numbers Compressed Sensing and Linear Codes over Real Numbers Henry D. Pfister (joint with Fan Zhang) Texas A&M University College Station Information Theory and Applications Workshop UC San Diego January 31st,

More information

Power Grid State Estimation after a Cyber-Physical Attack under the AC Power Flow Model

Power Grid State Estimation after a Cyber-Physical Attack under the AC Power Flow Model Power Grid State Estimation after a Cyber-Physical Attack under the AC Power Flow Model Saleh Soltan, Gil Zussman Department of Electrical Engineering Columbia University, New York, NY Email: {saleh,gil}@ee.columbia.edu

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

Invertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning

Invertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning Invertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning Xian Wei, Martin Kleinsteuber, and Hao Shen Department of Electrical and Computer Engineering Technische Universität München,

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

Machine Learning Lecture Notes

Machine Learning Lecture Notes Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Compressed Sensing and Related Learning Problems

Compressed Sensing and Related Learning Problems Compressed Sensing and Related Learning Problems Yingzhen Li Dept. of Mathematics, Sun Yat-sen University Advisor: Prof. Haizhang Zhang Advisor: Prof. Haizhang Zhang 1 / Overview Overview Background Compressed

More information

arxiv: v1 [cs.it] 21 Feb 2013

arxiv: v1 [cs.it] 21 Feb 2013 q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto

More information

arxiv: v1 [math.na] 26 Nov 2009

arxiv: v1 [math.na] 26 Nov 2009 Non-convexly constrained linear inverse problems arxiv:0911.5098v1 [math.na] 26 Nov 2009 Thomas Blumensath Applied Mathematics, School of Mathematics, University of Southampton, University Road, Southampton,

More information

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals

More information

Supplemental for Spectral Algorithm For Latent Tree Graphical Models

Supplemental for Spectral Algorithm For Latent Tree Graphical Models Supplemental for Spectral Algorithm For Latent Tree Graphical Models Ankur P. Parikh, Le Song, Eric P. Xing The supplemental contains 3 main things. 1. The first is network plots of the latent variable

More information

Compressed Sensing: Extending CLEAN and NNLS

Compressed Sensing: Extending CLEAN and NNLS Compressed Sensing: Extending CLEAN and NNLS Ludwig Schwardt SKA South Africa (KAT Project) Calibration & Imaging Workshop Socorro, NM, USA 31 March 2009 Outline 1 Compressed Sensing (CS) Introduction

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables

A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of

More information

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing

MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing Afonso S. Bandeira April 9, 2015 1 The Johnson-Lindenstrauss Lemma Suppose one has n points, X = {x 1,..., x n }, in R d with d very

More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information Introduction Consider a linear system y = Φx where Φ can be taken as an m n matrix acting on Euclidean space or more generally, a linear operator on a Hilbert space. We call the vector x a signal or input,

More information

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Jim Lambers MAT 610 Summer Session Lecture 2 Notes Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the

More information

A NEW FRAMEWORK FOR DESIGNING INCOHERENT SPARSIFYING DICTIONARIES

A NEW FRAMEWORK FOR DESIGNING INCOHERENT SPARSIFYING DICTIONARIES A NEW FRAMEWORK FOR DESIGNING INCOERENT SPARSIFYING DICTIONARIES Gang Li, Zhihui Zhu, 2 uang Bai, 3 and Aihua Yu 3 School of Automation & EE, Zhejiang Univ. of Sci. & Tech., angzhou, Zhejiang, P.R. China

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference

Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which

More information

Greedy Dictionary Selection for Sparse Representation

Greedy Dictionary Selection for Sparse Representation Greedy Dictionary Selection for Sparse Representation Volkan Cevher Rice University volkan@rice.edu Andreas Krause Caltech krausea@caltech.edu Abstract We discuss how to construct a dictionary by selecting

More information

A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality

A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality Shu Kong, Donghui Wang Dept. of Computer Science and Technology, Zhejiang University, Hangzhou 317,

More information