Sample Complexity of Bayesian Optimal Dictionary Learning
|
|
- Rosanna Matthews
- 6 years ago
- Views:
Transcription
1 Sample Complexity of Bayesian Optimal Dictionary Learning Ayaka Sakata and Yoshiyuki Kabashima Dep. of Computational Intelligence & Systems Science Tokyo Institute of Technology Yokohama 6-85, Japan ariv:3.699v [cs.lg] 5 Feb 4 Abstract We consider a learning problem of identifying a dictionary matrix D R M from a sample set of M dimensional vectors Y R M P = / D R M P, where R P is a sparse matrix in which the density of non-zero entries is < ρ <. In particular, we focus on the minimum sample size P c (sample complexity) necessary for perfectly identifying D of the optimal learning scheme when D and are independently generated from certain distributions. By using the replica method of statistical mechanics, we show that P c O() holds as long as α = M/ > ρ is satisfied in the limit of. Our analysis also implies that the posterior distribution given Y is condensed only at the correct dictionary D when the compression rate α is greater than a certain critical value α M(ρ). This suggests that belief propagation may allow us to learn D with a low computational complexity using O() samples. I. ITRODUCTIO The concept of sparse representations has recently attracted considerable attention from various fields in which the number of measurements is limited. Many real-world signals such as natural images are represented sparsely in Fourier/wavelet domains; in other words, many components vanish or are negligibly small in amplitude when the signals are represented by Fourier/wavelet bases. This empirical property is exploited in the signal recovery paradigm of compressed sensing (CS), thereby enabling the recovery of sparse signals from much fewer measurements than those estimated by the yquist- Shannon sampling theorem [], [], [3], [4]. In signal processing techniques for exploiting sparsity, signals are generally assumed to be described as linear combinations of a few dictionary atoms. Therefore, the effectiveness of this approach is highly dependent on the choice of dictionary, by which the objective signals appear sparse. A method for choosing an appropriate dictionary for sparse representation is dictionary learning (DL), whereby the dictionary is constructed through a learning process from an available set of P training samples [5], [6], [7], [8]. The ambiguity of the dictionary is fatal in signal/data analysis after learning. Therefore, an important issue is the estimation of the sample complexity, i.e., the sample size P c necessary for correct identification of the dictionary. In a seminal work, Aharon et al. showed that when the training set Y R M P is generated by a dictionary D R M and a sparse matrix R P (planted solution) as Y = D, one can perfectly learn these if P > P c = (k+) C k andk is sufficiently small, where k is the number of non-zero elements in each column of [9]. Unfortunately, this bound becomes exponentially large in for k O(), which motivates us to improve the estimation. A recent study has shown that almost all dictionaries under the uniform measure are learnable with P c O(M) samples when klnm O( ) []. However, the fact that the number of unknown variables M +P and known variables MP are balanced with each other at P O() when M O() implies the possibility of DL with O() training samples. To answer this question, in this study, we evaluate the sample complexity of the optimal learning scheme defined for a given probabilistic model of dictionary learning. In a previous study, the authors assessed the sample complexity for a naive learning scheme: min D, Y D subj. to Pρ ( < ρ < ), where A indicates the Frobenius norm of A, and is the number of nonzero elements in and D is enforced to be normalized appropriately. They used the replica method of statistical mechanics and found that P c O() holds when α = M/ is greater than a certain critical value α naive (ρ) > ρ []. However, the smallest possible P c that can be obtained for α < α naive (ρ) has not been clarified thus far. In this study, we show that P c O() holds in the entire region of α > ρ for the optimal learning scheme. II. PROBLEM SETUP Let us suppose the following scenario of dictionary learning. Planted solutions, an M dictionary matrix D R M and an P sparse matrix R P, are independently generated from prior distributions, P(D) and P ρ (), respectively, where P(D) is the uniform distribution over an appropriate support and P ρ () = P ρ ( il )= { } ( ρ)δ( il )+ρf( il ). () i,l i,l The rate of non-zero elements in is given by ρ [,], and the distribution function f() does not have a finite mass probability at the origin. The set of training samples Y R M P, whose column vector corresponds to a training sample, is assumed to be given by the planted solutions as Y = D, ()
2 where/ is introduced for convenience in taking the largesystem limit. A learner is required to infer D and from Y. We impose the normalization constraint of µ D µi = D i = M for each column i =, to avoid the ambiguity of the product D = DA A for diagonal matrices of positive diagonal entries A. In addition, we introduce two other constraints that i) the values of M µ= D µi are set to be positive and ii) columns of D are lined up in the descending order of the absolute value of M µ= D µi so that ambiguities of simultaneous permutation and/or multiplication of same signs for columns in D and rows in are removed. In the following, the uniform prior P(D) is assumed to be defined on the support that satisfies all of these constraints. Our aim is to evaluate the minimum value of the sample size P required for perfectly identifying D and. III. BAYESIA OPTIMAL LEARIG For mathematical formulation of our problem, let us denote the estimates of D and yielded by an arbitrary learning scheme as ˆD(Y ) and ˆ(Y ). We evaluate the efficiency of the scheme using the mean squared errors (per element), MSE D ( ˆD( ))= P ρ (D,,Y) D M ˆD(Y ) (3) Y,D, MSE ( ˆ( ))= P ρ (D,,Y) P ˆ(Y ) (4) Y,D, where A B = i,j A ijb ij represents the inner product between two matrices of the same dimension A and B. We impose the normalization constraint M µ= ( ˆD(Y )) µi = M for each column index i =,,..., in order to avoid the ambiguity of the product ˆD(Y ) ˆ( Ŷ) = ˆD(Y )A A ˆ(Y ) for an arbitrary invertible diagonal matrix A. The joint distribution of D,, and Y is given by P ρ (D,,Y ) = δ(y D)P(D)P ρ (). (5) The perfect identification of D and can be characterized by MSE D = MSE =. The following theorem offers a useful basis for answering our question. Theorem. For an arbitrary learning scheme, (3) and (4) are bounded from below as MSE D ( ˆD( )) ( ) ( D ρ ) i P ρ (Y ) Y i= M (6) MSE ( ˆ( )) ( ρ P ρ (Y ) ) ρ ρ, P P Y (7) where P ρ (Y ) = D, P ρ(d,,y ), and ρ denotes the average over D and according to the posterior distribution of D and under a given Y, P ρ (D, Y ) = One could choose different constraints as long as the trivial ambiguities of the column order and the signs are resolved. (a) MSE D.5.5 θ = ρ θ =.8ρ θ =.5ρ.5.5 γ (b). θ = ρ θ.8ρ θ =.5ρ.5 MSE γ Fig.. γ-dependence of (a) MSE D and (b) MSE at α =.5, ρ =.. Plots for θ = ρ,.8ρ, and.5ρ show the optimality of the correct parameter choice θ = ρ. Broken curves represent locally unstable branches, which are thermodynamically irrelevant. P ρ (D,,Y )/P ρ (Y ). The equalities hold when the estimates satisfy ( ˆD opt (Y )) i = M ( D ρ ) i ( D ρ ) i, ˆopt (Y ) = ρ, (8) where (A) i denotes the i-th column vector of matrix A. We refer to (8) as the Bayesian optimal learning scheme []. Proof: By applying the Cauchy-Shwartz inequality and the minimization of the quadratic function to MSE D and MSE, respectively, one can obtain (6) (8) after inserting the expression ρ (D,,Y )=P ρ (Y D,xP ) xp ρ (D, Y)=P ρ (Y ) x ρ D, for x = D and into (3). This theorem guarantees that when the setup of dictionary learning is characterized by P(D) and P ρ (), the estimates of (8) offer the best possible learning performance in the sense that (3) and (4) are minimized. As the perfect identification of D and is characterized by MSE D = MSE =, our purpose is fulfilled by analyzing the performance of the Bayesian optimal learning scheme of (8).
3 IV. AALYSIS For simplicity of calculation, let us set f( il ) as the Gaussian distribution with mean and variance σ, and σ is set to unity for all numerical calculations later on. For generality, we consider cases in which the sparsity assumed by the learner, denoted as θ, can differ from the actual value ρ. When θ ρ, the estimates are given by ( ˆD(Y )) i = M( D θ ) i / ( D θ ) i and ˆ(Y ) = θ instead of (8). To evaluate MSE D and MSE, we need to evaluate macroscopic quantities q D = M [ D θ D θ ] Y, m D = M [ D θ D ρ ] Y (9) Q = P [ θ] Y () q = P [ θ θ ] Y, m = P [ θ ρ ] Y, () where [ ] Y = Y P ρ(y )( ). ote that (9) () yield MSE D m D / q D and MSE = ρσ +q m. Unfortunately, evaluating these is intrinsically difficult because it generally requires averaging the quantity D,,D, P θ (Y,D, )P θ (Y,D, )(D D ) D,,D, P θ(y,d, )P θ (Y,D, ) (= D θ D θ ), () which includes summations over exponentially many terms in the denominator, with respect to Y. One promising approach for avoiding this difficulty involves multiplying P n θ (Y ) = (,D P θ(y,d,)) n (n =,3,... ) inside the operation of [ ] Y for canceling the denominator of (), which makes the evaluation of a modified average q D (n) = [Pθ n(y ) D θ D θ ] Y M [Pθ n(y )] Y (3) feasible via the saddle point assessment of [Pθ n(y )] Y for,m,p, keeping α = M/ and γ = P/ as O(). Furthermore, the resulting expression is likely to hold for n R as well. Therefore, we evaluate q D using the formula q D = lim n q D (n) with the expression, and similarly, for m D, Q, q, and m. This procedure is often termed the replica method [3], [4]. Under the replica symmetric ansatz, which assumes that the dominant saddle point in the evaluation is invariant under any permutation of replica indices a =,,...,n, the assessment is reduced to evaluating the extremum of the free entropy (density) function ( ˆQ Q + ˆq q ) φ = γ ˆm m + lnξ + α +ˆq D q D ˆm D m D ln(ˆq D +ˆq D )+ ˆq D+ˆm ) D (ˆQD ˆQ D +ˆq D αγ { qd q m D m +ρσ } +ln(q q D q ), (4) Q q D q aive computation requires us to assess a column-wise overlap C D,i = M / [( D ρ) i ( D θ ) i / ( D θ ) i ( D θ ) i ] Y for each column index i =,,...,. However, the law of large numbers and the statistical uniformity allow the simplification of C D,i m D / q D as M = α tends to infinity. where ˆσ = +(ˆQ + ˆq )σ, Ξ =( θ)+ θ ( σ exp ( ˆq z + ˆm ) ) ˆσ ˆσ ( θ)+ξ +, (5) and denotes the average over and z, which are distributed according to P ρ () and a Gaussian distribution with mean zero and variance, respectively. The extremized value 3 of φ, φ, is related to the average loglikelihood (density) of Y as Y P ρ(y )lnp θ (Y ) = lim n ( / n) { ln[pθ n(y )] Y} = φ +constant. In Fig., (a) MSE D and (b) MSE for θ = ρ =. are plotted versus γ together with those for θ =.8ρ and θ =.5ρ. At θ = ρ, MSE D and MSE of thermodynamically relevant branches have minimum values in the entire γ region, while a branch of solution characterized by MSE D = MSE = is shared by the three parameter sets. This supports the optimality of the correct parameter choice of θ = ρ, and therefore, we hereafter focus our analysis on this case to estimate the minimum value of γ for the perfect learning, MSE D = MSE =. At θ = ρ, the relationships m D = q D, m = q, and Q = ρ hold from (9) (), and the extremum problem is reduced to (Ξ q D = ˆq + D ˆq z + ˆq ), q =, (6) + ˆq D Ξ where ˆq D and ˆq are given by ˆq = αq D ρσ q Dq, ˆq D = ˆσ γq ρσ q Dq. (7) The other variables are provided as ˆQ D =, ˆQ =, ˆm = ˆq, and ˆm D = ˆq D. A. Actual solutions V. RESULTS Fig. plots q D and q versus γ for α =.5 and ρ =.. As shown in the figure, the solutions of q D and q given by (6) are classified into three types: q D =, q = ρσ, q D = q =, and < q D <, < q < ρσ. The first one yields MSE D = MSE =, indicating the correct identification of D and, and hence, we name it the success solution. The second one is referred to as the failure solution because it yields MSE D = and MSE = ρσ, which indicates complete failure of the learning of D and. The third one yields finite MSE D and MSE, < MSE D <, < MSE < ρσ, and we term it the middle solution. ) Success solution: When the expression ( δ Y D ( ) MP ( )= lim exp Y D ), τ + πτ τ (8) is used, the success solution of q D and q behaves as (ρσ q )/τ = χ and ( q D )/τ = χ D while ˆq and ˆq D scale 3 When multiple extrema exist, the maximum value among them should be chosen as long as no consistency condition is violated.
4 q D q D γ S γ F q γ M γ q Fig.. γ-dependence of q D (left axis) and q (right axis) for α =.5 and ρ =.. as ˆq = ˆθ /τ and ˆq D = ˆθ D /τ. By substituting them into the equations of q D and q, they are given by χ = ργ g, χ D = α ρσ g, ˆθ = ρ χ, ˆθD = χ D, (9) where g = (α ρ)γ α. χ and χ D must be positive by definition, and hence, the success solution exists for γ > α α ρ γ S () only when α > ρ. ) Failure solution: The failure solution q D = q = appears at γ < γ F as a locally stable solution. When q D and q are sufficiently small, they are expressed as q = ρσ αq D +O(q ), q D = γq ρσ +O(q ), () where O(q ) denotes the higher-order terms over second-order with respect to q D and q. These expressions indicate that when γ > α γ F, () the local stability of q D = q = is lost. As shown in Fig., the failure solution vanishes at γ F =. for α =.5. 3) Middle solution: We define γ M over which the middle solution with < q D < and < q < ρσ disappears, denoted as a vertical line in Fig., which is provided as γ M = for the parameter choice of(α,ρ) = (.5,.). The value of γ M depends on (α,ρ), as shown in Fig. 3. This figure indicates that γ M diverges at ρ M = for α =.5. The relation between ρ M and α, denoted as ρ M (α) (or α M (ρ)), generally accords with the critical condition that belief propagation (BP)-based signal recovery using the correct prior starts to be involved with multiple fixed points for the signal reconstruction problem of compressed sensing [6] in which the correct dictionary D is provided in advance. 8 6 γ M 4 φ ρ M.5..5 ρ Fig. 3. γ M versus ρ for α =.5. φ S γ γ F γ M -. S φ F M.5.5 γ Fig. 4. γ-dependence of φ for α =.5 and ρ =.. φ S diverges positively for γ > γ S = α/(α ρ) = BP is also a potential algorithm for practically achieving the learning performance predicted by the current analysis because it is known that macroscopic behavior theoretically analyzed by the replica method can be confirmed experimentally for single instances by BP for many other systems [6], [7], [8]. The fact that only the success solution exists for γ > γ M implies that one may be able to perfectly identify the correct dictionary D with a computational cost of polynomial order in utilizing BP, without being trapped by other locally stable solutions, for α > α M (ρ). B. Free entropy density There are three extrema of the free entropy (density), φ S, φ F, and φ M, corresponding to the success solution, failure solution, and middle solution, respectively. Among them, the thermodynamically dominant solution that provides the correct evaluations of q D and q is the one for which the value of free entropy is the largest. Fig. 4 plots φ S, φ F, and φ M versus γ for α =.5, ρ =., where γ S = and γ F =..
5 .8.6 α.4. ( I ) ( II ) ( III )..4 ρ.6.8 Fig. 5. Phase diagram on α ρ plane. The dashed curve in the area of (I) is the result of []. In particular, functional forms of φ S and φ F are given by [ φ S = lim g {ln( g } { )} ρσ τ + τ ) αγ ln(αγ)+α ln( α ] +γρ(lnγ lnσ) γh(ρ) (3) φ F = { αγ(+logρσ )+α}, (4) where τ + originates from the expression of (8) and H(ρ) = ( ρ)log( ρ) ρlog(ρ). Further, (3) shows that φ S diverges positively for g = (α ρ)γ α >, which guarantees that the success solution is always thermodynamically dominant for γ > γ S = α/(α ρ) as φ of other solutions is kept finite. This leads to the conclusion that the sample complexity of the Bayesian optimal learning is P c = γ S, which is guaranteed as O() as long as α > ρ. This is the main consequence of the present study. Fig. 5 plots the phase diagram in the α ρ plane. The union of the regions (I) and (II) represents the condition that the sample complexity P c is O(), while the full curve of the upper boundary of (II) denotes α M (ρ) above which BP is expected to work as an efficient learning algorithm. Dictionary learning is impossible in the region of (III). The critical condition α naive (ρ) above which the naive learning scheme of [] can perfectly identify the planted solution by O() samples is drawn as the dashed curve for comparison. The considerable difference between α naive (ρ) and ρ (or even α M (ρ)) indicates the significance of using adequate knowledge of probabilistic models in dictionary learning. VI. SUMMARY In summary, we assessed the minimum sample size required for perfectly identifying a planted solution in dictionary learning (DL). For this assessment, we derived the optimal learning scheme defined for a given probabilistic model of DL following the framework of Bayesian inference. Unfortunately, actually evaluating the performance of the Bayesian optimal learning scheme involves an intrinsic technical difficulty. For resolving this difficulty, we resorted to the replica method of statistical mechanics, and we showed that the sample complexity can be reduced to O() as long as the compression rate α is greater than the density ρ of non-zero elements of the sparse matrix. This indicates that the performance of a naive learning scheme examined in a previous study [] can be improved significantly by utilizing the knowledge of adequate probabilistic models in DL. It was also shown that when α is greater than a certain critical value α M (ρ), the macroscopic state corresponding to perfect identification of the planted solution becomes a unique candidate for the thermodynamically dominant state. This suggests that one may be able to learn the planted solution with a computational complexity of polynomial order in utilizing belief propagation for α > α M (ρ). ote added: After completing this study, the authors became aware that [9] presents results similar to those presented in this paper, where an algorithm for dictionary learning/calibration is independently developed on the basis of belief propagation. ACKOWLEDGMET This work was partially supported by a Grant-in-Aid for JSPS Fellow o (AS) and KAKEHI os. 33 and 398 (YK), and JSPS Core-to-Core Program onequilibrium dynamics of soft matter and information. REFERECES [] J.-L. Starck, F. Murtagh, and J. M. Fadili, Sparse Image and Signal Processing: Wavelets, Curvelets, Morphological Diversity (Cambridge Univ. Press, ew York, ). [] H. yquist, Certain topics in telegraph transmission theory, Trans. AIEE 47 (), pp (98). [3] D. L. Donoho, Compressed sensing, IEEE Trans. Inform. Theory 5 (4), pp (6). [4] E. J. Candès, and T Tao, Decoding by Linear Programming, IEEE Trans. Inform. Theory 5 (), pp (5). [5] B. A. Olshausen and D. J. Field, Sparse Coding with an Overcomplete Basis Set: A Strategy Employed by V?, Vision Res. 37 (3), pp (997). [6] R. Rubinstein, A. M. Bruckstein, and M. Elad, Dictionaries for Sparse Representation Modeling, Proc. of IEEE 98 (6), pp (). [7] M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, (Springer-Verlag, ew York, ). [8] S. Gleichman, and Y. C. Eldar, Blind Compressed Sensing, IEEE Inform. Theory 57, pp (). [9] M. Aharon, M. Elad, and A. M. Bruckstein, On the uniqueness of overcomplete dictionaries, and a practical way to retrieve them, Linear Algebra and its Applications 46 (), pp (6). [] D. Vainsencher, S. Mannor, and A. M. Bruckstein, The Sample Complexity of Dictionary Learning, Journal of Machine Learning Research, pp (). [] A. Sakata, and Y. Kabashima, Statistical mechanics of dictionary learning, ariv: [] Y. Iba, The ishimori line and Bayesian statistics, J. Phys. A: Math. Gen. 3 (), (999). [3] M. Mézard, G. Parisi, and M. A. Virasoro, Spin Glass Theory and Beyond, (World Sci. Pub., 987). [4] H. ishimori, Statistical Physics of Spin Glasses and Information Processing: An Introduction, (Oxford Univ. Pr., ). [5] M. Mézard and A. Montanari, Information, Physics, and Computation, (Oxford Univ. Press, Oxford, UK, 9).
6 [6] F. Krzakala, M. Mézard, F. Sausset, Y. F. Sun, and L. Zdeborová Statistical-Physics-Based Reconstruction in Compressed Sensing, Phys. Rev., pp (). [7] D. J. Thouless, P. W. Anderson, and R. G. Palmer, Solution of Solvable model of a spin glass, Phil. Mag. 35 (3), pp (977). [8] K. Kabashima, A CDMA multiuser detection algorithm on the basis of belief propagation, J. Phys. A 36 (43), pp. (3). [9] F. Krzakala, M. Mézard, and L. Zdeborová, Phase Diagram and Approximate Message Passing for Blind Calibration and Dictionary Learning. Preprint received directly from the authors via private communication.
Spin Glass Approach to Restricted Isometry Constant
Spin Glass Approach to Restricted Isometry Constant Ayaka Sakata 1,Yoshiyuki Kabashima 2 1 Institute of Statistical Mathematics 2 Tokyo Institute of Technology 1/29 Outline Background: Compressed sensing
More informationPerformance Regions in Compressed Sensing from Noisy Measurements
Performance egions in Compressed Sensing from Noisy Measurements Junan Zhu and Dror Baron Department of Electrical and Computer Engineering North Carolina State University; aleigh, NC 27695, USA Email:
More informationarxiv: v3 [cs.na] 21 Mar 2016
Phase transitions and sample complexity in Bayes-optimal matrix factorization ariv:40.98v3 [cs.a] Mar 06 Yoshiyuki Kabashima, Florent Krzakala, Marc Mézard 3, Ayaka Sakata,4, and Lenka Zdeborová 5, Department
More informationPropagating beliefs in spin glass models. Abstract
Propagating beliefs in spin glass models Yoshiyuki Kabashima arxiv:cond-mat/0211500v1 [cond-mat.dis-nn] 22 Nov 2002 Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology,
More informationSPARSE signal representations have gained popularity in recent
6958 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 10, OCTOBER 2011 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE Abstract The fundamental principle underlying
More informationOn convergence of Approximate Message Passing
On convergence of Approximate Message Passing Francesco Caltagirone (1), Florent Krzakala (2) and Lenka Zdeborova (1) (1) Institut de Physique Théorique, CEA Saclay (2) LPS, Ecole Normale Supérieure, Paris
More informationStatistical Mechanics of Multi Terminal Data Compression
MEXT Grant-in-Aid for Scientific Research on Priority Areas Statistical Mechanical Approach to Probabilistic Information Processing Workshop on Mathematics of Statistical Inference (December 004, Tohoku
More informationSignal Recovery from Permuted Observations
EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,
More informationA Generalized Uncertainty Principle and Sparse Representation in Pairs of Bases
2558 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL 48, NO 9, SEPTEMBER 2002 A Generalized Uncertainty Principle Sparse Representation in Pairs of Bases Michael Elad Alfred M Bruckstein Abstract An elementary
More informationApproximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery
Approimate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery arxiv:1606.00901v1 [cs.it] Jun 016 Shuai Huang, Trac D. Tran Department of Electrical and Computer Engineering Johns
More informationStatistical mechanics of low-density parity check error-correcting codes over Galois fields
EUROPHYSICS LETTERS 15 November 2001 Europhys. Lett., 56 (4), pp. 610 616 (2001) Statistical mechanics of low-density parity check error-correcting codes over Galois fields K. Nakamura 1 ( ), Y. Kabashima
More informationarxiv: v1 [cs.it] 17 Sep 2015
Online compressed sensing Paulo V. Rossi Dep. de Física Geral, Instituto de Física, University of São Paulo, São Paulo-SP 5314-9, Brazil Yoshiyuki Kabashima Department of Computational Intelligence and
More informationIEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER
IEEE SIGNAL PROCESSING LETTERS, VOL. 22, NO. 9, SEPTEMBER 2015 1239 Preconditioning for Underdetermined Linear Systems with Sparse Solutions Evaggelia Tsiligianni, StudentMember,IEEE, Lisimachos P. Kondi,
More informationRisk and Noise Estimation in High Dimensional Statistics via State Evolution
Risk and Noise Estimation in High Dimensional Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento, Murat Erdogdu, Marc Lelarge, and Andrea Montanari Statistical
More informationWhy Sparse Coding Works
Why Sparse Coding Works Mathematical Challenges in Deep Learning Shaowei Lin (UC Berkeley) shaowei@math.berkeley.edu 10 Aug 2011 Deep Learning Kickoff Meeting What is Sparse Coding? There are many formulations
More informationMismatched Estimation in Large Linear Systems
Mismatched Estimation in Large Linear Systems Yanting Ma, Dror Baron, Ahmad Beirami Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 7695, USA Department
More informationEquivalence Probability and Sparsity of Two Sparse Solutions in Sparse Representation
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 19, NO. 12, DECEMBER 2008 2009 Equivalence Probability and Sparsity of Two Sparse Solutions in Sparse Representation Yuanqing Li, Member, IEEE, Andrzej Cichocki,
More informationThresholds for the Recovery of Sparse Solutions via L1 Minimization
Thresholds for the Recovery of Sparse Solutions via L Minimization David L. Donoho Department of Statistics Stanford University 39 Serra Mall, Sequoia Hall Stanford, CA 9435-465 Email: donoho@stanford.edu
More informationApproximate Message Passing Algorithm for Complex Separable Compressed Imaging
Approximate Message Passing Algorithm for Complex Separle Compressed Imaging Akira Hirayashi, Jumpei Sugimoto, and Kazushi Mimura College of Information Science and Engineering, Ritsumeikan University,
More informationTHE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES. Lenka Zdeborová
THE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES Lenka Zdeborová (CEA Saclay and CNRS, France) MAIN CONTRIBUTORS TO THE PHYSICS UNDERSTANDING OF RANDOM INSTANCES Braunstein, Franz, Kabashima, Kirkpatrick,
More informationDesign of Projection Matrix for Compressive Sensing by Nonsmooth Optimization
Design of Proection Matrix for Compressive Sensing by Nonsmooth Optimization W.-S. Lu T. Hinamoto Dept. of Electrical & Computer Engineering Graduate School of Engineering University of Victoria Hiroshima
More informationLEARNING OVERCOMPLETE SPARSIFYING TRANSFORMS FOR SIGNAL PROCESSING. Saiprasad Ravishankar and Yoram Bresler
LEARNING OVERCOMPLETE SPARSIFYING TRANSFORMS FOR SIGNAL PROCESSING Saiprasad Ravishankar and Yoram Bresler Department of Electrical and Computer Engineering and the Coordinated Science Laboratory, University
More informationBayesian Paradigm. Maximum A Posteriori Estimation
Bayesian Paradigm Maximum A Posteriori Estimation Simple acquisition model noise + degradation Constraint minimization or Equivalent formulation Constraint minimization Lagrangian (unconstraint minimization)
More informationThe spin-glass cornucopia
The spin-glass cornucopia Marc Mézard! Ecole normale supérieure - PSL Research University and CNRS, Université Paris Sud LPT-ENS, January 2015 40 years of research n 40 years of research 70 s:anomalies
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationCompressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach
Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Asilomar 2011 Jason T. Parker (AFRL/RYAP) Philip Schniter (OSU) Volkan Cevher (EPFL) Problem Statement Traditional
More informationNew Coherence and RIP Analysis for Weak. Orthogonal Matching Pursuit
New Coherence and RIP Analysis for Wea 1 Orthogonal Matching Pursuit Mingrui Yang, Member, IEEE, and Fran de Hoog arxiv:1405.3354v1 [cs.it] 14 May 2014 Abstract In this paper we define a new coherence
More informationVariational Principal Components
Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings
More informationGWAS V: Gaussian processes
GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011
More informationSparse & Redundant Signal Representation, and its Role in Image Processing
Sparse & Redundant Signal Representation, and its Role in Michael Elad The CS Department The Technion Israel Institute of technology Haifa 3000, Israel Wave 006 Wavelet and Applications Ecole Polytechnique
More informationUniqueness Conditions for A Class of l 0 -Minimization Problems
Uniqueness Conditions for A Class of l 0 -Minimization Problems Chunlei Xu and Yun-Bin Zhao October, 03, Revised January 04 Abstract. We consider a class of l 0 -minimization problems, which is to search
More informationFundamental Limits of PhaseMax for Phase Retrieval: A Replica Analysis
Fundamental Limits of PhaseMax for Phase Retrieval: A Replica Analysis Oussama Dhifallah and Yue M. Lu John A. Paulson School of Engineering and Applied Sciences Harvard University, Cambridge, MA 0238,
More informationMachine Learning for Signal Processing Sparse and Overcomplete Representations. Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013
Machine Learning for Signal Processing Sparse and Overcomplete Representations Bhiksha Raj (slides from Sourish Chaudhuri) Oct 22, 2013 1 Key Topics in this Lecture Basics Component-based representations
More informationOn the Projection Matrices Influence in the Classification of Compressed Sensed ECG Signals
On the Projection Matrices Influence in the Classification of Compressed Sensed ECG Signals Monica Fira, Liviu Goras Institute of Computer Science Romanian Academy Iasi, Romania Liviu Goras, Nicolae Cleju,
More informationSignal Recovery, Uncertainty Relations, and Minkowski Dimension
Signal Recovery, Uncertainty Relations, and Minkowski Dimension Helmut Bőlcskei ETH Zurich December 2013 Joint work with C. Aubel, P. Kuppinger, G. Pope, E. Riegler, D. Stotz, and C. Studer Aim of this
More information5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER /$ IEEE
5742 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 55, NO. 12, DECEMBER 2009 Uncertainty Relations for Shift-Invariant Analog Signals Yonina C. Eldar, Senior Member, IEEE Abstract The past several years
More informationApproximate Message Passing Algorithms
November 4, 2017 Outline AMP (Donoho et al., 2009, 2010a) Motivations Derivations from a message-passing perspective Limitations Extensions Generalized Approximate Message Passing (GAMP) (Rangan, 2011)
More informationThe Iteration-Tuned Dictionary for Sparse Representations
The Iteration-Tuned Dictionary for Sparse Representations Joaquin Zepeda #1, Christine Guillemot #2, Ewa Kijak 3 # INRIA Centre Rennes - Bretagne Atlantique Campus de Beaulieu, 35042 Rennes Cedex, FRANCE
More informationSparsity in Underdetermined Systems
Sparsity in Underdetermined Systems Department of Statistics Stanford University August 19, 2005 Classical Linear Regression Problem X n y p n 1 > Given predictors and response, y Xβ ε = + ε N( 0, σ 2
More informationRigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems
1 Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems Alyson K. Fletcher, Mojtaba Sahraee-Ardakan, Philip Schniter, and Sundeep Rangan Abstract arxiv:1706.06054v1 cs.it
More informationEfficient Adaptive Compressive Sensing Using Sparse Hierarchical Learned Dictionaries
1 Efficient Adaptive Compressive Sensing Using Sparse Hierarchical Learned Dictionaries Akshay Soni and Jarvis Haupt University of Minnesota, Twin Cities Department of Electrical and Computer Engineering
More informationStochastic geometry and random matrix theory in CS
Stochastic geometry and random matrix theory in CS IPAM: numerical methods for continuous optimization University of Edinburgh Joint with Bah, Blanchard, Cartis, and Donoho Encoder Decoder pair - Encoder/Decoder
More informationEUSIPCO
EUSIPCO 013 1569746769 SUBSET PURSUIT FOR ANALYSIS DICTIONARY LEARNING Ye Zhang 1,, Haolong Wang 1, Tenglong Yu 1, Wenwu Wang 1 Department of Electronic and Information Engineering, Nanchang University,
More informationUniversity of Luxembourg. Master in Mathematics. Student project. Compressed sensing. Supervisor: Prof. I. Nourdin. Author: Lucien May
University of Luxembourg Master in Mathematics Student project Compressed sensing Author: Lucien May Supervisor: Prof. I. Nourdin Winter semester 2014 1 Introduction Let us consider an s-sparse vector
More informationTWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen
TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES Mika Inki and Aapo Hyvärinen Neural Networks Research Centre Helsinki University of Technology P.O. Box 54, FIN-215 HUT, Finland ABSTRACT
More informationCompressed Sensing in Astronomy
Compressed Sensing in Astronomy J.-L. Starck CEA, IRFU, Service d'astrophysique, France jstarck@cea.fr http://jstarck.free.fr Collaborators: J. Bobin, CEA. Introduction: Compressed Sensing (CS) Sparse
More informationLearning Bound for Parameter Transfer Learning
Learning Bound for Parameter Transfer Learning Wataru Kumagai Faculty of Engineering Kanagawa University kumagai@kanagawa-u.ac.jp Abstract We consider a transfer-learning problem by using the parameter
More informationESANN'1999 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 1999, D-Facto public., ISBN X, pp.
Statistical mechanics of support vector machines Arnaud Buhot and Mirta B. Gordon Department de Recherche Fondamentale sur la Matiere Condensee CEA-Grenoble, 17 rue des Martyrs, 38054 Grenoble Cedex 9,
More informationStatistical Data Mining and Machine Learning Hilary Term 2016
Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes
More informationGeneralized Orthogonal Matching Pursuit- A Review and Some
Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents
More informationCompressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes
Compressed Sensing Using Reed- Solomon and Q-Ary LDPC Codes Item Type text; Proceedings Authors Jagiello, Kristin M. Publisher International Foundation for Telemetering Journal International Telemetering
More informationInference in kinetic Ising models: mean field and Bayes estimators
Inference in kinetic Ising models: mean field and Bayes estimators Ludovica Bachschmid-Romano, Manfred Opper Artificial Intelligence group, Computer Science, TU Berlin, Germany October 23, 2015 L. Bachschmid-Romano,
More informationSparse Superposition Codes for the Gaussian Channel
Sparse Superposition Codes for the Gaussian Channel Florent Krzakala (LPS, Ecole Normale Supérieure, France) J. Barbier (ENS) arxiv:1403.8024 presented at ISIT 14 Long version in preparation Communication
More informationMMSE Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm
Denoising of 2-D Signals Using Consistent Cycle Spinning Algorithm Bodduluri Asha, B. Leela kumari Abstract: It is well known that in a real world signals do not exist without noise, which may be negligible
More informationApproximate Message Passing
Approximate Message Passing Mohammad Emtiyaz Khan CS, UBC February 8, 2012 Abstract In this note, I summarize Sections 5.1 and 5.2 of Arian Maleki s PhD thesis. 1 Notation We denote scalars by small letters
More informationPart III Advanced Coding Techniques
Part III Advanced Coding Techniques José Vieira SPL Signal Processing Laboratory Departamento de Electrónica, Telecomunicações e Informática / IEETA Universidade de Aveiro, Portugal 2010 José Vieira (IEETA,
More informationCompressive Sensing of Temporally Correlated Sources Using Isotropic Multivariate Stable Laws
Compressive Sensing of Temporally Correlated Sources Using Isotropic Multivariate Stable Laws George Tzagkarakis EONOS Investment Technologies Paris, France and Institute of Computer Science Foundation
More informationMultiplicative and Additive Perturbation Effects on the Recovery of Sparse Signals on the Sphere using Compressed Sensing
Multiplicative and Additive Perturbation Effects on the Recovery of Sparse Signals on the Sphere using Compressed Sensing ibeltal F. Alem, Daniel H. Chae, and Rodney A. Kennedy The Australian National
More informationRecovering overcomplete sparse representations from structured sensing
Recovering overcomplete sparse representations from structured sensing Deanna Needell Claremont McKenna College Feb. 2015 Support: Alfred P. Sloan Foundation and NSF CAREER #1348721. Joint work with Felix
More informationBlind Compressed Sensing
1 Blind Compressed Sensing Sivan Gleichman and Yonina C. Eldar, Senior Member, IEEE arxiv:1002.2586v2 [cs.it] 28 Apr 2010 Abstract The fundamental principle underlying compressed sensing is that a signal,
More informationFast Angular Synchronization for Phase Retrieval via Incomplete Information
Fast Angular Synchronization for Phase Retrieval via Incomplete Information Aditya Viswanathan a and Mark Iwen b a Department of Mathematics, Michigan State University; b Department of Mathematics & Department
More informationIntroduction to Machine Learning
Outline Introduction to Machine Learning Bayesian Classification Varun Chandola March 8, 017 1. {circular,large,light,smooth,thick}, malignant. {circular,large,light,irregular,thick}, malignant 3. {oval,large,dark,smooth,thin},
More informationLearning MN Parameters with Alternative Objective Functions. Sargur Srihari
Learning MN Parameters with Alternative Objective Functions Sargur srihari@cedar.buffalo.edu 1 Topics Max Likelihood & Contrastive Objectives Contrastive Objective Learning Methods Pseudo-likelihood Gradient
More informationRandomness-in-Structured Ensembles for Compressed Sensing of Images
Randomness-in-Structured Ensembles for Compressed Sensing of Images Abdolreza Abdolhosseini Moghadam Dep. of Electrical and Computer Engineering Michigan State University Email: abdolhos@msu.edu Hayder
More informationSIGNAL SEPARATION USING RE-WEIGHTED AND ADAPTIVE MORPHOLOGICAL COMPONENT ANALYSIS
TR-IIS-4-002 SIGNAL SEPARATION USING RE-WEIGHTED AND ADAPTIVE MORPHOLOGICAL COMPONENT ANALYSIS GUAN-JU PENG AND WEN-LIANG HWANG Feb. 24, 204 Technical Report No. TR-IIS-4-002 http://www.iis.sinica.edu.tw/page/library/techreport/tr204/tr4.html
More informationCompressed Sensing and Neural Networks
and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationCompressed Sensing: Lecture I. Ronald DeVore
Compressed Sensing: Lecture I Ronald DeVore Motivation Compressed Sensing is a new paradigm for signal/image/function acquisition Motivation Compressed Sensing is a new paradigm for signal/image/function
More informationLecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions
DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K
More informationLecture Notes 9: Constrained Optimization
Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form
More informationHigh-dimensional graphical model selection: Practical and information-theoretic limits
1 High-dimensional graphical model selection: Practical and information-theoretic limits Martin Wainwright Departments of Statistics, and EECS UC Berkeley, California, USA Based on joint work with: John
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationRui ZHANG Song LI. Department of Mathematics, Zhejiang University, Hangzhou , P. R. China
Acta Mathematica Sinica, English Series May, 015, Vol. 31, No. 5, pp. 755 766 Published online: April 15, 015 DOI: 10.1007/s10114-015-434-4 Http://www.ActaMath.com Acta Mathematica Sinica, English Series
More informationAsymptotic Achievability of the Cramér Rao Bound For Noisy Compressive Sampling
Asymptotic Achievability of the Cramér Rao Bound For Noisy Compressive Sampling The Harvard community h made this article openly available. Plee share how this access benefits you. Your story matters Citation
More informationEstimation of linear non-gaussian acyclic models for latent factors
Estimation of linear non-gaussian acyclic models for latent factors Shohei Shimizu a Patrik O. Hoyer b Aapo Hyvärinen b,c a The Institute of Scientific and Industrial Research, Osaka University Mihogaoka
More informationCompressed Sensing and Linear Codes over Real Numbers
Compressed Sensing and Linear Codes over Real Numbers Henry D. Pfister (joint with Fan Zhang) Texas A&M University College Station Information Theory and Applications Workshop UC San Diego January 31st,
More informationPower Grid State Estimation after a Cyber-Physical Attack under the AC Power Flow Model
Power Grid State Estimation after a Cyber-Physical Attack under the AC Power Flow Model Saleh Soltan, Gil Zussman Department of Electrical Engineering Columbia University, New York, NY Email: {saleh,gil}@ee.columbia.edu
More informationIntroduction to Machine Learning
Introduction to Machine Learning Bayesian Classification Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationInvertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning
Invertible Nonlinear Dimensionality Reduction via Joint Dictionary Learning Xian Wei, Martin Kleinsteuber, and Hao Shen Department of Electrical and Computer Engineering Technische Universität München,
More informationExact Low-rank Matrix Recovery via Nonconvex M p -Minimization
Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:
More informationMachine Learning Lecture Notes
Machine Learning Lecture Notes Predrag Radivojac January 25, 205 Basic Principles of Parameter Estimation In probabilistic modeling, we are typically presented with a set of observations and the objective
More informationPhysics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester
Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability
More informationCompressed Sensing and Related Learning Problems
Compressed Sensing and Related Learning Problems Yingzhen Li Dept. of Mathematics, Sun Yat-sen University Advisor: Prof. Haizhang Zhang Advisor: Prof. Haizhang Zhang 1 / Overview Overview Background Compressed
More informationarxiv: v1 [cs.it] 21 Feb 2013
q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto
More informationarxiv: v1 [math.na] 26 Nov 2009
Non-convexly constrained linear inverse problems arxiv:0911.5098v1 [math.na] 26 Nov 2009 Thomas Blumensath Applied Mathematics, School of Mathematics, University of Southampton, University Road, Southampton,
More informationPhysics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester
Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals
More informationSupplemental for Spectral Algorithm For Latent Tree Graphical Models
Supplemental for Spectral Algorithm For Latent Tree Graphical Models Ankur P. Parikh, Le Song, Eric P. Xing The supplemental contains 3 main things. 1. The first is network plots of the latent variable
More informationCompressed Sensing: Extending CLEAN and NNLS
Compressed Sensing: Extending CLEAN and NNLS Ludwig Schwardt SKA South Africa (KAT Project) Calibration & Imaging Workshop Socorro, NM, USA 31 March 2009 Outline 1 Compressed Sensing (CS) Introduction
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationA New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables
A New Combined Approach for Inference in High-Dimensional Regression Models with Correlated Variables Niharika Gauraha and Swapan Parui Indian Statistical Institute Abstract. We consider the problem of
More informationMAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing
MAT 585: Johnson-Lindenstrauss, Group testing, and Compressed Sensing Afonso S. Bandeira April 9, 2015 1 The Johnson-Lindenstrauss Lemma Suppose one has n points, X = {x 1,..., x n }, in R d with d very
More informationCambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information
Introduction Consider a linear system y = Φx where Φ can be taken as an m n matrix acting on Euclidean space or more generally, a linear operator on a Hilbert space. We call the vector x a signal or input,
More informationJim Lambers MAT 610 Summer Session Lecture 2 Notes
Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the
More informationA NEW FRAMEWORK FOR DESIGNING INCOHERENT SPARSIFYING DICTIONARIES
A NEW FRAMEWORK FOR DESIGNING INCOERENT SPARSIFYING DICTIONARIES Gang Li, Zhihui Zhu, 2 uang Bai, 3 and Aihua Yu 3 School of Automation & EE, Zhejiang Univ. of Sci. & Tech., angzhou, Zhejiang, P.R. China
More informationAn iterative hard thresholding estimator for low rank matrix recovery
An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical
More informationSparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference
Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which
More informationGreedy Dictionary Selection for Sparse Representation
Greedy Dictionary Selection for Sparse Representation Volkan Cevher Rice University volkan@rice.edu Andreas Krause Caltech krausea@caltech.edu Abstract We discuss how to construct a dictionary by selecting
More informationA Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality
A Multi-task Learning Strategy for Unsupervised Clustering via Explicitly Separating the Commonality Shu Kong, Donghui Wang Dept. of Computer Science and Technology, Zhejiang University, Hangzhou 317,
More information