On convergence of Approximate Message Passing

Size: px
Start display at page:

Download "On convergence of Approximate Message Passing"

Transcription

1 On convergence of Approximate Message Passing Francesco Caltagirone (1), Florent Krzakala (2) and Lenka Zdeborova (1) (1) Institut de Physique Théorique, CEA Saclay (2) LPS, Ecole Normale Supérieure, Paris

2 Compressed Sensing y = F x + the signal is an N components vector only K<N components are non-zero the measurement is an M<N components vector = K N = M N MxN random matrix with i.i.d. elements white noise with variance h 2 i = y = F x + GIVEN RECONSTRUCT

3 Standard techniques Minimization of the l 0 norm under linear constraint min x x 0 with F x = y x 0 = number of non-zero elements non-convex norm, exponentially hard to find Candès, Tao, Donoho Minimization of the norm x 1 = NX i=1 l 1 x i convex norm, easy to minimize The l 1 norm well approximates the l 0 norm

4 The Donoho-Tanner line y = F x + =1 Square matrix, we can invert it < 1 Rectangular matrix, under-determined 1 K = N = M = N Information-theoretical limit α α (ρ 0 ) ρ 0 `1 α EM-BP (ρ 0 ) s-bp, N=10 4 s-bp, N=10 3 α = ρ 0 Let us consider the noiseless case with a measurement matrix with i.i.d. elements distributed according to a gaussian with zero mean and a variance of order 1/N. Donoho-Tanner, L1 minimization AMP Bayesian Information-Theoretical limit

5 Setting and motivation Bayesian setting GOAL Reconstruct the signal, given the measurement vector, the measurement matrix and a prior knowledge of the (sparse) distribution of signal elements Approximate Message Passing Donoho, Maleki, Montanari (2009) Powerful algorithm. Convergence issues.

6 Setting and motivation y = F x + F µi = N + 1 p N N (0, 1) P (x) =(1 ) (x)+ N (0, 1) Simplest case in which Approximate Massage Passing (AMP) has convergence problems. If the mean is sufficiently large then AMP displays violent divergencies. This kind of divergencies are observed in many other cases and are the main obstacle to a wide use of AMP. In this simple case there are workarounds that ensure convergence, like a mean-removal procedure. BUT it is interesting because want to understand the origin of the nonconvergence that, we argue, is of the same nature in more complicated settings.

7 Bayesian Inference with Belief Propagation Bayes formula P (x F, y) = P (x F)P (y F, x) P (y F) Conditional probability of the measurement vector P (y F, x) = MY µ=1 1 p 2 µ e 1 2 µ (y µ P N i=1 F µix i ) 2 P (x F, y) = 1 Z(y, F) NY MY [(1 ) (x i )+ (x i )] i=1 µ=1 1 p 2 µ e 1 2 µ (y µ P N i=1 F µix i ) 2 Z x? i = dx i x i i (x i ) Z i (x i ) P (x F, y) {x j } j6=i MMSE estimator NX E = (x i s i ) 2 /N i=1 Takes an exponential time, unfeasible

8 Bayes optimal setting If we know exactly the prior distribution on the signal elements and on the noise we are in the so-called BAYES OPTIMAL setting In the following we will consider that this is the case. When it is not the case, the prior can be efficiently learned adding a step to the algorithm that I will present. (I will not talk about this)

9 Belief Propagation (Cavity method) Two kinds of nodes: factors (matrix lines) and variables (signal elements) P (x) x m i!µ (x i ) m µ!i F We can introduce a third kind of nodes: the prior distribution on the signal elements, local field. Belief propagation works for: locally tree-like graphs or densely and weakly connected graphs. Messages represent an approximation to the marginal distribution of a variable. Messages are updated according to a sequential or parallel schedule until convergence (fixed point).

10 Belief Propagation, r-bp and AMP BP O(N 2 ) continuous messages P (x) x m i!µ (x i ) m µ!i F projection r-bp O(N 2 ) numbers AMP O(N 2 ) operations dense matrix, TAP Donoho, Maleki, Montanari (2009) Krzakala et al. (2012) For the last step one assumes parallel update In this case, fast matrix multiplication algorithms can be applied, reducing the complexity to Nlog(N)

11 AMP Algorithm V t+1 µ = X i! t+1 µ = X i ( t+1 i ) 2 = F 2 µi v t i, (1) F µi a t (y µ! µ) t X i + Vµ t Fµi 2 vi t, (2) i " # 1 X Fµi 2 + Vµ t+1, (3) µ R t+1 i = a t i + P µ F (y µ! t+1 µ ) µi +Vµ t+1 P µ F 2 µi +V t+1 µ, (4) a t+1 i = f 1 ( t+1 i ) 2,R t+1 i, (5) v t+1 i = f 2 ( t+1 i ) 2,R t+1 i. (6) The performance of the algorithm can be evaluated through E t = 1 N NX (s i a t i) 2 i=1 V t = 1 N NX i=1 v i f k ( 2,R) k-th connected cumulants w.r.t. the measure Q(x) = (x R) Z( 2,R) P (x)e p 2 2 a i and v i are the AMP estimators for the mean and variance of the i-th signal component.

12 AMP Algorithm V t+1 µ = X i! t+1 µ = X i ( t+1 i ) 2 = F 2 µi v t i, (1) F µi a t (y µ! µ) t X i + Vµ t Fµi 2 vi t, (2) i " # 1 X Fµi 2 + Vµ t+1, (3) µ R t+1 i = a t i + P µ F (y µ! t+1 µ ) µi +Vµ t+1 P µ F 2 µi +V t+1 µ, (4) a t+1 i = f 1 ( t+1 i ) 2,R t+1 i, (5) v t+1 i = f 2 ( t+1 i ) 2,R t+1 i. (6) The performance of the algorithm can be evaluated through E t = 1 N NX (s i a t i) 2 i=1 V t = 1 N NX i=1 v i F µi = N + 1 p N N (0, 1) The AMP algorithm does NOT depend explicitly on the value of the mean of the matrix.

13 Convergence Bayes optimal case. Given a certain (sufficiently high) measurement ratio. Very small or zero noise.

14 State Evolution (infinite N) Bayati, Montanari (rigorous in the zero-mean case) 11 Krzakala et al. (replicas in the zero-mean case) 12 Caltagirone, Krzakala, Zdeborova (replicas in the non-zero-mean case) 14 State evolution is the asymptotic analysis of the average performance of the inference algorithm when the size of the signal goes to infinity. It gives a good indication of what happens in a practical situation if the size of the signal is sufficiently large. It can be obtained rigorously in simple cases and non rigorously with the replica method in more involved cases.

15 State Evolution (infinite N) Bayati, Montanari (rigorous in the zero-mean case) 11 Krzakala et al. (replicas in the zero-mean case) 12 Caltagirone, Krzakala, Zdeborova (replicas in the non-zero-mean case) 14 E t = 1 N NX i=1(s i a t i) 2 V t = 1 N NX i=1 v i D t = 1 N X (s j a t j) j V t+1 = Z dsp(s) Z Dz f 2 + V t,s+ za(e t,d t )+ 2 D t E t+1 = Z dsp(s) Z Dz + V t 2 apples f 1,s+ za(e t,d t )+ 2 D t D t+1 = Z dsp(s) Z Dz + V t apples f 1,s+ za(e t,d t )+ 2 D t with A(E t,d t )= r Et (D t ) 2 If the mean is zero the density evolution that does not depend on D

16 The Nishimori Condition Bayes optimal setting E t = V t D t =0 E t+1 = V t+1 D t+1 =0 Therefore, analytically, if the evolution starts (exactly) on the Nishimori Line it stays on it until convergence. BUT What is the effect of small perturbations with respect to the NL? Very small fluctuations due to numerical precision in the DE Fluctuations due to finite size in the AMP algorithm

17 Zero-mean case 0.07 Gaussian Signal, Gaussian inference, ρ=0.2, no spinodal 0.07 Gaussian Signal, Gau Convergence on the 0.05 NL (Bayati, Montanari) 0.04 E E V D 0.2 Binary Signal, Gaussian inference, ρ=0.15, no spinodal 0.2 Binary Signal, Gaus The non-zero mean adds a third dimension to the phase space!

18 E Stability Analysis (I) D K = V E K D K t+1 = f K (V t,k t,d t ) D t+1 = f D (V t,k t,d t ) V D The NL is a fixed line : (K =0,D = 0)

19 Stability Analysis (II) We linearize the equations with K t = K t D t = D t K D K t+1 D t+1 = M K t D t M f K (V t, 0, D f K (V t, 0, K f D (V t, 0, D f D (V t, 0, 0)

20 Stability Analysis (II) We linearize the equations with K t = K t D t = D t K D K t+1 D t+1 = M K t D t M f K (V t, 0, 0) 0 D f D (V t, 0, 0) When the signal is Gauss-Bernoulli with zero mean, the off-diagonal terms vanish.

21 Stability Analysis D f D (V t K f K (V t )= Z + V t 1 + V t Z ds P (s) Dzf 2 A 2,s+ za = 2 V t+1 + V t, D Z Z ds P (s) Dz f 4 A 2,s+ za + 2(f 2 A 2,s+ za ) 2 +2 f 1 A 2,s+ za s f 3 A 2,s+ za, =0.1 =0.3 = K D = =2.5 =2.9-3 = log 10 V the eigenvalue is always less than 1 in modulus. < c (1) c (1) < < c (2) the eigenvalue becomes larger than 1 in a limited region. > c (2) the eigenvalue is larger than 1 in modulus down to the fixed point.

22 Density Evolution and AMP For zero measurement noise both the critical values do NOT depend on the undersampling rate. For weak noise only the second critical value has a very weak dependence on both and. [Inset] Convergence Rate of the AMP algorithm for different signal sizes R N=1000 N=4000 N=16000 (1) c (2) c (1) c (2) c The transition becomes sharper and sharper N!1 It is expected to move towards the second critical value and behave similarly to the density evolution.

23 SwAMP algorithm, a possible solution Manoel, Krzakala, Tramel, Zdeborova (2014) With random sequential update convergence problems disappear.

24 SwAMP algorithm, a possible solution LOW RANK! Very effective solution that works well in many interesting cases! It is not a universal solution. Looses the property of involving only matrix multiplications. Caltagirone, Manoel, Krzakala, Tramel, Zdeborova. arxiv: arxiv: Rangan, Schniter, Fletcher Kabashima

25 Conclusions and Perspectives We found that the origin of the convergence problems is an instability of the Nishimori Line We provided a possible solution with the SwAMP algorithm. Relate this kind of instability in the density evolution to the shape of the replica potential. Perform the same kind of analysis for the case of dictionary learning. THANK YOU!

Sparse Superposition Codes for the Gaussian Channel

Sparse Superposition Codes for the Gaussian Channel Sparse Superposition Codes for the Gaussian Channel Florent Krzakala (LPS, Ecole Normale Supérieure, France) J. Barbier (ENS) arxiv:1403.8024 presented at ISIT 14 Long version in preparation Communication

More information

Risk and Noise Estimation in High Dimensional Statistics via State Evolution

Risk and Noise Estimation in High Dimensional Statistics via State Evolution Risk and Noise Estimation in High Dimensional Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento, Murat Erdogdu, Marc Lelarge, and Andrea Montanari Statistical

More information

Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines. Eric W. Tramel. itwist 2016 Aalborg, DK 24 August 2016

Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines. Eric W. Tramel. itwist 2016 Aalborg, DK 24 August 2016 Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines Eric W. Tramel itwist 2016 Aalborg, DK 24 August 2016 Andre MANOEL, Francesco CALTAGIRONE, Marylou GABRIE, Florent

More information

Approximate Message Passing for Bilinear Models

Approximate Message Passing for Bilinear Models Approximate Message Passing for Bilinear Models Volkan Cevher Laboratory for Informa4on and Inference Systems LIONS / EPFL h,p://lions.epfl.ch & Idiap Research Ins=tute joint work with Mitra Fatemi @Idiap

More information

THE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES. Lenka Zdeborová

THE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES. Lenka Zdeborová THE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES Lenka Zdeborová (CEA Saclay and CNRS, France) MAIN CONTRIBUTORS TO THE PHYSICS UNDERSTANDING OF RANDOM INSTANCES Braunstein, Franz, Kabashima, Kirkpatrick,

More information

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery Approimate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery arxiv:1606.00901v1 [cs.it] Jun 016 Shuai Huang, Trac D. Tran Department of Electrical and Computer Engineering Johns

More information

arxiv: v3 [cs.na] 21 Mar 2016

arxiv: v3 [cs.na] 21 Mar 2016 Phase transitions and sample complexity in Bayes-optimal matrix factorization ariv:40.98v3 [cs.a] Mar 06 Yoshiyuki Kabashima, Florent Krzakala, Marc Mézard 3, Ayaka Sakata,4, and Lenka Zdeborová 5, Department

More information

The non-backtracking operator

The non-backtracking operator The non-backtracking operator Florent Krzakala LPS, Ecole Normale Supérieure in collaboration with Paris: L. Zdeborova, A. Saade Rome: A. Decelle Würzburg: J. Reichardt Santa Fe: C. Moore, P. Zhang Berkeley:

More information

Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems

Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems 1 Rigorous Dynamics and Consistent Estimation in Arbitrarily Conditioned Linear Systems Alyson K. Fletcher, Mojtaba Sahraee-Ardakan, Philip Schniter, and Sundeep Rangan Abstract arxiv:1706.06054v1 cs.it

More information

Approximate Message Passing Algorithms

Approximate Message Passing Algorithms November 4, 2017 Outline AMP (Donoho et al., 2009, 2010a) Motivations Derivations from a message-passing perspective Limitations Extensions Generalized Approximate Message Passing (GAMP) (Rangan, 2011)

More information

Message passing and approximate message passing

Message passing and approximate message passing Message passing and approximate message passing Arian Maleki Columbia University 1 / 47 What is the problem? Given pdf µ(x 1, x 2,..., x n ) we are interested in arg maxx1,x 2,...,x n µ(x 1, x 2,..., x

More information

Performance Regions in Compressed Sensing from Noisy Measurements

Performance Regions in Compressed Sensing from Noisy Measurements Performance egions in Compressed Sensing from Noisy Measurements Junan Zhu and Dror Baron Department of Electrical and Computer Engineering North Carolina State University; aleigh, NC 27695, USA Email:

More information

Mismatched Estimation in Large Linear Systems

Mismatched Estimation in Large Linear Systems Mismatched Estimation in Large Linear Systems Yanting Ma, Dror Baron, Ahmad Beirami Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 7695, USA Department

More information

The spin-glass cornucopia

The spin-glass cornucopia The spin-glass cornucopia Marc Mézard! Ecole normale supérieure - PSL Research University and CNRS, Université Paris Sud LPT-ENS, January 2015 40 years of research n 40 years of research 70 s:anomalies

More information

Turbo-AMP: A Graphical-Models Approach to Compressive Inference

Turbo-AMP: A Graphical-Models Approach to Compressive Inference Turbo-AMP: A Graphical-Models Approach to Compressive Inference Phil Schniter (With support from NSF CCF-1018368 and DARPA/ONR N66001-10-1-4090.) June 27, 2012 1 Outline: 1. Motivation. (a) the need for

More information

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Asilomar 2011 Jason T. Parker (AFRL/RYAP) Philip Schniter (OSU) Volkan Cevher (EPFL) Problem Statement Traditional

More information

An equivalence between high dimensional Bayes optimal inference and M-estimation

An equivalence between high dimensional Bayes optimal inference and M-estimation An equivalence between high dimensional Bayes optimal inference and M-estimation Madhu Advani Surya Ganguli Department of Applied Physics, Stanford University msadvani@stanford.edu and sganguli@stanford.edu

More information

Single-letter Characterization of Signal Estimation from Linear Measurements

Single-letter Characterization of Signal Estimation from Linear Measurements Single-letter Characterization of Signal Estimation from Linear Measurements Dongning Guo Dror Baron Shlomo Shamai The work has been supported by the European Commission in the framework of the FP7 Network

More information

Vector Approximate Message Passing. Phil Schniter

Vector Approximate Message Passing. Phil Schniter Vector Approximate Message Passing Phil Schniter Collaborators: Sundeep Rangan (NYU), Alyson Fletcher (UCLA) Supported in part by NSF grant CCF-1527162. itwist @ Aalborg University Aug 24, 2016 Standard

More information

Spin Glass Approach to Restricted Isometry Constant

Spin Glass Approach to Restricted Isometry Constant Spin Glass Approach to Restricted Isometry Constant Ayaka Sakata 1,Yoshiyuki Kabashima 2 1 Institute of Statistical Mathematics 2 Tokyo Institute of Technology 1/29 Outline Background: Compressed sensing

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Approximate Message Passing

Approximate Message Passing Approximate Message Passing Mohammad Emtiyaz Khan CS, UBC February 8, 2012 Abstract In this note, I summarize Sections 5.1 and 5.2 of Arian Maleki s PhD thesis. 1 Notation We denote scalars by small letters

More information

arxiv: v4 [cond-mat.stat-mech] 28 Jul 2016

arxiv: v4 [cond-mat.stat-mech] 28 Jul 2016 Statistical physics of inference: Thresholds and algorithms Lenka Zdeborová 1,, and Florent Krzakala 2, 1 Institut de Physique Théorique, CNRS, CEA, Université Paris-Saclay, F-91191, Gif-sur-Yvette, France

More information

Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula

Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula Jean Barbier, Mohamad Dia and Nicolas Macris Laboratoire de Théorie des Communications, Faculté Informatique

More information

Phase transitions in low-rank matrix estimation

Phase transitions in low-rank matrix estimation Phase transitions in low-rank matrix estimation Florent Krzakala http://krzakala.github.io/lowramp/ arxiv:1503.00338 & 1507.03857 Thibault Lesieur Lenka Zeborová Caterina Debacco Cris Moore Jess Banks

More information

The Minimax Noise Sensitivity in Compressed Sensing

The Minimax Noise Sensitivity in Compressed Sensing The Minimax Noise Sensitivity in Compressed Sensing Galen Reeves and avid onoho epartment of Statistics Stanford University Abstract Consider the compressed sensing problem of estimating an unknown k-sparse

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction

Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction David L. Donoho Department of Statistics Arian Maleki Department of Electrical Engineering Andrea Montanari Department

More information

Phil Schniter. Collaborators: Jason Jeremy and Volkan

Phil Schniter. Collaborators: Jason Jeremy and Volkan Bilinear Generalized Approximate Message Passing (BiG-AMP) for Dictionary Learning Phil Schniter Collaborators: Jason Parker @OSU, Jeremy Vila @OSU, and Volkan Cehver @EPFL With support from NSF CCF-,

More information

Phil Schniter. Supported in part by NSF grants IIP , CCF , and CCF

Phil Schniter. Supported in part by NSF grants IIP , CCF , and CCF AMP-inspired Deep Networks, with Comms Applications Phil Schniter Collaborators: Sundeep Rangan (NYU), Alyson Fletcher (UCLA), Mark Borgerding (OSU) Supported in part by NSF grants IIP-1539960, CCF-1527162,

More information

Predicting Phase Transitions in Hypergraph q-coloring with the Cavity Method

Predicting Phase Transitions in Hypergraph q-coloring with the Cavity Method Predicting Phase Transitions in Hypergraph q-coloring with the Cavity Method Marylou Gabrié (LPS, ENS) Varsha Dani (University of New Mexico), Guilhem Semerjian (LPT, ENS), Lenka Zdeborová (CEA, Saclay)

More information

An Overview of Multi-Processor Approximate Message Passing

An Overview of Multi-Processor Approximate Message Passing An Overview of Multi-Processor Approximate Message Passing Junan Zhu, Ryan Pilgrim, and Dror Baron JPMorgan Chase & Co., New York, NY 10001, Email: jzhu9@ncsu.edu Department of Electrical and Computer

More information

Mismatched Estimation in Large Linear Systems

Mismatched Estimation in Large Linear Systems Mismatched Estimation in Large Linear Systems Yanting Ma, Dror Baron, and Ahmad Beirami North Carolina State University Massachusetts Institute of Technology & Duke University Supported by NSF & ARO Motivation

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

arxiv: v1 [cs.it] 17 Sep 2015

arxiv: v1 [cs.it] 17 Sep 2015 Online compressed sensing Paulo V. Rossi Dep. de Física Geral, Instituto de Física, University of São Paulo, São Paulo-SP 5314-9, Brazil Yoshiyuki Kabashima Department of Computational Intelligence and

More information

arxiv: v1 [cond-mat.dis-nn] 16 Feb 2018

arxiv: v1 [cond-mat.dis-nn] 16 Feb 2018 Parallel Tempering for the planted clique problem Angelini Maria Chiara 1 1 Dipartimento di Fisica, Ed. Marconi, Sapienza Università di Roma, P.le A. Moro 2, 00185 Roma Italy arxiv:1802.05903v1 [cond-mat.dis-nn]

More information

MMSE Dimension. snr. 1 We use the following asymptotic notation: f(x) = O (g(x)) if and only

MMSE Dimension. snr. 1 We use the following asymptotic notation: f(x) = O (g(x)) if and only MMSE Dimension Yihong Wu Department of Electrical Engineering Princeton University Princeton, NJ 08544, USA Email: yihongwu@princeton.edu Sergio Verdú Department of Electrical Engineering Princeton University

More information

Self-Averaging Expectation Propagation

Self-Averaging Expectation Propagation Self-Averaging Expectation Propagation Burak Çakmak Department of Electronic Systems Aalborg Universitet Aalborg 90, Denmark buc@es.aau.dk Manfred Opper Department of Artificial Intelligence Technische

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Expectation propagation for symbol detection in large-scale MIMO communications

Expectation propagation for symbol detection in large-scale MIMO communications Expectation propagation for symbol detection in large-scale MIMO communications Pablo M. Olmos olmos@tsc.uc3m.es Joint work with Javier Céspedes (UC3M) Matilde Sánchez-Fernández (UC3M) and Fernando Pérez-Cruz

More information

An iterative hard thresholding estimator for low rank matrix recovery

An iterative hard thresholding estimator for low rank matrix recovery An iterative hard thresholding estimator for low rank matrix recovery Alexandra Carpentier - based on a joint work with Arlene K.Y. Kim Statistical Laboratory, Department of Pure Mathematics and Mathematical

More information

Performance Limits for Noisy Multi-Measurement Vector Problems

Performance Limits for Noisy Multi-Measurement Vector Problems Performance Limits for Noisy Multi-Measurement Vector Problems Junan Zhu, Student Member, IEEE, Dror Baron, Senior Member, IEEE, and Florent Krzakala Abstract Compressed sensing CS) demonstrates that sparse

More information

Wiener Filters in Gaussian Mixture Signal Estimation with l -Norm Error

Wiener Filters in Gaussian Mixture Signal Estimation with l -Norm Error Wiener Filters in Gaussian Mixture Signal Estimation with l -Norm Error Jin Tan, Student Member, IEEE, Dror Baron, Senior Member, IEEE, and Liyi Dai, Fellow, IEEE Abstract Consider the estimation of a

More information

p L yi z n m x N n xi

p L yi z n m x N n xi y i z n x n N x i Overview Directed and undirected graphs Conditional independence Exact inference Latent variables and EM Variational inference Books statistical perspective Graphical Models, S. Lauritzen

More information

Does l p -minimization outperform l 1 -minimization?

Does l p -minimization outperform l 1 -minimization? Does l p -minimization outperform l -minimization? Le Zheng, Arian Maleki, Haolei Weng, Xiaodong Wang 3, Teng Long Abstract arxiv:50.03704v [cs.it] 0 Jun 06 In many application areas ranging from bioinformatics

More information

Optimisation Combinatoire et Convexe.

Optimisation Combinatoire et Convexe. Optimisation Combinatoire et Convexe. Low complexity models, l 1 penalties. A. d Aspremont. M1 ENS. 1/36 Today Sparsity, low complexity models. l 1 -recovery results: three approaches. Extensions: matrix

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Binary Classification and Feature Selection via Generalized Approximate Message Passing

Binary Classification and Feature Selection via Generalized Approximate Message Passing Binary Classification and Feature Selection via Generalized Approximate Message Passing Phil Schniter Collaborators: Justin Ziniel (OSU-ECE) and Per Sederberg (OSU-Psych) Supported in part by NSF grant

More information

Fundamental limits of symmetric low-rank matrix estimation

Fundamental limits of symmetric low-rank matrix estimation Proceedings of Machine Learning Research vol 65:1 5, 2017 Fundamental limits of symmetric low-rank matrix estimation Marc Lelarge Safran Tech, Magny-Les-Hameaux, France Département d informatique, École

More information

arxiv: v2 [cs.it] 6 Sep 2016

arxiv: v2 [cs.it] 6 Sep 2016 The Mutual Information in Random Linear Estimation Jean Barbier, Mohamad Dia, Nicolas Macris and Florent Krzakala Laboratoire de Théorie des Communications, Faculté Informatique et Communications, Ecole

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation

Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation David L. Donoho Department of Statistics Arian Maleki Department of Electrical Engineering Andrea Montanari Department of

More information

Hybrid Approximate Message Passing for Generalized Group Sparsity

Hybrid Approximate Message Passing for Generalized Group Sparsity Hybrid Approximate Message Passing for Generalized Group Sparsity Alyson K Fletcher a and Sundeep Rangan b a UC Santa Cruz, Santa Cruz, CA, USA; b NYU-Poly, Brooklyn, NY, USA ABSTRACT We consider the problem

More information

Statistical Physics of Inference and Bayesian Estimation

Statistical Physics of Inference and Bayesian Estimation Statistical Physics of Inference and Bayesian Estimation Florent Krzakala, Lenka Zdeborova, Maria Chiara Angelini and Francesco Caltagirone Contents Bayesian Inference and Estimators 3. The Bayes formula..........................................

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Phase transitions in discrete structures

Phase transitions in discrete structures Phase transitions in discrete structures Amin Coja-Oghlan Goethe University Frankfurt Overview 1 The physics approach. [following Mézard, Montanari 09] Basics. Replica symmetry ( Belief Propagation ).

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Passing and Interference Coordination

Passing and Interference Coordination Generalized Approximate Message Passing and Interference Coordination Sundeep Rangan, Polytechnic Institute of NYU Joint work with Alyson Fletcher (Berkeley), Vivek Goyal (MIT), Ulugbek Kamilov (EPFL/MIT),

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Reduced Complexity Models in the Identification of Dynamical Networks: links with sparsification problems

Reduced Complexity Models in the Identification of Dynamical Networks: links with sparsification problems Reduced Complexity Models in the Identification of Dynamical Networks: links with sparsification problems Donatello Materassi, Giacomo Innocenti, Laura Giarré December 15, 2009 Summary 1 Reconstruction

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Compressed Sensing under Optimal Quantization

Compressed Sensing under Optimal Quantization Compressed Sensing under Optimal Quantization Alon Kipnis, Galen Reeves, Yonina C. Eldar and Andrea J. Goldsmith Department of Electrical Engineering, Stanford University Department of Electrical and Computer

More information

Signal Reconstruction in Linear Mixing Systems with Different Error Metrics

Signal Reconstruction in Linear Mixing Systems with Different Error Metrics Signal Reconstruction in Linear Mixing Systems with Different Error Metrics Jin Tan and Dror Baron North Carolina State University Feb. 14, 2013 Jin Tan and Dror Baron (NCSU) Reconstruction with Different

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery

Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery Jorge F. Silva and Eduardo Pavez Department of Electrical Engineering Information and Decision Systems Group Universidad

More information

Phase Transitions (and their meaning) in Random Constraint Satisfaction Problems

Phase Transitions (and their meaning) in Random Constraint Satisfaction Problems International Workshop on Statistical-Mechanical Informatics 2007 Kyoto, September 17 Phase Transitions (and their meaning) in Random Constraint Satisfaction Problems Florent Krzakala In collaboration

More information

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization

Computer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

A statistical model for tensor PCA

A statistical model for tensor PCA A statistical model for tensor PCA Andrea Montanari Statistics & Electrical Engineering Stanford University Emile Richard Electrical Engineering Stanford University Abstract We consider the Principal Component

More information

Fundamental Limits of Compressed Sensing under Optimal Quantization

Fundamental Limits of Compressed Sensing under Optimal Quantization Fundamental imits of Compressed Sensing under Optimal Quantization Alon Kipnis, Galen Reeves, Yonina C. Eldar and Andrea J. Goldsmith Department of Electrical Engineering, Stanford University Department

More information

Statistical Image Recovery: A Message-Passing Perspective. Phil Schniter

Statistical Image Recovery: A Message-Passing Perspective. Phil Schniter Statistical Image Recovery: A Message-Passing Perspective Phil Schniter Collaborators: Sundeep Rangan (NYU) and Alyson Fletcher (UC Santa Cruz) Supported in part by NSF grants CCF-1018368 and NSF grant

More information

Bayesian Paradigm. Maximum A Posteriori Estimation

Bayesian Paradigm. Maximum A Posteriori Estimation Bayesian Paradigm Maximum A Posteriori Estimation Simple acquisition model noise + degradation Constraint minimization or Equivalent formulation Constraint minimization Lagrangian (unconstraint minimization)

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

Learning from and about complex energy landspaces

Learning from and about complex energy landspaces Learning from and about complex energy landspaces Lenka Zdeborová (CNLS + T-4, LANL) in collaboration with: Florent Krzakala (ParisTech) Thierry Mora (Princeton Univ.) Landscape Landscape Santa Fe Institute

More information

Reconstruction in the Generalized Stochastic Block Model

Reconstruction in the Generalized Stochastic Block Model Reconstruction in the Generalized Stochastic Block Model Marc Lelarge 1 Laurent Massoulié 2 Jiaming Xu 3 1 INRIA-ENS 2 INRIA-Microsoft Research Joint Centre 3 University of Illinois, Urbana-Champaign GDR

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Spectral Partitiong in a Stochastic Block Model

Spectral Partitiong in a Stochastic Block Model Spectral Graph Theory Lecture 21 Spectral Partitiong in a Stochastic Block Model Daniel A. Spielman November 16, 2015 Disclaimer These notes are not necessarily an accurate representation of what happened

More information

Expectation Propagation in Factor Graphs: A Tutorial

Expectation Propagation in Factor Graphs: A Tutorial DRAFT: Version 0.1, 28 October 2005. Do not distribute. Expectation Propagation in Factor Graphs: A Tutorial Charles Sutton October 28, 2005 Abstract Expectation propagation is an important variational

More information

Bayesian Methods for Sparse Signal Recovery

Bayesian Methods for Sparse Signal Recovery Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri Motivation Motivation Sparse Signal Recovery

More information

Propagating beliefs in spin glass models. Abstract

Propagating beliefs in spin glass models. Abstract Propagating beliefs in spin glass models Yoshiyuki Kabashima arxiv:cond-mat/0211500v1 [cond-mat.dis-nn] 22 Nov 2002 Department of Computational Intelligence and Systems Science, Tokyo Institute of Technology,

More information

Information, Physics, and Computation

Information, Physics, and Computation Information, Physics, and Computation Marc Mezard Laboratoire de Physique Thdorique et Moales Statistiques, CNRS, and Universit y Paris Sud Andrea Montanari Department of Electrical Engineering and Department

More information

Clustering from Sparse Pairwise Measurements

Clustering from Sparse Pairwise Measurements Clustering from Sparse Pairwise Measurements arxiv:6006683v2 [cssi] 9 May 206 Alaa Saade Laboratoire de Physique Statistique École Normale Supérieure, 24 Rue Lhomond Paris 75005 Marc Lelarge INRIA and

More information

Asynchronous Decoding of LDPC Codes over BEC

Asynchronous Decoding of LDPC Codes over BEC Decoding of LDPC Codes over BEC Saeid Haghighatshoar, Amin Karbasi, Amir Hesam Salavati Department of Telecommunication Systems, Technische Universität Berlin, Germany, School of Engineering and Applied

More information

How to Design Message Passing Algorithms for Compressed Sensing

How to Design Message Passing Algorithms for Compressed Sensing How to Design Message Passing Algorithms for Compressed Sensing David L. Donoho, Arian Maleki and Andrea Montanari, February 17, 2011 Abstract Finding fast first order methods for recovering signals from

More information

Scale Mixture Modeling of Priors for Sparse Signal Recovery

Scale Mixture Modeling of Priors for Sparse Signal Recovery Scale Mixture Modeling of Priors for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Jason Palmer, Zhilin Zhang and Ritwik Giri Outline Outline Sparse

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute

More information

Computational Genomics

Computational Genomics Computational Genomics http://www.cs.cmu.edu/~02710 Introduction to probability, statistics and algorithms (brief) intro to probability Basic notations Random variable - referring to an element / event

More information

Performance Trade-Offs in Multi-Processor Approximate Message Passing

Performance Trade-Offs in Multi-Processor Approximate Message Passing Performance Trade-Offs in Multi-Processor Approximate Message Passing Junan Zhu, Ahmad Beirami, and Dror Baron Department of Electrical and Computer Engineering, North Carolina State University, Email:

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

Math Solutions to homework 5

Math Solutions to homework 5 Math 75 - Solutions to homework 5 Cédric De Groote November 9, 207 Problem (7. in the book): Let {e n } be a complete orthonormal sequence in a Hilbert space H and let λ n C for n N. Show that there is

More information

Optimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices

Optimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices Optimality and Sub-optimality of Principal Component Analysis for Spiked Random Matrices Alex Wein MIT Joint work with: Amelia Perry (MIT), Afonso Bandeira (Courant NYU), Ankur Moitra (MIT) 1 / 22 Random

More information

Sample Complexity of Bayesian Optimal Dictionary Learning

Sample Complexity of Bayesian Optimal Dictionary Learning Sample Complexity of Bayesian Optimal Dictionary Learning Ayaka Sakata and Yoshiyuki Kabashima Dep. of Computational Intelligence & Systems Science Tokyo Institute of Technology Yokohama 6-85, Japan Email:

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information