Message passing and approximate message passing

Size: px
Start display at page:

Download "Message passing and approximate message passing"

Transcription

1 Message passing and approximate message passing Arian Maleki Columbia University 1 / 47

2 What is the problem? Given pdf µ(x 1, x 2,..., x n ) we are interested in arg maxx1,x 2,...,x n µ(x 1, x 2,..., x n) Calculating marginal pdf µi(x i) Calculating joint distribution µs(x S) for S {1, 2,..., n} These problems are important because Many inference problems are in these forms Many applications in image procession, communication, machine learning, signal processing This talk: Marginalizing distribution Most tools are applied to other problems 2 / 47

3 This talk: marginalization Problem: Given pdf µ(x 1, x 2,..., x n ) find µ i (x i ) Simplification: all xj s are binary random variables Is it difficult? Yes. No polynomial time algorithm Complexity of variable elimination: 2 n 1 Solution: No good solution for generic µ Consider structured µ 3 / 47

4 Structured µ Problem: Given pdf µ(x 1, x 2,..., x n ) find µ i (x i ) Simplification: all xj s are binary random variables Example 1: x 1, x 2,..., x n are independent Easy marginalization Not interesting Example 2: Independent subsets µ(x 1, x 2,..., x n ) = µ 1(x 1,..., x k )µ 2(x k+1,..., x n ) More interesting and less easy Complexity still exponential in terms of size of subsets Does not cover many interesting cases 4 / 47

5 Structured µ:cont d Problem: Given pdf µ(x 1, x 2,..., x n ) find µ i (x i ) Simplification: all xj s are binary random variables Example 3: more interesting structure: µ(x 1, x 2,..., x n ) = µ 1(x S1 )µ 2(x S2 )... µ l(x Sl ) Unlike independence, Si S j Covers many interesting practical problems Let Si n, i. Is marginalization easier than a generic µ? o Not clear!! We introduce factor graph to address this question 5 / 47

6 Factor graph Slight generalization µ(x 1, x 2,..., x n ) = 1 Z ψ a (x Sa ) a F 1 Variable nodes Factor nodes ψa : {0, 1} Sa R + not necessarily pdf Z: normalizing constant. Called "partition function" ψa(x Sa ): factors Factor graph: 2.. a b Variable node: n nodes corresponding to x 1, x 2,..., x n n Function node: F nodes corresponding to ψ a(x S1 ), ψ b (x S2 ),... Edge between a and variable node i iff xi S a 6 / 47

7 Factor graph Cont d µ(x 1, x 2,..., x n ) = 1 Z ψ a (x Sa ) Factor graph: a F Variable node: n nodes corresponding to x 1, x 2,..., x n Function node: F nodes corresponding to ψ a(x S1 ), ψ b (x S2 ), Variable nodes. Factor nodes a b Edge between a and variable node i iff xi S a Some notation:. a: neighbors of function node a: a = {i : x i S a}. n i: neighbors of variable node i i = {b F : i S b }. 7 / 47

8 Factor graph example µ(x 1, x 2, x 3 ) = 1 Z ψ a(x 1, x 2 )ψ b (x 2, x 3 ) Some notation: a: neighbors of function node a: a = {1, 2} b = {2, 3} i: neighbors of variable node i Variable nodes Factor nodes a b 1 = {a} 2 = {a, b} 8 / 47

9 Why factor graph? 1 Variable nodes Factor nodes Factor graphs simplifies understanding structures Independence: graph has disconnected pieces Tree graphs: Easy marginalization Loops (specially short loops): generally make the problem more complicated 2.. a b n 9 / 47

10 Marginalization on tree structured graphs Tree: Graph with no loops Marginalization by variable elimination is efficient on such graphs (distributions) o Linear in n o Exponential in size of factors i Proof: on the board 10 / 47

11 Summary of variable elimination on trees Notation µ(x1, x 2,..., x n) = ψa(x a F a) µj a: belief from variable node j to factor a ˆµa j: belief from factor node a to variable j i We also have µ a j (x j ) = ψ a (x a ) ˆµ l a (x l ) ˆµ j a (x j ) = {x j:j a\j} b j\a µ b j (x j ) l a\j 11 / 47

12 Belief propagation algorithm on trees Based on variable elimination on trees consider iterative algorithm: ν t+1 a j (x j) = ψ a (x a ) ˆν l a(x t l ) ˆν t+1 j a (x j) = {x j:j a\j} b j\a ν t b j(x j ) l a\j i ν t a j and ˆν t j a are beliefs of nodes at time t Each node propagates its belief to other nodes Hope: Converge to a good fixed point νb j(x t j ) µ b j (x j ) t ˆν j b(x t j ) µ j b (x j ) t 12 / 47

13 Belief propagation algorithm on trees Belief propagation: ν t+1 a j (x j) = ˆν t+1 j a (x j) = {x j:j a\j} b j\a ψ a (x a ) ν t b j(x j ) l a\j ˆν t l a(x l ) i Why BP and not variable elimination? On trees: no good reason loopy graphs: can be easily applied 13 / 47

14 Belief propagation algorithm on trees Belief propagation: ν t+1 a j (x j) = ˆν t+1 j a (x j) = {x j:j a\j} b j\a ψ a (x a ) ν t b j(x j ) l a\j ˆν t l a(x l ) i Lemma: On a tree graph BP converges to the marginal distributions in finite number of iterations, i.e, νb j(x t j ) µ b j (x j ) t ˆν j b(x t j ) µ j b (x j ) t 14 / 47

15 Summary of belief propagation on trees i Equivalent to variable elimination Exact on trees Converges in finite number of iterations: 2 times the maximum depth of the tree. 15 / 47

16 Loopy belief propagation Apply belief propagation to loopy graph: ν t+1 a j (x j) = ψ a (x a ) ˆν l a(x t l ) ˆν t+1 j a (x j) = {x j:j a\j} b j\a ν t b j(x j ) l a\j i Wait until the algorithm converges: Announce a j ν a j(x j) as µ j(x j) 16 / 47

17 Challenges for loopy BP Does not converge necessarily Example : µ(x 1, x 2, x 3) = I(x 1 x 2)I(x 2 x 3)I(x 3 x 1) o t = 1 start with x 1 = 0 o t = 2 x 2 = 1 o t = 3 x 3 = 0 o t = 4 x 4 = 1 Even if converges, not necessarily the marginal distribution Can have more than one fixed point 17 / 47

18 Advantages of loopy BP In many applications, works really well i Coding theory Machine vision compressed sensing. In some cases, loopy BP can be analyzed theoretically 18 / 47

19 Some theoretical results on loopy BP BP equations have at least one fixed point Proof uses Brouwer s theorem Does not necessarily converge to any of the fixed points BP is exact on trees i BP is accurate for locally tree like graphs Gallager 1963, Luby et al. 2001, Richardson and Urbanke 2001 Several methods exist for verifying correctness of BP Computation trees Dobrushin condition. 19 / 47

20 Locally tree like graphs BP is accurate for locally tree like graphs Heuristic: sparse graphs (without many edges) can be locally tree like i How to check? If girth(g) = O(log(n)) girth(g) is the length of the shortest cycle Random (dv, d c) graph is locally tree like with high probability 20 / 47

21 Dobrushin condition sufficient condition for convergence of BP Influence of node j on node i: C ij = sup x,x { µ(x i x V \i,j ) µ(x i x V \i,j ) T V : x l = x l l j} Intuition: if the influence of the nodes on each other is very small, then oscillations should not occur Theorem If γ = sup i ( j C ij) < 1, then BP marginals converge to the true marginals. 21 / 47

22 Summary of BP Belief propagation: ν t+1 a j (x j) = ˆν t+1 j a (x j) = {x j:j a\j} b j\a ψ a (x a ) ν t b j(x j ) l a\j ˆν t l a(x l ) i Exact on trees Accurate on locally tree like graphs No generic proving technique that works on all problems Successful in many applications, even when we cannot prove convergence 22 / 47

23 More challenges What if µ is a pdf on R n? Extension seems straightforward ν t+1 a j (x j) = ˆν t+1 j a (x j) = {x j:j a\j} b j\a But, calculations become complicated ν t b j(x j ) ψ a (x a ) l a\j ˆν t l a(x l )dx Shall discretize the pdf with certain accuracy Do lots of numeric multiple integrtions Exception: Gaussian graphical models µ(x) = exp( x t Jx+c t x) Z All messages become Gaussian NOT the focus of this talk 23 / 47

24 Approximate message passing 24 / 47

25 Model n N k sparse signal y A w y = Ax o + w x x o: k-sparse vector in R N A: n N design matrix y: measurement vector in R n w: measurement noise in R n 25 / 47

26 LASSO Many useful heuristic algorithms: Noiseless: l1-minimization minimize x subject to x 1 y = Ax x 1 = i xi Noisy: LASSO minimize x 1 2 y Ax λ x 1 convex optimizations Chen, Donoho, Saunders (96), Tibshirani (96) 26 / 47

27 LASSO Many useful heuristic algorithms: Noiseless: l1-minimization minimize x subject to x 1 y = Ax x 1 = i xi Noisy: LASSO minimize x 1 2 y Ax λ x 1 convex optimizations Chen, Donoho, Saunders (96), Tibshirani (96) 27 / 47

28 Algorithmic challenges of sparse recovery Use convex optimization tools to solve LASSO Computational complexity: interior point method: o generic tools o appropriate for N < 5000 homotopy methods: o use the structure of the LASSO o appropriate for N < First order methods o Low computational complexity per iteration o Require many iterations 28 / 47

29 Algorithmic challenges of sparse recovery Use convex optimization tools to solve LASSO Computational complexity: interior point method: o generic tools o appropriate for N < 5000 homotopy methods: o use the structure of the LASSO o appropriate for N < First order methods o Low computational complexity per iteration o Require many iterations 29 / 47

30 Algorithmic challenges of sparse recovery Use convex optimization tools to solve LASSO Computational complexity: interior point method: o generic tools o appropriate for N < 5000 homotopy methods: o use the structure of the LASSO o appropriate for N < First order methods o Low computational complexity per iteration o Require many iterations 30 / 47

31 Message passing for LASSO Define: µ(dx) = 1 Z e βλ x 1 β 2 y Ax 2 dx Z: normalization constant β > 0 As β : µ concentrates around solution of LASSO: ˆx λ Marginalize µ to obtain ˆx λ 31 / 47

32 Message passing for LASSO Define: µ(dx) = 1 Z e βλ x 1 β 2 y Ax 2 dx Z: normalization constant β > 0 As β : µ concentrates around solution of LASSO: ˆx λ Marginalize µ to obtain ˆx λ 32 / 47

33 Message passing for LASSO Define: µ(dx) = 1 Z e βλ x 1 β 2 y Ax 2 dx Z: normalization constant β > 0 As β : µ concentrates around solution of LASSO: ˆx λ Marginalize µ to obtain ˆx λ 33 / 47

34 Message passing algorithm µ(dx) = 1 Z e βλ i xi β 2 a (ya (Ax)a)2 dx Belief propagation update rules: ν t+1 i a (si) = e βλ s i b a ˆνt b i(s i) ˆν t a i(s i) = e β 2 (ya (As)a)2 j i dνt j a(s j) Pearl (82) 34 / 47

35 Issues of message passing algorithm Challenges: ν t+1 i a (s i) = e βλ si b a ˆν t b i(s i ) ˆν a i(s t i ) = e β 2 (ya (As)a)2 dνj a(s t j ) j i messages are distributions over real line calculation of messages is difficult o Numeric integration o The number of variables in the integral is N 1 The graph is not "tree like" o No good analysis tools 35 / 47

36 Challenges for message passing algorithm Challenges: Solution: ν t+1 i a (s i) = e βλ si b a ν t b i(s i ) ˆν a i(s t i ) = e β 2 (ya (As)a)2 messages are distributions over real line calculation of messages is difficult The graph is not "tree like" blessing of dimensionality: o as N : ˆν t a i (s i) converges to Gaussian dνj a(s t j ) j i o as N : νi a t (s i) converges to 1 β f β (s; x, b) z β (x, b) e β s 2b (s x)2. 36 / 47

37 Challenges for message passing algorithm Challenges: Solution: ν t+1 i a (s i) = e βλ si b a ν t b i(s i ) ˆν a i(s t i ) = e β 2 (ya (As)a)2 messages are distributions over real line calculation of messages is difficult The graph is not "tree like" blessing of dimensionality: o as N : ˆν t a i (s i) converges to Gaussian dνj a(s t j ) j i o as N : νi a t (s i) converges to 1 β f β (s; x, b) z β (x, b) e β s 2b (s x)2. 37 / 47

38 Simplifying messages for large N Simplification of ˆν a i t (s i) ˆν a i(s t i ) = e β 2 (ya (As)a)2 dνj a(s t j ) j i = E exp β 2 (y a A ai s i A aj s j ) 2 j i ( = E exp β 2 (A ais i Z) 2) Z = y a j i A ajs j d W ; W is Gaussian Ehsi (Z) Eh si (W ) C t N 1/2 (ˆτ a i t, )3/2 38 / 47

39 Message passing As N, ˆν a i t converges to a Gaussian distribution. Theorem Let x t j a and (τ j a t /β) denote the mean and variance of distribution νj a t. Then there exists a constant C t such that sup ˆν a i(s t i ) ˆφ t a i(s i ) s i ˆφ t a i(ds i ) C t N(ˆτ a i t, )3 βa 2 ai 2πˆτ t a i e β(aaisi zt a i )2 /2ˆτ t a i dsi, where z t a i y a j i A aj x t j a, ˆτ t a i j i A 2 ajτ t j a. Donoho, M., Montanari (11) 39 / 47

40 Shape of ν t i a; intuition Summary ˆν t b i (s i) Gaussian ν t+1 i a (si) = e βλ s i b a νt b i(s i) Shape of ν t+1 i a (s i) ν t+1 i a (si) 1 z β (x,b) e β s β 2b (s x)2 40 / 47

41 Message passing (cont d) Theorem Suppose that ˆν t a i = ˆφ t a i, with mean zt a i and variance ˆτ t a i = ˆτ t. Then at the next iteration we have ν t+1 i a (s i) φ t+1 i a (s i) < C/n, φ t+1 i a (s i) λ f β (λs i ; λ b a A bi z t b i, λ 2 (1 + ˆτ t )), and 1 f β (s; x, b) z β (x, b) e β s β 2b (s x)2. Donoho, M., Montanari (11) 41 / 47

42 Next asymptotic: β Reminder: x t j a mean of ν t j a z t a i mean of ˆν t a i z t a i y a j i Aajxt j a Can we write x t j a in terms of zt a i? x t j a = 1 z β s i e β si β(si b a A biz t b i )2 Can be done, but... Laplace method simplifies it (β ) x t j a = η( b a A bi z t b i). 42 / 47

43 Summary x t j a mean of νt j a z t a i mean of ˆνt a i z t a i = y a j i A aj x t j a x t+1 i a = η( b a A bi z t b i). 43 / 47

44 Comparison of MP and IST MP IST z t a i = y a j i A aj x t j a z t a i = y a j i A aj x t j a x t+1 i a = η( b a A bi z t b i). x t+1 i a = η( b A bi z t b i). x 1 x 1 y 1 y 1 y 2 x 2 y 2 x 2 y n y n x n x n 44 / 47

45 Approximate message passing We can further simplify the messages x t+1 = η(x t + A T z t ; λ t ) z t = y Ax t + xt 0 n zt 1 45 / 47

46 IST versus AMP IST: x t+1 = η(x t + A T z t ; λ t ) z t = y Ax t AMP: x t+1 = η(x t + A T z t ; λ t ) z t = y Ax t + 1 n xt 0z t 1 Mean square error State evolution Empirical Iteration Mean square error State evolution Empirical Iteration 46 / 47

47 Theoretical guarantees We can predict the performance of AMP at every iteration in asymptotic regime Idea: x t + A T z t can be modeled as signal plus Gaussian noise We keep track of noise variance I will discuss this part in the "Risk Seminar" 47 / 47

How to Design Message Passing Algorithms for Compressed Sensing

How to Design Message Passing Algorithms for Compressed Sensing How to Design Message Passing Algorithms for Compressed Sensing David L. Donoho, Arian Maleki and Andrea Montanari, February 17, 2011 Abstract Finding fast first order methods for recovering signals from

More information

Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction

Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction David L. Donoho Department of Statistics Arian Maleki Department of Electrical Engineering Andrea Montanari Department

More information

Approximate Message Passing

Approximate Message Passing Approximate Message Passing Mohammad Emtiyaz Khan CS, UBC February 8, 2012 Abstract In this note, I summarize Sections 5.1 and 5.2 of Arian Maleki s PhD thesis. 1 Notation We denote scalars by small letters

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

12 : Variational Inference I

12 : Variational Inference I 10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of

More information

Variational Inference (11/04/13)

Variational Inference (11/04/13) STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

5. Density evolution. Density evolution 5-1

5. Density evolution. Density evolution 5-1 5. Density evolution Density evolution 5-1 Probabilistic analysis of message passing algorithms variable nodes factor nodes x1 a x i x2 a(x i ; x j ; x k ) x3 b x4 consider factor graph model G = (V ;

More information

Loopy Belief Propagation for Bipartite Maximum Weight b-matching

Loopy Belief Propagation for Bipartite Maximum Weight b-matching Loopy Belief Propagation for Bipartite Maximum Weight b-matching Bert Huang and Tony Jebara Computer Science Department Columbia University New York, NY 10027 Outline 1. Bipartite Weighted b-matching 2.

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

On convergence of Approximate Message Passing

On convergence of Approximate Message Passing On convergence of Approximate Message Passing Francesco Caltagirone (1), Florent Krzakala (2) and Lenka Zdeborova (1) (1) Institut de Physique Théorique, CEA Saclay (2) LPS, Ecole Normale Supérieure, Paris

More information

Inference and Representation

Inference and Representation Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of

More information

Does l p -minimization outperform l 1 -minimization?

Does l p -minimization outperform l 1 -minimization? Does l p -minimization outperform l -minimization? Le Zheng, Arian Maleki, Haolei Weng, Xiaodong Wang 3, Teng Long Abstract arxiv:50.03704v [cs.it] 0 Jun 06 In many application areas ranging from bioinformatics

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information

Risk and Noise Estimation in High Dimensional Statistics via State Evolution

Risk and Noise Estimation in High Dimensional Statistics via State Evolution Risk and Noise Estimation in High Dimensional Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento, Murat Erdogdu, Marc Lelarge, and Andrea Montanari Statistical

More information

Recitation 9: Loopy BP

Recitation 9: Loopy BP Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 204 Recitation 9: Loopy BP General Comments. In terms of implementation,

More information

Approximate Message Passing Algorithms

Approximate Message Passing Algorithms November 4, 2017 Outline AMP (Donoho et al., 2009, 2010a) Motivations Derivations from a message-passing perspective Limitations Extensions Generalized Approximate Message Passing (GAMP) (Rangan, 2011)

More information

Convergence Analysis of the Variance in. Gaussian Belief Propagation

Convergence Analysis of the Variance in. Gaussian Belief Propagation 0.09/TSP.204.2345635, IEEE Transactions on Signal Processing Convergence Analysis of the Variance in Gaussian Belief Propagation Qinliang Su and Yik-Chung Wu Abstract It is known that Gaussian belief propagation

More information

17 Variational Inference

17 Variational Inference Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 17 Variational Inference Prompted by loopy graphs for which exact

More information

Min-Max Message Passing and Local Consistency in Constraint Networks

Min-Max Message Passing and Local Consistency in Constraint Networks Min-Max Message Passing and Local Consistency in Constraint Networks Hong Xu, T. K. Satish Kumar, and Sven Koenig University of Southern California, Los Angeles, CA 90089, USA hongx@usc.edu tkskwork@gmail.com

More information

Phase transitions in discrete structures

Phase transitions in discrete structures Phase transitions in discrete structures Amin Coja-Oghlan Goethe University Frankfurt Overview 1 The physics approach. [following Mézard, Montanari 09] Basics. Replica symmetry ( Belief Propagation ).

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

5. Sum-product algorithm

5. Sum-product algorithm Sum-product algorithm 5-1 5. Sum-product algorithm Elimination algorithm Sum-product algorithm on a line Sum-product algorithm on a tree Sum-product algorithm 5-2 Inference tasks on graphical models consider

More information

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013 School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two

More information

Bayesian Machine Learning - Lecture 7

Bayesian Machine Learning - Lecture 7 Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1

More information

arxiv: v1 [stat.ml] 28 Oct 2017

arxiv: v1 [stat.ml] 28 Oct 2017 Jinglin Chen Jian Peng Qiang Liu UIUC UIUC Dartmouth arxiv:1710.10404v1 [stat.ml] 28 Oct 2017 Abstract We propose a new localized inference algorithm for answering marginalization queries in large graphical

More information

Sparse Superposition Codes for the Gaussian Channel

Sparse Superposition Codes for the Gaussian Channel Sparse Superposition Codes for the Gaussian Channel Florent Krzakala (LPS, Ecole Normale Supérieure, France) J. Barbier (ENS) arxiv:1403.8024 presented at ISIT 14 Long version in preparation Communication

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Message Passing Algorithms: A Success Looking for Theoreticians

Message Passing Algorithms: A Success Looking for Theoreticians Message Passing Algorithms: A Success Looking for Theoreticians Andrea Montanari Stanford University June 5, 2010 Andrea Montanari (Stanford) Message Passing June 5, 2010 1 / 93 What is this talk about?

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

11 The Max-Product Algorithm

11 The Max-Product Algorithm Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 11 The Max-Product Algorithm In the previous lecture, we introduced

More information

Submodularity in Machine Learning

Submodularity in Machine Learning Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity

More information

Efficient Network-wide Available Bandwidth Estimation through Active Learning and Belief Propagation

Efficient Network-wide Available Bandwidth Estimation through Active Learning and Belief Propagation Efficient Network-wide Available Bandwidth Estimation through Active Learning and Belief Propagation mark.coates@mcgill.ca McGill University Department of Electrical and Computer Engineering Montreal,

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery

Approximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery Approimate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery arxiv:1606.00901v1 [cs.it] Jun 016 Shuai Huang, Trac D. Tran Department of Electrical and Computer Engineering Johns

More information

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014

Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014 Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,

More information

Graphical Model Inference with Perfect Graphs

Graphical Model Inference with Perfect Graphs Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite

More information

1 Regression with High Dimensional Data

1 Regression with High Dimensional Data 6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3

More information

Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation

Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation David L. Donoho Department of Statistics Arian Maleki Department of Electrical Engineering Andrea Montanari Department of

More information

Expectation propagation for symbol detection in large-scale MIMO communications

Expectation propagation for symbol detection in large-scale MIMO communications Expectation propagation for symbol detection in large-scale MIMO communications Pablo M. Olmos olmos@tsc.uc3m.es Joint work with Javier Céspedes (UC3M) Matilde Sánchez-Fernández (UC3M) and Fernando Pérez-Cruz

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation

Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute

More information

A Gaussian Tree Approximation for Integer Least-Squares

A Gaussian Tree Approximation for Integer Least-Squares A Gaussian Tree Approximation for Integer Least-Squares Jacob Goldberger School of Engineering Bar-Ilan University goldbej@eng.biu.ac.il Amir Leshem School of Engineering Bar-Ilan University leshema@eng.biu.ac.il

More information

Fixed Point Solutions of Belief Propagation

Fixed Point Solutions of Belief Propagation Fixed Point Solutions of Belief Propagation Christian Knoll Signal Processing and Speech Comm. Lab. Graz University of Technology christian.knoll@tugraz.at Dhagash Mehta Dept of ACMS University of Notre

More information

Linear Dynamical Systems

Linear Dynamical Systems Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Overfitting, Bias / Variance Analysis

Overfitting, Bias / Variance Analysis Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach

Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Asilomar 2011 Jason T. Parker (AFRL/RYAP) Philip Schniter (OSU) Volkan Cevher (EPFL) Problem Statement Traditional

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017

COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS

More information

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

arxiv: v1 [cs.it] 21 Feb 2013

arxiv: v1 [cs.it] 21 Feb 2013 q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and

More information

Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines. Eric W. Tramel. itwist 2016 Aalborg, DK 24 August 2016

Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines. Eric W. Tramel. itwist 2016 Aalborg, DK 24 August 2016 Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines Eric W. Tramel itwist 2016 Aalborg, DK 24 August 2016 Andre MANOEL, Francesco CALTAGIRONE, Marylou GABRIE, Florent

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Structured Variational Inference

Structured Variational Inference Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

11. Learning graphical models

11. Learning graphical models Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical

More information

Probabilistic Graphical Models

Probabilistic Graphical Models 2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Fractional Belief Propagation

Fractional Belief Propagation Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation

More information

CS Lecture 19. Exponential Families & Expectation Propagation

CS Lecture 19. Exponential Families & Expectation Propagation CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

Info-Greedy Sequential Adaptive Compressed Sensing

Info-Greedy Sequential Adaptive Compressed Sensing Info-Greedy Sequential Adaptive Compressed Sensing Yao Xie Joint work with Gabor Braun and Sebastian Pokutta Georgia Institute of Technology Presented at Allerton Conference 2014 Information sensing for

More information

High dimensional Ising model selection

High dimensional Ising model selection High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse

More information

EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes

EE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes EE229B - Final Project Capacity-Approaching Low-Density Parity-Check Codes Pierre Garrigues EECS department, UC Berkeley garrigue@eecs.berkeley.edu May 13, 2005 Abstract The class of low-density parity-check

More information

Stochastic Proximal Gradient Algorithm

Stochastic Proximal Gradient Algorithm Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Message-Passing Algorithms for GMRFs and Non-Linear Optimization

Message-Passing Algorithms for GMRFs and Non-Linear Optimization Message-Passing Algorithms for GMRFs and Non-Linear Optimization Jason Johnson Joint Work with Dmitry Malioutov, Venkat Chandrasekaran and Alan Willsky Stochastic Systems Group, MIT NIPS Workshop: Approximate

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning and Inference in DAGs We discussed learning in DAG models, log p(x W ) = n d log p(x i j x i pa(j),

More information

PAC-learning, VC Dimension and Margin-based Bounds

PAC-learning, VC Dimension and Margin-based Bounds More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based

More information

10708 Graphical Models: Homework 2

10708 Graphical Models: Homework 2 10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves

More information

Reconstruction from Anisotropic Random Measurements

Reconstruction from Anisotropic Random Measurements Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013

More information

Lecture 18 Generalized Belief Propagation and Free Energy Approximations

Lecture 18 Generalized Belief Propagation and Free Energy Approximations Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and

More information

Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction

Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction David L. Donoho Department of Statistics Arian Maleki Department of Electrical Engineering Andrea Montanari Department

More information

Conditional Independence and Factorization

Conditional Independence and Factorization Conditional Independence and Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Probabilistic Graphical Models Lecture Notes Fall 2009

Probabilistic Graphical Models Lecture Notes Fall 2009 Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University

More information

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Low-Density Parity-Check Codes

Low-Density Parity-Check Codes Department of Computer Sciences Applied Algorithms Lab. July 24, 2011 Outline 1 Introduction 2 Algorithms for LDPC 3 Properties 4 Iterative Learning in Crowds 5 Algorithm 6 Results 7 Conclusion PART I

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation

More information

Graphical Models Concepts in Compressed Sensing

Graphical Models Concepts in Compressed Sensing Graphical Models Concepts in Compressed Sensing Andrea Montanari Abstract This paper surveys recent work in applying ideas from graphical models and message passing algorithms to solve large scale regularized

More information

ON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES. Advisor: Iryna Andriyanova Professor: R.. udiger Urbanke

ON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES. Advisor: Iryna Andriyanova Professor: R.. udiger Urbanke ON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES RETHNAKARAN PULIKKOONATTU ABSTRACT. Minimum distance is an important parameter of a linear error correcting code. For improved performance of binary Low

More information

Recent developments on sparse representation

Recent developments on sparse representation Recent developments on sparse representation Zeng Tieyong Department of Mathematics, Hong Kong Baptist University Email: zeng@hkbu.edu.hk Hong Kong Baptist University Dec. 8, 2008 First Previous Next Last

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

Linear Response for Approximate Inference

Linear Response for Approximate Inference Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University

More information

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles

More information

Approximate Message Passing for Bilinear Models

Approximate Message Passing for Bilinear Models Approximate Message Passing for Bilinear Models Volkan Cevher Laboratory for Informa4on and Inference Systems LIONS / EPFL h,p://lions.epfl.ch & Idiap Research Ins=tute joint work with Mitra Fatemi @Idiap

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html

More information