Message passing and approximate message passing
|
|
- Marsha Cole
- 6 years ago
- Views:
Transcription
1 Message passing and approximate message passing Arian Maleki Columbia University 1 / 47
2 What is the problem? Given pdf µ(x 1, x 2,..., x n ) we are interested in arg maxx1,x 2,...,x n µ(x 1, x 2,..., x n) Calculating marginal pdf µi(x i) Calculating joint distribution µs(x S) for S {1, 2,..., n} These problems are important because Many inference problems are in these forms Many applications in image procession, communication, machine learning, signal processing This talk: Marginalizing distribution Most tools are applied to other problems 2 / 47
3 This talk: marginalization Problem: Given pdf µ(x 1, x 2,..., x n ) find µ i (x i ) Simplification: all xj s are binary random variables Is it difficult? Yes. No polynomial time algorithm Complexity of variable elimination: 2 n 1 Solution: No good solution for generic µ Consider structured µ 3 / 47
4 Structured µ Problem: Given pdf µ(x 1, x 2,..., x n ) find µ i (x i ) Simplification: all xj s are binary random variables Example 1: x 1, x 2,..., x n are independent Easy marginalization Not interesting Example 2: Independent subsets µ(x 1, x 2,..., x n ) = µ 1(x 1,..., x k )µ 2(x k+1,..., x n ) More interesting and less easy Complexity still exponential in terms of size of subsets Does not cover many interesting cases 4 / 47
5 Structured µ:cont d Problem: Given pdf µ(x 1, x 2,..., x n ) find µ i (x i ) Simplification: all xj s are binary random variables Example 3: more interesting structure: µ(x 1, x 2,..., x n ) = µ 1(x S1 )µ 2(x S2 )... µ l(x Sl ) Unlike independence, Si S j Covers many interesting practical problems Let Si n, i. Is marginalization easier than a generic µ? o Not clear!! We introduce factor graph to address this question 5 / 47
6 Factor graph Slight generalization µ(x 1, x 2,..., x n ) = 1 Z ψ a (x Sa ) a F 1 Variable nodes Factor nodes ψa : {0, 1} Sa R + not necessarily pdf Z: normalizing constant. Called "partition function" ψa(x Sa ): factors Factor graph: 2.. a b Variable node: n nodes corresponding to x 1, x 2,..., x n n Function node: F nodes corresponding to ψ a(x S1 ), ψ b (x S2 ),... Edge between a and variable node i iff xi S a 6 / 47
7 Factor graph Cont d µ(x 1, x 2,..., x n ) = 1 Z ψ a (x Sa ) Factor graph: a F Variable node: n nodes corresponding to x 1, x 2,..., x n Function node: F nodes corresponding to ψ a(x S1 ), ψ b (x S2 ), Variable nodes. Factor nodes a b Edge between a and variable node i iff xi S a Some notation:. a: neighbors of function node a: a = {i : x i S a}. n i: neighbors of variable node i i = {b F : i S b }. 7 / 47
8 Factor graph example µ(x 1, x 2, x 3 ) = 1 Z ψ a(x 1, x 2 )ψ b (x 2, x 3 ) Some notation: a: neighbors of function node a: a = {1, 2} b = {2, 3} i: neighbors of variable node i Variable nodes Factor nodes a b 1 = {a} 2 = {a, b} 8 / 47
9 Why factor graph? 1 Variable nodes Factor nodes Factor graphs simplifies understanding structures Independence: graph has disconnected pieces Tree graphs: Easy marginalization Loops (specially short loops): generally make the problem more complicated 2.. a b n 9 / 47
10 Marginalization on tree structured graphs Tree: Graph with no loops Marginalization by variable elimination is efficient on such graphs (distributions) o Linear in n o Exponential in size of factors i Proof: on the board 10 / 47
11 Summary of variable elimination on trees Notation µ(x1, x 2,..., x n) = ψa(x a F a) µj a: belief from variable node j to factor a ˆµa j: belief from factor node a to variable j i We also have µ a j (x j ) = ψ a (x a ) ˆµ l a (x l ) ˆµ j a (x j ) = {x j:j a\j} b j\a µ b j (x j ) l a\j 11 / 47
12 Belief propagation algorithm on trees Based on variable elimination on trees consider iterative algorithm: ν t+1 a j (x j) = ψ a (x a ) ˆν l a(x t l ) ˆν t+1 j a (x j) = {x j:j a\j} b j\a ν t b j(x j ) l a\j i ν t a j and ˆν t j a are beliefs of nodes at time t Each node propagates its belief to other nodes Hope: Converge to a good fixed point νb j(x t j ) µ b j (x j ) t ˆν j b(x t j ) µ j b (x j ) t 12 / 47
13 Belief propagation algorithm on trees Belief propagation: ν t+1 a j (x j) = ˆν t+1 j a (x j) = {x j:j a\j} b j\a ψ a (x a ) ν t b j(x j ) l a\j ˆν t l a(x l ) i Why BP and not variable elimination? On trees: no good reason loopy graphs: can be easily applied 13 / 47
14 Belief propagation algorithm on trees Belief propagation: ν t+1 a j (x j) = ˆν t+1 j a (x j) = {x j:j a\j} b j\a ψ a (x a ) ν t b j(x j ) l a\j ˆν t l a(x l ) i Lemma: On a tree graph BP converges to the marginal distributions in finite number of iterations, i.e, νb j(x t j ) µ b j (x j ) t ˆν j b(x t j ) µ j b (x j ) t 14 / 47
15 Summary of belief propagation on trees i Equivalent to variable elimination Exact on trees Converges in finite number of iterations: 2 times the maximum depth of the tree. 15 / 47
16 Loopy belief propagation Apply belief propagation to loopy graph: ν t+1 a j (x j) = ψ a (x a ) ˆν l a(x t l ) ˆν t+1 j a (x j) = {x j:j a\j} b j\a ν t b j(x j ) l a\j i Wait until the algorithm converges: Announce a j ν a j(x j) as µ j(x j) 16 / 47
17 Challenges for loopy BP Does not converge necessarily Example : µ(x 1, x 2, x 3) = I(x 1 x 2)I(x 2 x 3)I(x 3 x 1) o t = 1 start with x 1 = 0 o t = 2 x 2 = 1 o t = 3 x 3 = 0 o t = 4 x 4 = 1 Even if converges, not necessarily the marginal distribution Can have more than one fixed point 17 / 47
18 Advantages of loopy BP In many applications, works really well i Coding theory Machine vision compressed sensing. In some cases, loopy BP can be analyzed theoretically 18 / 47
19 Some theoretical results on loopy BP BP equations have at least one fixed point Proof uses Brouwer s theorem Does not necessarily converge to any of the fixed points BP is exact on trees i BP is accurate for locally tree like graphs Gallager 1963, Luby et al. 2001, Richardson and Urbanke 2001 Several methods exist for verifying correctness of BP Computation trees Dobrushin condition. 19 / 47
20 Locally tree like graphs BP is accurate for locally tree like graphs Heuristic: sparse graphs (without many edges) can be locally tree like i How to check? If girth(g) = O(log(n)) girth(g) is the length of the shortest cycle Random (dv, d c) graph is locally tree like with high probability 20 / 47
21 Dobrushin condition sufficient condition for convergence of BP Influence of node j on node i: C ij = sup x,x { µ(x i x V \i,j ) µ(x i x V \i,j ) T V : x l = x l l j} Intuition: if the influence of the nodes on each other is very small, then oscillations should not occur Theorem If γ = sup i ( j C ij) < 1, then BP marginals converge to the true marginals. 21 / 47
22 Summary of BP Belief propagation: ν t+1 a j (x j) = ˆν t+1 j a (x j) = {x j:j a\j} b j\a ψ a (x a ) ν t b j(x j ) l a\j ˆν t l a(x l ) i Exact on trees Accurate on locally tree like graphs No generic proving technique that works on all problems Successful in many applications, even when we cannot prove convergence 22 / 47
23 More challenges What if µ is a pdf on R n? Extension seems straightforward ν t+1 a j (x j) = ˆν t+1 j a (x j) = {x j:j a\j} b j\a But, calculations become complicated ν t b j(x j ) ψ a (x a ) l a\j ˆν t l a(x l )dx Shall discretize the pdf with certain accuracy Do lots of numeric multiple integrtions Exception: Gaussian graphical models µ(x) = exp( x t Jx+c t x) Z All messages become Gaussian NOT the focus of this talk 23 / 47
24 Approximate message passing 24 / 47
25 Model n N k sparse signal y A w y = Ax o + w x x o: k-sparse vector in R N A: n N design matrix y: measurement vector in R n w: measurement noise in R n 25 / 47
26 LASSO Many useful heuristic algorithms: Noiseless: l1-minimization minimize x subject to x 1 y = Ax x 1 = i xi Noisy: LASSO minimize x 1 2 y Ax λ x 1 convex optimizations Chen, Donoho, Saunders (96), Tibshirani (96) 26 / 47
27 LASSO Many useful heuristic algorithms: Noiseless: l1-minimization minimize x subject to x 1 y = Ax x 1 = i xi Noisy: LASSO minimize x 1 2 y Ax λ x 1 convex optimizations Chen, Donoho, Saunders (96), Tibshirani (96) 27 / 47
28 Algorithmic challenges of sparse recovery Use convex optimization tools to solve LASSO Computational complexity: interior point method: o generic tools o appropriate for N < 5000 homotopy methods: o use the structure of the LASSO o appropriate for N < First order methods o Low computational complexity per iteration o Require many iterations 28 / 47
29 Algorithmic challenges of sparse recovery Use convex optimization tools to solve LASSO Computational complexity: interior point method: o generic tools o appropriate for N < 5000 homotopy methods: o use the structure of the LASSO o appropriate for N < First order methods o Low computational complexity per iteration o Require many iterations 29 / 47
30 Algorithmic challenges of sparse recovery Use convex optimization tools to solve LASSO Computational complexity: interior point method: o generic tools o appropriate for N < 5000 homotopy methods: o use the structure of the LASSO o appropriate for N < First order methods o Low computational complexity per iteration o Require many iterations 30 / 47
31 Message passing for LASSO Define: µ(dx) = 1 Z e βλ x 1 β 2 y Ax 2 dx Z: normalization constant β > 0 As β : µ concentrates around solution of LASSO: ˆx λ Marginalize µ to obtain ˆx λ 31 / 47
32 Message passing for LASSO Define: µ(dx) = 1 Z e βλ x 1 β 2 y Ax 2 dx Z: normalization constant β > 0 As β : µ concentrates around solution of LASSO: ˆx λ Marginalize µ to obtain ˆx λ 32 / 47
33 Message passing for LASSO Define: µ(dx) = 1 Z e βλ x 1 β 2 y Ax 2 dx Z: normalization constant β > 0 As β : µ concentrates around solution of LASSO: ˆx λ Marginalize µ to obtain ˆx λ 33 / 47
34 Message passing algorithm µ(dx) = 1 Z e βλ i xi β 2 a (ya (Ax)a)2 dx Belief propagation update rules: ν t+1 i a (si) = e βλ s i b a ˆνt b i(s i) ˆν t a i(s i) = e β 2 (ya (As)a)2 j i dνt j a(s j) Pearl (82) 34 / 47
35 Issues of message passing algorithm Challenges: ν t+1 i a (s i) = e βλ si b a ˆν t b i(s i ) ˆν a i(s t i ) = e β 2 (ya (As)a)2 dνj a(s t j ) j i messages are distributions over real line calculation of messages is difficult o Numeric integration o The number of variables in the integral is N 1 The graph is not "tree like" o No good analysis tools 35 / 47
36 Challenges for message passing algorithm Challenges: Solution: ν t+1 i a (s i) = e βλ si b a ν t b i(s i ) ˆν a i(s t i ) = e β 2 (ya (As)a)2 messages are distributions over real line calculation of messages is difficult The graph is not "tree like" blessing of dimensionality: o as N : ˆν t a i (s i) converges to Gaussian dνj a(s t j ) j i o as N : νi a t (s i) converges to 1 β f β (s; x, b) z β (x, b) e β s 2b (s x)2. 36 / 47
37 Challenges for message passing algorithm Challenges: Solution: ν t+1 i a (s i) = e βλ si b a ν t b i(s i ) ˆν a i(s t i ) = e β 2 (ya (As)a)2 messages are distributions over real line calculation of messages is difficult The graph is not "tree like" blessing of dimensionality: o as N : ˆν t a i (s i) converges to Gaussian dνj a(s t j ) j i o as N : νi a t (s i) converges to 1 β f β (s; x, b) z β (x, b) e β s 2b (s x)2. 37 / 47
38 Simplifying messages for large N Simplification of ˆν a i t (s i) ˆν a i(s t i ) = e β 2 (ya (As)a)2 dνj a(s t j ) j i = E exp β 2 (y a A ai s i A aj s j ) 2 j i ( = E exp β 2 (A ais i Z) 2) Z = y a j i A ajs j d W ; W is Gaussian Ehsi (Z) Eh si (W ) C t N 1/2 (ˆτ a i t, )3/2 38 / 47
39 Message passing As N, ˆν a i t converges to a Gaussian distribution. Theorem Let x t j a and (τ j a t /β) denote the mean and variance of distribution νj a t. Then there exists a constant C t such that sup ˆν a i(s t i ) ˆφ t a i(s i ) s i ˆφ t a i(ds i ) C t N(ˆτ a i t, )3 βa 2 ai 2πˆτ t a i e β(aaisi zt a i )2 /2ˆτ t a i dsi, where z t a i y a j i A aj x t j a, ˆτ t a i j i A 2 ajτ t j a. Donoho, M., Montanari (11) 39 / 47
40 Shape of ν t i a; intuition Summary ˆν t b i (s i) Gaussian ν t+1 i a (si) = e βλ s i b a νt b i(s i) Shape of ν t+1 i a (s i) ν t+1 i a (si) 1 z β (x,b) e β s β 2b (s x)2 40 / 47
41 Message passing (cont d) Theorem Suppose that ˆν t a i = ˆφ t a i, with mean zt a i and variance ˆτ t a i = ˆτ t. Then at the next iteration we have ν t+1 i a (s i) φ t+1 i a (s i) < C/n, φ t+1 i a (s i) λ f β (λs i ; λ b a A bi z t b i, λ 2 (1 + ˆτ t )), and 1 f β (s; x, b) z β (x, b) e β s β 2b (s x)2. Donoho, M., Montanari (11) 41 / 47
42 Next asymptotic: β Reminder: x t j a mean of ν t j a z t a i mean of ˆν t a i z t a i y a j i Aajxt j a Can we write x t j a in terms of zt a i? x t j a = 1 z β s i e β si β(si b a A biz t b i )2 Can be done, but... Laplace method simplifies it (β ) x t j a = η( b a A bi z t b i). 42 / 47
43 Summary x t j a mean of νt j a z t a i mean of ˆνt a i z t a i = y a j i A aj x t j a x t+1 i a = η( b a A bi z t b i). 43 / 47
44 Comparison of MP and IST MP IST z t a i = y a j i A aj x t j a z t a i = y a j i A aj x t j a x t+1 i a = η( b a A bi z t b i). x t+1 i a = η( b A bi z t b i). x 1 x 1 y 1 y 1 y 2 x 2 y 2 x 2 y n y n x n x n 44 / 47
45 Approximate message passing We can further simplify the messages x t+1 = η(x t + A T z t ; λ t ) z t = y Ax t + xt 0 n zt 1 45 / 47
46 IST versus AMP IST: x t+1 = η(x t + A T z t ; λ t ) z t = y Ax t AMP: x t+1 = η(x t + A T z t ; λ t ) z t = y Ax t + 1 n xt 0z t 1 Mean square error State evolution Empirical Iteration Mean square error State evolution Empirical Iteration 46 / 47
47 Theoretical guarantees We can predict the performance of AMP at every iteration in asymptotic regime Idea: x t + A T z t can be modeled as signal plus Gaussian noise We keep track of noise variance I will discuss this part in the "Risk Seminar" 47 / 47
How to Design Message Passing Algorithms for Compressed Sensing
How to Design Message Passing Algorithms for Compressed Sensing David L. Donoho, Arian Maleki and Andrea Montanari, February 17, 2011 Abstract Finding fast first order methods for recovering signals from
More informationMessage Passing Algorithms for Compressed Sensing: I. Motivation and Construction
Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction David L. Donoho Department of Statistics Arian Maleki Department of Electrical Engineering Andrea Montanari Department
More informationApproximate Message Passing
Approximate Message Passing Mohammad Emtiyaz Khan CS, UBC February 8, 2012 Abstract In this note, I summarize Sections 5.1 and 5.2 of Arian Maleki s PhD thesis. 1 Notation We denote scalars by small letters
More information13 : Variational Inference: Loopy Belief Propagation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction
More information12 : Variational Inference I
10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More information5. Density evolution. Density evolution 5-1
5. Density evolution Density evolution 5-1 Probabilistic analysis of message passing algorithms variable nodes factor nodes x1 a x i x2 a(x i ; x j ; x k ) x3 b x4 consider factor graph model G = (V ;
More informationLoopy Belief Propagation for Bipartite Maximum Weight b-matching
Loopy Belief Propagation for Bipartite Maximum Weight b-matching Bert Huang and Tony Jebara Computer Science Department Columbia University New York, NY 10027 Outline 1. Bipartite Weighted b-matching 2.
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationOn convergence of Approximate Message Passing
On convergence of Approximate Message Passing Francesco Caltagirone (1), Florent Krzakala (2) and Lenka Zdeborova (1) (1) Institut de Physique Théorique, CEA Saclay (2) LPS, Ecole Normale Supérieure, Paris
More informationInference and Representation
Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of
More informationDoes l p -minimization outperform l 1 -minimization?
Does l p -minimization outperform l -minimization? Le Zheng, Arian Maleki, Haolei Weng, Xiaodong Wang 3, Teng Long Abstract arxiv:50.03704v [cs.it] 0 Jun 06 In many application areas ranging from bioinformatics
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction
More informationRisk and Noise Estimation in High Dimensional Statistics via State Evolution
Risk and Noise Estimation in High Dimensional Statistics via State Evolution Mohsen Bayati Stanford University Joint work with Jose Bento, Murat Erdogdu, Marc Lelarge, and Andrea Montanari Statistical
More informationRecitation 9: Loopy BP
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 204 Recitation 9: Loopy BP General Comments. In terms of implementation,
More informationApproximate Message Passing Algorithms
November 4, 2017 Outline AMP (Donoho et al., 2009, 2010a) Motivations Derivations from a message-passing perspective Limitations Extensions Generalized Approximate Message Passing (GAMP) (Rangan, 2011)
More informationConvergence Analysis of the Variance in. Gaussian Belief Propagation
0.09/TSP.204.2345635, IEEE Transactions on Signal Processing Convergence Analysis of the Variance in Gaussian Belief Propagation Qinliang Su and Yik-Chung Wu Abstract It is known that Gaussian belief propagation
More information17 Variational Inference
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 17 Variational Inference Prompted by loopy graphs for which exact
More informationMin-Max Message Passing and Local Consistency in Constraint Networks
Min-Max Message Passing and Local Consistency in Constraint Networks Hong Xu, T. K. Satish Kumar, and Sven Koenig University of Southern California, Los Angeles, CA 90089, USA hongx@usc.edu tkskwork@gmail.com
More informationPhase transitions in discrete structures
Phase transitions in discrete structures Amin Coja-Oghlan Goethe University Frankfurt Overview 1 The physics approach. [following Mézard, Montanari 09] Basics. Replica symmetry ( Belief Propagation ).
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More information5. Sum-product algorithm
Sum-product algorithm 5-1 5. Sum-product algorithm Elimination algorithm Sum-product algorithm on a line Sum-product algorithm on a tree Sum-product algorithm 5-2 Inference tasks on graphical models consider
More informationProbabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013
School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationarxiv: v1 [stat.ml] 28 Oct 2017
Jinglin Chen Jian Peng Qiang Liu UIUC UIUC Dartmouth arxiv:1710.10404v1 [stat.ml] 28 Oct 2017 Abstract We propose a new localized inference algorithm for answering marginalization queries in large graphical
More informationSparse Superposition Codes for the Gaussian Channel
Sparse Superposition Codes for the Gaussian Channel Florent Krzakala (LPS, Ecole Normale Supérieure, France) J. Barbier (ENS) arxiv:1403.8024 presented at ISIT 14 Long version in preparation Communication
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationMessage Passing Algorithms: A Success Looking for Theoreticians
Message Passing Algorithms: A Success Looking for Theoreticians Andrea Montanari Stanford University June 5, 2010 Andrea Montanari (Stanford) Message Passing June 5, 2010 1 / 93 What is this talk about?
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture
More information11 The Max-Product Algorithm
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 11 The Max-Product Algorithm In the previous lecture, we introduced
More informationSubmodularity in Machine Learning
Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity
More informationEfficient Network-wide Available Bandwidth Estimation through Active Learning and Belief Propagation
Efficient Network-wide Available Bandwidth Estimation through Active Learning and Belief Propagation mark.coates@mcgill.ca McGill University Department of Electrical and Computer Engineering Montreal,
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationApproximate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery
Approimate Message Passing with Built-in Parameter Estimation for Sparse Signal Recovery arxiv:1606.00901v1 [cs.it] Jun 016 Shuai Huang, Trac D. Tran Department of Electrical and Computer Engineering Johns
More informationMassachusetts Institute of Technology Department of Electrical Engineering and Computer Science Algorithms For Inference Fall 2014
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 Problem Set 3 Issued: Thursday, September 25, 2014 Due: Thursday,
More informationGraphical Model Inference with Perfect Graphs
Graphical Model Inference with Perfect Graphs Tony Jebara Columbia University July 25, 2013 joint work with Adrian Weller Graphical models and Markov random fields We depict a graphical model G as a bipartite
More information1 Regression with High Dimensional Data
6.883 Learning with Combinatorial Structure ote for Lecture 11 Instructor: Prof. Stefanie Jegelka Scribe: Xuhong Zhang 1 Regression with High Dimensional Data Consider the following regression problem:
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3
More informationMessage Passing Algorithms for Compressed Sensing: II. Analysis and Validation
Message Passing Algorithms for Compressed Sensing: II. Analysis and Validation David L. Donoho Department of Statistics Arian Maleki Department of Electrical Engineering Andrea Montanari Department of
More informationExpectation propagation for symbol detection in large-scale MIMO communications
Expectation propagation for symbol detection in large-scale MIMO communications Pablo M. Olmos olmos@tsc.uc3m.es Joint work with Javier Céspedes (UC3M) Matilde Sánchez-Fernández (UC3M) and Fernando Pérez-Cruz
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationWalk-Sum Interpretation and Analysis of Gaussian Belief Propagation
Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute
More informationA Gaussian Tree Approximation for Integer Least-Squares
A Gaussian Tree Approximation for Integer Least-Squares Jacob Goldberger School of Engineering Bar-Ilan University goldbej@eng.biu.ac.il Amir Leshem School of Engineering Bar-Ilan University leshema@eng.biu.ac.il
More informationFixed Point Solutions of Belief Propagation
Fixed Point Solutions of Belief Propagation Christian Knoll Signal Processing and Speech Comm. Lab. Graz University of Technology christian.knoll@tugraz.at Dhagash Mehta Dept of ACMS University of Notre
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 2950-P, Spring 2013 Prof. Erik Sudderth Lecture 12: Gaussian Belief Propagation, State Space Models and Kalman Filters Guest Kalman Filter Lecture by
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul
More informationMachine Learning for Data Science (CS4786) Lecture 24
Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationChapter 17: Undirected Graphical Models
Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)
More informationCompressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach
Compressive Sensing under Matrix Uncertainties: An Approximate Message Passing Approach Asilomar 2011 Jason T. Parker (AFRL/RYAP) Philip Schniter (OSU) Volkan Cevher (EPFL) Problem Statement Traditional
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More informationCOMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017
COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS
More informationUNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS
UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationarxiv: v1 [cs.it] 21 Feb 2013
q-ary Compressive Sensing arxiv:30.568v [cs.it] Feb 03 Youssef Mroueh,, Lorenzo Rosasco, CBCL, CSAIL, Massachusetts Institute of Technology LCSL, Istituto Italiano di Tecnologia and IIT@MIT lab, Istituto
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and
More informationInferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines. Eric W. Tramel. itwist 2016 Aalborg, DK 24 August 2016
Inferring Sparsity: Compressed Sensing Using Generalized Restricted Boltzmann Machines Eric W. Tramel itwist 2016 Aalborg, DK 24 August 2016 Andre MANOEL, Francesco CALTAGIRONE, Marylou GABRIE, Florent
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More informationStructured Variational Inference
Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory
More information11. Learning graphical models
Learning graphical models 11-1 11. Learning graphical models Maximum likelihood Parameter learning Structural learning Learning partially observed graphical models Learning graphical models 11-2 statistical
More informationProbabilistic Graphical Models
2016 Robert Nowak Probabilistic Graphical Models 1 Introduction We have focused mainly on linear models for signals, in particular the subspace model x = Uθ, where U is a n k matrix and θ R k is a vector
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationFractional Belief Propagation
Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory
More informationInfo-Greedy Sequential Adaptive Compressed Sensing
Info-Greedy Sequential Adaptive Compressed Sensing Yao Xie Joint work with Gabor Braun and Sebastian Pokutta Georgia Institute of Technology Presented at Allerton Conference 2014 Information sensing for
More informationHigh dimensional Ising model selection
High dimensional Ising model selection Pradeep Ravikumar UT Austin (based on work with John Lafferty, Martin Wainwright) Sparse Ising model US Senate 109th Congress Banerjee et al, 2008 Estimate a sparse
More informationEE229B - Final Project. Capacity-Approaching Low-Density Parity-Check Codes
EE229B - Final Project Capacity-Approaching Low-Density Parity-Check Codes Pierre Garrigues EECS department, UC Berkeley garrigue@eecs.berkeley.edu May 13, 2005 Abstract The class of low-density parity-check
More informationStochastic Proximal Gradient Algorithm
Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationMessage-Passing Algorithms for GMRFs and Non-Linear Optimization
Message-Passing Algorithms for GMRFs and Non-Linear Optimization Jason Johnson Joint Work with Dmitry Malioutov, Venkat Chandrasekaran and Alan Willsky Stochastic Systems Group, MIT NIPS Workshop: Approximate
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Mark Schmidt University of British Columbia Winter 2018 Last Time: Learning and Inference in DAGs We discussed learning in DAG models, log p(x W ) = n d log p(x i j x i pa(j),
More informationPAC-learning, VC Dimension and Margin-based Bounds
More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based
More information10708 Graphical Models: Homework 2
10708 Graphical Models: Homework 2 Due Monday, March 18, beginning of class Feburary 27, 2013 Instructions: There are five questions (one for extra credit) on this assignment. There is a problem involves
More informationReconstruction from Anisotropic Random Measurements
Reconstruction from Anisotropic Random Measurements Mark Rudelson and Shuheng Zhou The University of Michigan, Ann Arbor Coding, Complexity, and Sparsity Workshop, 013 Ann Arbor, Michigan August 7, 013
More informationLecture 18 Generalized Belief Propagation and Free Energy Approximations
Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and
More informationMessage Passing Algorithms for Compressed Sensing: I. Motivation and Construction
Message Passing Algorithms for Compressed Sensing: I. Motivation and Construction David L. Donoho Department of Statistics Arian Maleki Department of Electrical Engineering Andrea Montanari Department
More informationConditional Independence and Factorization
Conditional Independence and Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationProbabilistic Graphical Models Lecture Notes Fall 2009
Probabilistic Graphical Models Lecture Notes Fall 2009 October 28, 2009 Byoung-Tak Zhang School of omputer Science and Engineering & ognitive Science, Brain Science, and Bioinformatics Seoul National University
More informationEEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as
L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationLow-Density Parity-Check Codes
Department of Computer Sciences Applied Algorithms Lab. July 24, 2011 Outline 1 Introduction 2 Algorithms for LDPC 3 Properties 4 Iterative Learning in Crowds 5 Algorithm 6 Results 7 Conclusion PART I
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation
More informationGraphical Models Concepts in Compressed Sensing
Graphical Models Concepts in Compressed Sensing Andrea Montanari Abstract This paper surveys recent work in applying ideas from graphical models and message passing algorithms to solve large scale regularized
More informationON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES. Advisor: Iryna Andriyanova Professor: R.. udiger Urbanke
ON THE MINIMUM DISTANCE OF NON-BINARY LDPC CODES RETHNAKARAN PULIKKOONATTU ABSTRACT. Minimum distance is an important parameter of a linear error correcting code. For improved performance of binary Low
More informationRecent developments on sparse representation
Recent developments on sparse representation Zeng Tieyong Department of Mathematics, Hong Kong Baptist University Email: zeng@hkbu.edu.hk Hong Kong Baptist University Dec. 8, 2008 First Previous Next Last
More informationThe Ising model and Markov chain Monte Carlo
The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte
More informationGaussian Graphical Models and Graphical Lasso
ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf
More informationLinear Response for Approximate Inference
Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University
More informationApproximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)
Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles
More informationApproximate Message Passing for Bilinear Models
Approximate Message Passing for Bilinear Models Volkan Cevher Laboratory for Informa4on and Inference Systems LIONS / EPFL h,p://lions.epfl.ch & Idiap Research Ins=tute joint work with Mitra Fatemi @Idiap
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html
More information