Belief Propagation on Partially Ordered Sets Robert J. McEliece California Institute of Technology
|
|
- Jessie Ramsey
- 5 years ago
- Views:
Transcription
1 Belief Propagation on Partially Ordered Sets Robert J. McEliece California Institute of Technology {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 International Symposium on Mathematical Theory of Networks and Systems University of Notre Dame August 13, 2002 With much help from Muhammed Yildirim, Jonathan Harel, Jeremy Thorpe, and Ravi Palanki.
2 Belief Propagation on Partially Ordered Sets Robert J. McEliece California Institute of Technology {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 International Symposium on Mathematical Theory of Networks and Systems University of Notre Dame August 13, 2002 * With much help from Muhammed Yildirim, Jonathan Harel, Jeremy Thorpe, and Ravi Palanki.
3 A Big Tip of the Hat to: Jonathan Yedidia, William Freeman, and Yair Weiss, the authors of Generalized Belief Propagation and Free Energy Minimization, which inspired this paper.
4 Outline of the Talk The Problem Statement Significance of the Problem A Statistical Physics Approach Solution Via Poset Belief Propagation Experimental Results Connection Between the Statistical Physics and the BP Approaches Open Problems Coffee Break
5 Outline of the Talk The Problem Statement Significance of the Problem A Statistical Physics Approach Solution Via Poset Belief Propagation Experimental Results Connection Between the Statistical Physics and the BP Approaches Open Problems Coffee Break
6 Outline of the Talk The Problem Statement Significance of the Problem A Statistical Physics Approach Solution Via Poset Belief Propagation Experimental Results Connection Between the Statistical Physics and the BP Approaches Open Problems Coffee Break
7 Outline of the Talk The Problem Statement Significance of the Problem A Statistical Physics Approach Solution Via Poset Belief Propagation Experimental Results Connection Between the Statistical Physics and the BP Approaches Open Problems Coffee Break
8 Outline of the Talk The Problem Statement Significance of the Problem A Statistical Physics Approach Solution Via Poset Belief Propagation Experimental Results Connection Between the Statistical Physics and the BP Approaches Open Problems Coffee Break
9 Outline of the Talk The Problem Statement Significance of the Problem A Statistical Physics Approach Solution Via Poset Belief Propagation Experimental Results Connection Between the Statistical Physics and the BP Approaches Open Problems Coffee Break
10 Outline of the Talk The Problem Statement Significance of the Problem A Statistical Physics Approach Solution Via Poset Belief Propagation Experimental Results Connection Between the Statistical Physics and the BP Approaches Open Problems Coffee Break
11 Outline of the Talk The Problem Statement Significance of the Problem A Statistical Physics Approach Solution Via Poset Belief Propagation Experimental Results Connection Between the Statistical Physics and the BP Approaches Open Problems Coffee Break
12 Outline of the Talk The Problem Statement Significance of the Problem A Statistical Physics Approach Solution Via Poset Belief Propagation Experimental Results Connection Between the Statistical Physics and the BP Approaches Open Problems Coffee Break
13 A General Computational Problem Variables {x 1,...,x n }, x i A = {0, 1,...,q 1}. R= {R 1,...,R M },acollection of subsets of {1, 2,...,n}. A set of nonnegative local potentials {α R (x R ):R R}. Define the global (Boltzmann) probability density function; B(x) = 1 α R (x R ), Z R R where Z = α R (x R ) (Partition function) x A n R R (F = ln Z = Helmholtz free energy).
14 A General Computational Problem Variables {x 1,...,x n }, x i A = {0, 1,...,q 1}. R= {R 1,...,R M },acollection of subsets of {1, 2,...,n}. A set of nonnegative local potentials {α R (x R ):R R}. Define the global (Boltzmann) probability density function; B(x) = 1 α R (x R ), Z R R where Z = α R (x R ) (Partition function) x A n R R (F = ln Z = Helmholtz free energy).
15 A General Computational Problem, Continued Problem: Compute, exactly or approximately, the Helmholtz free energy F and some or all of the local marginal densities of the Boltzmann density: for R R. B R (x R )= = 1 Z x R c A Rc B(x) x R c A Rc S R α S (x S ),
16 Applications: Appropriately interpreted, this computational problem includes: Turbo/LDPC decoding. Probabilistic inference in Bayesian networks. Finite Fourier transforms. Free energy computations in statistical physics. (But we won t discuss these applications).
17 Applications: Appropriately interpreted, this computational problem includes: Turbo/LDPC decoding. Probabilistic inference in Bayesian networks. Finite Fourier transforms. Free energy computations in statistical physics. (But we won t discuss these applications).
18 Applications: Appropriately interpreted, this computational problem includes: Turbo/LDPC decoding. Probabilistic inference in Bayesian networks. Finite Fourier transforms. Free energy computations in statistical physics. (But we won t discuss these applications).
19 Applications: Appropriately interpreted, this computational problem includes: Turbo/LDPC decoding. Probabilistic inference in Bayesian networks. Finite Fourier transforms. Free energy computations in statistical physics. (But we won t discuss these applications).
20 Applications: Appropriately interpreted, this computational problem includes: Turbo/LDPC decoding. Probabilistic inference in Bayesian networks. Finite Fourier transforms. Free energy computations in statistical physics. (But we won t discuss these applications).
21 Applications: Appropriately interpreted, this computational problem includes: Turbo/LDPC decoding. Probabilistic inference in Bayesian networks. Finite Fourier transforms. Free energy computations in statistical physics. (But we won t discuss these applications).
22 Applications: Appropriately interpreted, this computational problem includes: Turbo/LDPC decoding. Probabilistic inference in Bayesian networks. Finite Fourier transforms. Free energy computations in statistical physics. (But we won t discuss these applications).
23 A Simple Example. Alphabet: A = {0, 1}. Local domains: R = {{1, 2, 3}, {1, 3, 4}, {2, 3, 5}, {3, 4, 5}}. Local potentials: α i (x, y, z) = { 1/2 if x = y = z 0 otherwise i =1, 2, 3, 4.
24 A Simple Example. Alphabet: A = {0, 1}. Local domains: R = {{1, 2, 3}, {1, 3, 4}, {2, 3, 5}, {3, 4, 5}}. Local potentials: α i (x, y, z) = { 1/2 if x = y = z 0 otherwise i =1, 2, 3, 4. Answers: Z =1/8 F =ln8. B(x 1,x 2,x 3,x 4,x 5 )={ 1/2 if x1 = x 2 = = x 5 0 otherwise. { B i (x, y, z) = 1/2 if x = y = z 0 otherwise
25 A Statistical Physics Approach s i S S = {s 1,...,s n } = n identical particles. Spin of s i = x i A = {0, 1,...,q 1}. System configuration x =(x 1,x 2,...,x n ). E(x 1,...,x n )=energy of configuration x. (Hamiltonian) = R R ln α R(x R ).
26 A Statistical Physics Approach s i S S = {s 1,...,s n } = n identical particles. Spin of s i = x i A = {0, 1,...,q 1}. System configuration x =(x 1,x 2,...,x n ). E(x 1,...,x n )=energy of configuration x. (Hamiltonian) = R R ln α R(x R ).
27 A Statistical Physics Approach s i S S = {s 1,...,s n } = n identical particles. Spin of s i = x i A = {0, 1,...,q 1}. System configuration x =(x 1,x 2,...,x n ). E(x 1,...,x n )=energy of configuration x. (Hamiltonian) = R R ln α R(x R ).
28 A Statistical Physics Approach s i S S = {s 1,...,s n } = n identical particles. Spin of s i = x i A = {0, 1,...,q 1}. System configuration x =(x 1,x 2,...,x n ). E(x 1,...,x n )=energy of configuration x. (Hamiltonian) = R R ln α R(x R ).
29 A Statistical Physics Approach b(x) = Trial probability of configuration x. Average energy: U = x A b(x)e(x). n Entropy: H = x A b(x)lnb(x). n Variational free energy: F (b) =U H.
30 A Statistical Physics Approach b(x) = Trial probability of configuration x. Average energy: U = x A b(x)e(x). n Entropy: H = x A b(x)lnb(x). n Variational free energy: F (b) =U H.
31 A Statistical Physics Approach b(x) = Trial probability of configuration x. Average energy: U = x A b(x)e(x). n Entropy: H = x A b(x)lnb(x). n Variational free energy: F (b) =U H.
32 A Statistical Physics Approach b(x) = Trial probability of configuration x. Average energy: U = x A b(x)e(x). n Entropy: H = x A b(x)lnb(x). n Variational free energy: F (b) =U H.
33 A Famous Theorem from Statistical Mechanics Theorem. F (b) F, with equality if and only if b(x) =B(x) = 1 Z e E(x), the Boltzmann-Gibbs, or equilibrium, density.
34 Proof of the Famous Theorem A simple calculation shows that F (b) =D(b B)+F. Hence F (b) F, with equals iff b = B.
35 An Important Corollary Corollary. F = min b(x) F (b) B(x) = arg min b(x) F (b). Suggests a possible method for computing F (and B(x)), but as it stands, it s too complex...
36 An Important Corollary Corollary. F = min b(x) F (b) B(x) = arg min b(x) F (b). Suggests a possible method for computing F (and B(x)), but as it stands, it s too complex, and anyway it doesn t yield the marginals B R (x)...
37 A Solution Using Belief Propagation on a Partially Ordered Set (But What is a Partially Ordered Set?) A finite partially ordered set is a finite set P together with a binary relation, denoted, which satifies the following three axioms: 1. For all ρ P, ρ ρ. (reflexitivity) 2. If ρ σ and σ ρ, then ρ = σ. (antisymmetry) 3. If ρ σ and σ τ, then ρ τ. (transitivity)
38 A Solution Using Belief Propagation on a Partially Ordered Set (But What is a Partially Ordered Set?) A finite partially ordered set is a finite set P together with a binary relation, denoted, which satifies the following three axioms: 1. For all ρ P, ρ ρ. (reflexitivity) 2. If ρ σ and σ ρ, then ρ = σ. (antisymmetry) 3. If ρ σ and σ τ, then ρ τ. (transitivity)
39 A Solution Using Belief Propagation on a Partially Ordered Set (But What is a Partially Ordered Set?) A finite partially ordered set is a finite set P together with a binary relation, denoted, which satifies the following three axioms: 1. For all ρ P, ρ ρ. (reflexitivity) 2. If ρ σ and σ ρ, then ρ = σ. (antisymmetry) 3. If ρ σ and σ τ, then ρ τ. (transitivity)
40 Hasse Diagrams for Three Posets
41 Overcounting Numbers for Posets We assign an overcounting number φ(ρ) toeach ρ P, such that (1) ρ:ρ σ φ(ρ) =1, for all σ P. The overcounting numbers φ(ρ) are integers and are determined uniquely by (1).
42 Some Overcounting Numbers
43 How to Distribute the Local Potentials in a Given Poset P. Step 1: Assign each variable x i to one or more elements of P.IfP i = {ρ P : x i is assigned to ρ}, then we require: P i is connected; P i is closed under, i.e., if x i is assigned to ρ it is also ssigned to all of ρ s superiors; φ(p i )=1,i.e., the net number of appearances of x i is 1. We denote by D(ρ) (the local domain at ρ) the set of variables which are assigned to ρ.
44 How to Distribute the Local Potentials in a Given Poset P. Step 2: Assign each local potential α R (x R )toone or more elements of P. If P R = {ρ P : α R (x R )isassigned to ρ}, then we require: R D(ρ) for all ρ P R, i.e., the local domain at ρ supports α R (x R ); P R is connected; P R is closed under, i.e., if α R is known to ρ it is also known to all of ρ s superiors; φ(p R )=1,i.e., the net number of appearances of α R (x R ) is 1.
45 Example: Generalized Belief Propagation (YF&W) R= {{1, 2}, {2, 3},...,{8, 9}}: Here is the poset:
46 The Trick is to Cluster the Variables {1,2,4,5} {2,3,5,6} {4,5,7,8} {5,6,8,9} {2,,5} {4,5} {5,6} {5,8} +1 {5}
47 The PBP Algorithm: The Messages. Throughout the algorithm, each edge e =(ρ, σ) ofthe Hasse diagram carries a message m e. The message m e on the edge e is a function on the domain of σ: m e = m e (x σ ). Example: {x, x, x, x } ρ e σ {x, x, x } (x 2,x 4,x 6 ) m e (x 2.x 4,x 6 ) (0, 0, 0) 1 (0, 0, 1) 0.. (1, 1, 1) 0
48 The PBP Algorithm: Calculating Beliefs. Foragiven set of messages {m e : e E}, wedefine the belief at ρ as the following probability density on the domain at ρ: b ρ (x ρ ) α ρ (x ρ ) m e (x fin e ), e E(ρ) where the normalization is such that x ρ b ρ (x ρ )=1. Here α ρ (x ρ )isthe local potential at ρ, and E(ρ) represents the set of messages which are fused at ρ.
49 The PBP Algorithm: Calculating Beliefs. Foragiven set of messages {m e : e E}, wedefine the belief at ρ as the following probability density on the domain at ρ: b ρ (x ρ ) α ρ (x ρ ) m e (x fin e ), e E(ρ) where the normalization is such that x ρ b ρ (x ρ )=1. Here α ρ (x ρ ) is the local potential at ρ, and E(ρ) represents the set of messages which are fused at ρ.
50 The Messages that are Fused at ρ b ρ (x ρ ) α ρ (x ρ ) e E(ρ) m e (x fin e ), The messages in E(ρ) are those that originate outside ρ s field of view but terminate inside it. init(e) ρ e E(ρ) e fin(e)
51 The PBP Algorithm: Updating the Messages. An edge e =(ρ, σ) issaid to be consistent with respect to a given set {m e : E E} of messages if x ρ \x σ b ρ (x ρ )=b σ (x σ ) for all x σ A D(σ). In words, this says that the belief at ρ in x σ, obtained by marginalization, agrees with the belief at σ in x σ.
52 Example of Edge Consistency {x 1, x 2, x 4, x 6 } ρ e σ {x, x, x } e =(ρ, σ) isconsistent if b ρ (x 1,x 2,x 4,x 6 )=b σ (x 2,x 4,x 6 ) x 1 A
53 Example of Edge Consistency {x 1, x 2, x 4, x 6 } ρ e σ {x, x, x } e =(ρ, σ) isconsistent if b ρ (x 1,x 2,x 4,x 6 )=b σ (x 2,x 4,x 6 ) x 1 A
54 The PBP Algorithm: The Update Rule When the message m e is updated, it is adjusted so that the edge e becomes consistent. Explicitly, m e (x σ ) x ρ \x σ ( α ρ\σ (x ρ ) ) g E(ρ)\E(σ) m g(x fin(g) ) f E(σ)\{E(ρ) e} m. f (x fin(f) ) The PBP algorithm proceeds by updating messages according to this rule. The hope is that the messages will converge to a fixed point whose associated beliefs are good approximations to the desired marginals, i.e., b ρ (x ρ ) B ρ (x ρ ), where B(x) is the global or Boltzmann density.
55 The PBP Algorithm: The Update Rule When the message m e is updated, it is adjusted so that the edge e becomes consistent. Explicitly, m e (x σ ) x ρ \x σ ( α ρ\σ (x ρ ) ) g E(ρ)\E(σ) m g(x fin(g) ) f E(σ)\{E(ρ) e} m. f (x fin(f) ) The PBP algorithm proceeds by updating messages according to this rule. The hope is that the messages will converge to a fixed point whose associated beliefs are good approximations to the desired marginals, i.e., b ρ (x ρ ) B ρ (x ρ ), where B(x) is the global or Boltzmann density.
56 How Well Does it Work? A series of experiments: Local domains: R = {{1, 2, 3}, {1, 3, 4}, {2, 3, 5}, {3, 4, 5}}. Local potentials: α 1 (x 1,x 2,x 3 ), α 2 (x 1,x 3,x 4 ), α 3 (x 2,x 3,x 5 ), and α 4 (x 3,x 4,x 5 ) selected at random from the 8-simplex. There are many posets that support this choice of domains; we will investigate only three.
57 How Well Does it Work? A series of experiments: Local domains: R = {{1, 2, 3}, {1, 3, 4}, {2, 3, 5}, {3, 4, 5}}. Local potentials: α 1 (x 1,x 2,x 3 ), α 2 (x 1,x 3,x 4 ), α 3 (x 2,x 3,x 5 ), and α 4 (x 3,x 4,x 5 ) selected at random from the 8-simplex. There are many posets that support this choice of domains; we will investigate only three.
58 How Well Does it Work? A series of experiments: Local domains: R = {{1, 2, 3}, {1, 3, 4}, {2, 3, 5}, {3, 4, 5}}. Local potentials: α 1 (x 1,x 2,x 3 ), α 2 (x 1,x 3,x 4 ), α 3 (x 2,x 3,x 5 ), and α 4 (x 3,x 4,x 5 ) selected at random from the 8-simplex. There are many posets that support this choice of domains; we will investigate only three.
59 How Well Does it Work? A series of experiments: Local domains: R = {{1, 2, 3}, {1, 3, 4}, {2, 3, 5}, {3, 4, 5}}. Local potentials: α 1 (x 1,x 2,x 3 ), α 2 (x 1,x 3,x 4 ), α 3 (x 2,x 3,x 5 ), and α 4 (x 3,x 4,x 5 )selected at random from the 8-simplex. There are many posets that support this choice of domains; we will investigate only three.
60 How Well Does it Work? A series of experiments: Local domains: R = {{1, 2, 3}, {1, 3, 4}, {2, 3, 5}, {3, 4, 5}}. Local potentials: α 1 (x 1,x 2,x 3 ), α 2 (x 1,x 3,x 4 ), α 3 (x 2,x 3,x 5 ), and α 4 (x 3,x 4,x 5 )selected at random from the 8-simplex. There are many posets that support this choice of domains; we will investigate only three.
61 Poset Number One (Junction Graph Construction) {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5}
62 Poset Number Two (Factor Graph Construction) {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5}
63 Poset Number Three (Cluster Variational Method) {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1
64 Poset Number One {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} PBP on Poset #1 local kernel # iterative exact
65 Poset Number One {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} PBP on Poset #1 local kernel # iterative exact
66 Poset Number One {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} PBP on Poset #1 local kernel # iterative exact
67 Poset Number One {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} PBP on Poset #1 local kernel # iterative exact
68 Poset Number One {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} PBP on Poset #1 variable # iterative exact
69 Poset Number One {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} PBP on Poset #1 variable # iterative exact
70 Poset Number One {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} PBP on Poset #1 variable # iterative exact
71 Poset Number One {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} PBP on Poset #1 variable # iterative exact
72 Poset Number One {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} PBP on Poset #1 variable # iterative exact
73 Poset Number Two (Factor Graph Construction) {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5}
74 Poset Number Two {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} PBP on Poset #2 local kernel # iterative exact
75 Poset Number Two {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} PBP on Poset #2 local kernel # iterative exact
76 Poset Number Two {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} PBP on Poset #2 local kernel # iterative exact
77 Poset Number Two {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} PBP on Poset #2 local kernel # iterative exact
78 Poset Number Two {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} PBP on Poset #2 variable # iterative exact
79 Poset Number Two {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} PBP on Poset #2 variable # iterative exact
80 Poset Number Two {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} PBP on Poset #2 variable # iterative exact
81 Poset Number Two {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} PBP on Poset #2 variable # iterative exact
82 Poset Number Two {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} PBP on Poset #2 variable # iterative exact
83 Poset Number Three (Cluster Variational Method) {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1
84 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #2 local kernel #1 (1000 runs,, w=0.65) iterative exact
85 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #2 local kernel #2 (1000 runs,, w=0.65) iterative exact
86 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #2 local kernel #3 (1000 runs,, w=0.65) iterative exact
87 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #2 local kernel #4 (1000 runs,, w=0.65) iterative exact
88 The Problem is Non-Convergence, and is Easily Repaired. y n+1 = F (y n ) (too aggressive) y n+1 = y n F (y n ) y n+1 = y 1 w n F (y n ) w for 0 <w<1.
89 The Problem is Non-Convergence, and is Easily Repaired. y n+1 = F (y n ) (too aggressive) y n+1 = y n F (y n ) (slower, but surer) y n+1 = y 1 w n F (y n ) w for 0 <w<1.
90 The Problem is Non-Convergence, and is Easily Repaired. y n+1 = F (y n ) (too aggressive) y n+1 = y n F (y n ) (slower, but surer) y n+1 = y 1 w n F (y n ) w for 0 <w<1.
91 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 KullbackLeibler Divergence (summed over experiments) Performance of GGBP (PBP) on Poset #3 for 10/w iterations w, coefficient of old message in update rule
92 {1,2,3} {1,3,4} {2,3,5} {3,4,5} Poset Number Three {1,3} {2,3} {3,4} {3,5} {3} +1 Performance of GGBP (PBP) on Poset #3 for 1000/w iterations (on [.3,.6]) 800 KullbackLeibler Divergence (summed over experiments) w, coefficient of old message in update rule
93 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #3 local kernel # iterative exact
94 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #3 local kernel # iterative exact
95 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #3 local kernel # iterative exact
96 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #3 local kernel # iterative exact
97 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #3 variable # iterative exact
98 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #3 variable # iterative exact
99 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #3 variable # iterative exact
100 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #3 variable # iterative exact
101 Poset Number Three {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} +1 1 PBP on Poset #3 variable # iterative exact
102 Why Does It Work? No one really knows, but... The PBP algorithm can be viewed as an algorithm for minimizing a certain energy function. There is a one-toone correspondence between the fixed points of PBP and the stationary points of the energy. (It appears that the attractive fixed points correspond to the local minima of the free energy.)
103 Why Does It Work? No one really knows, but... The PBP algorithm can be viewed as an algorithm for minimizing a certain energy function. There is a one-toone correspondence between the fixed points of PBP and the stationary points of the energy. (It appears that the attractive fixed points correspond to the local minima of the free energy.)
104 Why Does It Work? No one really knows, but... The PBP algorithm can be viewed as an algorithm for minimizing a certain energy function. There is a one-toone correspondence between the fixed points of PBP and the stationary points of this energy surface. (It appears that the attractive fixed points correspond to the local minima of the free energy.)
105 More Precisely: The Bethe-Kikuchi Approximation We know F = min b(x) B(x) =argmin b(x) F (b(x)) F (b(x)). We approximate F (b(x)) with something that depends only on the poset P and the marginals b ρ (x ρ )ofb(x): F P (b(x)) = ρ P φ(ρ) F ρ (b ρ (x ρ ), where F ρ (b ρ (x ρ )) is the local free energy at ρ, defined as b ρ (x ρ )E ρ (x ρ )+ b ρ (x ρ )lnb ρ (x ρ ). x ρ x ρ
106 More Precisely: The Bethe-Kikuchi Approximation We know F = min b(x) B(x) =argmin b(x) F (b(x)) F (b(x)). We approximate F (b(x)) with something that depends only on the poset P and the marginals b ρ (x ρ )ofb(x): F P (b(x)) = ρ P φ(ρ) F ρ (b ρ (x ρ ), where F ρ (b ρ (x ρ )) is the local free energy at ρ, defined as b ρ (x ρ )E ρ (x ρ )+ b ρ (x ρ )lnb ρ (x ρ ). x ρ x ρ
107 An Analogy: Black line = F Colored line = F P. The hope is that min F {bρ (x ρ )}) P min b(x) F = F.
108 Poset Number One The BK Approximation {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1} {2,3} {3,4} {3,5} Variational Free Energy Poset #1 with error= FP weighted difference from convergence point
109 Poset Number Two The BK Approximation {1,2,3} {1,3,4} {3,4,5} {2,3,5} {1} {2} {3} {4} {5} Variational Free Energy Poset #2 with error= F weighted difference from convergence point
110 Poset Number Three The BK Approximation {1,2,3} {1,3,4} {2,3,5} {3,4,5} {1,3} {2,3} {3,4} {3,5} {3} Variational Free Energy Poset #3 with error= F weighted difference from convergence point
111 Some Open Questions What is the best poset for a given MPD problem? Prove (or find a counterexample) that the stable fixed points of PBP correspond to the local minima of the BK variational free energy. Can the PBP algorithm always be made to converge using the smoothing trick? What is the relationship between the BK approximate free energy and the exact (Helmholtz), free energy? Can other combinatorial optimization methods, e.g. simulated annealing, be used to minimize F P, thereby leading to alternative BP algorithms?
112 Some Open Questions What is the best poset for a given MPD problem? Prove (or find a counterexample) that the stable fixed points of PBP correspond to the local minima of the BK variational free energy. Can the PBP algorithm always be made to converge using the smoothing trick? What is the relationship between the BK approximate free energy and the exact (Helmholtz), free energy? Can other combinatorial optimization methods, e.g. simulated annealing, be used to minimize F P, thereby leading to alternative BP algorithms?
113 Some Open Questions What is the best poset for a given MPD problem? Prove (or find a counterexample) that the stable fixed points of PBP correspond to the local minima of the BK variational free energy. Can the PBP algorithm always be made to converge using the smoothing trick? What is the relationship between the BK approximate free energy and the exact (Helmholtz), free energy? Can other combinatorial optimization methods, e.g. simulated annealing, be used to minimize F P, thereby leading to alternative BP algorithms?
114 Some Open Questions What is the best poset for a given MPD problem? Prove (or find a counterexample) that the stable fixed points of PBP correspond to the local minima of the BK variational free energy. Can the PBP algorithm always be made to converge using the smoothing trick? What is the relationship between the BK approximate free energy and the exact (Helmholtz), free energy? Can other combinatorial optimization methods, e.g. simulated annealing, be used to minimize F P, thereby leading to alternative BP algorithms?
115 Some Open Questions What is the best poset for a given MPD problem? Prove (or find a counterexample) that the stable fixed points of PBP correspond to the local minima of the BK variational free energy. Can the PBP algorithm always be made to converge using the smoothing trick? What is the relationship between the BK approximate free energy and the exact (Helmholtz), free energy? Can other combinatorial optimization methods, e.g. simulated annealing, be used to minimize F P, thereby leading to alternative BP algorithms?
116 Some Open Questions What is the best poset for a given MPD problem? Prove (or find a counterexample) that the stable fixed points of PBP correspond to the local minima of the BK variational free energy. Can the PBP algorithm always be made to converge using the smoothing trick? What is the relationship between the BK approximate free energy and the exact (Helmholtz), free energy? Can other combinatorial optimization methods, e.g. simulated annealing, be used to minimize F P, thereby leading to alternative BP algorithms?
The Generalized Distributive Law and Free Energy Minimization
The Generalized Distributive Law and Free Energy Minimization Srinivas M. Aji Robert J. McEliece Rainfinity, Inc. Department of Electrical Engineering 87 N. Raymond Ave. Suite 200 California Institute
More informationUNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS
UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems
More informationLecture 18 Generalized Belief Propagation and Free Energy Approximations
Lecture 18, Generalized Belief Propagation and Free Energy Approximations 1 Lecture 18 Generalized Belief Propagation and Free Energy Approximations In this lecture we talked about graphical models and
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationEstimating the Capacity of the 2-D Hard Square Constraint Using Generalized Belief Propagation
Estimating the Capacity of the 2-D Hard Square Constraint Using Generalized Belief Propagation Navin Kashyap (joint work with Eric Chan, Mahdi Jafari Siavoshani, Sidharth Jaggi and Pascal Vontobel, Chinese
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction
More informationFractional Belief Propagation
Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation
More informationConnecting Belief Propagation with Maximum Likelihood Detection
Connecting Belief Propagation with Maximum Likelihood Detection John MacLaren Walsh 1, Phillip Allan Regalia 2 1 School of Electrical Computer Engineering, Cornell University, Ithaca, NY. jmw56@cornell.edu
More informationVariational algorithms for marginal MAP
Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang
More information12 : Variational Inference I
10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of
More information13 : Variational Inference: Loopy Belief Propagation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction
More informationConcavity of reweighted Kikuchi approximation
Concavity of reweighted ikuchi approximation Po-Ling Loh Department of Statistics The Wharton School University of Pennsylvania loh@wharton.upenn.edu Andre Wibisono Computer Science Division University
More informationInformation, Physics, and Computation
Information, Physics, and Computation Marc Mezard Laboratoire de Physique Thdorique et Moales Statistiques, CNRS, and Universit y Paris Sud Andrea Montanari Department of Electrical Engineering and Department
More informationData Fusion Algorithms for Collaborative Robotic Exploration
IPN Progress Report 42-149 May 15, 2002 Data Fusion Algorithms for Collaborative Robotic Exploration J. Thorpe 1 and R. McEliece 1 In this article, we will study the problem of efficient data fusion in
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationWalk-Sum Interpretation and Analysis of Gaussian Belief Propagation
Walk-Sum Interpretation and Analysis of Gaussian Belief Propagation Jason K. Johnson, Dmitry M. Malioutov and Alan S. Willsky Department of Electrical Engineering and Computer Science Massachusetts Institute
More informationExpectation Consistent Free Energies for Approximate Inference
Expectation Consistent Free Energies for Approximate Inference Manfred Opper ISIS School of Electronics and Computer Science University of Southampton SO17 1BJ, United Kingdom mo@ecs.soton.ac.uk Ole Winther
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More information6.867 Machine learning, lecture 23 (Jaakkola)
Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context
More informationA Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation
A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation Max Welling Dept. of Computer Science University of California, Irvine Irvine, CA 92697-3425, USA Andrew E. Gelfand Dept. of Computer
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationStructured Variational Inference
Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:
More informationHESKES ET AL. Approximate Inference and Constrained Optimization. SNN, University of Nijmegen Geert Grooteplein 21, 6525 EZ, Nijmegen, The Netherlands
UA12003 HESKES ET AL. 313 Approximate Inference and Constrained Optimization Tom Heskes Kees Albers Bert Kappen SNN, University of Nijmegen Geert Grooteplein 21, 6525 EZ, Nijmegen, The Netherlands Abstract
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 9 Undirected Models CS/CNS/EE 155 Andreas Krause Announcements Homework 2 due next Wednesday (Nov 4) in class Start early!!! Project milestones due Monday (Nov 9)
More informationChapter 11. Stochastic Methods Rooted in Statistical Mechanics
Chapter 11. Stochastic Methods Rooted in Statistical Mechanics Neural Networks and Learning Machines (Haykin) Lecture Notes on Self-learning Neural Algorithms Byoung-Tak Zhang School of Computer Science
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationSemidefinite relaxations for approximate inference on graphs with cycles
Semidefinite relaxations for approximate inference on graphs with cycles Martin J. Wainwright Electrical Engineering and Computer Science UC Berkeley, Berkeley, CA 94720 wainwrig@eecs.berkeley.edu Michael
More informationOn the Choice of Regions for Generalized Belief Propagation
UAI 2004 WELLING 585 On the Choice of Regions for Generalized Belief Propagation Max Welling (welling@ics.uci.edu) School of Information and Computer Science University of California Irvine, CA 92697-3425
More informationA Combined LP and QP Relaxation for MAP
A Combined LP and QP Relaxation for MAP Patrick Pletscher ETH Zurich, Switzerland pletscher@inf.ethz.ch Sharon Wulff ETH Zurich, Switzerland sharon.wulff@inf.ethz.ch Abstract MAP inference for general
More informationProbabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013
School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two
More informationOn the number of circuits in random graphs. Guilhem Semerjian. [ joint work with Enzo Marinari and Rémi Monasson ]
On the number of circuits in random graphs Guilhem Semerjian [ joint work with Enzo Marinari and Rémi Monasson ] [ Europhys. Lett. 73, 8 (2006) ] and [ cond-mat/0603657 ] Orsay 13-04-2006 Outline of the
More informationLinear Response for Approximate Inference
Linear Response for Approximate Inference Max Welling Department of Computer Science University of Toronto Toronto M5S 3G4 Canada welling@cs.utoronto.ca Yee Whye Teh Computer Science Division University
More informationMin-Max Message Passing and Local Consistency in Constraint Networks
Min-Max Message Passing and Local Consistency in Constraint Networks Hong Xu, T. K. Satish Kumar, and Sven Koenig University of Southern California, Los Angeles, CA 90089, USA hongx@usc.edu tkskwork@gmail.com
More information17 Variational Inference
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms for Inference Fall 2014 17 Variational Inference Prompted by loopy graphs for which exact
More informationApproximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)
Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles
More informationInformation Projection Algorithms and Belief Propagation
χ 1 π(χ 1 ) Information Projection Algorithms and Belief Propagation Phil Regalia Department of Electrical Engineering and Computer Science Catholic University of America Washington, DC 20064 with J. M.
More informationGrowth Rate of Spatially Coupled LDPC codes
Growth Rate of Spatially Coupled LDPC codes Workshop on Spatially Coupled Codes and Related Topics at Tokyo Institute of Technology 2011/2/19 Contents 1. Factor graph, Bethe approximation and belief propagation
More informationNOTES ON PROBABILISTIC DECODING OF PARITY CHECK CODES
NOTES ON PROBABILISTIC DECODING OF PARITY CHECK CODES PETER CARBONETTO 1. Basics. The diagram given in p. 480 of [7] perhaps best summarizes our model of a noisy channel. We start with an array of bits
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference
More informationClique trees & Belief Propagation. Siamak Ravanbakhsh Winter 2018
Graphical Models Clique trees & Belief Propagation Siamak Ravanbakhsh Winter 2018 Learning objectives message passing on clique trees its relation to variable elimination two different forms of belief
More informationA Generalized Loop Correction Method for Approximate Inference in Graphical Models
for Approximate Inference in Graphical Models Siamak Ravanbakhsh mravanba@ualberta.ca Chun-Nam Yu chunnam@ualberta.ca Russell Greiner rgreiner@ualberta.ca Department of Computing Science, University of
More informationPart 1: Expectation Propagation
Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud
More information(# = %(& )(* +,(- Closed system, well-defined energy (or e.g. E± E/2): Microcanonical ensemble
Recall from before: Internal energy (or Entropy): &, *, - (# = %(& )(* +,(- Closed system, well-defined energy (or e.g. E± E/2): Microcanonical ensemble & = /01Ω maximized Ω: fundamental statistical quantity
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationFactor Graphs and Message Passing Algorithms Part 1: Introduction
Factor Graphs and Message Passing Algorithms Part 1: Introduction Hans-Andrea Loeliger December 2007 1 The Two Basic Problems 1. Marginalization: Compute f k (x k ) f(x 1,..., x n ) x 1,..., x n except
More informationBayesian Machine Learning - Lecture 7
Bayesian Machine Learning - Lecture 7 Guido Sanguinetti Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh gsanguin@inf.ed.ac.uk March 4, 2015 Today s lecture 1
More informationRecitation 9: Loopy BP
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 204 Recitation 9: Loopy BP General Comments. In terms of implementation,
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture Notes Fall 2009 November, 2009 Byoung-Ta Zhang School of Computer Science and Engineering & Cognitive Science, Brain Science, and Bioinformatics Seoul National University
More informationBeyond Log-Supermodularity: Lower Bounds and the Bethe Partition Function
Beyond Log-Supermodularity: Lower Bounds and the Bethe Partition Function Nicholas Ruozzi Communication Theory Laboratory École Polytechnique Fédérale de Lausanne Lausanne, Switzerland nicholas.ruozzi@epfl.ch
More informationInformation Geometric view of Belief Propagation
Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural
More informationOn the Bethe approximation
On the Bethe approximation Adrian Weller Department of Statistics at Oxford University September 12, 2014 Joint work with Tony Jebara 1 / 46 Outline 1 Background on inference in graphical models Examples
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture
More informationLearning unbelievable marginal probabilities
Learning unbelievable marginal probabilities Xaq Pitkow Department of Brain and Cognitive Science University of Rochester Rochester, NY 14607 xaq@post.harvard.edu Yashar Ahmadian Center for Theoretical
More informationMessage Passing and Junction Tree Algorithms. Kayhan Batmanghelich
Message Passing and Junction Tree Algorithms Kayhan Batmanghelich 1 Review 2 Review 3 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here 1 of me 11
More informationDual Decomposition for Marginal Inference
Dual Decomposition for Marginal Inference Justin Domke Rochester Institute of echnology Rochester, NY 14623 Abstract We present a dual decomposition approach to the treereweighted belief propagation objective.
More informationInference and Representation
Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of
More informationTHE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES. Lenka Zdeborová
THE PHYSICS OF COUNTING AND SAMPLING ON RANDOM INSTANCES Lenka Zdeborová (CEA Saclay and CNRS, France) MAIN CONTRIBUTORS TO THE PHYSICS UNDERSTANDING OF RANDOM INSTANCES Braunstein, Franz, Kabashima, Kirkpatrick,
More informationStochastic Processes, Kernel Regression, Infinite Mixture Models
Stochastic Processes, Kernel Regression, Infinite Mixture Models Gabriel Huang (TA for Simon Lacoste-Julien) IFT 6269 : Probabilistic Graphical Models - Fall 2018 Stochastic Process = Random Function 2
More informationOn the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy
On the Convergence of the Convex Relaxation Method and Distributed Optimization of Bethe Free Energy Ming Su Department of Electrical Engineering and Department of Statistics University of Washington,
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationRandom Graph Coloring
Random Graph Coloring Amin Coja-Oghlan Goethe University Frankfurt A simple question... What is the chromatic number of G(n,m)? [ER 60] n vertices m = dn/2 random edges ... that lacks a simple answer Early
More informationDoes Better Inference mean Better Learning?
Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract
More informationLoopy Belief Propagation for Bipartite Maximum Weight b-matching
Loopy Belief Propagation for Bipartite Maximum Weight b-matching Bert Huang and Tony Jebara Computer Science Department Columbia University New York, NY 10027 Outline 1. Bipartite Weighted b-matching 2.
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and
More informationAsaf Bar Zvi Adi Hayat. Semantic Segmentation
Asaf Bar Zvi Adi Hayat Semantic Segmentation Today s Topics Fully Convolutional Networks (FCN) (CVPR 2015) Conditional Random Fields as Recurrent Neural Networks (ICCV 2015) Gaussian Conditional random
More informationVariational Inference (11/04/13)
STA561: Probabilistic machine learning Variational Inference (11/04/13) Lecturer: Barbara Engelhardt Scribes: Matt Dickenson, Alireza Samany, Tracy Schifeling 1 Introduction In this lecture we will further
More informationIntroduction to Low-Density Parity Check Codes. Brian Kurkoski
Introduction to Low-Density Parity Check Codes Brian Kurkoski kurkoski@ice.uec.ac.jp Outline: Low Density Parity Check Codes Review block codes History Low Density Parity Check Codes Gallager s LDPC code
More informationFast Memory-Efficient Generalized Belief Propagation
Fast Memory-Efficient Generalized Belief Propagation M. Pawan Kumar P.H.S. Torr Department of Computing Oxford Brookes University Oxford, UK, OX33 1HX {pkmudigonda,philiptorr}@brookes.ac.uk http://cms.brookes.ac.uk/computervision
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning More Approximate Inference Mark Schmidt University of British Columbia Winter 2018 Last Time: Approximate Inference We ve been discussing graphical models for density estimation,
More information1.2 Posets and Zorn s Lemma
1.2 Posets and Zorn s Lemma A set Σ is called partially ordered or a poset with respect to a relation if (i) (reflexivity) ( σ Σ) σ σ ; (ii) (antisymmetry) ( σ,τ Σ) σ τ σ = σ = τ ; 66 (iii) (transitivity)
More informationDecomposition Methods for Large Scale LP Decoding
Decomposition Methods for Large Scale LP Decoding Siddharth Barman Joint work with Xishuo Liu, Stark Draper, and Ben Recht Outline Background and Problem Setup LP Decoding Formulation Optimization Framework
More informationPhase transitions in discrete structures
Phase transitions in discrete structures Amin Coja-Oghlan Goethe University Frankfurt Overview 1 The physics approach. [following Mézard, Montanari 09] Basics. Replica symmetry ( Belief Propagation ).
More informationSubmodularity in Machine Learning
Saifuddin Syed MLRG Summer 2016 1 / 39 What are submodular functions Outline 1 What are submodular functions Motivation Submodularity and Concavity Examples 2 Properties of submodular functions Submodularity
More informationDeep Learning Srihari. Deep Belief Nets. Sargur N. Srihari
Deep Belief Nets Sargur N. Srihari srihari@cedar.buffalo.edu Topics 1. Boltzmann machines 2. Restricted Boltzmann machines 3. Deep Belief Networks 4. Deep Boltzmann machines 5. Boltzmann machines for continuous
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationMessage passing and approximate message passing
Message passing and approximate message passing Arian Maleki Columbia University 1 / 47 What is the problem? Given pdf µ(x 1, x 2,..., x n ) we are interested in arg maxx1,x 2,...,x n µ(x 1, x 2,..., x
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul
More informationGraphical models and message-passing Part II: Marginals and likelihoods
Graphical models and message-passing Part II: Marginals and likelihoods Martin Wainwright UC Berkeley Departments of Statistics, and EECS Tutorial materials (slides, monograph, lecture notes) available
More information12. LOCAL SEARCH. gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria
12. LOCAL SEARCH gradient descent Metropolis algorithm Hopfield neural networks maximum cut Nash equilibria Lecture slides by Kevin Wayne Copyright 2005 Pearson-Addison Wesley h ttp://www.cs.princeton.edu/~wayne/kleinberg-tardos
More informationarxiv: v1 [stat.ml] 26 Feb 2013
Variational Algorithms for Marginal MAP Qiang Liu Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA, 92697-3425, USA qliu1@uci.edu arxiv:1302.6584v1 [stat.ml]
More informationStatistical Approaches to Learning and Discovery
Statistical Approaches to Learning and Discovery Graphical Models Zoubin Ghahramani & Teddy Seidenfeld zoubin@cs.cmu.edu & teddy@stat.cmu.edu CALD / CS / Statistics / Philosophy Carnegie Mellon University
More informationCanonicity and representable relation algebras
Canonicity and representable relation algebras Ian Hodkinson Joint work with: Rob Goldblatt Robin Hirsch Yde Venema What am I going to do? In the 1960s, Monk proved that the variety RRA of representable
More informationProbabilistic numerics for deep learning
Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling
More informationCS Lecture 4. Markov Random Fields
CS 6347 Lecture 4 Markov Random Fields Recap Announcements First homework is available on elearning Reminder: Office hours Tuesday from 10am-11am Last Time Bayesian networks Today Markov random fields
More informationLINK scheduling algorithms based on CSMA have received
Efficient CSMA using Regional Free Energy Approximations Peruru Subrahmanya Swamy, Venkata Pavan Kumar Bellam, Radha Krishna Ganti, and Krishna Jagannathan arxiv:.v [cs.ni] Feb Abstract CSMA Carrier Sense
More informationLinear Response Algorithms for Approximate Inference in Graphical Models
Linear Response Algorithms for Approximate Inference in Graphical Models Max Welling Department of Computer Science University of Toronto 10 King s College Road, Toronto M5S 3G4 Canada welling@cs.toronto.edu
More informationMaxime GUILLAUD. Huawei Technologies Mathematical and Algorithmic Sciences Laboratory, Paris
1/21 Maxime GUILLAUD Alignment Huawei Technologies Mathematical and Algorithmic Sciences Laboratory, Paris maxime.guillaud@huawei.com http://research.mguillaud.net/ Optimisation Géométrique sur les Variétés
More informationLearning MN Parameters with Approximation. Sargur Srihari
Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief
More informationConcavity of reweighted Kikuchi approximation
Concavity of reweighted Kikuchi approximation Po-Ling Loh Department of Statistics The Wharton School University of Pennsylvania loh@wharton.upenn.edu Andre Wibisono Computer Science Division University
More information5. Density evolution. Density evolution 5-1
5. Density evolution Density evolution 5-1 Probabilistic analysis of message passing algorithms variable nodes factor nodes x1 a x i x2 a(x i ; x j ; x k ) x3 b x4 consider factor graph model G = (V ;
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference II: Mean Field Method and Variational Principle Junming Yin Lecture 15, March 7, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3
More informationapproximation of the dimension. Therefore, we get the following corollary.
INTRODUCTION In the last five years, several challenging problems in combinatorics have been solved in an unexpected way using polynomials. This new approach is called the polynomial method, and the goal
More informationFirst-Order Ordinary Differntial Equations II: Autonomous Case. David Levermore Department of Mathematics University of Maryland.
First-Order Ordinary Differntial Equations II: Autonomous Case David Levermore Department of Mathematics University of Maryland 25 February 2009 These notes cover some of the material that we covered in
More informationDefinition: A binary relation R from a set A to a set B is a subset R A B. Example:
Chapter 9 1 Binary Relations Definition: A binary relation R from a set A to a set B is a subset R A B. Example: Let A = {0,1,2} and B = {a,b} {(0, a), (0, b), (1,a), (2, b)} is a relation from A to B.
More information