Bounded Inference Non-iteratively; Mini-Bucket Elimination
|
|
- Russell Chapman
- 5 years ago
- Views:
Transcription
1 Bounded Inference Non-iteratively; Mini-Bucket Elimination COMPSCI 276,Spring 2018 Class 6: Rina Dechter Reading: Class Notes (8, Darwiche chapters 14
2 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Re-parameterization, cost-shifting Iterative Belief propagation Iterative-join-graph propagation
3 Types of queries Max-Inference Sum-Inference Harder Mixed-Inference NP-hard: exponentially many terms We will focus on approximation algorithms Anytime: very fast & very approximate! Slower & more accurate
4 Queries Probability of evidence (or partition function P( e = X var( e i= 1 P( x i pa i Posterior marginal (beliefs: P( x i P( xi, e e = = P( e Most Probable Explanation x* n e X var( e X j= 1 n X var( e j= 1 = n P( x Z = ( P( x argmaxp(x,e x i j j X pa pa j j i e e i C i
5 Bucket Elimination 0, ( 0 ( = = e a P e a P = = = d e b c c b e P b a d P a c P a b P a P e a P 0,,,, (, ( ( ( ( 0, ( = = d c b e b a d P c b e P a b P a c P a P, (, ( ( ( ( 0 Elimination Order: d,e,b,c Query: D: E: B: C: A: = d D b a d P b a f, (, (, ( b a d P, ( c b e P, 0 (, ( c b e P c b f E = = = b E D B c b f b a f a b P c a f, (, ( (, ( ( ( 0, ( a f A p e a P C = = P(a ( a c P = c B C c a f a c P a f, ( ( ( ( a b P D,A,B E,B,C B,A,C C,A A, ( b a f D, ( c b f E, ( c a f B f C (a A D E C B Bucket Tree D E B C A Original Functions Messages Time and space exp(w*
6 Finding Algorithm elim-mpe (Dechter 1996 MPE = max P(x x MPE = bucket B: bucket C: bucket D: bucket E: bucket A: is replaced by max : max P( a P( c a P( b a P( d a, e, d, c, b max b P(b a P(d b,a P(e b,c P(c a e=0 P(a h C (a,d,e MPE h D (a,e h E (a a, b P( e Elimination operator W*=4 induced width (max clique size b, c B C D E A
7 Generating the MPE-tuple 5. b' = arg max P(b a' b P(d' b,a' P(e' b,c' B: P(b a P(d b,a P(e b,c 4. c' = h arg max P(c a' B c (a',d',c, e' C: P(c a h B (a,d,c, e 3. d' = arg max h d C (a',d,e' D: h C (a,d,e 2. e' = 0 E: e=0 h D (a, e 1. a' = arg max P(a h a E (a A: P(a h E (a Return (a',b',c',d',e'
8 Approximate Inference Metrics of evaluation Absolute error: given e>0 and a query p= P(x e, an estimate r has absolute error e iff p-r <e Relative error: the ratio r/p in [1-e,1+e]. Dagum and Luby 1993: approximation up to a relative error is NPhard. Absolute error is also NP-hard if error is less than.5
9 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Re-parameterization, cost-shifting Iterative Belief propagation Iterative-join-graph propagation
10 Mini-Buckets: Local Inference Computation in a bucket is time and space exponential in the number of variables involved Therefore, partition functions in a bucket into mini-buckets on smaller number of variables
11 Decomposition Bounds Upper & lower bounds via approximate problem decomposition Example: MAP inference X F (X X f 1 (X X f 2 (X = = = 6.0 Relaxation: two copies of x, no longer required to be equal Bound is tight (equality if f 1, f 2 agree on maximizing value x
12 Mini-Bucket Approximation Split a bucket into mini-buckets > bound complexity bucket (X = Exponential complexity decrease:
13 Mini-Bucket Elimination mini-buckets [Dechter & Rish 2003] A bucket B: bucket C: B C bucket D: bucket E: D E bucket A: U = upper bound
14 Mini-Bucket Elimination mini-buckets [Dechter & Rish 2003] B D B A E C bucket B: bucket C: bucket D: bucket E: bucket A: U = upper bound Can interpret process as duplicating B [Kask et al. 2001, Geffner et al. 2007, Choi et al. 2007, Johnson et al. 2007]
15 Semantics of Mini-Bucket: Splitting a Node Variables in different buckets are renamed and duplicated (Kask et. al., 2001, (Geffner et. al., 2007, (Choi, Chavira, Darwiche, 2007 Before Splitting: Network N After Splitting: Network N' U U Û 16
16 Relaxed Network Example
17 MBE-MPE(i: Algorithm MBE-mpe Input: I max number of variables allowed in a mini-bucket Output: [lower bound (P of suboptimal solution, upper bound] Example: MBE-mpe(3 versus BE-mpe B: f a, b f b, c f b, d f b, e max variables in a mini-bucket 3: B: f a, b f b, c f b, d f b, e C: λ B C (a, c f c, a f c, e 3: C: λ B C (a, c, d, e f c, a f c, e D: f a, d λ B D (d, e 3: D: f a, d λ C D (a, d, e E: λ C E (a, e λ D E (a, e 2: E: λ D E (a, e A: f a λ E A (a 1: A: f a λ E A (a U = Upper bound w = 2 OPT w =4 [Dechter and Rish, 1997]
18 Mini-Bucket Decoding b = arg min b መd = arg min d a = arg min a f a, b + f b, cƹ +f b, መd + f(b, e Ƹ c Ƹ = arg min λ B C a, c + f c, a + f(c, e Ƹ c f a, d + λ B D (d, e Ƹ e Ƹ = arg min λ C E ( a, e + λ D E ( a, e e f a + λ E A (a bucket B: bucket C: bucket D: bucket E: bucket A: mini-buckets min f( min f( B f a, b f b, c f b, d f b, e λ B C (a, c f c, a f c, e f a, d λ B D (d, e λ C E (a, e λ D E (a, e f a λ E A (a B Greedy configuration = upper bound L = lower bound [Dechter and Rish, 2003]
19 (i,m-patitionings
20 MBE(i,m, MBE(i Input: Belief network ( P,..., Output: upper and lower bounds Initialize: put functions in buckets along ordering Process each bucket from p=n to 1 Create (i,m-partitions Process each mini-bucket (For mpe: assign values in ordering d 1 Return: mpe-configuration, upper and lower bounds P n
21
22 Partitioning, Refinements
23 Properties of MBE(i Complexity: O(r exp(i time and O(exp(i space Yields a lower bound and an upper bound Accuracy: determined by upper/lower (U/L bound Possible use of mini-bucket approximations As anytime algorithms As heuristics in search Other tasks (similar mini-bucket approximations Belief updating, Marginal MAP, MEU, WCSP, Max-CSP [Dechter and Rish, 1997], [Liu and Ihler, 2011], [Liu and Ihler, 2013]
24 Anytime Approximation
25 MBE for Belief Updating and for Probability of Evidence or Partition Function Idea mini-bucket is the same: f ( x g( x f ( x g( x X X X X f ( x g( x f ( x max g( X So we can apply a sum in each mini-bucket, or better, one sum and the rest max, or min (for lower-bound MBE-bel-max(i,m, MBE-bel-min(i,m generating upper and lower-bound on beliefs approximates BE-bel MBE-map(i,m: max mini-buckets will be maximized, sum mini-buckets will be sum-max. Approximates BE-map. X X
26
27 Methodology for Empirical Evaluation (for mpe U/L accuracy Better (U/mpe or mpe/l Benchmarks: Random networks Given n,e,v generate a random DAG For xi and parents generate table from uniform [0,1], or noisy-or Create k instances. For each, generate random evidence, likely evidence Measure averages
28 Upper/Lower CPCS Networks Medical Diagnosis (noisy-or model Test case: no evidence Anytime-mpe( U/L error vs time cpcs422b cpcs360b Algorithm elim-mpe anytime-mpe(, anytime-mpe(, 0.6 i=1 i= Time and parameter i 4 = 10 1 = 10 Time (sec cpcs360 cpcs
29 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Re-parameterization, cost-shifting Iterative Belief propagation Iterative-join-graph propagation
30 The Power Sum and Holder Inequality
31 Working Example Model: Markov network Task: Partition function A B C (Qiang Liu slides
32 Mini-Bucket (Basic Principles Upper bound Lower bound (Qiang Liu slides
33 Holder Inequality Where and When, the equality is achieved. (Qiang Liu slides G. H. Hardy, J. E. Littlewood and G. Pólya, dechter, Inequalities, class Cambridge Univ. Press, London and New York, 1934.
34 G. H. Hardy, J. E. Littlewood and G. Pólya, Inequalities, Cambridge Univ. Press, London and New York, Reverse Holder Inequality If, but the direction of the inequality reverses. (Qiang Liu slides
35 Mini-Bucket as Holder Inequality (mbe-bel-max (mbe-bel-min (Qiang Liu slides
36 Weighted Mini-Bucket (for summation Exact bucket elimination: λ B a, c, d, e = where b w 1 f a, b f b, c b f a, b f b, c f b, d f b, e w 2 f b, d f b, e = λ B C (a, c λ B D (d, e (mini-buckets w f x = f x 1 w x x is the weighted or power sum operator b w bucket B: bucket C: bucket D: bucket E: bucket A: mini-buckets f(x f a, b f b, c f b, d f b, e λ B C (a, c f a, c f c, e f a, d λ B D (d, e λ C E (a, e λ D E (a, e f a λ E A (a w x w w 1 f 1 x f 2 x f 1 x x x w 2 f 2 x where w 1 + w 2 = w and w 1 > 0, w 2 > 0 (lower bound if w 1 > 0, w 2 < 0 x U = upper bound [Liu and Ihler, 2011]
37
38 Choosing the weights (Cauchy Schwarz inequality (Qiang Liu slides
39 Weighted-mini-bucket for Marginal Map
40 MB and WMB for Marginal MAP max Y Σ X\Y Π j P j w 1 λ B C a, c = f a, b f(b, c b w 2 λ B D d, e = f b, d f(b, e. b (w 1 + w 2 = 1 Marginal MAP Σ B Σ C max D bucket B: bucket C: bucket D: mini-buckets f a, b f b, c f b, d f b, e λ B C (a, c f a, c f c, e f a, d λ B D (d, e λ E A a = max e λ C E a, e λ D E (a, e max E bucket E: λ C E (a, e λ D E (a, e U = max a f a λ E A(a max A bucket A: f a λ E A (a Can optimize over cost-shifting and weights (single pass MM or iterative message passing U = upper bound [Liu and Ihler, 2011; 2013] [Dechter and Rish, 2003]
41 MBE-map Process max buckets With max mini-buckets And sum buckets with sum Mini-bucket and max mini-buckets
42 Initial partitioning
43
44 Complexity and Tractability of MBE(i,m
45 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Re-parameterization, cost-shifting Iterative Belief propagation Iterative-join-graph propagation
46 Join-Tree Clustering (Cluster-Tree Elimination A B 1 ABC BC h( 1,2 ( b, c = p( a p( b a p( c a, b a C D E 2 BCDF BF h( 2,1 ( b, c = p( d b p( f c, d h(3,2 ( b, f d, f h( 2,3 ( b, f = p( d b p( f c, d h(1,2 ( b, c c, d F EXACT algorithm Time and space: exp(cluster size= exp(treewidth G 3 BEF 4 EF EFG h( 3,2 ( b, f = p( e b, f h(4,3 ( e, f e h( 3,4 ( e, f = p( e b, f h(2,3 ( b, f h( 4,3 ( e, f = p( G = ge e, f b
47
48 Mini-Clustering, i-bound=3 A B 1 A B C p(a, p(b a, p(c a,b C F D E G 2 3 BC B C D p(d b, h (1,2 (b,c C D F p(f c,d BF B E F p(e b,f, h 1 (2,3(b, h 2 (2,3(f 1 h( 1,2 ( b, c = p( a p( b a p( c a, b a 1 1 h( 2,3 ( b = p( d b h(1,2 ( b, c c, d 2 h( 2,3 ( f = max p( f c, d c, d APPROXIMATE algorithm 4 EF E F G p(g e,f Time and space: exp(i-bound Number of variables in a mini-cluster
49 EF BF BC, ( ( ( :, ( 1 1,2 ( b a c p a b p a p c b h a = H (1,2, ( max : (, ( ( : (, 2 (2,1 1 (3,2, 1 (2,1 d c f p c h f b h b d p b h f d f d = = H (2,1, ( max : (, ( ( : (, 2 (2,3 1 (1,2, 1 (2,3 d c f p f h c b h b d p b h d c d c = = H (2,3, (, ( :, ( 1 ( 4, , ( f e h f b e p f b h e = H (3,2 ( (, ( :, ( 2 ( 2,3 1 ( 2, , ( f h b h f b e p f e h b = H (3,4, ( :, ( 1 4,3 ( f e g G p f e h e = = H (4,3 ABC BEF EFG BCDF Mini-Clustering - Example
50 Cluster Tree Elimination vs. Mini-Clustering 1 ABC BC h( 1, 2 ( b, c 1 ABC BC H (1,2 1 h( 1, 2 ( b, c h 1 (2,1 ( b 2 BCDF BF h( 2,1 ( b, c h( 2,3 ( b, f 2 BCDF BF H (2,1 H (2,3 h h h 2 (2,1 1 (2,3 2 (2,3 ( c ( b ( f 3 BEF EF 4 EFG h( 3, 2 ( b, f h( 3, 4 ( e, f h( 4,3 ( e, f 3 BEF EF 4 EFG H (3,2 H (3,4 H (4,3 1 h( 3, 2 ( b, f 1 h( 3, 4 ( e, f 1 h( 4,3 ( e, f
51 Linear Block Codes Received bits a b c d e f g h σ Input bits Parity bits A B C D E F G H Gaussian channel noise σ Received bits p 1 p 2 p 3 p 4 p 5 p 6
52 Probabilistic decoding Error-correcting linear block code State-of-the-art: approximate algorithm iterative belief propagation (IBP (Pearl s poly-tree algorithm applied to loopy networks
53 Iterative Belief Proapagation Belief propagation is exact for poly-trees IBP - applying BP iteratively to cyclic networks One step : update BEL(U 1 U1 U2 U3 U2 ( x 1 U3 ( x 1 X1 ( u 1 X 1 X 2 X 2 ( u 1 No guarantees for convergence Works well for many coding networks
54 MBE-mpe vs. IBP IB P a p p ro x - m p e is b e tte r on lo w - is b e tte r on r a nd o m ly ge ne r a te d w * c o d e s ( high - w * c o d e s Bit error rate (BER as a function of noise (sigma:
55 Grid 15x15-10 evidence Grid 15x15, evid=10, w*=22, 10 instances Grid 15x15, evid=10, w*=22, 10 instances NHD MC IBP Absolute error MC IBP i-bound Grid 15x15, evid=10, w*=22, 10 instances i-bound Grid 15x15, evid=10, w*=22, 10 instances MC IBP 10 MC IBP Relative error Time (seconds i-bound i-bound
56 Heuristics for partitioning (Dechter and Rish, 2003, Rollon and Dechter 2010 Scope-based Partitioning Heuristic (SCP aims at minimizing the number of mini-buckets in the partition by including in each minibucket as many functions as respecting the i bound is satisfied Use greedy heuristic derived from a distance function to decide which functions go into a single mini-bucket
57 Greedy Scope-based Partitioning
58 Heuristic for Partitioning Scope-based Partitioning Heuristic. The scope-based partition heuristic (SCP aims at minimizing the number of mini-buckets in the partition by including in each mini-bucket as many functions as possible as long as the i bound is satisfied. First, single function mini-buckets are decreasingly ordered according to their arity from left to right. Then, each mini-bucket is absorbed into the left-most mini-bucket with whom it can be merged. The time complexity of Partition(B, i, where B is the bucket to be partitioned, and B,the number of functions in the bucket, using the SCP heuristic is O( B log ( B + B ^2. The scope-based heuristic is is quite fast, its shortcoming is that it does not consider the actual information in the functions.
59 Greedy Partition as a function of a distance function h
60 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Iterative Belief propagation Iterative-join-graph propagation Re-parameterization, cost-shifting
61 Iterative Belief Proapagation Belief propagation is exact for poly-trees IBP - applying BP iteratively to cyclic networks One step : update BEL(U 1 U1 U2 U3 U2 ( x 1 U3 ( x 1 X1 ( u 1 X 1 X 2 ( u 1 X 2 No guarantees for convergence Works well for many coding networks Lets combine iterative-nature with anytime--ijgp
62 Iterative Join Graph Propagation Loopy Belief Propagation Cyclic graphs Iterative Converges fast in practice (no guarantees though Very good approximations (e.g., turbo decoding, LDPC codes, SAT survey propagation Mini-Clustering(i Tree decompositions Only two sets of messages (inward, outward Anytime behavior can improve with more time by increasing the i-bound We want to combine: Iterative virtues of Loopy BP Anytime behavior of Mini-Clustering(i
63 IJGP - The basic idea Apply Cluster Tree Elimination to any join-graph We commit to graphs that are I-maps Avoid cycles as long as I-mapness is not violated Result: use minimal arc-labeled join-graphs
64 Minimal Arc-Labeled Decomposition ABCDE BC BCE ABCDE BC BCE CDE CE DE CE CDEF CDEF a Fragment of an arc-labeled join-graph a Shrinking labels to make it a minimal arc-labeled join-graph Use a DFS algorithm to eliminate cycles relative to each variable
65 Minimal arc-labeled join-graph
66 Message propagation ABCDE BC BCE CDE CDEF CE ABCDE h (3,1 (bc p(a, p(c, p(b ac, BCE 1 p(d abe,p(e b,c BC 3 h(3,1(bc F F FGH GH h (1,2 CDE CE FGI GI GHIJ 2 CDEF Minimal arc-labeled: sep(1,2={d,e} elim(1,2={a,b,c} Non-minimal arc-labeled: sep(1,2={c,d,e} elim(1,2={a,b} h h( 1,2 ( de = p( a p( c p( b ac p( d abe p( e bc h(3,1 ( bc a, b, c ( 1,2 p( a p( c p( b ac p( d abe p( e bc h(3,1 ( bc a, b ( cde =
67 A IJGP - Example B C A A ABC C C A AB BC C D E ABDE BE BCE DE CE C F G H CDEF F FGH H H F FG GH H I J FGI GI GHIJ Belief network Loopy BP graph
68 Arc-Minimal Join-Graph A A ABC C C A AB BC C Arcs labeled with any single variable should form a TREE DE ABDE CE BE BCE C CDEF F FGH H H F FG GH H FGI GI GHIJ
69 A ABDE FGI ABC BCE GHIJ CDEF FGH C H A AB BC BE C DE CE H F FG GH GI Collapsing Clusters ABCDE FGI BCE GHIJ CDEF FGH BC CDE CE F FG GH GI ABCDE FGI BCE GHIJ CDEF FGH BC CDE CE F FG GH GI ABCDE FGI BCE GHIJ CDEF FGH BC CDE CE F FG GH GI ABCDE FGHI GHIJ CDEF CDE F GHI
70 Join-Graphs A A ABC C C A A ABC C A AB BC C AB BC ABCDE BC BCE ABCDE ABDE BE DE CE BCE C DE ABDE CE BCE C DE CE CDE CDEF F FGH H H CDEF FGH H H CDEF FGH CDEF F FG GH H F F GH F F GH F FGI GI GHIJ FGI GI GHIJ FGI GI GHIJ FGHI GHI GHIJ more accuracy less complexity
71 Bounded decompositions We want arc-labeled decompositions such that: the cluster size (internal width is bounded by i (the accuracy parameter the width of the decomposition as a graph (external width is as small as possible Possible approaches to build decompositions: partition-based algorithms - inspired by the mini-bucket decomposition grouping-based algorithms
72 Constructing Join-Graphs A B G: (GFE GFE P(G F,E C D E E: (EBF (EF EF EBF P(E B,F F F: (FCD (BF D: (DB (CD C: (CAB (CB B: (BA (AB (B A: (A P(F C,D FCD CD P(D B CDB F CB P(C A,B CAB BF P(B A BA BA BF B A P(A A G a schematic mini-bucket(i, i=3 b arc-labeled join-graph decomposition
73 IJGP properties IJGP(i applies BP to min arc-labeled join-graph, whose cluster size is bounded by i On join-trees IJGP finds exact beliefs IJGP is a Generalized Belief Propagation algorithm (Yedidia, Freeman, Weiss 2001 Complexity of one iteration: time: O(deg (n+n d i+1 space: O(N d
74 Empirical evaluation Algorithms: Exact IBP MC IJGP Measures: Networks (all variables are binary: Random networks Grid networks (MxM CPCS 54, 360, 422 Coding networks Absolute error Relative error Kulbach-Leibler (KL distance Bit Error Rate Time
75 Coding Networks Bit Error Rate σ =.22 Coding, N=400, 1000 instances, 30 it, w*=43, sigma=.22 σ =.32 Coding, N=400, 500 instances, 30 it, w*=43, sigma=.32 1e e-2 IJGP MC IBP IBP IJGP BER 1e-3 BER e e i-bound i-bound Coding, N=400, 500 instances, 30 it, w*=43, sigma=.51 σ =.51 Coding, N=400, 500 instances, 30 it, w*=43, sigma=.65 σ = IBP IJGP IBP IJGP BER BER i-bound i-bound
76 CPCS 422 KL Distance CPCS 422, evid=0, w*=23, 1instance CPCS 422, evid=30, w*=23, 1instance 0.1 IJGP 30 it (at convergence MC IBP 10 it (at convergence KL distance KL distance IJGP at convergence MC IBP at convergence i-bound i-bound evidence=0 evidence=30
77 CPCS 422 KL vs. Iterations CPCS 422, evid=0, w*=23, 1instance CPCS 422, evid=30, w*=23, 1instance IJGP (3 IJGP(10 IBP 0.1 IJGP(3 IJGP(10 IBP 0.01 KL distance KL distance number of iterations number of iterations evidence=0 evidence=30
78 Coding networks - Time 10 Coding, N=400, 500 instances, 30 iterations, w*=43 Time (seconds IJGP 30 iterations MC IBP 30 iterations i-bound
79 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Iterative Belief propagation Iterative-join-graph propagation Re-parameterization, cost-shifting
80 Cost-Shifting (Reparameterization + λ(b λ(b A B f(a,b B C f(b,c b b b g 0 1 g b g g 6 1 A B C f(a,b,c b b b 12 b b g 6 b g b 0 b b 6 3 b g 0 3 g b g g B λ(b b 3 g -1 b g g 6 g b b 6 g b g 0 g g b 6 g g g 12 = Modify the individual functions - but keep the sum of functions the same
81 Tightening the bound Reparameterization (or, cost shifting Decrease bound without changing overall function A B f 1 (A,B B C f 2 (B,C = A B C F(A,B,C = A B f 1 (A,B (B B C f 2 (B,C - (B (Adjusting functions cancel each other (Decomposition bound is exact 105
82 Dual Decomposition f 13 (x 1, x 3 x 1 x 3 f 13 ( f 12 (x 1, x 2 x 2 f 23 (x 2, x 3 x 1 x 1 x 3 x 3 f 12 ( f 23 ( x 2 x 2 F = min x f α (x α α min x f α (x Bound solution using decomposed optimization Solve independently: optimistic bound
83 Dual Decomposition f 12 (x 1, x 2 f 13 (x 1, x 3 x 1 x 3 x 2 f 23 (x 2, x 3 λ 1 13 (x 1 λ 3 13 (x 3 f 13 ( λ 1 12 (x 1 x 1 x 1 x 3 x 3 λ 3 23 (x 3 f 12 ( f 23 ( Reparameterization: x 2 x 2 j λ j α x j = 0 λ 2 13 (x 2 λ 2 23 (x 2 α j F = min x f α (x max α λ i α α min x f α (x + λ i α x i i α Bound solution using decomposed optimization Solve independently: optimistic bound Tighten the bound by reparameterization Enforce lost equality constraints via Lagrange multipliers
84 Dual Decomposition f 12 (x 1, x 2 f 13 (x 1, x 3 x 1 x 3 x 2 f 23 (x 2, x 3 λ 1 13 (x 1 λ 3 13 (x 3 f 13 ( λ 1 12 (x 1 x 1 x 1 x 3 x 3 λ 3 23 (x 3 f 12 ( f 23 ( Reparameterization: x 2 x 2 j λ j α x j = 0 λ 2 13 (x 2 λ 2 23 (x 2 α j F = min x f α (x max α λ i α α min x f α (x + λ i α x i i α Many names for the same class of bounds: Dual decomposition [Komodakis et al. 2007] TRW, MPLP [Wainwright et al. 2005; Globerson & Jaakkola, 2007] Soft arc consistency [Cooper & Schiex, 2004] Max-sum diffusion [Warner 2007]
85 Dual Decomposition f 12 (x 1, x 2 f 13 (x 1, x 3 x 1 x 3 x 2 f 23 (x 2, x 3 λ 1 13 (x 1 λ 3 13 (x 3 f 13 ( λ 1 12 (x 1 x 1 x 1 x 3 x 3 λ 3 23 (x 3 f 12 ( f 23 ( Reparameterization: x 2 x 2 j λ j α x j = 0 λ 2 13 (x 2 λ 2 23 (x 2 α j F = min x f α (x max α λ i α α min x f α (x + λ i α x i i α Many ways to optimize the bound: Sub-gradient descent [Komodakis et al. 2007; Jojic et al. 2010] Coordinate descent [Warner 2007; Globerson & Jaakkola 2007; Sontag et al. 2009; Ihler et al. 2012] Proximal optimization [Ravikumar et al, 2010] ADMM [Meshi & Globerson 2011; Martins et al. 2011; Forouzan & Ihler 2013]
86 Optimizing the bound Can optimize the bound in various ways: (Sub-gradient descent A B B C A B f 1 (A,B (B B C f 2 (B,C - (B =
87 Optimizing the bound Can optimize the bound in various ways: (Sub-gradient descent A B B C A B f 1 (A,B (B B C f 2 (B,C - (B =
88 Optimizing the bound Can optimize the bound in various ways: (Sub-gradient descent A B B C A B f 1 (A,B (B B C f 2 (B,C - (B =
89 Optimizing the bound Can optimize the bound in various ways: (Sub-gradient descent A B B C A B f 1 (A,B (B B C f 2 (B,C - (B =
90 Various Update Schemes Can use any decomposition updates (message passing, subgradient, augmented, etc. FGLP: Update the original factors JGLP: Update clique function of the join graph MBE-MM: Mini-bucket with moment matching Apply cost-shifting within each bucket only
91 Factor graph Linear Programming Update the original factors (FGLP Tighten all factors over over x i simultaneously Compute max-marginals & update:
92 FGLP (Factor Graph Linear Programming
93
94 Factor Graph Linear Programming
95 Complexity of FGLP
96 Mini-Bucket as Decomposition [Ihler et al. 2012] Downward pass as cost shifting Join graph: Can also do cost shifting within mini-buckets: Join graph message passing B: C: {A,B,C} {A,C} {A,C,E} {B} {B,D,E} {D,E} Moment-matching version: One message exchange within each bucket, during downward sweep Optimal bound defined by cliques ( regions and cost-shifting f n scopes ( coordinates D: E: A: {A,E} {A} {A,E} {A} U = upper bound {A,D,E} {A}
97 MBE-MM: MBE with moment matching Bucket B Bucket C P(E B,C P(C A max B m 11 m 12 h B (C,E P(B A max B P(D A,B m 11,m 12 - moment-matching messages P(B A B P(A A C P(C A Bucket D h B (A,D E Bucket E E = 0 h C (A,E D P(D A,B P(E B,C Bucket A P(A h E (A h D (A W=2 MPE* is an upper bound on MPE --U Generating a solution yields a lower bound--l
98 MBE-MM (MBE with Moment-Matching
99 Anytime Approximation Can tighten the bound in various ways Cost-shifting (improve consistency between cliques Increase i-bound (higher order consistency Simple moment-matching step improves bound significantly
100 Anytime Approximation Can tighten the bound in various ways Cost-shifting (improve consistency between cliques Increase i-bound (higher order consistency Simple moment-matching step improves bound significantly
101 Anytime Approximation Can tighten the bound in various ways Cost-shifting (improve consistency between cliques Increase i-bound (higher order consistency Simple moment-matching step improves bound significantly
102 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Iterative Belief propagation Iterative-join-graph propagation Re-parameterization, cost-shifting
Approximation Techniques Iterative bounded inference
pproximation Techniques Iterative bounded inference COMPSCI 276, Spring 2013 Set 11: Rina Dechter (Reading: Primary: Class Notes (10) Secondary:, Darwiche chapters 14) genda Mini-bucket elimination Mini-clustering
More informationAlgorithms for Reasoning with Probabilistic Graphical Models. Class 3: Approximate Inference. International Summer School on Deep Learning July 2017
Algorithms for Reasoning with Probabilistic Graphical Models Class 3: Approximate Inference International Summer School on Deep Learning July 2017 Prof. Rina Dechter Prof. Alexander Ihler Approximate Inference
More informationCOMPSCI 276 Fall 2007
Exact Inference lgorithms for Probabilistic Reasoning; OMPSI 276 Fall 2007 1 elief Updating Smoking lung ancer ronchitis X-ray Dyspnoea P lung cancer=yes smoking=no, dyspnoea=yes =? 2 Probabilistic Inference
More informationExact Inference Algorithms Bucket-elimination
Exact Inference Algorithms Bucket-elimination COMPSCI 276, Spring 2013 Class 5: Rina Dechter (Reading: class notes chapter 4, Darwiche chapter 6) 1 Belief Updating Smoking lung Cancer Bronchitis X-ray
More informationExact Inference Algorithms Bucket-elimination
Exact Inference Algorithms Bucket-elimination COMPSCI 276, Spring 2011 Class 5: Rina Dechter (Reading: class notes chapter 4, Darwiche chapter 6) 1 Belief Updating Smoking lung Cancer Bronchitis X-ray
More informationVariational algorithms for marginal MAP
Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang
More informationJunction Tree, BP and Variational Methods
Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,
More informationBounding the Partition Function using Hölder s Inequality
Qiang Liu Alexander Ihler Department of Computer Science, University of California, Irvine, CA, 92697, USA qliu1@uci.edu ihler@ics.uci.edu Abstract We describe an algorithm for approximate inference in
More informationReasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter
Reasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter Road Map n Graphical models n Constraint networks Model n Inference n Search n Probabilistic
More informationarxiv: v1 [stat.ml] 26 Feb 2013
Variational Algorithms for Marginal MAP Qiang Liu Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA, 92697-3425, USA qliu1@uci.edu arxiv:1302.6584v1 [stat.ml]
More informationMini-Buckets: A General Scheme for Bounded Inference
Mini-Buckets: A General Scheme for Bounded Inference RINA DECHTER University of California, Irvine, Irvine, California AND IRINA RISH IBM T. J. Watson Research Center, Hawthorne, New York Abstract. This
More informationOn the Power of Belief Propagation: A Constraint Propagation Perspective
9 On the Power of Belief Propagation: A Constraint Propagation Perspective R. Dechter, B. Bidyuk, R. Mateescu and E. Rollon Introduction In his seminal paper, Pearl [986] introduced the notion of Bayesian
More informationOn the Power of Belief Propagation: A Constraint Propagation Perspective
9 On the Power of Belief Propagation: A Constraint Propagation Perspective R. DECHTER, B. BIDYUK, R. MATEESCU AND E. ROLLON Introduction In his seminal paper, Pearl [986] introduced the notion of Bayesian
More informationDoes Better Inference mean Better Learning?
Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract
More information13 : Variational Inference: Loopy Belief Propagation
10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction
More informationProbabilistic Graphical Models (I)
Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random
More informationExact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = =
Exact Inference I Mark Peot In this lecture we will look at issues associated with exact inference 10 Queries The objective of probabilistic inference is to compute a joint distribution of a set of query
More informationDual Decomposition for Inference
Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for
More informationProbabilistic Graphical Models
Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference
More informationRelax then Compensate: On Max-Product Belief Propagation and More
Relax then Compensate: On Max-Product Belief Propagation and More Arthur Choi Computer Science Department University of California, Los Angeles Los Angeles, CA 995 aychoi@cs.ucla.edu Adnan Darwiche Computer
More informationBayesian Networks: Representation, Variable Elimination
Bayesian Networks: Representation, Variable Elimination CS 6375: Machine Learning Class Notes Instructor: Vibhav Gogate The University of Texas at Dallas We can view a Bayesian network as a compact representation
More information14 : Theory of Variational Inference: Inner and Outer Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction
More informationFractional Belief Propagation
Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation
More informationApproximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)
Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles
More informationA Combined LP and QP Relaxation for MAP
A Combined LP and QP Relaxation for MAP Patrick Pletscher ETH Zurich, Switzerland pletscher@inf.ethz.ch Sharon Wulff ETH Zurich, Switzerland sharon.wulff@inf.ethz.ch Abstract MAP inference for general
More informationMessage Passing and Junction Tree Algorithms. Kayhan Batmanghelich
Message Passing and Junction Tree Algorithms Kayhan Batmanghelich 1 Review 2 Review 3 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here 1 of me 11
More informationInference and Representation
Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of
More informationLearning MN Parameters with Approximation. Sargur Srihari
Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief
More informationInference as Optimization
Inference as Optimization Sargur Srihari srihari@cedar.buffalo.edu 1 Topics in Inference as Optimization Overview Exact Inference revisited The Energy Functional Optimizing the Energy Functional 2 Exact
More information13 : Variational Inference: Loopy Belief Propagation and Mean Field
10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationVariational Algorithms for Marginal MAP
Variational Algorithms for Marginal MAP Qiang Liu Department of Computer Science University of California, Irvine Irvine, CA, 92697 qliu1@ics.uci.edu Alexander Ihler Department of Computer Science University
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft
More informationProbabilistic Graphical Models
School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4
More informationInference in Bayesian Networks
Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)
More informationInference in Graphical Models Variable Elimination and Message Passing Algorithm
Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption
More informationDual Decomposition for Marginal Inference
Dual Decomposition for Marginal Inference Justin Domke Rochester Institute of echnology Rochester, NY 14623 Abstract We present a dual decomposition approach to the treereweighted belief propagation objective.
More informationImproving Bound Propagation
Improving Bound Propagation Bozhena Bidyuk 1 and Rina Dechter 2 Abstract. This paper extends previously proposed bound propagation algorithm [11] for computing lower and upper bounds on posterior marginals
More informationProbabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013
School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two
More informationSampling Algorithms for Probabilistic Graphical models
Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir
More informationUNIVERSITY OF CALIFORNIA, IRVINE. Fixing and Extending the Multiplicative Approximation Scheme THESIS
UNIVERSITY OF CALIFORNIA, IRVINE Fixing and Extending the Multiplicative Approximation Scheme THESIS submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in Information
More informationVariational Inference. Sargur Srihari
Variational Inference Sargur srihari@cedar.buffalo.edu 1 Plan of discussion We first describe inference with PGMs and the intractability of exact inference Then give a taxonomy of inference algorithms
More informationVariable Elimination (VE) Barak Sternberg
Variable Elimination (VE) Barak Sternberg Basic Ideas in VE Example 1: Let G be a Chain Bayesian Graph: X 1 X 2 X n 1 X n How would one compute P X n = k? Using the CPDs: P X 2 = x = x Val X1 P X 1 = x
More informationStat 521A Lecture 18 1
Stat 521A Lecture 18 1 Outline Cts and discrete variables (14.1) Gaussian networks (14.2) Conditional Gaussian networks (14.3) Non-linear Gaussian networks (14.4) Sampling (14.5) 2 Hybrid networks A hybrid
More informationCutset sampling for Bayesian networks
Cutset sampling for Bayesian networks Cutset sampling for Bayesian networks Bozhena Bidyuk Rina Dechter School of Information and Computer Science University Of California Irvine Irvine, CA 92697-3425
More informationUNIVERSITY OF CALIFORNIA, IRVINE. Reasoning and Decisions in Probabilistic Graphical Models A Unified Framework DISSERTATION
UNIVERSITY OF CALIFORNIA, IRVINE Reasoning and Decisions in Probabilistic Graphical Models A Unified Framework DISSERTATION submitted in partial satisfaction of the requirements for the degree of DOCTOR
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation
More informationMessage-Passing Algorithms for GMRFs and Non-Linear Optimization
Message-Passing Algorithms for GMRFs and Non-Linear Optimization Jason Johnson Joint Work with Dmitry Malioutov, Venkat Chandrasekaran and Alan Willsky Stochastic Systems Group, MIT NIPS Workshop: Approximate
More informationMin-Max Message Passing and Local Consistency in Constraint Networks
Min-Max Message Passing and Local Consistency in Constraint Networks Hong Xu, T. K. Satish Kumar, and Sven Koenig University of Southern California, Los Angeles, CA 90089, USA hongx@usc.edu tkskwork@gmail.com
More informationTHE topic of this paper is the following problem:
1 Revisiting the Linear Programming Relaxation Approach to Gibbs Energy Minimization and Weighted Constraint Satisfaction Tomáš Werner Abstract We present a number of contributions to the LP relaxation
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationProbabilistic Variational Bounds for Graphical Models
Probabilistic Variational Bounds for Graphical Models Qiang Liu Computer Science Dartmouth College qliu@cs.dartmouth.edu John Fisher III CSAIL MIT fisher@csail.mit.edu Alexander Ihler Computer Science
More informationUNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS
UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems
More informationFactor Graphs and Message Passing Algorithms Part 1: Introduction
Factor Graphs and Message Passing Algorithms Part 1: Introduction Hans-Andrea Loeliger December 2007 1 The Two Basic Problems 1. Marginalization: Compute f k (x k ) f(x 1,..., x n ) x 1,..., x n except
More informationApproximation by Quantization
Approximation by Quantization Vibhav Gogate and Pedro Domingos Computer Science & Engineering University of Washington Seattle, WA 98195, USA {vgogate,pedrod}@cs.washington.edu Abstract Inference in graphical
More informationCSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches
CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches A presentation by Evan Ettinger November 11, 2005 Outline Introduction Motivation and Background
More informationMachine Learning Lecture 14
Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4
ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul
More informationOn finding minimal w-cutset problem
On finding minimal w-cutset problem Bozhena Bidyuk and Rina Dechter Information and Computer Science University Of California Irvine Irvine, CA 92697-3425 Abstract The complexity of a reasoning task over
More informationVariable Elimination: Algorithm
Variable Elimination: Algorithm Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference Algorithms 2. Variable Elimination: the Basic ideas 3. Variable Elimination Sum-Product VE Algorithm Sum-Product
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Summary of Class Advanced Topics Dhruv Batra Virginia Tech HW1 Grades Mean: 28.5/38 ~= 74.9%
More informationDecomposition Methods for Large Scale LP Decoding
Decomposition Methods for Large Scale LP Decoding Siddharth Barman Joint work with Xishuo Liu, Stark Draper, and Ben Recht Outline Background and Problem Setup LP Decoding Formulation Optimization Framework
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationSubproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing
Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Huayan Wang huayanw@cs.stanford.edu Computer Science Department, Stanford University, Palo Alto, CA 90 USA Daphne Koller koller@cs.stanford.edu
More informationVariable Elimination: Algorithm
Variable Elimination: Algorithm Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference Algorithms 2. Variable Elimination: the Basic ideas 3. Variable Elimination Sum-Product VE Algorithm Sum-Product
More informationImplementing Machine Reasoning using Bayesian Network in Big Data Analytics
Implementing Machine Reasoning using Bayesian Network in Big Data Analytics Steve Cheng, Ph.D. Guest Speaker for EECS 6893 Big Data Analytics Columbia University October 26, 2017 Outline Introduction Probability
More informationStructured Variational Inference
Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:
More information9 Forward-backward algorithm, sum-product on factor graphs
Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous
More information4 : Exact Inference: Variable Elimination
10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference
More informationCompiling Knowledge into Decomposable Negation Normal Form
Compiling Knowledge into Decomposable Negation Normal Form Adnan Darwiche Cognitive Systems Laboratory Department of Computer Science University of California Los Angeles, CA 90024 darwiche@cs. ucla. edu
More informationWeighted Heuristic Anytime Search
Weighted Heuristic Anytime Search Flerova, Marinescu, and Dechter 1 1 University of South Carolina April 24, 2017 Basic Heuristic Best-first Search Blindly follows the heuristic Weighted A Search For w
More informationExpectation Propagation Algorithm
Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,
More informationComputational Complexity of Inference
Computational Complexity of Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. What is Inference? 2. Complexity Classes 3. Exact Inference 1. Variable Elimination Sum-Product Algorithm 2. Factor Graphs
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationComputer Science CPSC 322. Lecture 23 Planning Under Uncertainty and Decision Networks
Computer Science CPSC 322 Lecture 23 Planning Under Uncertainty and Decision Networks 1 Announcements Final exam Mon, Dec. 18, 12noon Same general format as midterm Part short questions, part longer problems
More informationInternational Mathematical Olympiad. Preliminary Selection Contest 2004 Hong Kong. Outline of Solutions 3 N
International Mathematical Olympiad Preliminary Selection Contest 004 Hong Kong Outline of Solutions Answers:. 8. 0. N 4. 49894 5. 6. 004! 40 400 7. 6 8. 7 9. 5 0. 0. 007. 8066. π + 4. 5 5. 60 6. 475 7.
More informationCodes on Graphs. Telecommunications Laboratory. Alex Balatsoukas-Stimming. Technical University of Crete. November 27th, 2008
Codes on Graphs Telecommunications Laboratory Alex Balatsoukas-Stimming Technical University of Crete November 27th, 2008 Telecommunications Laboratory (TUC) Codes on Graphs November 27th, 2008 1 / 31
More informationCS Lecture 4. Markov Random Fields
CS 6347 Lecture 4 Markov Random Fields Recap Announcements First homework is available on elearning Reminder: Office hours Tuesday from 10am-11am Last Time Bayesian networks Today Markov random fields
More informationGraphical Models and Kernel Methods
Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.
More information6.867 Machine learning, lecture 23 (Jaakkola)
Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context
More informationBayes Networks 6.872/HST.950
Bayes Networks 6.872/HST.950 What Probabilistic Models Should We Use? Full joint distribution Completely expressive Hugely data-hungry Exponential computational complexity Naive Bayes (full conditional
More information12 : Variational Inference I
10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of
More informationMax$Sum(( Exact(Inference(
Probabilis7c( Graphical( Models( Inference( MAP( Max$Sum(( Exact(Inference( Product Summation a 1 b 1 8 a 1 b 2 1 a 2 b 1 0.5 a 2 b 2 2 a 1 b 1 3 a 1 b 2 0 a 2 b 1-1 a 2 b 2 1 Max-Sum Elimination in Chains
More informationRevisiting the Limits of MAP Inference by MWSS on Perfect Graphs
Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs Adrian Weller University of Cambridge CP 2015 Cork, Ireland Slides and full paper at http://mlg.eng.cam.ac.uk/adrian/ 1 / 21 Motivation:
More informationInformation Geometric view of Belief Propagation
Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural
More informationUNIVERSITY OF CALIFORNIA, IRVINE. Bottom-Up Approaches to Approximate Inference and Learning in Discrete Graphical Models DISSERTATION
UNIVERSITY OF CALIFORNIA, IRVINE Bottom-Up Approaches to Approximate Inference and Learning in Discrete Graphical Models DISSERTATION submitted in partial satisfaction of the requirements for the degree
More informationConvergent Tree-reweighted Message Passing for Energy Minimization
Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2006. Convergent Tree-reweighted Message Passing for Energy Minimization Vladimir Kolmogorov University College London,
More informationA Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation
A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation Max Welling Dept. of Computer Science University of California, Irvine Irvine, CA 92697-3425, USA Andrew E. Gelfand Dept. of Computer
More informationActive Tuples-based Scheme for Bounding Posterior Beliefs
Journal of Artificial Intelligence Research 39 (2010) 335-371 Submitted 10/09; published 09/10 Active Tuples-based Scheme for Bounding Posterior Beliefs Bozhena Bidyuk Google Inc. 19540 Jamboree Rd Irvine,
More informationIntroduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah
Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,
More informationLecture 17: May 29, 2002
EE596 Pat. Recog. II: Introduction to Graphical Models University of Washington Spring 2000 Dept. of Electrical Engineering Lecture 17: May 29, 2002 Lecturer: Jeff ilmes Scribe: Kurt Partridge, Salvador
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationInference in Bayes Nets
Inference in Bayes Nets Bruce D Ambrosio Oregon State University and Robert Fung Institute for Decision Systems Research July 23, 1994 Summer Institute on Probablility in AI 1994 Inference 1 1 Introduction
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Chapter 8&9: Classification: Part 3 Instructor: Yizhou Sun yzsun@ccs.neu.edu March 12, 2013 Midterm Report Grade Distribution 90-100 10 80-89 16 70-79 8 60-69 4
More informationOn the Choice of Regions for Generalized Belief Propagation
UAI 2004 WELLING 585 On the Choice of Regions for Generalized Belief Propagation Max Welling (welling@ics.uci.edu) School of Information and Computer Science University of California Irvine, CA 92697-3425
More informationStudies in Solution Sampling
Studies in Solution Sampling Vibhav Gogate and Rina Dechter Donald Bren School of Information and Computer Science, University of California, Irvine, CA 92697, USA. {vgogate,dechter}@ics.uci.edu Abstract
More informationApproximate Inference Part 1 of 2
Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory
More information