Bounded Inference Non-iteratively; Mini-Bucket Elimination

Size: px
Start display at page:

Download "Bounded Inference Non-iteratively; Mini-Bucket Elimination"

Transcription

1 Bounded Inference Non-iteratively; Mini-Bucket Elimination COMPSCI 276,Spring 2018 Class 6: Rina Dechter Reading: Class Notes (8, Darwiche chapters 14

2 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Re-parameterization, cost-shifting Iterative Belief propagation Iterative-join-graph propagation

3 Types of queries Max-Inference Sum-Inference Harder Mixed-Inference NP-hard: exponentially many terms We will focus on approximation algorithms Anytime: very fast & very approximate! Slower & more accurate

4 Queries Probability of evidence (or partition function P( e = X var( e i= 1 P( x i pa i Posterior marginal (beliefs: P( x i P( xi, e e = = P( e Most Probable Explanation x* n e X var( e X j= 1 n X var( e j= 1 = n P( x Z = ( P( x argmaxp(x,e x i j j X pa pa j j i e e i C i

5 Bucket Elimination 0, ( 0 ( = = e a P e a P = = = d e b c c b e P b a d P a c P a b P a P e a P 0,,,, (, ( ( ( ( 0, ( = = d c b e b a d P c b e P a b P a c P a P, (, ( ( ( ( 0 Elimination Order: d,e,b,c Query: D: E: B: C: A: = d D b a d P b a f, (, (, ( b a d P, ( c b e P, 0 (, ( c b e P c b f E = = = b E D B c b f b a f a b P c a f, (, ( (, ( ( ( 0, ( a f A p e a P C = = P(a ( a c P = c B C c a f a c P a f, ( ( ( ( a b P D,A,B E,B,C B,A,C C,A A, ( b a f D, ( c b f E, ( c a f B f C (a A D E C B Bucket Tree D E B C A Original Functions Messages Time and space exp(w*

6 Finding Algorithm elim-mpe (Dechter 1996 MPE = max P(x x MPE = bucket B: bucket C: bucket D: bucket E: bucket A: is replaced by max : max P( a P( c a P( b a P( d a, e, d, c, b max b P(b a P(d b,a P(e b,c P(c a e=0 P(a h C (a,d,e MPE h D (a,e h E (a a, b P( e Elimination operator W*=4 induced width (max clique size b, c B C D E A

7 Generating the MPE-tuple 5. b' = arg max P(b a' b P(d' b,a' P(e' b,c' B: P(b a P(d b,a P(e b,c 4. c' = h arg max P(c a' B c (a',d',c, e' C: P(c a h B (a,d,c, e 3. d' = arg max h d C (a',d,e' D: h C (a,d,e 2. e' = 0 E: e=0 h D (a, e 1. a' = arg max P(a h a E (a A: P(a h E (a Return (a',b',c',d',e'

8 Approximate Inference Metrics of evaluation Absolute error: given e>0 and a query p= P(x e, an estimate r has absolute error e iff p-r <e Relative error: the ratio r/p in [1-e,1+e]. Dagum and Luby 1993: approximation up to a relative error is NPhard. Absolute error is also NP-hard if error is less than.5

9 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Re-parameterization, cost-shifting Iterative Belief propagation Iterative-join-graph propagation

10 Mini-Buckets: Local Inference Computation in a bucket is time and space exponential in the number of variables involved Therefore, partition functions in a bucket into mini-buckets on smaller number of variables

11 Decomposition Bounds Upper & lower bounds via approximate problem decomposition Example: MAP inference X F (X X f 1 (X X f 2 (X = = = 6.0 Relaxation: two copies of x, no longer required to be equal Bound is tight (equality if f 1, f 2 agree on maximizing value x

12 Mini-Bucket Approximation Split a bucket into mini-buckets > bound complexity bucket (X = Exponential complexity decrease:

13 Mini-Bucket Elimination mini-buckets [Dechter & Rish 2003] A bucket B: bucket C: B C bucket D: bucket E: D E bucket A: U = upper bound

14 Mini-Bucket Elimination mini-buckets [Dechter & Rish 2003] B D B A E C bucket B: bucket C: bucket D: bucket E: bucket A: U = upper bound Can interpret process as duplicating B [Kask et al. 2001, Geffner et al. 2007, Choi et al. 2007, Johnson et al. 2007]

15 Semantics of Mini-Bucket: Splitting a Node Variables in different buckets are renamed and duplicated (Kask et. al., 2001, (Geffner et. al., 2007, (Choi, Chavira, Darwiche, 2007 Before Splitting: Network N After Splitting: Network N' U U Û 16

16 Relaxed Network Example

17 MBE-MPE(i: Algorithm MBE-mpe Input: I max number of variables allowed in a mini-bucket Output: [lower bound (P of suboptimal solution, upper bound] Example: MBE-mpe(3 versus BE-mpe B: f a, b f b, c f b, d f b, e max variables in a mini-bucket 3: B: f a, b f b, c f b, d f b, e C: λ B C (a, c f c, a f c, e 3: C: λ B C (a, c, d, e f c, a f c, e D: f a, d λ B D (d, e 3: D: f a, d λ C D (a, d, e E: λ C E (a, e λ D E (a, e 2: E: λ D E (a, e A: f a λ E A (a 1: A: f a λ E A (a U = Upper bound w = 2 OPT w =4 [Dechter and Rish, 1997]

18 Mini-Bucket Decoding b = arg min b መd = arg min d a = arg min a f a, b + f b, cƹ +f b, መd + f(b, e Ƹ c Ƹ = arg min λ B C a, c + f c, a + f(c, e Ƹ c f a, d + λ B D (d, e Ƹ e Ƹ = arg min λ C E ( a, e + λ D E ( a, e e f a + λ E A (a bucket B: bucket C: bucket D: bucket E: bucket A: mini-buckets min f( min f( B f a, b f b, c f b, d f b, e λ B C (a, c f c, a f c, e f a, d λ B D (d, e λ C E (a, e λ D E (a, e f a λ E A (a B Greedy configuration = upper bound L = lower bound [Dechter and Rish, 2003]

19 (i,m-patitionings

20 MBE(i,m, MBE(i Input: Belief network ( P,..., Output: upper and lower bounds Initialize: put functions in buckets along ordering Process each bucket from p=n to 1 Create (i,m-partitions Process each mini-bucket (For mpe: assign values in ordering d 1 Return: mpe-configuration, upper and lower bounds P n

21

22 Partitioning, Refinements

23 Properties of MBE(i Complexity: O(r exp(i time and O(exp(i space Yields a lower bound and an upper bound Accuracy: determined by upper/lower (U/L bound Possible use of mini-bucket approximations As anytime algorithms As heuristics in search Other tasks (similar mini-bucket approximations Belief updating, Marginal MAP, MEU, WCSP, Max-CSP [Dechter and Rish, 1997], [Liu and Ihler, 2011], [Liu and Ihler, 2013]

24 Anytime Approximation

25 MBE for Belief Updating and for Probability of Evidence or Partition Function Idea mini-bucket is the same: f ( x g( x f ( x g( x X X X X f ( x g( x f ( x max g( X So we can apply a sum in each mini-bucket, or better, one sum and the rest max, or min (for lower-bound MBE-bel-max(i,m, MBE-bel-min(i,m generating upper and lower-bound on beliefs approximates BE-bel MBE-map(i,m: max mini-buckets will be maximized, sum mini-buckets will be sum-max. Approximates BE-map. X X

26

27 Methodology for Empirical Evaluation (for mpe U/L accuracy Better (U/mpe or mpe/l Benchmarks: Random networks Given n,e,v generate a random DAG For xi and parents generate table from uniform [0,1], or noisy-or Create k instances. For each, generate random evidence, likely evidence Measure averages

28 Upper/Lower CPCS Networks Medical Diagnosis (noisy-or model Test case: no evidence Anytime-mpe( U/L error vs time cpcs422b cpcs360b Algorithm elim-mpe anytime-mpe(, anytime-mpe(, 0.6 i=1 i= Time and parameter i 4 = 10 1 = 10 Time (sec cpcs360 cpcs

29 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Re-parameterization, cost-shifting Iterative Belief propagation Iterative-join-graph propagation

30 The Power Sum and Holder Inequality

31 Working Example Model: Markov network Task: Partition function A B C (Qiang Liu slides

32 Mini-Bucket (Basic Principles Upper bound Lower bound (Qiang Liu slides

33 Holder Inequality Where and When, the equality is achieved. (Qiang Liu slides G. H. Hardy, J. E. Littlewood and G. Pólya, dechter, Inequalities, class Cambridge Univ. Press, London and New York, 1934.

34 G. H. Hardy, J. E. Littlewood and G. Pólya, Inequalities, Cambridge Univ. Press, London and New York, Reverse Holder Inequality If, but the direction of the inequality reverses. (Qiang Liu slides

35 Mini-Bucket as Holder Inequality (mbe-bel-max (mbe-bel-min (Qiang Liu slides

36 Weighted Mini-Bucket (for summation Exact bucket elimination: λ B a, c, d, e = where b w 1 f a, b f b, c b f a, b f b, c f b, d f b, e w 2 f b, d f b, e = λ B C (a, c λ B D (d, e (mini-buckets w f x = f x 1 w x x is the weighted or power sum operator b w bucket B: bucket C: bucket D: bucket E: bucket A: mini-buckets f(x f a, b f b, c f b, d f b, e λ B C (a, c f a, c f c, e f a, d λ B D (d, e λ C E (a, e λ D E (a, e f a λ E A (a w x w w 1 f 1 x f 2 x f 1 x x x w 2 f 2 x where w 1 + w 2 = w and w 1 > 0, w 2 > 0 (lower bound if w 1 > 0, w 2 < 0 x U = upper bound [Liu and Ihler, 2011]

37

38 Choosing the weights (Cauchy Schwarz inequality (Qiang Liu slides

39 Weighted-mini-bucket for Marginal Map

40 MB and WMB for Marginal MAP max Y Σ X\Y Π j P j w 1 λ B C a, c = f a, b f(b, c b w 2 λ B D d, e = f b, d f(b, e. b (w 1 + w 2 = 1 Marginal MAP Σ B Σ C max D bucket B: bucket C: bucket D: mini-buckets f a, b f b, c f b, d f b, e λ B C (a, c f a, c f c, e f a, d λ B D (d, e λ E A a = max e λ C E a, e λ D E (a, e max E bucket E: λ C E (a, e λ D E (a, e U = max a f a λ E A(a max A bucket A: f a λ E A (a Can optimize over cost-shifting and weights (single pass MM or iterative message passing U = upper bound [Liu and Ihler, 2011; 2013] [Dechter and Rish, 2003]

41 MBE-map Process max buckets With max mini-buckets And sum buckets with sum Mini-bucket and max mini-buckets

42 Initial partitioning

43

44 Complexity and Tractability of MBE(i,m

45 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Re-parameterization, cost-shifting Iterative Belief propagation Iterative-join-graph propagation

46 Join-Tree Clustering (Cluster-Tree Elimination A B 1 ABC BC h( 1,2 ( b, c = p( a p( b a p( c a, b a C D E 2 BCDF BF h( 2,1 ( b, c = p( d b p( f c, d h(3,2 ( b, f d, f h( 2,3 ( b, f = p( d b p( f c, d h(1,2 ( b, c c, d F EXACT algorithm Time and space: exp(cluster size= exp(treewidth G 3 BEF 4 EF EFG h( 3,2 ( b, f = p( e b, f h(4,3 ( e, f e h( 3,4 ( e, f = p( e b, f h(2,3 ( b, f h( 4,3 ( e, f = p( G = ge e, f b

47

48 Mini-Clustering, i-bound=3 A B 1 A B C p(a, p(b a, p(c a,b C F D E G 2 3 BC B C D p(d b, h (1,2 (b,c C D F p(f c,d BF B E F p(e b,f, h 1 (2,3(b, h 2 (2,3(f 1 h( 1,2 ( b, c = p( a p( b a p( c a, b a 1 1 h( 2,3 ( b = p( d b h(1,2 ( b, c c, d 2 h( 2,3 ( f = max p( f c, d c, d APPROXIMATE algorithm 4 EF E F G p(g e,f Time and space: exp(i-bound Number of variables in a mini-cluster

49 EF BF BC, ( ( ( :, ( 1 1,2 ( b a c p a b p a p c b h a = H (1,2, ( max : (, ( ( : (, 2 (2,1 1 (3,2, 1 (2,1 d c f p c h f b h b d p b h f d f d = = H (2,1, ( max : (, ( ( : (, 2 (2,3 1 (1,2, 1 (2,3 d c f p f h c b h b d p b h d c d c = = H (2,3, (, ( :, ( 1 ( 4, , ( f e h f b e p f b h e = H (3,2 ( (, ( :, ( 2 ( 2,3 1 ( 2, , ( f h b h f b e p f e h b = H (3,4, ( :, ( 1 4,3 ( f e g G p f e h e = = H (4,3 ABC BEF EFG BCDF Mini-Clustering - Example

50 Cluster Tree Elimination vs. Mini-Clustering 1 ABC BC h( 1, 2 ( b, c 1 ABC BC H (1,2 1 h( 1, 2 ( b, c h 1 (2,1 ( b 2 BCDF BF h( 2,1 ( b, c h( 2,3 ( b, f 2 BCDF BF H (2,1 H (2,3 h h h 2 (2,1 1 (2,3 2 (2,3 ( c ( b ( f 3 BEF EF 4 EFG h( 3, 2 ( b, f h( 3, 4 ( e, f h( 4,3 ( e, f 3 BEF EF 4 EFG H (3,2 H (3,4 H (4,3 1 h( 3, 2 ( b, f 1 h( 3, 4 ( e, f 1 h( 4,3 ( e, f

51 Linear Block Codes Received bits a b c d e f g h σ Input bits Parity bits A B C D E F G H Gaussian channel noise σ Received bits p 1 p 2 p 3 p 4 p 5 p 6

52 Probabilistic decoding Error-correcting linear block code State-of-the-art: approximate algorithm iterative belief propagation (IBP (Pearl s poly-tree algorithm applied to loopy networks

53 Iterative Belief Proapagation Belief propagation is exact for poly-trees IBP - applying BP iteratively to cyclic networks One step : update BEL(U 1 U1 U2 U3 U2 ( x 1 U3 ( x 1 X1 ( u 1 X 1 X 2 X 2 ( u 1 No guarantees for convergence Works well for many coding networks

54 MBE-mpe vs. IBP IB P a p p ro x - m p e is b e tte r on lo w - is b e tte r on r a nd o m ly ge ne r a te d w * c o d e s ( high - w * c o d e s Bit error rate (BER as a function of noise (sigma:

55 Grid 15x15-10 evidence Grid 15x15, evid=10, w*=22, 10 instances Grid 15x15, evid=10, w*=22, 10 instances NHD MC IBP Absolute error MC IBP i-bound Grid 15x15, evid=10, w*=22, 10 instances i-bound Grid 15x15, evid=10, w*=22, 10 instances MC IBP 10 MC IBP Relative error Time (seconds i-bound i-bound

56 Heuristics for partitioning (Dechter and Rish, 2003, Rollon and Dechter 2010 Scope-based Partitioning Heuristic (SCP aims at minimizing the number of mini-buckets in the partition by including in each minibucket as many functions as respecting the i bound is satisfied Use greedy heuristic derived from a distance function to decide which functions go into a single mini-bucket

57 Greedy Scope-based Partitioning

58 Heuristic for Partitioning Scope-based Partitioning Heuristic. The scope-based partition heuristic (SCP aims at minimizing the number of mini-buckets in the partition by including in each mini-bucket as many functions as possible as long as the i bound is satisfied. First, single function mini-buckets are decreasingly ordered according to their arity from left to right. Then, each mini-bucket is absorbed into the left-most mini-bucket with whom it can be merged. The time complexity of Partition(B, i, where B is the bucket to be partitioned, and B,the number of functions in the bucket, using the SCP heuristic is O( B log ( B + B ^2. The scope-based heuristic is is quite fast, its shortcoming is that it does not consider the actual information in the functions.

59 Greedy Partition as a function of a distance function h

60 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Iterative Belief propagation Iterative-join-graph propagation Re-parameterization, cost-shifting

61 Iterative Belief Proapagation Belief propagation is exact for poly-trees IBP - applying BP iteratively to cyclic networks One step : update BEL(U 1 U1 U2 U3 U2 ( x 1 U3 ( x 1 X1 ( u 1 X 1 X 2 ( u 1 X 2 No guarantees for convergence Works well for many coding networks Lets combine iterative-nature with anytime--ijgp

62 Iterative Join Graph Propagation Loopy Belief Propagation Cyclic graphs Iterative Converges fast in practice (no guarantees though Very good approximations (e.g., turbo decoding, LDPC codes, SAT survey propagation Mini-Clustering(i Tree decompositions Only two sets of messages (inward, outward Anytime behavior can improve with more time by increasing the i-bound We want to combine: Iterative virtues of Loopy BP Anytime behavior of Mini-Clustering(i

63 IJGP - The basic idea Apply Cluster Tree Elimination to any join-graph We commit to graphs that are I-maps Avoid cycles as long as I-mapness is not violated Result: use minimal arc-labeled join-graphs

64 Minimal Arc-Labeled Decomposition ABCDE BC BCE ABCDE BC BCE CDE CE DE CE CDEF CDEF a Fragment of an arc-labeled join-graph a Shrinking labels to make it a minimal arc-labeled join-graph Use a DFS algorithm to eliminate cycles relative to each variable

65 Minimal arc-labeled join-graph

66 Message propagation ABCDE BC BCE CDE CDEF CE ABCDE h (3,1 (bc p(a, p(c, p(b ac, BCE 1 p(d abe,p(e b,c BC 3 h(3,1(bc F F FGH GH h (1,2 CDE CE FGI GI GHIJ 2 CDEF Minimal arc-labeled: sep(1,2={d,e} elim(1,2={a,b,c} Non-minimal arc-labeled: sep(1,2={c,d,e} elim(1,2={a,b} h h( 1,2 ( de = p( a p( c p( b ac p( d abe p( e bc h(3,1 ( bc a, b, c ( 1,2 p( a p( c p( b ac p( d abe p( e bc h(3,1 ( bc a, b ( cde =

67 A IJGP - Example B C A A ABC C C A AB BC C D E ABDE BE BCE DE CE C F G H CDEF F FGH H H F FG GH H I J FGI GI GHIJ Belief network Loopy BP graph

68 Arc-Minimal Join-Graph A A ABC C C A AB BC C Arcs labeled with any single variable should form a TREE DE ABDE CE BE BCE C CDEF F FGH H H F FG GH H FGI GI GHIJ

69 A ABDE FGI ABC BCE GHIJ CDEF FGH C H A AB BC BE C DE CE H F FG GH GI Collapsing Clusters ABCDE FGI BCE GHIJ CDEF FGH BC CDE CE F FG GH GI ABCDE FGI BCE GHIJ CDEF FGH BC CDE CE F FG GH GI ABCDE FGI BCE GHIJ CDEF FGH BC CDE CE F FG GH GI ABCDE FGHI GHIJ CDEF CDE F GHI

70 Join-Graphs A A ABC C C A A ABC C A AB BC C AB BC ABCDE BC BCE ABCDE ABDE BE DE CE BCE C DE ABDE CE BCE C DE CE CDE CDEF F FGH H H CDEF FGH H H CDEF FGH CDEF F FG GH H F F GH F F GH F FGI GI GHIJ FGI GI GHIJ FGI GI GHIJ FGHI GHI GHIJ more accuracy less complexity

71 Bounded decompositions We want arc-labeled decompositions such that: the cluster size (internal width is bounded by i (the accuracy parameter the width of the decomposition as a graph (external width is as small as possible Possible approaches to build decompositions: partition-based algorithms - inspired by the mini-bucket decomposition grouping-based algorithms

72 Constructing Join-Graphs A B G: (GFE GFE P(G F,E C D E E: (EBF (EF EF EBF P(E B,F F F: (FCD (BF D: (DB (CD C: (CAB (CB B: (BA (AB (B A: (A P(F C,D FCD CD P(D B CDB F CB P(C A,B CAB BF P(B A BA BA BF B A P(A A G a schematic mini-bucket(i, i=3 b arc-labeled join-graph decomposition

73 IJGP properties IJGP(i applies BP to min arc-labeled join-graph, whose cluster size is bounded by i On join-trees IJGP finds exact beliefs IJGP is a Generalized Belief Propagation algorithm (Yedidia, Freeman, Weiss 2001 Complexity of one iteration: time: O(deg (n+n d i+1 space: O(N d

74 Empirical evaluation Algorithms: Exact IBP MC IJGP Measures: Networks (all variables are binary: Random networks Grid networks (MxM CPCS 54, 360, 422 Coding networks Absolute error Relative error Kulbach-Leibler (KL distance Bit Error Rate Time

75 Coding Networks Bit Error Rate σ =.22 Coding, N=400, 1000 instances, 30 it, w*=43, sigma=.22 σ =.32 Coding, N=400, 500 instances, 30 it, w*=43, sigma=.32 1e e-2 IJGP MC IBP IBP IJGP BER 1e-3 BER e e i-bound i-bound Coding, N=400, 500 instances, 30 it, w*=43, sigma=.51 σ =.51 Coding, N=400, 500 instances, 30 it, w*=43, sigma=.65 σ = IBP IJGP IBP IJGP BER BER i-bound i-bound

76 CPCS 422 KL Distance CPCS 422, evid=0, w*=23, 1instance CPCS 422, evid=30, w*=23, 1instance 0.1 IJGP 30 it (at convergence MC IBP 10 it (at convergence KL distance KL distance IJGP at convergence MC IBP at convergence i-bound i-bound evidence=0 evidence=30

77 CPCS 422 KL vs. Iterations CPCS 422, evid=0, w*=23, 1instance CPCS 422, evid=30, w*=23, 1instance IJGP (3 IJGP(10 IBP 0.1 IJGP(3 IJGP(10 IBP 0.01 KL distance KL distance number of iterations number of iterations evidence=0 evidence=30

78 Coding networks - Time 10 Coding, N=400, 500 instances, 30 iterations, w*=43 Time (seconds IJGP 30 iterations MC IBP 30 iterations i-bound

79 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Iterative Belief propagation Iterative-join-graph propagation Re-parameterization, cost-shifting

80 Cost-Shifting (Reparameterization + λ(b λ(b A B f(a,b B C f(b,c b b b g 0 1 g b g g 6 1 A B C f(a,b,c b b b 12 b b g 6 b g b 0 b b 6 3 b g 0 3 g b g g B λ(b b 3 g -1 b g g 6 g b b 6 g b g 0 g g b 6 g g g 12 = Modify the individual functions - but keep the sum of functions the same

81 Tightening the bound Reparameterization (or, cost shifting Decrease bound without changing overall function A B f 1 (A,B B C f 2 (B,C = A B C F(A,B,C = A B f 1 (A,B (B B C f 2 (B,C - (B (Adjusting functions cancel each other (Decomposition bound is exact 105

82 Dual Decomposition f 13 (x 1, x 3 x 1 x 3 f 13 ( f 12 (x 1, x 2 x 2 f 23 (x 2, x 3 x 1 x 1 x 3 x 3 f 12 ( f 23 ( x 2 x 2 F = min x f α (x α α min x f α (x Bound solution using decomposed optimization Solve independently: optimistic bound

83 Dual Decomposition f 12 (x 1, x 2 f 13 (x 1, x 3 x 1 x 3 x 2 f 23 (x 2, x 3 λ 1 13 (x 1 λ 3 13 (x 3 f 13 ( λ 1 12 (x 1 x 1 x 1 x 3 x 3 λ 3 23 (x 3 f 12 ( f 23 ( Reparameterization: x 2 x 2 j λ j α x j = 0 λ 2 13 (x 2 λ 2 23 (x 2 α j F = min x f α (x max α λ i α α min x f α (x + λ i α x i i α Bound solution using decomposed optimization Solve independently: optimistic bound Tighten the bound by reparameterization Enforce lost equality constraints via Lagrange multipliers

84 Dual Decomposition f 12 (x 1, x 2 f 13 (x 1, x 3 x 1 x 3 x 2 f 23 (x 2, x 3 λ 1 13 (x 1 λ 3 13 (x 3 f 13 ( λ 1 12 (x 1 x 1 x 1 x 3 x 3 λ 3 23 (x 3 f 12 ( f 23 ( Reparameterization: x 2 x 2 j λ j α x j = 0 λ 2 13 (x 2 λ 2 23 (x 2 α j F = min x f α (x max α λ i α α min x f α (x + λ i α x i i α Many names for the same class of bounds: Dual decomposition [Komodakis et al. 2007] TRW, MPLP [Wainwright et al. 2005; Globerson & Jaakkola, 2007] Soft arc consistency [Cooper & Schiex, 2004] Max-sum diffusion [Warner 2007]

85 Dual Decomposition f 12 (x 1, x 2 f 13 (x 1, x 3 x 1 x 3 x 2 f 23 (x 2, x 3 λ 1 13 (x 1 λ 3 13 (x 3 f 13 ( λ 1 12 (x 1 x 1 x 1 x 3 x 3 λ 3 23 (x 3 f 12 ( f 23 ( Reparameterization: x 2 x 2 j λ j α x j = 0 λ 2 13 (x 2 λ 2 23 (x 2 α j F = min x f α (x max α λ i α α min x f α (x + λ i α x i i α Many ways to optimize the bound: Sub-gradient descent [Komodakis et al. 2007; Jojic et al. 2010] Coordinate descent [Warner 2007; Globerson & Jaakkola 2007; Sontag et al. 2009; Ihler et al. 2012] Proximal optimization [Ravikumar et al, 2010] ADMM [Meshi & Globerson 2011; Martins et al. 2011; Forouzan & Ihler 2013]

86 Optimizing the bound Can optimize the bound in various ways: (Sub-gradient descent A B B C A B f 1 (A,B (B B C f 2 (B,C - (B =

87 Optimizing the bound Can optimize the bound in various ways: (Sub-gradient descent A B B C A B f 1 (A,B (B B C f 2 (B,C - (B =

88 Optimizing the bound Can optimize the bound in various ways: (Sub-gradient descent A B B C A B f 1 (A,B (B B C f 2 (B,C - (B =

89 Optimizing the bound Can optimize the bound in various ways: (Sub-gradient descent A B B C A B f 1 (A,B (B B C f 2 (B,C - (B =

90 Various Update Schemes Can use any decomposition updates (message passing, subgradient, augmented, etc. FGLP: Update the original factors JGLP: Update clique function of the join graph MBE-MM: Mini-bucket with moment matching Apply cost-shifting within each bucket only

91 Factor graph Linear Programming Update the original factors (FGLP Tighten all factors over over x i simultaneously Compute max-marginals & update:

92 FGLP (Factor Graph Linear Programming

93

94 Factor Graph Linear Programming

95 Complexity of FGLP

96 Mini-Bucket as Decomposition [Ihler et al. 2012] Downward pass as cost shifting Join graph: Can also do cost shifting within mini-buckets: Join graph message passing B: C: {A,B,C} {A,C} {A,C,E} {B} {B,D,E} {D,E} Moment-matching version: One message exchange within each bucket, during downward sweep Optimal bound defined by cliques ( regions and cost-shifting f n scopes ( coordinates D: E: A: {A,E} {A} {A,E} {A} U = upper bound {A,D,E} {A}

97 MBE-MM: MBE with moment matching Bucket B Bucket C P(E B,C P(C A max B m 11 m 12 h B (C,E P(B A max B P(D A,B m 11,m 12 - moment-matching messages P(B A B P(A A C P(C A Bucket D h B (A,D E Bucket E E = 0 h C (A,E D P(D A,B P(E B,C Bucket A P(A h E (A h D (A W=2 MPE* is an upper bound on MPE --U Generating a solution yields a lower bound--l

98 MBE-MM (MBE with Moment-Matching

99 Anytime Approximation Can tighten the bound in various ways Cost-shifting (improve consistency between cliques Increase i-bound (higher order consistency Simple moment-matching step improves bound significantly

100 Anytime Approximation Can tighten the bound in various ways Cost-shifting (improve consistency between cliques Increase i-bound (higher order consistency Simple moment-matching step improves bound significantly

101 Anytime Approximation Can tighten the bound in various ways Cost-shifting (improve consistency between cliques Increase i-bound (higher order consistency Simple moment-matching step improves bound significantly

102 Outline Mini-bucket elimination Weighted Mini-bucket Mini-clustering Iterative Belief propagation Iterative-join-graph propagation Re-parameterization, cost-shifting

Approximation Techniques Iterative bounded inference

Approximation Techniques Iterative bounded inference pproximation Techniques Iterative bounded inference COMPSCI 276, Spring 2013 Set 11: Rina Dechter (Reading: Primary: Class Notes (10) Secondary:, Darwiche chapters 14) genda Mini-bucket elimination Mini-clustering

More information

Algorithms for Reasoning with Probabilistic Graphical Models. Class 3: Approximate Inference. International Summer School on Deep Learning July 2017

Algorithms for Reasoning with Probabilistic Graphical Models. Class 3: Approximate Inference. International Summer School on Deep Learning July 2017 Algorithms for Reasoning with Probabilistic Graphical Models Class 3: Approximate Inference International Summer School on Deep Learning July 2017 Prof. Rina Dechter Prof. Alexander Ihler Approximate Inference

More information

COMPSCI 276 Fall 2007

COMPSCI 276 Fall 2007 Exact Inference lgorithms for Probabilistic Reasoning; OMPSI 276 Fall 2007 1 elief Updating Smoking lung ancer ronchitis X-ray Dyspnoea P lung cancer=yes smoking=no, dyspnoea=yes =? 2 Probabilistic Inference

More information

Exact Inference Algorithms Bucket-elimination

Exact Inference Algorithms Bucket-elimination Exact Inference Algorithms Bucket-elimination COMPSCI 276, Spring 2013 Class 5: Rina Dechter (Reading: class notes chapter 4, Darwiche chapter 6) 1 Belief Updating Smoking lung Cancer Bronchitis X-ray

More information

Exact Inference Algorithms Bucket-elimination

Exact Inference Algorithms Bucket-elimination Exact Inference Algorithms Bucket-elimination COMPSCI 276, Spring 2011 Class 5: Rina Dechter (Reading: class notes chapter 4, Darwiche chapter 6) 1 Belief Updating Smoking lung Cancer Bronchitis X-ray

More information

Variational algorithms for marginal MAP

Variational algorithms for marginal MAP Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang

More information

Junction Tree, BP and Variational Methods

Junction Tree, BP and Variational Methods Junction Tree, BP and Variational Methods Adrian Weller MLSALT4 Lecture Feb 21, 2018 With thanks to David Sontag (MIT) and Tony Jebara (Columbia) for use of many slides and illustrations For more information,

More information

Bounding the Partition Function using Hölder s Inequality

Bounding the Partition Function using Hölder s Inequality Qiang Liu Alexander Ihler Department of Computer Science, University of California, Irvine, CA, 92697, USA qliu1@uci.edu ihler@ics.uci.edu Abstract We describe an algorithm for approximate inference in

More information

Reasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter

Reasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter Reasoning with Deterministic and Probabilistic graphical models Class 2: Inference in Constraint Networks Rina Dechter Road Map n Graphical models n Constraint networks Model n Inference n Search n Probabilistic

More information

arxiv: v1 [stat.ml] 26 Feb 2013

arxiv: v1 [stat.ml] 26 Feb 2013 Variational Algorithms for Marginal MAP Qiang Liu Donald Bren School of Information and Computer Sciences University of California, Irvine Irvine, CA, 92697-3425, USA qliu1@uci.edu arxiv:1302.6584v1 [stat.ml]

More information

Mini-Buckets: A General Scheme for Bounded Inference

Mini-Buckets: A General Scheme for Bounded Inference Mini-Buckets: A General Scheme for Bounded Inference RINA DECHTER University of California, Irvine, Irvine, California AND IRINA RISH IBM T. J. Watson Research Center, Hawthorne, New York Abstract. This

More information

On the Power of Belief Propagation: A Constraint Propagation Perspective

On the Power of Belief Propagation: A Constraint Propagation Perspective 9 On the Power of Belief Propagation: A Constraint Propagation Perspective R. Dechter, B. Bidyuk, R. Mateescu and E. Rollon Introduction In his seminal paper, Pearl [986] introduced the notion of Bayesian

More information

On the Power of Belief Propagation: A Constraint Propagation Perspective

On the Power of Belief Propagation: A Constraint Propagation Perspective 9 On the Power of Belief Propagation: A Constraint Propagation Perspective R. DECHTER, B. BIDYUK, R. MATEESCU AND E. ROLLON Introduction In his seminal paper, Pearl [986] introduced the notion of Bayesian

More information

Does Better Inference mean Better Learning?

Does Better Inference mean Better Learning? Does Better Inference mean Better Learning? Andrew E. Gelfand, Rina Dechter & Alexander Ihler Department of Computer Science University of California, Irvine {agelfand,dechter,ihler}@ics.uci.edu Abstract

More information

13 : Variational Inference: Loopy Belief Propagation

13 : Variational Inference: Loopy Belief Propagation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 13 : Variational Inference: Loopy Belief Propagation Lecturer: Eric P. Xing Scribes: Rajarshi Das, Zhengzhong Liu, Dishan Gupta 1 Introduction

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Exact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = =

Exact Inference I. Mark Peot. In this lecture we will look at issues associated with exact inference. = = Exact Inference I Mark Peot In this lecture we will look at issues associated with exact inference 10 Queries The objective of probabilistic inference is to compute a joint distribution of a set of query

More information

Dual Decomposition for Inference

Dual Decomposition for Inference Dual Decomposition for Inference Yunshu Liu ASPITRG Research Group 2014-05-06 References: [1]. D. Sontag, A. Globerson and T. Jaakkola, Introduction to Dual Decomposition for Inference, Optimization for

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models David Sontag New York University Lecture 6, March 7, 2013 David Sontag (NYU) Graphical Models Lecture 6, March 7, 2013 1 / 25 Today s lecture 1 Dual decomposition 2 MAP inference

More information

Relax then Compensate: On Max-Product Belief Propagation and More

Relax then Compensate: On Max-Product Belief Propagation and More Relax then Compensate: On Max-Product Belief Propagation and More Arthur Choi Computer Science Department University of California, Los Angeles Los Angeles, CA 995 aychoi@cs.ucla.edu Adnan Darwiche Computer

More information

Bayesian Networks: Representation, Variable Elimination

Bayesian Networks: Representation, Variable Elimination Bayesian Networks: Representation, Variable Elimination CS 6375: Machine Learning Class Notes Instructor: Vibhav Gogate The University of Texas at Dallas We can view a Bayesian network as a compact representation

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Maria Ryskina, Yen-Chia Hsu 1 Introduction

More information

Fractional Belief Propagation

Fractional Belief Propagation Fractional Belief Propagation im iegerinck and Tom Heskes S, niversity of ijmegen Geert Grooteplein 21, 6525 EZ, ijmegen, the etherlands wimw,tom @snn.kun.nl Abstract e consider loopy belief propagation

More information

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles

More information

A Combined LP and QP Relaxation for MAP

A Combined LP and QP Relaxation for MAP A Combined LP and QP Relaxation for MAP Patrick Pletscher ETH Zurich, Switzerland pletscher@inf.ethz.ch Sharon Wulff ETH Zurich, Switzerland sharon.wulff@inf.ethz.ch Abstract MAP inference for general

More information

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich

Message Passing and Junction Tree Algorithms. Kayhan Batmanghelich Message Passing and Junction Tree Algorithms Kayhan Batmanghelich 1 Review 2 Review 3 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 3 here 7 here 1 of me 11

More information

Inference and Representation

Inference and Representation Inference and Representation David Sontag New York University Lecture 5, Sept. 30, 2014 David Sontag (NYU) Inference and Representation Lecture 5, Sept. 30, 2014 1 / 16 Today s lecture 1 Running-time of

More information

Learning MN Parameters with Approximation. Sargur Srihari

Learning MN Parameters with Approximation. Sargur Srihari Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief

More information

Inference as Optimization

Inference as Optimization Inference as Optimization Sargur Srihari srihari@cedar.buffalo.edu 1 Topics in Inference as Optimization Overview Exact Inference revisited The Energy Functional Optimizing the Energy Functional 2 Exact

More information

13 : Variational Inference: Loopy Belief Propagation and Mean Field

13 : Variational Inference: Loopy Belief Propagation and Mean Field 10-708: Probabilistic Graphical Models 10-708, Spring 2012 13 : Variational Inference: Loopy Belief Propagation and Mean Field Lecturer: Eric P. Xing Scribes: Peter Schulam and William Wang 1 Introduction

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/

More information

Variational Algorithms for Marginal MAP

Variational Algorithms for Marginal MAP Variational Algorithms for Marginal MAP Qiang Liu Department of Computer Science University of California, Irvine Irvine, CA, 92697 qliu1@ics.uci.edu Alexander Ihler Department of Computer Science University

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Probabilistic Graphical Models Brown University CSCI 295-P, Spring 213 Prof. Erik Sudderth Lecture 11: Inference & Learning Overview, Gaussian Graphical Models Some figures courtesy Michael Jordan s draft

More information

Probabilistic Graphical Models

Probabilistic Graphical Models School of Computer Science Probabilistic Graphical Models Variational Inference IV: Variational Principle II Junming Yin Lecture 17, March 21, 2012 X 1 X 1 X 1 X 1 X 2 X 3 X 2 X 2 X 3 X 3 Reading: X 4

More information

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Inference in Graphical Models Variable Elimination and Message Passing Algorithm Inference in Graphical Models Variable Elimination and Message Passing lgorithm Le Song Machine Learning II: dvanced Topics SE 8803ML, Spring 2012 onditional Independence ssumptions Local Markov ssumption

More information

Dual Decomposition for Marginal Inference

Dual Decomposition for Marginal Inference Dual Decomposition for Marginal Inference Justin Domke Rochester Institute of echnology Rochester, NY 14623 Abstract We present a dual decomposition approach to the treereweighted belief propagation objective.

More information

Improving Bound Propagation

Improving Bound Propagation Improving Bound Propagation Bozhena Bidyuk 1 and Rina Dechter 2 Abstract. This paper extends previously proposed bound propagation algorithm [11] for computing lower and upper bounds on posterior marginals

More information

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013

Probabilistic Graphical Models. Theory of Variational Inference: Inner and Outer Approximation. Lecture 15, March 4, 2013 School of Computer Science Probabilistic Graphical Models Theory of Variational Inference: Inner and Outer Approximation Junming Yin Lecture 15, March 4, 2013 Reading: W & J Book Chapters 1 Roadmap Two

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

UNIVERSITY OF CALIFORNIA, IRVINE. Fixing and Extending the Multiplicative Approximation Scheme THESIS

UNIVERSITY OF CALIFORNIA, IRVINE. Fixing and Extending the Multiplicative Approximation Scheme THESIS UNIVERSITY OF CALIFORNIA, IRVINE Fixing and Extending the Multiplicative Approximation Scheme THESIS submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in Information

More information

Variational Inference. Sargur Srihari

Variational Inference. Sargur Srihari Variational Inference Sargur srihari@cedar.buffalo.edu 1 Plan of discussion We first describe inference with PGMs and the intractability of exact inference Then give a taxonomy of inference algorithms

More information

Variable Elimination (VE) Barak Sternberg

Variable Elimination (VE) Barak Sternberg Variable Elimination (VE) Barak Sternberg Basic Ideas in VE Example 1: Let G be a Chain Bayesian Graph: X 1 X 2 X n 1 X n How would one compute P X n = k? Using the CPDs: P X 2 = x = x Val X1 P X 1 = x

More information

Stat 521A Lecture 18 1

Stat 521A Lecture 18 1 Stat 521A Lecture 18 1 Outline Cts and discrete variables (14.1) Gaussian networks (14.2) Conditional Gaussian networks (14.3) Non-linear Gaussian networks (14.4) Sampling (14.5) 2 Hybrid networks A hybrid

More information

Cutset sampling for Bayesian networks

Cutset sampling for Bayesian networks Cutset sampling for Bayesian networks Cutset sampling for Bayesian networks Bozhena Bidyuk Rina Dechter School of Information and Computer Science University Of California Irvine Irvine, CA 92697-3425

More information

UNIVERSITY OF CALIFORNIA, IRVINE. Reasoning and Decisions in Probabilistic Graphical Models A Unified Framework DISSERTATION

UNIVERSITY OF CALIFORNIA, IRVINE. Reasoning and Decisions in Probabilistic Graphical Models A Unified Framework DISSERTATION UNIVERSITY OF CALIFORNIA, IRVINE Reasoning and Decisions in Probabilistic Graphical Models A Unified Framework DISSERTATION submitted in partial satisfaction of the requirements for the degree of DOCTOR

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation

More information

Message-Passing Algorithms for GMRFs and Non-Linear Optimization

Message-Passing Algorithms for GMRFs and Non-Linear Optimization Message-Passing Algorithms for GMRFs and Non-Linear Optimization Jason Johnson Joint Work with Dmitry Malioutov, Venkat Chandrasekaran and Alan Willsky Stochastic Systems Group, MIT NIPS Workshop: Approximate

More information

Min-Max Message Passing and Local Consistency in Constraint Networks

Min-Max Message Passing and Local Consistency in Constraint Networks Min-Max Message Passing and Local Consistency in Constraint Networks Hong Xu, T. K. Satish Kumar, and Sven Koenig University of Southern California, Los Angeles, CA 90089, USA hongx@usc.edu tkskwork@gmail.com

More information

THE topic of this paper is the following problem:

THE topic of this paper is the following problem: 1 Revisiting the Linear Programming Relaxation Approach to Gibbs Energy Minimization and Weighted Constraint Satisfaction Tomáš Werner Abstract We present a number of contributions to the LP relaxation

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Probabilistic Variational Bounds for Graphical Models

Probabilistic Variational Bounds for Graphical Models Probabilistic Variational Bounds for Graphical Models Qiang Liu Computer Science Dartmouth College qliu@cs.dartmouth.edu John Fisher III CSAIL MIT fisher@csail.mit.edu Alexander Ihler Computer Science

More information

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS

UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS UNDERSTANDING BELIEF PROPOGATION AND ITS GENERALIZATIONS JONATHAN YEDIDIA, WILLIAM FREEMAN, YAIR WEISS 2001 MERL TECH REPORT Kristin Branson and Ian Fasel June 11, 2003 1. Inference Inference problems

More information

Factor Graphs and Message Passing Algorithms Part 1: Introduction

Factor Graphs and Message Passing Algorithms Part 1: Introduction Factor Graphs and Message Passing Algorithms Part 1: Introduction Hans-Andrea Loeliger December 2007 1 The Two Basic Problems 1. Marginalization: Compute f k (x k ) f(x 1,..., x n ) x 1,..., x n except

More information

Approximation by Quantization

Approximation by Quantization Approximation by Quantization Vibhav Gogate and Pedro Domingos Computer Science & Engineering University of Washington Seattle, WA 98195, USA {vgogate,pedrod}@cs.washington.edu Abstract Inference in graphical

More information

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches

CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches CSE 254: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches A presentation by Evan Ettinger November 11, 2005 Outline Introduction Motivation and Background

More information

Machine Learning Lecture 14

Machine Learning Lecture 14 Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Lecture 9: PGM Learning

Lecture 9: PGM Learning 13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Bayesian Learning in Undirected Graphical Models

Bayesian Learning in Undirected Graphical Models Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul

More information

On finding minimal w-cutset problem

On finding minimal w-cutset problem On finding minimal w-cutset problem Bozhena Bidyuk and Rina Dechter Information and Computer Science University Of California Irvine Irvine, CA 92697-3425 Abstract The complexity of a reasoning task over

More information

Variable Elimination: Algorithm

Variable Elimination: Algorithm Variable Elimination: Algorithm Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference Algorithms 2. Variable Elimination: the Basic ideas 3. Variable Elimination Sum-Product VE Algorithm Sum-Product

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Summary of Class Advanced Topics Dhruv Batra Virginia Tech HW1 Grades Mean: 28.5/38 ~= 74.9%

More information

Decomposition Methods for Large Scale LP Decoding

Decomposition Methods for Large Scale LP Decoding Decomposition Methods for Large Scale LP Decoding Siddharth Barman Joint work with Xishuo Liu, Stark Draper, and Ben Recht Outline Background and Problem Setup LP Decoding Formulation Optimization Framework

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing

Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Subproblem-Tree Calibration: A Unified Approach to Max-Product Message Passing Huayan Wang huayanw@cs.stanford.edu Computer Science Department, Stanford University, Palo Alto, CA 90 USA Daphne Koller koller@cs.stanford.edu

More information

Variable Elimination: Algorithm

Variable Elimination: Algorithm Variable Elimination: Algorithm Sargur srihari@cedar.buffalo.edu 1 Topics 1. Types of Inference Algorithms 2. Variable Elimination: the Basic ideas 3. Variable Elimination Sum-Product VE Algorithm Sum-Product

More information

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics Implementing Machine Reasoning using Bayesian Network in Big Data Analytics Steve Cheng, Ph.D. Guest Speaker for EECS 6893 Big Data Analytics Columbia University October 26, 2017 Outline Introduction Probability

More information

Structured Variational Inference

Structured Variational Inference Structured Variational Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. Structured Variational Approximations 1. The Mean Field Approximation 1. The Mean Field Energy 2. Maximizing the energy functional:

More information

9 Forward-backward algorithm, sum-product on factor graphs

9 Forward-backward algorithm, sum-product on factor graphs Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 9 Forward-backward algorithm, sum-product on factor graphs The previous

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

Compiling Knowledge into Decomposable Negation Normal Form

Compiling Knowledge into Decomposable Negation Normal Form Compiling Knowledge into Decomposable Negation Normal Form Adnan Darwiche Cognitive Systems Laboratory Department of Computer Science University of California Los Angeles, CA 90024 darwiche@cs. ucla. edu

More information

Weighted Heuristic Anytime Search

Weighted Heuristic Anytime Search Weighted Heuristic Anytime Search Flerova, Marinescu, and Dechter 1 1 University of South Carolina April 24, 2017 Basic Heuristic Best-first Search Blindly follows the heuristic Weighted A Search For w

More information

Expectation Propagation Algorithm

Expectation Propagation Algorithm Expectation Propagation Algorithm 1 Shuang Wang School of Electrical and Computer Engineering University of Oklahoma, Tulsa, OK, 74135 Email: {shuangwang}@ou.edu This note contains three parts. First,

More information

Computational Complexity of Inference

Computational Complexity of Inference Computational Complexity of Inference Sargur srihari@cedar.buffalo.edu 1 Topics 1. What is Inference? 2. Complexity Classes 3. Exact Inference 1. Variable Elimination Sum-Product Algorithm 2. Factor Graphs

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Computer Science CPSC 322. Lecture 23 Planning Under Uncertainty and Decision Networks

Computer Science CPSC 322. Lecture 23 Planning Under Uncertainty and Decision Networks Computer Science CPSC 322 Lecture 23 Planning Under Uncertainty and Decision Networks 1 Announcements Final exam Mon, Dec. 18, 12noon Same general format as midterm Part short questions, part longer problems

More information

International Mathematical Olympiad. Preliminary Selection Contest 2004 Hong Kong. Outline of Solutions 3 N

International Mathematical Olympiad. Preliminary Selection Contest 2004 Hong Kong. Outline of Solutions 3 N International Mathematical Olympiad Preliminary Selection Contest 004 Hong Kong Outline of Solutions Answers:. 8. 0. N 4. 49894 5. 6. 004! 40 400 7. 6 8. 7 9. 5 0. 0. 007. 8066. π + 4. 5 5. 60 6. 475 7.

More information

Codes on Graphs. Telecommunications Laboratory. Alex Balatsoukas-Stimming. Technical University of Crete. November 27th, 2008

Codes on Graphs. Telecommunications Laboratory. Alex Balatsoukas-Stimming. Technical University of Crete. November 27th, 2008 Codes on Graphs Telecommunications Laboratory Alex Balatsoukas-Stimming Technical University of Crete November 27th, 2008 Telecommunications Laboratory (TUC) Codes on Graphs November 27th, 2008 1 / 31

More information

CS Lecture 4. Markov Random Fields

CS Lecture 4. Markov Random Fields CS 6347 Lecture 4 Markov Random Fields Recap Announcements First homework is available on elearning Reminder: Office hours Tuesday from 10am-11am Last Time Bayesian networks Today Markov random fields

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

Bayes Networks 6.872/HST.950

Bayes Networks 6.872/HST.950 Bayes Networks 6.872/HST.950 What Probabilistic Models Should We Use? Full joint distribution Completely expressive Hugely data-hungry Exponential computational complexity Naive Bayes (full conditional

More information

12 : Variational Inference I

12 : Variational Inference I 10-708: Probabilistic Graphical Models, Spring 2015 12 : Variational Inference I Lecturer: Eric P. Xing Scribes: Fattaneh Jabbari, Eric Lei, Evan Shapiro 1 Introduction Probabilistic inference is one of

More information

Max$Sum(( Exact(Inference(

Max$Sum(( Exact(Inference( Probabilis7c( Graphical( Models( Inference( MAP( Max$Sum(( Exact(Inference( Product Summation a 1 b 1 8 a 1 b 2 1 a 2 b 1 0.5 a 2 b 2 2 a 1 b 1 3 a 1 b 2 0 a 2 b 1-1 a 2 b 2 1 Max-Sum Elimination in Chains

More information

Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs

Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs Revisiting the Limits of MAP Inference by MWSS on Perfect Graphs Adrian Weller University of Cambridge CP 2015 Cork, Ireland Slides and full paper at http://mlg.eng.cam.ac.uk/adrian/ 1 / 21 Motivation:

More information

Information Geometric view of Belief Propagation

Information Geometric view of Belief Propagation Information Geometric view of Belief Propagation Yunshu Liu 2013-10-17 References: [1]. Shiro Ikeda, Toshiyuki Tanaka and Shun-ichi Amari, Stochastic reasoning, Free energy and Information Geometry, Neural

More information

UNIVERSITY OF CALIFORNIA, IRVINE. Bottom-Up Approaches to Approximate Inference and Learning in Discrete Graphical Models DISSERTATION

UNIVERSITY OF CALIFORNIA, IRVINE. Bottom-Up Approaches to Approximate Inference and Learning in Discrete Graphical Models DISSERTATION UNIVERSITY OF CALIFORNIA, IRVINE Bottom-Up Approaches to Approximate Inference and Learning in Discrete Graphical Models DISSERTATION submitted in partial satisfaction of the requirements for the degree

More information

Convergent Tree-reweighted Message Passing for Energy Minimization

Convergent Tree-reweighted Message Passing for Energy Minimization Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2006. Convergent Tree-reweighted Message Passing for Energy Minimization Vladimir Kolmogorov University College London,

More information

A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation

A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation A Cluster-Cumulant Expansion at the Fixed Points of Belief Propagation Max Welling Dept. of Computer Science University of California, Irvine Irvine, CA 92697-3425, USA Andrew E. Gelfand Dept. of Computer

More information

Active Tuples-based Scheme for Bounding Posterior Beliefs

Active Tuples-based Scheme for Bounding Posterior Beliefs Journal of Artificial Intelligence Research 39 (2010) 335-371 Submitted 10/09; published 09/10 Active Tuples-based Scheme for Bounding Posterior Beliefs Bozhena Bidyuk Google Inc. 19540 Jamboree Rd Irvine,

More information

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah

Introduction to Graphical Models. Srikumar Ramalingam School of Computing University of Utah Introduction to Graphical Models Srikumar Ramalingam School of Computing University of Utah Reference Christopher M. Bishop, Pattern Recognition and Machine Learning, Jonathan S. Yedidia, William T. Freeman,

More information

Lecture 17: May 29, 2002

Lecture 17: May 29, 2002 EE596 Pat. Recog. II: Introduction to Graphical Models University of Washington Spring 2000 Dept. of Electrical Engineering Lecture 17: May 29, 2002 Lecturer: Jeff ilmes Scribe: Kurt Partridge, Salvador

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Inference in Bayes Nets

Inference in Bayes Nets Inference in Bayes Nets Bruce D Ambrosio Oregon State University and Robert Fung Institute for Decision Systems Research July 23, 1994 Summer Institute on Probablility in AI 1994 Inference 1 1 Introduction

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Chapter 8&9: Classification: Part 3 Instructor: Yizhou Sun yzsun@ccs.neu.edu March 12, 2013 Midterm Report Grade Distribution 90-100 10 80-89 16 70-79 8 60-69 4

More information

On the Choice of Regions for Generalized Belief Propagation

On the Choice of Regions for Generalized Belief Propagation UAI 2004 WELLING 585 On the Choice of Regions for Generalized Belief Propagation Max Welling (welling@ics.uci.edu) School of Information and Computer Science University of California Irvine, CA 92697-3425

More information

Studies in Solution Sampling

Studies in Solution Sampling Studies in Solution Sampling Vibhav Gogate and Rina Dechter Donald Bren School of Information and Computer Science, University of California, Irvine, CA 92697, USA. {vgogate,dechter}@ics.uci.edu Abstract

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information