Symbolic Methods for Probabilistic Inference, Optimization, and Decision-making Scott Sanner

Size: px
Start display at page:

Download "Symbolic Methods for Probabilistic Inference, Optimization, and Decision-making Scott Sanner"

Transcription

1 Symbolic Methods for Probabilistic Inference, Optimization, and Decision-making Scott Sanner With much thanks to research collaborators: Zahra Zamani, Ehsan Abbasnejad, Karina Valdivia Delgado, Leliane Nunes de Barros, Luis Gustavo Rocha Vianna, Cheng (Simon) Fang

2 Graphical Models are Pervasive in AI Medical Pathfinder: Expert System BUGS: Epidemiology Text LDA and 1,000 Extensions Vision Ising Model! Robotics Dynamics and sensor models

3 Graphical Models + Symbolic Methods Specify a graphical model for problem at hand Text, vision, robotics, etc. Goal: efficient inference and optimization in this model Be it discrete or continuous Symbolic methods (e.g., decision diagrams) facilitate this! Useful as building block in any inference algorithm Exploit structure for compactness, efficient computation Automagically! Exploit more structure than graphical model alone Still partially a dream, but many recent advances

4 Tutorial Outline Part I: Symbolic Methods for Discrete Inference Graphical Models and Influence Diagrams Symbolic Inference with Decision Diagrams Part II: Extensions to Continuous Inference Case Calculus Extended ADD (XADD) Part III: Applications Graphical Model Inference Sequential Decision-making Constrained Optimization

5 Part I: Symbolic Methods for Discrete Inference and Optimization

6 Directed Graphical Models Bayesian Network: compact (factored) specification of joint probability e.g., have binary variables B, F, A, H,P: Bubonic Plague Stomach Flu Appendicitis Severe Headache Abdominal Pain joint factorizes according to model structure: P(B,F,A,H,P) = P(H B,F) P(P F,A) P(B) P(F) P(A)

7 Undirected Graphical Models Markov Random Fields (MRFs) Variables take on possible configurations V 1 V 2 V 3 V 4 Factors denote compatibility of configurations P(V 1,V 2,V 3,V 4 ) = 1 Z F(V 1,V 2 ) F(V 2,V 4 ) F(V 1,V 3 ) F(V 3,V 4 ) Conditional MRFs (CRFs) P(V 1,V 2 V 3,V 4 ) = How to compute normalizer Z? 1 Z(V 3,V F(V 4 ) 1,V 2 ) F(V 2,V 4 ) F(V 1,V 3 ) F(V 3,V 4 ) Note: representation above is factor graph (FG), which works for directed models as well. For FGs of directed models, what conditions also hold?)

8 Dynamical Models & Influence Diagrams Dynamical models t t+1 A Represent times t, t+1 Assume stationary distribution X 1 X 2 X 1 X 2 U Influence diagrams Action nodes [squares] Not random variables Rather controlled variables Utility nodes <diamonds> A utility conditioned on state, e.g. U(X 1,X 2 ) = if (X 1 =X 2 ) then 10 else 0

9 Discrete Inference & Optimization Probabilistic Queries in Bayesian networks: Want P(Query Evidence), e.g. E A B X P(E X) = P(E,X) / P(X) A B P(A,B,E,X) = A B P(A B,E) P(X B) P(E) P(X) A t+1 t X X Maximizing Exp. Utility in Influence Diagrams: Want optimal action a* = argmax a E[U A=a, ], e.g. a* = argmax a E[U A=a,X=x] = argmax a X U(X )P(X A=a,X=x)

10 Manipulating Discrete Distributions Marginalization X P(A; b) = P(A) b X b A B Pr = A Pr

11 Manipulating Discrete Distributions Maximization max b P(A; b) = P(A) A B Pr b = max A Pr 0.3 (B=1) 1.4 (B=0)

12 Manipulating Discrete Distributions Binary Multiplication P(AjB) P(BjC) = P(A; BjC) A B Pr B C Pr A B C Pr = Same principle holds for all binary ops +, -, /, max, etc

13 Discrete Inference & Optimization Observation 1: all discrete functions can be tables P(A,B) = A B Pr Observation 2: all operations computable in closed-form f 1 f 2, f 1 f 2 max( f 1, f 2 ), min( f 1, f 2 ) x f(x) (arg)max x f(x), (arg)min x f(x) Now can do exact inference and optimization in discrete graphical models and influence diagrams!

14 Discrete Inference & Optimization Probabilistic Queries in Bayesian networks: Want P(Query Evidence), e.g. E A B X P(E X) = P(E,X) / P(X) Just products and marginals. A B P(A,B,E,X) = A B P(A B,E) P(X B) P(E) P(X) A t+1 t X X Maximizing Exp. Utility in Influence Diagrams: Want optimal action a* = argmax a E[U A=a, ], e.g. a* = argmax a E[U A=a,X=x] = argmax a X U(X )P(X A=a,X=x) Product, max, and marginals.

15 Where are we? We can specify discrete models We know operations needed for inference Can we optimize the order of operations?

16 Variable Elimination When marginalizing over y, try to factor out all probabilities independent of y: P (X 1 ) = X x 2 ;:::;x n ;y = P (X 1 ) {z } O(1) P (yjx 1 ; :::; x n )P (X 1 ) P (x n ) X X X P (x 2 ) P (x n ) x 2 x n {z } =O(n) y P (yjx 1 ; :::; x n ) {z } =O(1) Curly braces show number of FLOPS In tabular case.

17 VE: Variable Order Matters Sum commutes, can change order of elimination: P (X 1 ) = Original query, changed var order X P (yjx 1 ; :::; x n )P (X 1 ) P (x n ) y;x n ;:::;x 2 = X X P (X 1 )P (x 3 ) P (x n ) P (x 2 )P (yjx 1 ; :::; x n ) y;x n ;:::;x 3 x 2 {z } =O(2 n+1 ) With different variable order: O(n) O(2 n+1 ) Good variable order: minimizes #vars in largest intermediate factor a.k.a., ~tree width (TW) = n+1 Graphical model inference is ~O(2 TW ) Actually TW+1 but the point is exponential

18 Recap Graphical Models and Influence Diagrams why should you care? Versus non-factored models GMs and IDs allow exponential space savings in representation exponential time savings in inference exploit factored structure in variable elimination exponential data reduction for learning smaller models = fewer parameters = less data

19 Where are we? We can specify discrete models We know operations needed for inference We know how to optimize order of operations Is this it? Is there more structure to exploit?

20 Symbolic Inference with Decision Diagrams For Discrete Models

21 DD Definition Decision diagrams (DDs): DAG variant of decision tree b a b Decision tests ordered Used to represent: f: B n B (boolean BDD, set of subsets {{a,b},{a}} ZDD) f: B n Z (integer MTBDD / ADD) f: B n R (real ADD) 1 0 We ll focus on ADDs in this tutorial.

22 What s the Big Deal? More than compactness f a g b Ordered decision tests in DDs support efficient operations b 1 b 0 c 0 d 1 ADD: -f, f g, f g, max(f, g) BDD: f, f g, f g ZDD: f \ g, f g, f g Efficient operations key to inference

23 Function Representation (Tables) How to represent functions: B n R? How about a fully enumerated table OK, but can we be more compact? a b c F(a,b,c)

24 Function Representation (Trees) How about a tree? Sure, can simplify. a b c F(a,b,c) Context-specific independence! c a b c

25 Function Representation (ADDs) Why not a directed acyclic graph (DAG)? a b c F(a,b,c) c 1 0 a Algebraic b Decision Diagram (ADD) c Exploits context-specific independence (CSI) and shared substructure.

26 Function Representation (ADDs) Why not a directed acyclic graph (DAG)? a b c F(a,b,c) a c 1 0 b Algebraic Decision Diagram (ADD) Exploits context-specific independence (CSI) and shared substructure.

27 Trees vs. ADDs AND OR XOR x 1 x 1 x 1 x 2 x 2 x 2 x 2 x 3 x 3 x 3 x Trees can compactly represent AND / OR But not XOR (linear as ADD, exponential as tree) Why? Trees must represent every path

28 Binary Operations (ADDs) Why do we order variable tests? Enables us to do efficient binary operations a b a c Result: ADD operations can avoid state enumeration a b c 0 2 c

29 Summary We need B n B / Z / R We need compact representations We need efficient operations DDs are a promising candidate Great, tell me all about DDs OK Not claiming DDs solve all problems but often better than tabular approach

30 Decision Diagrams: Reduce (how to build canonical DDs)

31 How to Reduce Ordered Tree to ADD? Recursively build bottom up Hash terminal nodes R ID leaf cache Hash non-terminal functions (v, ID 0, ID 1 ) ID special case: (v,id,id) ID others: keep in (reduce) cache (x 2,2,2) 2 (x 1,1,0) 2 (x 1,1,0)

32 GetNode Removes redundant branches Maintains cache of internal nodes v 1 F F Algorithm 1: GetNode(v; F h ; F l i)! F r input : v; F h ; F l : Var and node ids for high/low branches output: F r : Return values for o set, multiplier, and canonical node id begin // If branches redundant, return child if (F l = F h ) then return F l ; // Make new node if not in cache if (hv; F h ; F l! id is not in node cache) then id := currently unallocated id; insert hv; F h ; F l ii! id in cache; // Return the cached, canonical node return id ; end

33 Reduce Algorithm Algorithm 1: Reduce(F )! F r input : F : Node id output: F r : Canonical node id for reduced ADD begin // Check for terminal node if (F is terminal node) then return canonical terminal node for value of F ; // Check reduce cache if ( F! F r is not in reduce cache) then // Not in cache, so recurse F h := Reduce(F h ); F l := Reduce(F l ); // Retrieve canonical form F r := GetNode(F var ; F h ; F l ); // Put in cache insert F! F r in reduce cache; // Return canonical reduced node return F r ; end

34 Reduce Complexity Linear in size of input Input can be tree or DAG x 1 x 2 x 2 Because of caching Explores each node once Does not need to explore all branches x 3 x 3 x 4 x 4 x 5 x 5 1 0

35 Canonicity of ADDs via Reduce Claim: if two functions are identical, Reduce will assign both functions same ID By induction on var order Base case: Canonical for 0 vars: terminal nodes Inductive: Assume canonical for k-1 vars v k GetNode result canonical for k th var (only one way to represent) x 1 x 2 x 2 x 3 x 3 x 4 x 4 x 5 x 5 ID 1 ID 0 GetNode Unique ID 1 0

36 Impact of Variable Orderings Good orders can matter Original var labels Vars relabeled Good orders typically have related vars together e.g., low tree-width order in transition graphical model Graph-Based Algorithms for Boolean Function Manipulation Randal E. Bryant; IEEE Transactions on Computers 1986.

37 In-diagram Reordering Rudell s sifting algorithm Global reordering as pairwise swapping Only need to redirect arcs Helps to use pointers then don t need to redirect parents, e.g., a b b b Swap(a,b) a a ID 1 ID 2 ID 3 ID 4 ID 1 ID 3 ID 2 ID 4 Can also do reorder using Apply later

38 Beyond Binary Variables Multivalued (MV-)DDs A DD with multivalued variables straightforward k-branch extension e.g., k=6 R R=1 R=2 R=6 ID 1 ID 2 ID 6 Works for ADD extensions as well

39 Decision Diagrams: Apply (how to do efficient operations on DDs)

40 Recap Recall the Apply recursion a b a c Result: ADD operations can avoid state enumeration a b Need to handle recursive cases c 0 2 c 1 0 Need to handle base cases 0 2

41 Apply Recursion Need to compute F 1 op F 2 e.g., op {,,, } Case 1: F 1 & F 2 match vars F h = Apply(F 1;h ; F 2;h ; op) F l = Apply(F 1;l ; F 2;l ; op) F r = GetNode(F var 1 ; F h ; F l ) F r F 1 F 2 v 1 F 1,h F 1,l (op) v 1 F 2,h F 2,l v 1 Apply (F 1,h, F 2,h,op) Apply (F 1,l, F 2,l,op)

42 Apply Recursion Need to compute F 1 op F 2 e.g., op {,,, } Case 2: Non-matching var: v 1 v 2 F h = Apply(F 1 ; F 2;h ; op) F l = Apply(F 1 ; F 2;l ; op) F r = GetNode(F var 2 ; F h ; F l ) F 1 F 2 v 2 (op) v 1 F r F 1,h F 1,l F 2,h F 2,l v 1 Apply (F 1, F 2,h,op) Apply (F 1, F 2,l,op)

43 Table 1: Input and output summaries of ComputeResult. Apply Base Case: ComputeResult F 1 (op) F 2 Constant (terminal) nodes and some other cases can be computed without recursion ComputeResult(F 1 ; F 2 ; op)! F r Operation and Conditions Return Value F 1 op F 2 ; F 1 = C 1 ; F 2 = C 2 C 1 op C 2 F 1 F 2 ; F 2 = 0 F 1 F 1 F 2 ; F 1 = 0 F 2 F 1 ª F 2 ; F 2 = 0 F 1 F 1 F 2 ; F 2 = 1 F 1 F 1 F 2 ; F 1 = 1 F 2 F 1 F 2 ; F 2 = 1 F 1 min(f 1 ; F 2 ); max(f 1 ) min(f 2 ) F 1 min(f 1 ; F 2 ); max(f 2 ) min(f 1 ) F 2 similarly for max other null

44 Apply Algorithm Note: Apply works for any binary operation! Why? Algorithm 1: Apply(F 1 ; F 2 ; op)! F r input : F 1 ; F 2 ; op : ADD nodes and op output: F r : ADD result node to return begin // Check if result can be immediately computed if (ComputeResult(F 1 ; F 2 ; op)! F r is not null ) then return F r ; // Check if result already in apply cache if ( hf 1 ; F 2 ; opi! F r is not in apply cache) then // Not terminal, so recurse var := GetEarliestVar(F 1 var ; F2 var ); // Set up nodes for recursion if (F 1 is non-terminal ^ var = F1 var ) then Fl v1 := F 1;l ; Fh v1 := F 1;h; else F v1 l=h := F 1; if (F 2 is non-terminal ^ var = F2 var ) then Fl v2 := F 2;l ; Fh v2 := F 2;h; else F v2 l=h := F 2; // Recurse and get cached result F l := Apply(Fl v1 ; Fl v2 ; op); F h := Apply(Fh v1; F h v2; op); F r := GetNode(var; F h ; F l ); // Put result in apply cache and return insert hf 1 ; F 2 ; opi! F r into apply cache; return F r ; end

45 Apply Properties Apply uses Apply cache (F 1,F 2,op) F R Complexity Quadratic: O( F 1 F 2 ) F measured in node count Why? Cache implies touch every pair of nodes at most once! 1 x 1 x 2 0 x 3 1 x 2 0 Canonical? Same inductive argument as Reduce

46 Reduce-Restrict Important operation Have F(x,y,z) Want G(x,y) = F z=0 F z ID 1 ID 0 Trivial when restricted var is root node Restrict(z=0) Restrict F v=value operation performs a Reduce Just returns branch for v=value whenever v reached Need Restrict-Reduce cache for O( F ) complexity G ID 0

47 Marginalization, etc. Use Apply + Reduce-Restrict x F (x, ) = F x=0 F x=1, e.g. F x x ID 1 ID 0 ID 1 ID 0 Likewise for similar operations ADD: min x F (x, ) = min( F x=0, F x=1 ) BDD: x F (x, ) = F x=0 F x=1 BDD: x F (x, ) = F x=0 F x=1

48 Apply Tricks I Build F(x 1,, x n ) = i=1..n x i Don t build a tree and then call Reduce! Just use indicator DDs and Apply to compute x 1 x 2 x n In general: Build any arithmetic expression bottom-up using Apply! x 1 + (x 2 + 4x 3 ) * (x 4 ) x 1 (x 2 (4 x 3 )) (x 4 )

49 Apply Tricks II Build ordered DD from unordered DD z is out of order z result will have z in order! z ID 1 ID z 0 1 ID 1 ID 0

50 Affine ADDs

51 ADD Inefficiency Are ADDs enough? Or do we need more compactness? Ex. 1: Additive reward/utility functions R(a,b,c) = R(a) + R(b) + R(c) = 4a + 2b + c Ex. 2: Multiplicative value functions V(a,b,c) = V(a) V(b) V(c) = (4a + 2b + c) c b a b c c c c b a b c c c

52 (Sanner, McAllester, IJCAI-05) Affine ADD (AADD) Define a new decision diagram Affine ADD Edges labeled by offset (c) and multiplier (b): a <c 1,b 1 > <c 2,b 2 > F 1 F 2 Semantics: if (a) then (c 1 +b 1 F 1 ) else (c 2 +b 2 F 2 )

53 Affine ADD (AADD) Maximize sharing by normalizing nodes [0,1] Example: if (a) then (4) else (2) Need top-level affine transform to recover original range a <4,0> <2,0> Normalize <2,2> a <1,0> <0,0> 0 0

54 AADD Reduce Key point: automatically finds additive structure

55 AADD Examples Automatically Constructed! Back to our previous examples Ex. 1: Additive reward/utility functions R(a,b) = R(a) + R(b) = 2a + b Ex. 2: Multiplicative value functions V(a,b) = V(a) V(b) = (2a + b) ; <1 a <2/3,1/3> <0,1/3> b 0 <0,3> <1,0> <0,0> <0, 2-3 > 1-3 a < 3, 1-3 > b <0,0> <1,0> 0 < - 3, 1- >

56 AADD Apply & Normalized Caching Need to normalize Apply cache keys, e.g., given h3 + 4F 1 i h5 + 6F 2 i before lookup in Apply cache, normalize h0 + 1F 1 i h0 + 1:5F 2 i GetNormCacheKey(hc 1 ; b 1 ; F 1 i; hc 2 ; b 2 ; F 2 i; op)! hhc 0 1; b 0 1ihc 0 2; b 0 2ii and ModifyResult(hc r ; b r ; F r i)! hc 0 r; b 0 r; F 0 ri Operation and Conditions Normalized Cache Key and Computation Result Modi cation hc 1 + b 1 F 1 i hc 2 + b 2 F 2 i; F 1 6= 0 hc r + b r F r i = h0 + 1F 1 i h0 + (b 2 =b 1 )F 2 i h(c 1 + c 2 + b 1 c r ) + b 1 b r F r i hc 1 + b 1 F 1 i ª hc 2 + b 2 F 2 i; F 1 6= 0 hc r + b r F r i = h0 + 1F 1 i ª h0 + (b 2 =b 1 )F 2 i h(c 1 c 2 + b 1 c r ) + b 1 b r F r i hc 1 + b 1 F 1 i hc 2 + b 2 F 2 i; F 1 6= 0 hc r + b r F r i = h(c 1 =b 1 ) + F 1 i h(c 2 =b 2 ) + F 2 i hb 1 b 2 c r + b 1 b 2 b r F r i hc 1 + b 1 F 1 i hc 2 + b 2 F 2 i; F 1 6= 0 hc r + b r F r i = h(c 1 =b 1 ) + F 1 i h(c 2 =b 2 ) + F 2 i h(b 1 =b 2 )c r + (b 1 =b 2 )b r F r i max(hc 1 + b 1 F 1 i; hc 2 + b 2 F 2 i); hc r + b r F r i = max(h0 + 1F 1 i; h(c 2 c 1 )=b 1 + (b 2 =b 1 )F 2 i) h(c 1 + b 1 c r ) + b 1 b r F r i F 1 6= 0, Note: same for min any hopi not matching above: hc 1 + b 1 F 1 i hopi hc 2 + b 2 F 2 i hc r + b r F r i = hc 1 + b 1 F 1 i hopi hc 2 + b 2 F 2 i hc r + b r F r i

57 ADDs vs. AADDs Additive functions: i=1..n x i Note: no context-specific independence, but subdiagrams shared: result size O(n 2 )

58 ADDs vs. AADDs Additive functions: i 2 i x i Best case result for ADD (exp.) vs. AADD (linear)

59 ADDs vs. AADDs Additive functions: i=0..n-1 F(x i,x (i+1) % n ) x 1 x 2 x 3 x 4 Pairwise factoring evident in AADD structure

60 Main AADD Theorem AADD can yield exponential time/space improvement over ADD and never performs worse! But Apply operations on AADDs can be exponential Why? Reconvergent diagrams possible in AADDs (edge labels), but not ADDs Sometimes Apply explores all paths if no hits in normalized Apply cache

61 Recap: Symbolic Inference with DDs Probabilistic Queries in Bayesian networks: E A B X P(E X) = P(E,X) / P(X) A B P(A,B,E,X) = A B P(A B,E) P(X B) P(E) P(X) DD DD DD DD Maximizing Exp. Utility in Influence Diagrams: t+1 t A a* = argmax a E[U A=a,X=x] = argmax a X U(X )P(X A=a,X=x) X X DD DD DDs can be used in any algorithm: use in VE, Loopy BP, Junction tree, etc. DDs automatically exploit structure in factors / msgs.

62 Approximate Inference Sometimes no DD is compact, but bounded approximation is

63 Problem: Value ADD Too Large Sum: ( i= i x i ) + x 4 -Noise How to approximate?

64 (St-Aubin, Hoey, Boutilier, NIPS-00) Solution: APRICODD Trick Merge leaves and reduce: Error is bounded!

65 Can use ADD to Maintain Bounds! Change leaf to represent range [L,U] Normal leaf is like [V,V] When merging leaves keep track of min and max values contributing For operations, see interval arithmetic :

66 More Compactness? AADDs? Sum: ( i= i x i ) + x 4 -Noise How to approximate? Error-bounded merge

67 (Sanner, Uther, Delgado, AAMAS-10) Solution: MADCAP Trick Merge nodes from bottom up:

68 Key Approximation Message automatic, efficient methods for finding logical, CSI, additive, and multiplicative structure in bounded approximations!

69 Example Inference Results using Decision Diagrams Do they really work well?

70 Empirical Comparison: Table/ADD/AADD n n Sum: i=1 2 i x i i=1 2 i x i Prod: i=1 ^(2 i x i ) i=1 ^(2 i x i ) n n ADD Table AADD ADD Table AADD ADD Table AADD ADD Table AADD

71 Application: Bayes Net Inference Use variable elimination Replace CPTs with ADDs or AADDs Could do same for clique/junction-tree algorithms Exploits Context-specific independence Probability has logical structure: P(a b,c) = if b? 1 : if c?.7 :.3 Additive CPTs Probability is discretized linear function: P(a b 1 b n ) = c + b i 2 i b i Multiplicative CPTs Noisy-or (multiplicative AADD): P(e c 1 c n ) = 1 - i (1 p i ) If factor has above compact form, AADD exploits it

72 Bayes Net Results: Various Networks Bayes Net Table ADD AADD # Entries Time # Nodes Time # Nodes Time Alarm 1, s s s Barley 470,294 EML * 139,856 EML * 60, m Carpo s s s Hailfinder 9, s 4, s 2, s Insurance 2, s 1, s s Noisy-Or-15 65, s 125, s 1, s Noisy-Max , s 202, s 40, s * EML: Exceeded Memory Limit (1GB)

73 Application: MDP Solving SPUDD Factored MDP Solver (Hoey et al, 99) Originally uses ADDs, can use AADDs Implements factored value iteration DD DD V n+1 DD DD (x 1 x i ) = R(x 1 x i ) + max a x1 xi F1 Fi P(x 1 x i ) P(x i x i ) V n (x 1 x i ) DD

74 Application: SysAdmin SysAdmin MDP (GKP, 2001) Network of computers: c 1,, c k Various network topologies c 3 c 2 c 4 Every computer is running or crashed At each time step, status of c i affected by Previous state status Status of incoming connections in previous state Reward: +1 for every computer running (additive) c 5 c 1 c 6

75 Results I: SysAdmin (10% Approx)

76 Results II: SysAdmin

77 Traffic Domain Binary cell transmission model (CTM) Actions Light changes Objective: Maximize empty cells in network

78 Results Traffic

79 Application: POMDPs Provided an AADD implementation for Guy Shani s factored POMDP solver Final value function size results: ADD AADD Network Management Rock Sample

80 Inference with Decision Diagrams vs. Compilations (d-dnnf, etc.) Important Distinctions

81 Defintions / Diagrams from A Knowledge Compilation Map, Darwiche and Marquis. JAIR 02 BDDs in NNF Can express BDD as NNF formula Can represent NNF diagrammatically

82 d-dnnf Defintions / Diagrams from A Knowledge Compilation Map, Darwiche and Marquis. JAIR 02 Decomposable NNF: sets of leaf vars of conjuncts are disjoint Deterministic NNF: formula for disjuncts have disjoint models (conjunction is unsatisfiable)

83 d-dnnf Defintions / Diagrams from A Knowledge Compilation Map, Darwiche and Marquis. JAIR 02 D-DNNF used to compile single formula d-dnnf does not support efficient binary operations (,, ) d-dnnf shares some polytime operations with OBDD / ADD (weighted) model counting (CT) used in many inference tasks Size(d-DNNF) Size(OBDD) so more efficient on d-dnnf child is subset of parent Children inherit polytime operations of parents Size of children parents Ordered BDD, in previous slides I call this a BDD

84 Compilations vs Decision Diagrams Typically not a good idea in sequential probabilistic inference or decision-making Summary If you can compile problem into single formula then compilation is likely preferable to DDs provided you only need ops that compilation supports Not all compilations efficient for all binary operations e.g., all ops needed for progression / regression approaches fixed ordering of DDs help support these operations Note: other compilations (e.g., arithmetic circuits) Great software:

85 Part I Summary: Symbolic Methods for Discrete Inference Graphical Models and Influence Diagrams Representation: products of discrete factors Inference: operations on discrete factors Order of operations matters Symbolic Inference with Decision Diagrams (DDs) DDs more compact than tables or trees Logical (AND, OR, XOR) Context-specific independence (CSI) & shared substructure Additive and multiplicative structure (Affine ADD) DD operations exploit structure DDs support efficient bounded approximations Gogate and Domingos (UAI-13) have recent approx. prob. inference contributions and a good reference list.

Decision Diagrams in Automated Planning and Scheduling

Decision Diagrams in Automated Planning and Scheduling ICAPS 2014 Tutorial Decision Diagrams in Automated Planning and Scheduling Scott Sanner DD Definition Decision diagrams (DDs): DAG variant of decision tree b a b Decision tests ordered Used to represent:

More information

Decision Diagrams in Discrete and Continuous Planning

Decision Diagrams in Discrete and Continuous Planning ICAPS 2012 Tutorial Decision Diagrams in Discrete and Continuous Planning Scott Sanner DD Definition Decision diagrams (DDs): DAG variant of decision tree Decision tests ordered Used to represent: f: B

More information

Data Structures for Efficient Inference and Optimization

Data Structures for Efficient Inference and Optimization Data Structures for Efficient Inference and Optimization in Expressive Continuous Domains Scott Sanner Ehsan Abbasnejad Zahra Zamani Karina Valdivia Delgado Leliane Nunes de Barros Cheng Fang Discrete

More information

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad Symbolic Variable Elimination in Discrete and Continuous Graphical Models Scott Sanner Ehsan Abbasnejad Inference for Dynamic Tracking No one previously did this inference exactly in closed-form! Exact

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence 175 (2011) 1498 1527 Contents lists available at ScienceDirect Artificial Intelligence www.elsevier.com/locate/artint Efficient solutions to factored MDPs with imprecise transition

More information

Efficient Solutions to Factored MDPs with Imprecise Transition Probabilities

Efficient Solutions to Factored MDPs with Imprecise Transition Probabilities Efficient Solutions to Factored MDPs with Imprecise Transition Probabilities Karina Valdivia Delgado,a, Scott Sanner b, Leliane Nunes de Barros a a University of São Paulo, SP, Brazil b NICTA and the Australian

More information

CS 5522: Artificial Intelligence II

CS 5522: Artificial Intelligence II CS 5522: Artificial Intelligence II Bayes Nets: Independence Instructor: Alan Ritter Ohio State University [These slides were adapted from CS188 Intro to AI at UC Berkeley. All materials available at http://ai.berkeley.edu.]

More information

Binary Decision Diagrams

Binary Decision Diagrams Binary Decision Diagrams Logic Circuits Design Seminars WS2010/2011, Lecture 2 Ing. Petr Fišer, Ph.D. Department of Digital Design Faculty of Information Technology Czech Technical University in Prague

More information

Artificial Intelligence Bayes Nets: Independence

Artificial Intelligence Bayes Nets: Independence Artificial Intelligence Bayes Nets: Independence Instructors: David Suter and Qince Li Course Delivered @ Harbin Institute of Technology [Many slides adapted from those created by Dan Klein and Pieter

More information

Bayes Nets: Independence

Bayes Nets: Independence Bayes Nets: Independence [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.] Bayes Nets A Bayes

More information

Directed Graphical Models or Bayesian Networks

Directed Graphical Models or Bayesian Networks Directed Graphical Models or Bayesian Networks Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Bayesian Networks One of the most exciting recent advancements in statistical AI Compact

More information

Uncertainty and Bayesian Networks

Uncertainty and Bayesian Networks Uncertainty and Bayesian Networks Tutorial 3 Tutorial 3 1 Outline Uncertainty Probability Syntax and Semantics for Uncertainty Inference Independence and Bayes Rule Syntax and Semantics for Bayesian Networks

More information

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving III & Binary Decision Diagrams. Sanjit A. Seshia EECS, UC Berkeley

EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving III & Binary Decision Diagrams. Sanjit A. Seshia EECS, UC Berkeley EECS 219C: Computer-Aided Verification Boolean Satisfiability Solving III & Binary Decision Diagrams Sanjit A. Seshia EECS, UC Berkeley Acknowledgments: Lintao Zhang Announcement Project proposals due

More information

Directed and Undirected Graphical Models

Directed and Undirected Graphical Models Directed and Undirected Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Machine Learning: Neural Networks and Advanced Models (AA2) Last Lecture Refresher Lecture Plan Directed

More information

Probabilistic Models. Models describe how (a portion of) the world works

Probabilistic Models. Models describe how (a portion of) the world works Probabilistic Models Models describe how (a portion of) the world works Models are always simplifications May not account for every variable May not account for all interactions between variables All models

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 3, 2016 CPSC 422, Lecture 11 Slide 1 422 big picture: Where are we? Query Planning Deterministic Logics First Order Logics Ontologies

More information

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Uncertainty & Bayesian Networks

More information

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic CS 188: Artificial Intelligence Fall 2008 Lecture 16: Bayes Nets III 10/23/2008 Announcements Midterms graded, up on glookup, back Tuesday W4 also graded, back in sections / box Past homeworks in return

More information

Binary Decision Diagrams

Binary Decision Diagrams Binary Decision Diagrams Literature Some pointers: H.R. Andersen, An Introduction to Binary Decision Diagrams, Lecture notes, Department of Information Technology, IT University of Copenhagen Tools: URL:

More information

Basing Decisions on Sentences in Decision Diagrams

Basing Decisions on Sentences in Decision Diagrams Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Basing Decisions on Sentences in Decision Diagrams Yexiang Xue Department of Computer Science Cornell University yexiang@cs.cornell.edu

More information

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination Outline Syntax Semantics Exact inference by enumeration Exact inference by variable elimination s A simple, graphical notation for conditional independence assertions and hence for compact specication

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 16: Bayes Nets IV Inference 3/28/2011 Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Announcements

More information

Bayes Networks 6.872/HST.950

Bayes Networks 6.872/HST.950 Bayes Networks 6.872/HST.950 What Probabilistic Models Should We Use? Full joint distribution Completely expressive Hugely data-hungry Exponential computational complexity Naive Bayes (full conditional

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas Bayesian Networks Vibhav Gogate The University of Texas at Dallas Intro to AI (CS 6364) Many slides over the course adapted from either Dan Klein, Luke Zettlemoyer, Stuart Russell or Andrew Moore 1 Outline

More information

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas Bayesian Networks Vibhav Gogate The University of Texas at Dallas Intro to AI (CS 4365) Many slides over the course adapted from either Dan Klein, Luke Zettlemoyer, Stuart Russell or Andrew Moore 1 Outline

More information

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car CSE 573: Artificial Intelligence Autumn 2012 Bayesian Networks Dan Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer Outline Probabilistic models (and inference)

More information

Reduced Ordered Binary Decision Diagrams

Reduced Ordered Binary Decision Diagrams Reduced Ordered Binary Decision Diagrams Lecture #13 of Advanced Model Checking Joost-Pieter Katoen Lehrstuhl 2: Software Modeling & Verification E-mail: katoen@cs.rwth-aachen.de June 5, 2012 c JPK Switching

More information

Symbolic Model Checking with ROBDDs

Symbolic Model Checking with ROBDDs Symbolic Model Checking with ROBDDs Lecture #13 of Advanced Model Checking Joost-Pieter Katoen Lehrstuhl 2: Software Modeling & Verification E-mail: katoen@cs.rwth-aachen.de December 14, 2016 c JPK Symbolic

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a

More information

3 : Representation of Undirected GM

3 : Representation of Undirected GM 10-708: Probabilistic Graphical Models 10-708, Spring 2016 3 : Representation of Undirected GM Lecturer: Eric P. Xing Scribes: Longqi Cai, Man-Chia Chang 1 MRF vs BN There are two types of graphical models:

More information

A Knowledge Compilation Map for Ordered Real-Valued Decision Diagrams

A Knowledge Compilation Map for Ordered Real-Valued Decision Diagrams A Knowledge Compilation Map for Ordered Real-Valued Decision Diagrams Hélène Fargier 1 and Pierre Marquis 2 and Alexandre Niveau 3 and Nicolas Schmidt 1,2 1 IRIT-CNRS, Univ. Paul Sabatier, Toulouse, France

More information

Binary Decision Diagrams

Binary Decision Diagrams Binary Decision Diagrams Sungho Kang Yonsei University Outline Representing Logic Function Design Considerations for a BDD package Algorithms 2 Why BDDs BDDs are Canonical (each Boolean function has its

More information

Product rule. Chain rule

Product rule. Chain rule Probability Recap CS 188: Artificial Intelligence ayes Nets: Independence Conditional probability Product rule Chain rule, independent if and only if: and are conditionally independent given if and only

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Machine Learning Lecture 14

Machine Learning Lecture 14 Many slides adapted from B. Schiele, S. Roth, Z. Gharahmani Machine Learning Lecture 14 Undirected Graphical Models & Inference 23.06.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de

More information

4.1 Notation and probability review

4.1 Notation and probability review Directed and undirected graphical models Fall 2015 Lecture 4 October 21st Lecturer: Simon Lacoste-Julien Scribe: Jaime Roquero, JieYing Wu 4.1 Notation and probability review 4.1.1 Notations Let us recall

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

On the Relationship between Sum-Product Networks and Bayesian Networks

On the Relationship between Sum-Product Networks and Bayesian Networks On the Relationship between Sum-Product Networks and Bayesian Networks by Han Zhao A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of

More information

Bayesian networks. Chapter Chapter

Bayesian networks. Chapter Chapter Bayesian networks Chapter 14.1 3 Chapter 14.1 3 1 Outline Syntax Semantics Parameterized distributions Chapter 14.1 3 2 Bayesian networks A simple, graphical notation for conditional independence assertions

More information

Sums of Products. Pasi Rastas November 15, 2005

Sums of Products. Pasi Rastas November 15, 2005 Sums of Products Pasi Rastas November 15, 2005 1 Introduction This presentation is mainly based on 1. Bacchus, Dalmao and Pitassi : Algorithms and Complexity results for #SAT and Bayesian inference 2.

More information

PROBABILISTIC REASONING SYSTEMS

PROBABILISTIC REASONING SYSTEMS PROBABILISTIC REASONING SYSTEMS In which we explain how to build reasoning systems that use network models to reason with uncertainty according to the laws of probability theory. Outline Knowledge in uncertain

More information

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials)

Markov Networks. l Like Bayes Nets. l Graphical model that describes joint probability distribution using tables (AKA potentials) Markov Networks l Like Bayes Nets l Graphical model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov

More information

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1 Bayes Networks CS540 Bryan R Gibson University of Wisconsin-Madison Slides adapted from those used by Prof. Jerry Zhu, CS540-1 1 / 59 Outline Joint Probability: great for inference, terrible to obtain

More information

Motivation for introducing probabilities

Motivation for introducing probabilities for introducing probabilities Reaching the goals is often not sufficient: it is important that the expected costs do not outweigh the benefit of reaching the goals. 1 Objective: maximize benefits - costs.

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Undirected Graphical Models Mark Schmidt University of British Columbia Winter 2016 Admin Assignment 3: 2 late days to hand it in today, Thursday is final day. Assignment 4:

More information

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012 CSE 573: Artificial Intelligence Autumn 2012 Reasoning about Uncertainty & Hidden Markov Models Daniel Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 Outline

More information

Probabilistic Models

Probabilistic Models Bayes Nets 1 Probabilistic Models Models describe how (a portion of) the world works Models are always simplifications May not account for every variable May not account for all interactions between variables

More information

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Bayesian networks CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Slides have been adopted from Klein and Abdeel, CS188, UC Berkeley. Outline Probability

More information

CSE 473: Artificial Intelligence Autumn 2011

CSE 473: Artificial Intelligence Autumn 2011 CSE 473: Artificial Intelligence Autumn 2011 Bayesian Networks Luke Zettlemoyer Many slides over the course adapted from either Dan Klein, Stuart Russell or Andrew Moore 1 Outline Probabilistic models

More information

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials)

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials) Markov Networks l Like Bayes Nets l Graph model that describes joint probability distribution using tables (AKA potentials) l Nodes are random variables l Labels are outcomes over the variables Markov

More information

Boolean decision diagrams and SAT-based representations

Boolean decision diagrams and SAT-based representations Boolean decision diagrams and SAT-based representations 4th July 200 So far we have seen Kripke Structures 2 Temporal logics (and their semantics over Kripke structures) 3 Model checking of these structures

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Oct, 21, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models CPSC

More information

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation CS 188: Artificial Intelligence Spring 2010 Lecture 15: Bayes Nets II Independence 3/9/2010 Pieter Abbeel UC Berkeley Many slides over the course adapted from Dan Klein, Stuart Russell, Andrew Moore Current

More information

Lecture 4 October 18th

Lecture 4 October 18th Directed and undirected graphical models Fall 2017 Lecture 4 October 18th Lecturer: Guillaume Obozinski Scribe: In this lecture, we will assume that all random variables are discrete, to keep notations

More information

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks BY: MOHAMAD ALSABBAGH Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional

More information

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Reduced Ordered Binary Decision Diagram with Implied Literals: A New knowledge Compilation Approach

Reduced Ordered Binary Decision Diagram with Implied Literals: A New knowledge Compilation Approach Reduced Ordered Binary Decision Diagram with Implied Literals: A New knowledge Compilation Approach Yong Lai, Dayou Liu, Shengsheng Wang College of Computer Science and Technology Jilin University, Changchun

More information

CS 188: Artificial Intelligence. Bayes Nets

CS 188: Artificial Intelligence. Bayes Nets CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 14: Bayes Nets II Independence 3/9/2011 Pieter Abbeel UC Berkeley Many slides over the course adapted from Dan Klein, Stuart Russell, Andrew Moore Announcements

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Factored Markov Decision Process with Imprecise Transition Probabilities

Factored Markov Decision Process with Imprecise Transition Probabilities Factored Markov Decision Process with Imprecise Transition Probabilities Karina V. Delgado EACH-University of Sao Paulo Sao Paulo - Brazil Leliane N. de Barros IME-University of Sao Paulo Sao Paulo - Brazil

More information

BDD Based Upon Shannon Expansion

BDD Based Upon Shannon Expansion Boolean Function Manipulation OBDD and more BDD Based Upon Shannon Expansion Notations f(x, x 2,, x n ) - n-input function, x i = or f xi=b (x,, x n ) = f(x,,x i-,b,x i+,,x n ), b= or Shannon Expansion

More information

Stat 521A Lecture 18 1

Stat 521A Lecture 18 1 Stat 521A Lecture 18 1 Outline Cts and discrete variables (14.1) Gaussian networks (14.2) Conditional Gaussian networks (14.3) Non-linear Gaussian networks (14.4) Sampling (14.5) 2 Hybrid networks A hybrid

More information

Reduced Ordered Binary Decision Diagrams

Reduced Ordered Binary Decision Diagrams Reduced Ordered Binary Decision Diagrams Lecture #12 of Advanced Model Checking Joost-Pieter Katoen Lehrstuhl 2: Software Modeling & Verification E-mail: katoen@cs.rwth-aachen.de December 13, 2016 c JPK

More information

Based on slides by Richard Zemel

Based on slides by Richard Zemel CSC 412/2506 Winter 2018 Probabilistic Learning and Reasoning Lecture 3: Directed Graphical Models and Latent Variables Based on slides by Richard Zemel Learning outcomes What aspects of a model can we

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation

More information

Bayesian Networks: Independencies and Inference

Bayesian Networks: Independencies and Inference Bayesian Networks: Independencies and Inference Scott Davies and Andrew Moore Note to other teachers and users of these slides. Andrew and Scott would be delighted if you found this source material useful

More information

Bayesian Networks Inference with Probabilistic Graphical Models

Bayesian Networks Inference with Probabilistic Graphical Models 4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning

More information

Bayesian networks. Chapter 14, Sections 1 4

Bayesian networks. Chapter 14, Sections 1 4 Bayesian networks Chapter 14, Sections 1 4 Artificial Intelligence, spring 2013, Peter Ljunglöf; based on AIMA Slides c Stuart Russel and Peter Norvig, 2004 Chapter 14, Sections 1 4 1 Bayesian networks

More information

Lecture 6: Graphical Models

Lecture 6: Graphical Models Lecture 6: Graphical Models Kai-Wei Chang CS @ Uniersity of Virginia kw@kwchang.net Some slides are adapted from Viek Skirmar s course on Structured Prediction 1 So far We discussed sequence labeling tasks:

More information

Chris Bishop s PRML Ch. 8: Graphical Models

Chris Bishop s PRML Ch. 8: Graphical Models Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September

More information

Binary Decision Diagrams

Binary Decision Diagrams Binary Decision Diagrams Binary Decision Diagrams (BDDs) are a class of graphs that can be used as data structure for compactly representing boolean functions. BDDs were introduced by R. Bryant in 1986.

More information

3 Undirected Graphical Models

3 Undirected Graphical Models Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.438 Algorithms For Inference Fall 2014 3 Undirected Graphical Models In this lecture, we discuss undirected

More information

Computer Science CPSC 322. Lecture 23 Planning Under Uncertainty and Decision Networks

Computer Science CPSC 322. Lecture 23 Planning Under Uncertainty and Decision Networks Computer Science CPSC 322 Lecture 23 Planning Under Uncertainty and Decision Networks 1 Announcements Final exam Mon, Dec. 18, 12noon Same general format as midterm Part short questions, part longer problems

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Probabilistic Reasoning. (Mostly using Bayesian Networks) Probabilistic Reasoning (Mostly using Bayesian Networks) Introduction: Why probabilistic reasoning? The world is not deterministic. (Usually because information is limited.) Ways of coping with uncertainty

More information

CTL Model Checking. Wishnu Prasetya.

CTL Model Checking. Wishnu Prasetya. CTL Model Checking Wishnu Prasetya wishnu@cs.uu.nl www.cs.uu.nl/docs/vakken/pv Background Example: verification of web applications à e.g. to prove existence of a path from page A to page B. Use of CTL

More information

Soft Computing. Lecture Notes on Machine Learning. Matteo Matteucci.

Soft Computing. Lecture Notes on Machine Learning. Matteo Matteucci. Soft Computing Lecture Notes on Machine Learning Matteo Matteucci matteucci@elet.polimi.it Department of Electronics and Information Politecnico di Milano Matteo Matteucci c Lecture Notes on Machine Learning

More information

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II. Copyright Richard J. Povinelli rev 1.0, 10/1//2001 Page 1 Probabilistic Reasoning Systems Dr. Richard J. Povinelli Objectives You should be able to apply belief networks to model a problem with uncertainty.

More information

CSE 473: Artificial Intelligence Probability Review à Markov Models. Outline

CSE 473: Artificial Intelligence Probability Review à Markov Models. Outline CSE 473: Artificial Intelligence Probability Review à Markov Models Daniel Weld University of Washington [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

Bayesian Networks. Motivation

Bayesian Networks. Motivation Bayesian Networks Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Motivation Assume we have five Boolean variables,,,, The joint probability is,,,, How many state configurations

More information

Inference in Bayesian Networks

Inference in Bayesian Networks Andrea Passerini passerini@disi.unitn.it Machine Learning Inference in graphical models Description Assume we have evidence e on the state of a subset of variables E in the model (i.e. Bayesian Network)

More information

14 : Theory of Variational Inference: Inner and Outer Approximation

14 : Theory of Variational Inference: Inner and Outer Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2014 14 : Theory of Variational Inference: Inner and Outer Approximation Lecturer: Eric P. Xing Scribes: Yu-Hsin Kuo, Amos Ng 1 Introduction Last lecture

More information

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013 Introduction to Bayes Nets CS 486/686: Introduction to Artificial Intelligence Fall 2013 1 Introduction Review probabilistic inference, independence and conditional independence Bayesian Networks - - What

More information

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty Lecture 10: Introduction to reasoning under uncertainty Introduction to reasoning under uncertainty Review of probability Axioms and inference Conditional probability Probability distributions COMP-424,

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Sampling Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

Probabilistic Reasoning Systems

Probabilistic Reasoning Systems Probabilistic Reasoning Systems Dr. Richard J. Povinelli Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 1 Objectives You should be able to apply belief networks to model a problem with uncertainty.

More information

CS357: CTL Model Checking (two lectures worth) David Dill

CS357: CTL Model Checking (two lectures worth) David Dill CS357: CTL Model Checking (two lectures worth) David Dill 1 CTL CTL = Computation Tree Logic It is a propositional temporal logic temporal logic extended to properties of events over time. CTL is a branching

More information

Bayes Nets III: Inference

Bayes Nets III: Inference 1 Hal Daumé III (me@hal3.name) Bayes Nets III: Inference Hal Daumé III Computer Science University of Maryland me@hal3.name CS 421: Introduction to Artificial Intelligence 10 Apr 2012 Many slides courtesy

More information

4 : Exact Inference: Variable Elimination

4 : Exact Inference: Variable Elimination 10-708: Probabilistic Graphical Models 10-708, Spring 2014 4 : Exact Inference: Variable Elimination Lecturer: Eric P. ing Scribes: Soumya Batra, Pradeep Dasigi, Manzil Zaheer 1 Probabilistic Inference

More information

Undirected Graphical Models: Markov Random Fields

Undirected Graphical Models: Markov Random Fields Undirected Graphical Models: Markov Random Fields 40-956 Advanced Topics in AI: Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2015 Markov Random Field Structure: undirected

More information

Bayesian Networks: Representation, Variable Elimination

Bayesian Networks: Representation, Variable Elimination Bayesian Networks: Representation, Variable Elimination CS 6375: Machine Learning Class Notes Instructor: Vibhav Gogate The University of Texas at Dallas We can view a Bayesian network as a compact representation

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Alternative Parameterizations of Markov Networks. Sargur Srihari

Alternative Parameterizations of Markov Networks. Sargur Srihari Alternative Parameterizations of Markov Networks Sargur srihari@cedar.buffalo.edu 1 Topics Three types of parameterization 1. Gibbs Parameterization 2. Factor Graphs 3. Log-linear Models with Energy functions

More information

Bayesian Networks Representation

Bayesian Networks Representation Bayesian Networks Representation Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University March 19 th, 2007 Handwriting recognition Character recognition, e.g., kernel SVMs a c z rr r r

More information

University of Washington Department of Electrical Engineering EE512 Spring, 2006 Graphical Models

University of Washington Department of Electrical Engineering EE512 Spring, 2006 Graphical Models University of Washington Department of Electrical Engineering EE512 Spring, 2006 Graphical Models Jeff A. Bilmes Lecture 1 Slides March 28 th, 2006 Lec 1: March 28th, 2006 EE512

More information