PSDDs for Tractable Learning in Structured and Unstructured Spaces

Size: px
Start display at page:

Download "PSDDs for Tractable Learning in Structured and Unstructured Spaces"

Transcription

1 PSDDs for Tractable Learning in Structured and Unstructured Spaces Guy Van den Broeck DeLBP Aug 18, 2017

2 References Probabilistic Sentential Decision Diagrams Doga Kisa, Guy Van den Broeck, Arthur Choi and Adnan Darwiche KR, 2014 Learning with Massive Logical Constraints Doga Kisa, Guy Van den Broeck, Arthur Choi and Adnan Darwiche ICML LTPM workshop, 2014 Tractable Learning for Structured Probability Spaces Arthur Choi, Guy Van den Broeck and Adnan Darwiche IJCAI, 2015 Tractable Learning for Complex Probability Queries Jessa Bekker, Jesse Davis, Arthur Choi, Adnan Darwiche, Guy Van den Broeck. NIPS, 2015 Learning the Structure of PSDDs Yitao Liang, Jessa Bekker and Guy Van den Broeck UAI, 2017 Towards Compact Interpretable Models: Learning and Shrinking PSDDs Yitao Liang and Guy Van den Broeck IJCAI XAI workshop, 2017

3 (P)SDDs in Melbourne Sunday: Logical Foundations for Uncertainty and Machine Learning Workshop Adnan Darwiche: On the Role of Logic in Probabilistic Inference and Machine Learning YooJung Choi: Optimal Feature Selection for Decision Robustness in Bayesian Networks Sunday: Explainable AI Workshop Yitao Liang: Towards Compact Interpretable Models: Learning and Shrinking PSDDs Tuesday: IJCAI YooJung Choi (again)

4 Structured vs. unstructured probability spaces?

5 Running Example Courses: Logic (L) Knowledge Representation (K) Probability (P) Artificial Intelligence (A) Data

6 Running Example Courses: Logic (L) Knowledge Representation (K) Probability (P) Artificial Intelligence (A) Data Constraints Must take at least one of Probability or Logic. Probability is a prerequisite for AI. The prerequisites for KR is either AI or Logic.

7 Probability Space unstructured L K P A

8 Structured Probability Space unstructured L K P A Must take at least one of Probability or Logic. Probability is a prerequisite for AI. The prerequisites for KR is either AI or Logic. 7 out of 16 instantiations are impossible structured L K P A

9 Learning with Constraints Data Constraints (Background Knowledge) (Physics) Learn Statistical Model (Distribution)

10 Learning with Constraints Data Constraints (Background Knowledge) (Physics) Learn Statistical Model (Distribution) Learn a statistical model that assigns zero probability to instantiations that violate the constraints.

11 Example: Video [Lu, W. L., Ting, J. A., Little, J. J., & Murphy, K. P. (2013). Learning to track and identify players from broadcast sports videos.]

12 Example: Video [Lu, W. L., Ting, J. A., Little, J. J., & Murphy, K. P. (2013). Learning to track and identify players from broadcast sports videos.]

13 Example: Robotics [Wong, L. L., Kaelbling, L. P., & Lozano-Perez, T., Collision-free state estimation. ICRA 2012]

14 Example: Robotics [Wong, L. L., Kaelbling, L. P., & Lozano-Perez, T., Collision-free state estimation. ICRA 2012]

15 Example: Language Non-local dependencies: At least one verb in each sentence [Chang, M., Ratinov, L., & Roth, D. (2008). Constraints as prior knowledge],, [Chang, M. W., Ratinov, L., & Roth, D. (2012). Structured learning with constrained conditional models.], [

16 Example: Language Non-local dependencies: At least one verb in each sentence Sentence compression If a modifier is kept, its subject is also kept [Chang, M., Ratinov, L., & Roth, D. (2008). Constraints as prior knowledge],, [Chang, M. W., Ratinov, L., & Roth, D. (2012). Structured learning with constrained conditional models.], [

17 Example: Language Non-local dependencies: At least one verb in each sentence Sentence compression If a modifier is kept, its subject is also kept Information extraction [Chang, M., Ratinov, L., & Roth, D. (2008). Constraints as prior knowledge],, [Chang, M. W., Ratinov, L., & Roth, D. (2012). Structured learning with constrained conditional models.], [

18 Example: Language Non-local dependencies: At least one verb in each sentence Sentence compression If a modifier is kept, its subject is also kept Information extraction Semantic role labeling and many more! [Chang, M., Ratinov, L., & Roth, D. (2008). Constraints as prior knowledge],, [Chang, M. W., Ratinov, L., & Roth, D. (2012). Structured learning with constrained conditional models.], [

19 Example: Deep Learning [Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., et al.. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), ]

20 Example: Deep Learning [Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., et al.. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), ]

21 Example: Deep Learning [Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., et al.. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), ]

22 Example: Deep Learning [Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., et al.. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), ]

23 What are people doing now? Ignore constraints Handcraft into models Use specialized distributions Find non-structured encoding Try to learn constraints Hack your way around

24 What are people doing now? Ignore constraints Handcraft into models Use specialized distributions Find non-structured encoding Try to learn constraints Hack your way around + Accuracy? Specialized skill? Intractable inference? Intractable learning? Waste parameters? Risk predicting out of space? you are on your own

25 Structured Probability Spaces Everywhere in ML! Configuration problems, inventory, video, text, deep learning Planning and diagnosis (physics) Causal models: cooking scenarios (interpreting videos) Combinatorial objects: parse trees, rankings, directed acyclic graphs, trees, simple paths, game traces, etc.

26 Structured Probability Spaces Everywhere in ML! Configuration problems, inventory, video, text, deep learning Planning and diagnosis (physics) Causal models: cooking scenarios (interpreting videos) Combinatorial objects: parse trees, rankings, directed acyclic graphs, trees, simple paths, game traces, etc. Some representations: constrained conditional models, mixed networks, probabilistic logics.

27 Structured Probability Spaces Everywhere in ML! Configuration problems, inventory, video, text, deep learning Planning and diagnosis (physics) Causal models: cooking scenarios (interpreting videos) Combinatorial objects: parse trees, rankings, directed acyclic graphs, trees, simple paths, game traces, etc. Some representations: constrained conditional models, mixed networks, probabilistic logics. No statistical ML boxes out there that take constraints as input!

28 Structured Probability Spaces Everywhere in ML! Configuration problems, inventory, video, text, deep learning Planning and diagnosis (physics) Causal models: cooking scenarios (interpreting videos) Combinatorial objects: parse trees, rankings, directed acyclic graphs, trees, simple paths, game traces, etc. Some representations: constrained conditional models, mixed networks, probabilistic logics. No statistical ML boxes out there that take constraints as input! Goal: Constraints as important as data! General purpose!

29 Specification Language: Logic

30 Structured Probability Space unstructured L K P A Must take at least one of Probability or Logic. Probability is a prerequisite for AI. The prerequisites for KR is either AI or Logic. 7 out of 16 instantiations are impossible structured L K P A

31 Boolean Constraints unstructured L K P A out of 16 instantiations are impossible structured L K P A

32 Combinatorial Objects: Rankings rank sushi rank sushi 1 fatty tuna 2 sea urchin 3 salmon roe 4 shrimp 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll 1 shrimp 2 sea urchin 3 salmon roe 4 fatty tuna 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll 10 items: 3,628,800 rankings 20 items: 2,432,902,008,176,640,000 rankings

33 Combinatorial Objects: Rankings rank sushi 1 fatty tuna 2 sea urchin 3 salmon roe 4 shrimp 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll rank sushi 1 shrimp 2 sea urchin 3 salmon roe 4 fatty tuna 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll A ij item i at position j (n items require n 2 Boolean variables)

34 Combinatorial Objects: Rankings rank sushi 1 fatty tuna 2 sea urchin 3 salmon roe 4 shrimp 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll rank sushi 1 shrimp 2 sea urchin 3 salmon roe 4 fatty tuna 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll A ij item i at position j (n items require n 2 Boolean variables) An item may be assigned to more than one position A position may contain more than one item

35 Encoding Rankings in Logic A ij : item i at position j pos 1 pos 2 pos 3 pos 4 item 1 A 11 A 12 A 13 A 14 item 2 A 21 A 22 A 23 A 24 item 3 A 31 A 32 A 33 A 34 item 4 A 41 A 42 A 43 A 44

36 Encoding Rankings in Logic A ij : item i at position j pos 1 pos 2 pos 3 pos 4 constraint: each item i assigned to a unique position (n constraints) item 1 A 11 A 12 A 13 A 14 item 2 A 21 A 22 A 23 A 24 item 3 A 31 A 32 A 33 A 34 item 4 A 41 A 42 A 43 A 44

37 Encoding Rankings in Logic A ij : item i at position j pos 1 pos 2 pos 3 pos 4 constraint: each item i assigned to a unique position (n constraints) item 1 A 11 A 12 A 13 A 14 item 2 A 21 A 22 A 23 A 24 item 3 A 31 A 32 A 33 A 34 item 4 A 41 A 42 A 43 A 44 constraint: each position j assigned a unique item (n constraints)

38 Encoding Rankings in Logic A ij : item i at position j pos 1 pos 2 pos 3 pos 4 constraint: each item i assigned to a unique position (n constraints) item 1 A 11 A 12 A 13 A 14 item 2 A 21 A 22 A 23 A 24 item 3 A 31 A 32 A 33 A 34 item 4 A 41 A 42 A 43 A 44 constraint: each position j assigned a unique item (n constraints)

39 Structured Space for Paths cf. Nature paper

40 Structured Space for Paths cf. Nature paper Good variable assignment (represents route) 184

41 Structured Space for Paths cf. Nature paper Good variable assignment (represents route) 184 Bad variable assignment (does not represent route) 16,777,032

42 Structured Space for Paths cf. Nature paper Good variable assignment (represents route) 184 Bad variable assignment (does not represent route) 16,777,032 Space easily encoded in logical constraints See [Choi, Tavabi, Darwiche, AAAI 2016]

43 Structured Space for Paths cf. Nature paper Good variable assignment (represents route) 184 Bad variable assignment (does not represent route) 16,777,032 Space easily encoded in logical constraints See [Choi, Tavabi, Darwiche, AAAI 2016] Unstructured probability space: ,777,032 = 2 24

44 Deep Architecture Logic + Probability

45 Logical Circuits L K L P A P L L P A P L K L P P K K A A A A

46 Property: Decomposability L K L P A P L L P A P L K L P P K K A A A A

47 Property: Decomposability L K L P A P L L P A P L K L P P K K A A A A

48 Property: Determinism L K L P A P L L P A P L K L P P Input: L, K, P, A K K A A A A

49 Sentential Decision Diagram (SDD) L K L P A P L L P A P L K L P P Input: L, K, P, A K K A A A A

50 Sentential Decision Diagram (SDD) L K L P A P L L P A P L K L P P Input: L, K, P, A K K A A A A

51 Sentential Decision Diagram (SDD) L K L P A P L L P A P L K L P P Input: L, K, P, A K K A A A A

52 Tractable for Logical Inference Is structured space empty? (SAT) Count size of structured space (#SAT) Check equivalence of spaces

53 Tractable for Logical Inference Is structured space empty? (SAT) Count size of structured space (#SAT) Check equivalence of spaces Algorithms linear in circuit size (pass up, pass down, similar to backprop)

54 Tractable for Logical Inference Is structured space empty? (SAT) Count size of structured space (#SAT) Check equivalence of spaces Algorithms linear in circuit size (pass up, pass down, similar to backprop)

55 PSDD: Probabilistic SDD L K L P A P L L P A P L K L P P K K A A A A

56 PSDD: Probabilistic SDD L K L P A P L L P A P L K L P P K K A A A A Input: L, K, P, A

57 PSDD: Probabilistic SDD L K L P A P L L P A P L K L P P K K A A A A Input: L, K, P, A

58 PSDD: Probabilistic SDD L K L P A P L L P A P L K L P P K K A A A A Input: L, K, P, A Pr(L,K,P,A) = 0.3 x 1.0 x 0.8 x 0.4 x 0.25 = 0.024

59 PSDD nodes induce a normalized distribution! L K L P A P L L P A P L K L P P A A A A A A

60 PSDD nodes induce a normalized distribution! L K L P A P L L P A P L K L P P A A A A A A

61 PSDD nodes induce a normalized distribution! L K L P A P L L P A P L K L P P A A A A A A Can read probabilistic independences off the circuit structure

62 Tractable for Probabilistic Inference MAP inference: Find most-likely assignment (otherwise NP-complete) Computing conditional probabilities Pr(x y) (otherwise PP-complete) Sample from Pr(x y)

63 Tractable for Probabilistic Inference MAP inference: Find most-likely assignment (otherwise NP-complete) Computing conditional probabilities Pr(x y) (otherwise PP-complete) Sample from Pr(x y) Algorithms linear in circuit size (pass up, pass down, similar to backprop)

64 Learning PSDDs Logic + Probability + ML

65 Parameters are Interpretable L K L P A P L L P A P L K L P P K K A A A A Explainable AI DARPA Program

66 Parameters are Interpretable L K L P A P L L P A P L K L P P K K A A A A Student takes course L Explainable AI DARPA Program

67 Parameters are Interpretable L K L P A P L L P A P L K L P P Student takes course L K K A A Student takes course P A A Explainable AI DARPA Program

68 Parameters are Interpretable Probability of P given L L K L P A P L L P A P L K L P P Student takes course L K K A A Student takes course P A A Explainable AI DARPA Program

69 Learning Algorithms Parameter learning: Closed form max likelihood from complete data One pass over data to estimate Pr(x y)

70 Learning Algorithms Parameter learning: Closed form max likelihood from complete data One pass over data to estimate Pr(x y) Not a lot to say: very easy!

71 Learning Algorithms Parameter learning: Closed form max likelihood from complete data One pass over data to estimate Pr(x y) Not a lot to say: very easy! Circuit learning (naïve): Compile constraints to SDD circuit Use SAT solver technology Circuit does not depend on data

72 Learning Preference Distributions Special-purpose distribution: Mixture-of-Mallows # of components from 1 to 20 EM with 10 random seeds implementation of Lu & Boutilier PSDD

73 Learning Preference Distributions Special-purpose distribution: Mixture-of-Mallows # of components from 1 to 20 EM with 10 random seeds implementation of Lu & Boutilier PSDD This is the naive approach, circuit does not depend on data!

74 Learn Circuit from Data Even in unstructured spaces

75 Tractable Learning Bayesian networks Markov networks

76 Tractable Learning Bayesian networks Markov networks Do not support linear-time exact inference

77 Tractable Learning Historically: Polytrees, Chow-Liu trees, etc. SPNs Cutset Networks Both are Arithmetic Circuits (ACs) [Darwiche, JACM 2003]

78 PSDDs are Arithmetic Circuits n * * * * 1 * 2 * n p 1 s 1 p 2 s 2 PSDD p n s n p 1 s 1 p 2 s 2 p n s n AC

79 Tractable Learning Strong Properties Representational Freedom

80 Tractable Learning DNN Strong Properties Representational Freedom

81 Tractable Learning DNN Strong Properties Representational Freedom

82 Tractable Learning DNN Cutset SPN Strong Properties Representational Freedom

83 Tractable Learning DNN Cutset SPN Strong Properties Representational Freedom Perhaps the most powerful circuit proposed to date

84 PSDDs for the Logic-Phobic

85 PSDDs for the Logic-Phobic Bottom-up each node is a distribution

86 PSDDs for the Logic-Phobic Bottom-up each node is a distribution

87 PSDDs for the Logic-Phobic

88 PSDDs for the Logic-Phobic Multiply independent distributions

89 PSDDs for the Logic-Phobic

90 PSDDs for the Logic-Phobic Weighted mixture of lower level distributions

91 PSDDs for the Logic-Phobic

92 PSDDs for the Logic-Phobic

93 Variable Trees (vtrees) PSDD Vtree Correspondence

94 Learning Variable Trees How much do vars depend on each other? Learn vtree by hierarchical clustering

95 Learning Variable Trees How much do vars depend on each other? Learn vtree by hierarchical clustering

96 Learning Primitives

97 Learning Primitives

98 Learning Primitives Primitives maintain PSDD properties and structured space!

99 LearnPSDD 1 Vtree learning 2 Construct the most naïve PSDD 3 LearnPSDD (search for better structure)

100 LearnPSDD 1 Vtree learning 2 Construct the most naïve PSDD Generate candidate operations Simulate operations 3 LearnPSDD (search for better structure) Execute the best

101 Experiments on 20 datasets

102 Experiments on 20 datasets Compare with O-SPN: smaller size in 14, better LL in 11, win on both in 6 Compare with L-SPN: smaller size in 14, better LL in 6, win on both in 2

103 Experiments on 20 datasets Compare with O-SPN: smaller size in 14, better LL in 11, win on both in 6 Compare with L-SPN: smaller size in 14, better LL in 6, win on both in 2 Comparable in performance & Smaller in size

104 Ensembles of PSDDs

105 Ensembles of PSDDs EM/Bagging

106 Ensembles of PSDDs EM/Bagging

107 State-of-the-Art Performance

108 State-of-the-Art Performance State of the art in 6 datasets

109 What happens if you ignore constraints?

110 What happens if you ignore constraints? Roadmap Compile logic into a SDD Convert to a PSDD: Parameter estimation LearnPSDD

111 What happens if you ignore constraints? Roadmap Compile logic into a SDD Convert to a PSDD: Parameter estimation LearnPSDD Generate candidate operations Execute the best Simulate operations

112 What happens if you ignore constraints? Discrete multi-valued data A: a 1, a 2, a 3 a 1 a 2 a 3 a 1 a 2 a 3 a 1 a 2 a 3

113 What happens if you ignore constraints? Discrete multi-valued data A: a 1, a 2, a 3 a 1 a 2 a 3 a 1 a 2 a 3 a 1 a 2 a 3

114 What happens if you ignore constraints? Discrete multi-valued data A: a 1, a 2, a 3 a 1 a 2 a 3 a 1 a 2 a 3 a 1 a 2 a 3

115 What happens if you ignore constraints? Discrete multi-valued data A: a 1, a 2, a 3 a 1 a 2 a 3 a 1 a 2 a 3 a 1 a 2 a 3 Never omit domain constraints

116 Complex queries and Learning from constraints

117 Incomplete Data a classical complete dataset id X Y Z 1 x 1 y 2 z 1 2 x 2 y 1 z 2 3 x 2 y 1 z 2 4 x 1 y 1 z 1 5 x 1 y 2 z 2 closed-form (maximum-likelihood estimates are unique)

118 Incomplete Data a classical complete dataset id X Y Z 1 x 1 y 2 z 1 2 x 2 y 1 z 2 3 x 2 y 1 z 2 4 x 1 y 1 z 1 5 x 1 y 2 z 2 closed-form (maximum-likelihood estimates are unique) a classical incomplete dataset id X Y Z 1 x 1 y 2? 2 x 2 y 1? 3?? z 2 4? y 1 z 1 5 x 1 y 2 z 2 EM algorithm (on PSDDs)

119 Incomplete Data a classical complete dataset id X Y Z 1 x 1 y 2 z 1 2 x 2 y 1 z 2 3 x 2 y 1 z 2 4 x 1 y 1 z 1 a classical incomplete dataset id X Y Z 1 x 1 y 2? 2 x 2 y 1? 3?? z 2 4? y 1 z 1 a new type of incomplete dataset id X Y Z 1 X Z 2 x 2 and (y 2 or z 2 ) 3 x 2 y 1 4 X Y Z 1 5 x 1 y 2 z 2 closed-form (maximum-likelihood estimates are unique) 5 x 1 y 2 z 2 EM algorithm (on PSDDs) 5 x 1 and y 2 and z 2 Missed in the ML literature

120 Structured Datasets a classical complete dataset (e.g., total rankings) a classical incomplete dataset (e.g., top-k rankings) id st sushi fatty tuna fatty tuna 3 tuna 4 fatty tuna 2 nd sushi sea urchin 3 rd sushi salmon roe tuna shrimp tuna roll salmon roe sea eel tuna id st sushi fatty tuna 3 tuna 4 2 nd sushi 3 rd sushi sea urchin? fatty tuna?? fatty tuna tuna roll? salmon roe? 5 egg squid shrimp 5 egg??

121 Structured Datasets a classical complete dataset (e.g., total rankings) a new type of incomplete dataset (e.g., partial rankings) id st sushi fatty tuna fatty tuna 3 tuna 4 fatty tuna 2 nd sushi sea urchin 3 rd sushi salmon roe tuna shrimp tuna roll salmon roe sea eel tuna 5 egg squid shrimp id st sushi 2 nd sushi 3 rd sushi (fatty tuna > sea urchin) and (tuna > sea eel) (fatty tuna is 1 st ) and (salmon roe > egg) 3 tuna > squid 4 egg is last 5 egg > squid > shrimp (represents constraints on possible total rankings)

122 Learning from Incomplete Data Movielens Dataset: 3,900 movies, 6,040 users, 1m ratings take ratings from 64 most rated movies ratings 1-5 converted to pairwise prefs. PSDD for partial rankings 4 tiers 18,711 parameters movies by expected tier rank movie 1 The Godfather 2 The Usual Suspects 3 Casablanca 4 The Shawshank Redemption 5 Schindler s List 6 One Flew Over the Cuckoo s Nest 7 The Godfather: Part II 8 Monty Python and the Holy Grail 9 Raiders of the Lost Ark 10 Star Wars IV: A New Hope

123 PSDD Sizes

124 Structured Queries rank movie 1 Star Wars V: The Empire Strikes Back 2 Star Wars IV: A New Hope 3 The Godfather 4 The Shawshank Redemption 5 The Usual Suspects

125 Structured Queries rank movie 1 Star Wars V: The Empire Strikes Back 2 Star Wars IV: A New Hope 3 The Godfather 4 The Shawshank Redemption 5 The Usual Suspects no other Star Wars movie in top-5 at least one comedy in top-5

126 Structured Queries no other Star Wars movie in top-5 at least one comedy in top-5 rank movie 1 Star Wars V: The Empire Strikes Back 2 Star Wars IV: A New Hope 3 The Godfather 4 The Shawshank Redemption 5 The Usual Suspects rank movie 1 Star Wars V: The Empire Strikes Back 2 American Beauty 3 The Godfather 4 The Usual Suspects 5 The Shawshank Redemption

127 Structured Queries no other Star Wars movie in top-5 at least one comedy in top-5 rank movie 1 Star Wars V: The Empire Strikes Back 2 Star Wars IV: A New Hope 3 The Godfather 4 The Shawshank Redemption 5 The Usual Suspects rank movie 1 Star Wars V: The Empire Strikes Back 2 American Beauty 3 The Godfather 4 The Usual Suspects 5 The Shawshank Redemption diversified recommendations via logical constraints

128 Conclusions Structured spaces are everywhere PSDDs build on logical circuits 1. Tractability 2. Semantics 3. Natural encoding of structured spaces Learning is effective 1. From constraints encoding structured space State of the art learning preference distributions 2. From standard unstructured datasets using search State of the art on standard tractable learning datasets Novel settings for inference and learning Structured spaces / learning from constraints / complex queries

129 References Probabilistic Sentential Decision Diagrams Doga Kisa, Guy Van den Broeck, Arthur Choi and Adnan Darwiche KR, 2014 Learning with Massive Logical Constraints Doga Kisa, Guy Van den Broeck, Arthur Choi and Adnan Darwiche ICML LTPM workshop, 2014 Tractable Learning for Structured Probability Spaces Arthur Choi, Guy Van den Broeck and Adnan Darwiche IJCAI, 2015 Tractable Learning for Complex Probability Queries Jessa Bekker, Jesse Davis, Arthur Choi, Adnan Darwiche, Guy Van den Broeck. NIPS, 2015 Learning the Structure of PSDDs Yitao Liang, Jessa Bekker and Guy Van den Broeck UAI, 2017 Towards Compact Interpretable Models: Learning and Shrinking PSDDs Yitao Liang and Guy Van den Broeck IJCAI XAI workshop, 2017

130 (P)SDDs in Melbourne Sunday: Logical Foundations for Uncertainty and Machine Learning Workshop Adnan Darwiche: On the Role of Logic in Probabilistic Inference and Machine Learning YooJung Choi: Optimal Feature Selection for Decision Robustness in Bayesian Networks Sunday: Explainable AI Workshop Yitao Liang: Towards Compact Interpretable Models: Learning and Shrinking PSDDs Tuesday: IJCAI YooJung Choi (again)

131 Conclusions Statistical ML Probability Symbolic AI Logic PSDD Connectionism Deep

132 Questions? PSDD with 15,000 nodes LearnPSDD code: Other PSDD code: SDD code:

Probabilistic Circuits: A New Synthesis of Logic and Machine Learning

Probabilistic Circuits: A New Synthesis of Logic and Machine Learning Probabilistic Circuits: A New Synthesis of Logic and Machine Learning Guy Van den Broeck UCSD May 14, 2018 Overview Statistical ML Probability Symbolic AI Logic Probabilistic Circuits Connectionism Deep

More information

Tractable Learning in Structured Probability Spaces

Tractable Learning in Structured Probability Spaces Tractable Learning in Structured Probability Spaces Guy Van den Broeck UCLA Stats Seminar Jan 17, 2017 Outline 1. Structured probability spaces? 2. Specification language Logic 3. Deep architecture Logic

More information

Tractable Learning in Structured Probability Spaces

Tractable Learning in Structured Probability Spaces Tractable Learning in Structured Probability Spaces Guy Van den Broeck DTAI Seminar - KU Leuven Dec 20, 2016 Structured probability spaces? Running Example Courses: Logic (L) Knowledge Representation (K)

More information

Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning

Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning Guy Van den Broeck HRL/ACTIONS @ KR Oct 28, 2018 Foundation: Logical Circuit Languages Negation Normal Form Circuits Δ

More information

Knowledge Compilation and Machine Learning A New Frontier Adnan Darwiche Computer Science Department UCLA with Arthur Choi and Guy Van den Broeck

Knowledge Compilation and Machine Learning A New Frontier Adnan Darwiche Computer Science Department UCLA with Arthur Choi and Guy Van den Broeck Knowledge Compilation and Machine Learning A New Frontier Adnan Darwiche Computer Science Department UCLA with Arthur Choi and Guy Van den Broeck The Problem Logic (L) Knowledge Representation (K) Probability

More information

Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning

Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning Guy Van den Broeck KULeuven Symposium Dec 12, 2018 Outline Learning Adding knowledge to deep learning Logistic circuits

More information

Probabilistic Sentential Decision Diagrams

Probabilistic Sentential Decision Diagrams Probabilistic Sentential Decision Diagrams Doga Kisa and Guy Van den Broeck and Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles {doga,guyvdb,aychoi,darwiche}@cs.ucla.edu

More information

StarAI Full, 6+1 pages Short, 2 page position paper or abstract

StarAI Full, 6+1 pages Short, 2 page position paper or abstract StarAI 2015 Fifth International Workshop on Statistical Relational AI At the 31st Conference on Uncertainty in Artificial Intelligence (UAI) (right after ICML) In Amsterdam, The Netherlands, on July 16.

More information

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017

Sum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth

More information

Sum-Product Networks: A New Deep Architecture

Sum-Product Networks: A New Deep Architecture Sum-Product Networks: A New Deep Architecture Pedro Domingos Dept. Computer Science & Eng. University of Washington Joint work with Hoifung Poon 1 Graphical Models: Challenges Bayesian Network Markov Network

More information

On First-Order Knowledge Compilation. Guy Van den Broeck

On First-Order Knowledge Compilation. Guy Van den Broeck On First-Order Knowledge Compilation Guy Van den Broeck Beyond NP Workshop Feb 12, 2016 Overview 1. Why first-order model counting? 2. Why first-order model counters? 3. What first-order circuit languages?

More information

Learning Logistic Circuits

Learning Logistic Circuits Learning Logistic Circuits Yitao Liang and Guy Van den Broeck Computer Science Department University of California, Los Angeles {yliang, guyvdb}@cs.ucla.edu Abstract This paper proposes a new classification

More information

On the Role of Canonicity in Bottom-up Knowledge Compilation

On the Role of Canonicity in Bottom-up Knowledge Compilation On the Role of Canonicity in Bottom-up Knowledge Compilation Guy Van den Broeck and Adnan Darwiche Computer Science Department University of California, Los Angeles {guyvdb,darwiche}@cs.ucla.edu Abstract

More information

Basing Decisions on Sentences in Decision Diagrams

Basing Decisions on Sentences in Decision Diagrams Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Basing Decisions on Sentences in Decision Diagrams Yexiang Xue Department of Computer Science Cornell University yexiang@cs.cornell.edu

More information

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar

On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar Proceedings of Machine Learning Research vol 73:153-164, 2017 AMBN 2017 On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar Kei Amii Kyoto University Kyoto

More information

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)

Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles

More information

Solving PP PP -Complete Problems Using Knowledge Compilation

Solving PP PP -Complete Problems Using Knowledge Compilation Solving PP PP -Complete Problems Using Knowledge Compilation Umut Oztok and Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles {umut,aychoi,darwiche}@cs.ucla.edu

More information

Online Algorithms for Sum-Product

Online Algorithms for Sum-Product Online Algorithms for Sum-Product Networks with Continuous Variables Priyank Jaini Ph.D. Seminar Consistent/Robust Tensor Decomposition And Spectral Learning Offline Bayesian Learning ADF, EP, SGD, oem

More information

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Uncertainty & Bayesian Networks

More information

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning

ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Summary of Class Advanced Topics Dhruv Batra Virginia Tech HW1 Grades Mean: 28.5/38 ~= 74.9%

More information

First-Order Knowledge Compilation for Probabilistic Reasoning

First-Order Knowledge Compilation for Probabilistic Reasoning First-Order Knowledge Compilation for Probabilistic Reasoning Guy Van den Broeck based on joint work with Adnan Darwiche, Dan Suciu, and many others MOTIVATION 1 A Simple Reasoning Problem...? Probability

More information

Hidden Markov Models. x 1 x 2 x 3 x N

Hidden Markov Models. x 1 x 2 x 3 x N Hidden Markov Models 1 1 1 1 K K K K x 1 x x 3 x N Example: The dishonest casino A casino has two dice: Fair die P(1) = P() = P(3) = P(4) = P(5) = P(6) = 1/6 Loaded die P(1) = P() = P(3) = P(4) = P(5)

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 8. Satisfiability and Model Construction Davis-Putnam-Logemann-Loveland Procedure, Phase Transitions, GSAT Joschka Boedecker and Wolfram Burgard and Bernhard Nebel

More information

Parameter Es*ma*on: Cracking Incomplete Data

Parameter Es*ma*on: Cracking Incomplete Data Parameter Es*ma*on: Cracking Incomplete Data Khaled S. Refaat Collaborators: Arthur Choi and Adnan Darwiche Agenda Learning Graphical Models Complete vs. Incomplete Data Exploi*ng Data for Decomposi*on

More information

Optimal Feature Selection for Decision Robustness in Bayesian Networks

Optimal Feature Selection for Decision Robustness in Bayesian Networks Optimal Feature Selection for ecision Robustness in Bayesian Networks YooJung Choi, Adnan arwiche, and Guy Van den Broeck Computer Science epartment University of California, Los Angeles {yjchoi, darwiche,

More information

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017

COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SEQUENTIAL DATA So far, when thinking

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

Compiling Knowledge into Decomposable Negation Normal Form

Compiling Knowledge into Decomposable Negation Normal Form Compiling Knowledge into Decomposable Negation Normal Form Adnan Darwiche Cognitive Systems Laboratory Department of Computer Science University of California Los Angeles, CA 90024 darwiche@cs. ucla. edu

More information

Uncertainty and Bayesian Networks

Uncertainty and Bayesian Networks Uncertainty and Bayesian Networks Tutorial 3 Tutorial 3 1 Outline Uncertainty Probability Syntax and Semantics for Uncertainty Inference Independence and Bayes Rule Syntax and Semantics for Bayesian Networks

More information

Probabilistic Logic CNF for Reasoning

Probabilistic Logic CNF for Reasoning Probabilistic Logic CNF for Reasoning Song- Chun Zhu, Sinisa Todorovic, and Ales Leonardis At CVPR, Providence, Rhode Island June 16, 2012 Goal input tracking parsing reasoning activity recognition in

More information

PROBABILISTIC REASONING SYSTEMS

PROBABILISTIC REASONING SYSTEMS PROBABILISTIC REASONING SYSTEMS In which we explain how to build reasoning systems that use network models to reason with uncertainty according to the laws of probability theory. Outline Knowledge in uncertain

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

Approximate Inference by Compilation to Arithmetic Circuits

Approximate Inference by Compilation to Arithmetic Circuits Approximate Inference by Compilation to Arithmetic Circuits Daniel Lowd Department of Computer and Information Science University of Oregon Eugene, OR 97403-1202 lowd@cs.uoregon.edu Pedro Domingos Department

More information

Learning Bayesian Network Parameters under Equivalence Constraints

Learning Bayesian Network Parameters under Equivalence Constraints Learning Bayesian Network Parameters under Equivalence Constraints Tiansheng Yao 1,, Arthur Choi, Adnan Darwiche Computer Science Department University of California, Los Angeles Los Angeles, CA 90095

More information

KU Leuven Department of Computer Science

KU Leuven Department of Computer Science Implementation and Performance of Probabilistic Inference Pipelines: Data and Results Dimitar Shterionov Gerda Janssens Report CW 679, March 2015 KU Leuven Department of Computer Science Celestijnenlaan

More information

CS 570: Machine Learning Seminar. Fall 2016

CS 570: Machine Learning Seminar. Fall 2016 CS 570: Machine Learning Seminar Fall 2016 Class Information Class web page: http://web.cecs.pdx.edu/~mm/mlseminar2016-2017/fall2016/ Class mailing list: cs570@cs.pdx.edu My office hours: T,Th, 2-3pm or

More information

Mixed Membership Matrix Factorization

Mixed Membership Matrix Factorization Mixed Membership Matrix Factorization Lester Mackey 1 David Weiss 2 Michael I. Jordan 1 1 University of California, Berkeley 2 University of Pennsylvania International Conference on Machine Learning, 2010

More information

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries Anonymous Author(s) Affiliation Address email Abstract 1 2 3 4 5 6 7 8 9 10 11 12 Probabilistic

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

Learning to Learn and Collaborative Filtering

Learning to Learn and Collaborative Filtering Appearing in NIPS 2005 workshop Inductive Transfer: Canada, December, 2005. 10 Years Later, Whistler, Learning to Learn and Collaborative Filtering Kai Yu, Volker Tresp Siemens AG, 81739 Munich, Germany

More information

Computer Science and Logic A Match Made in Heaven

Computer Science and Logic A Match Made in Heaven A Match Made in Heaven Luca Aceto Reykjavik University Reykjavik, 3 April 2009 Thanks to Moshe Vardi from whom I have drawn inspiration (read stolen ideas ) for this presentation. Why This Talk Today?

More information

CS 188: Artificial Intelligence. Bayes Nets

CS 188: Artificial Intelligence. Bayes Nets CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

arxiv: v1 [stat.ml] 15 Nov 2016

arxiv: v1 [stat.ml] 15 Nov 2016 Recoverability of Joint Distribution from Missing Data arxiv:1611.04709v1 [stat.ml] 15 Nov 2016 Jin Tian Department of Computer Science Iowa State University Ames, IA 50014 jtian@iastate.edu Abstract A

More information

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks BY: MOHAMAD ALSABBAGH Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional

More information

Preliminaries. Data Mining. The art of extracting knowledge from large bodies of structured data. Let s put it to use!

Preliminaries. Data Mining. The art of extracting knowledge from large bodies of structured data. Let s put it to use! Data Mining The art of extracting knowledge from large bodies of structured data. Let s put it to use! 1 Recommendations 2 Basic Recommendations with Collaborative Filtering Making Recommendations 4 The

More information

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes. CS 188 Spring 2014 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers

More information

Examination Artificial Intelligence Module Intelligent Interaction Design December 2014

Examination Artificial Intelligence Module Intelligent Interaction Design December 2014 Examination Artificial Intelligence Module Intelligent Interaction Design December 2014 Introduction This exam is closed book, you may only use a simple calculator (addition, substraction, multiplication

More information

Model Counting for Probabilistic Reasoning

Model Counting for Probabilistic Reasoning Model Counting for Probabilistic Reasoning Beyond NP Workshop Stefano Ermon CS Department, Stanford Combinatorial Search and Optimization Progress in combinatorial search since the 1990s (SAT, SMT, MIP,

More information

Models, Data, Learning Problems

Models, Data, Learning Problems Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Models, Data, Learning Problems Tobias Scheffer Overview Types of learning problems: Supervised Learning (Classification, Regression,

More information

Generative Models for Sentences

Generative Models for Sentences Generative Models for Sentences Amjad Almahairi PhD student August 16 th 2014 Outline 1. Motivation Language modelling Full Sentence Embeddings 2. Approach Bayesian Networks Variational Autoencoders (VAE)

More information

On the Relationship between Sum-Product Networks and Bayesian Networks

On the Relationship between Sum-Product Networks and Bayesian Networks On the Relationship between Sum-Product Networks and Bayesian Networks International Conference on Machine Learning, 2015 Han Zhao Mazen Melibari Pascal Poupart University of Waterloo, Waterloo, ON, Canada

More information

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet. CS 188 Fall 2015 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

More information

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic CS 188: Artificial Intelligence Fall 2008 Lecture 16: Bayes Nets III 10/23/2008 Announcements Midterms graded, up on glookup, back Tuesday W4 also graded, back in sections / box Past homeworks in return

More information

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models 02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput

More information

Dynamic models 1 Kalman filters, linearization,

Dynamic models 1 Kalman filters, linearization, Koller & Friedman: Chapter 16 Jordan: Chapters 13, 15 Uri Lerner s Thesis: Chapters 3,9 Dynamic models 1 Kalman filters, linearization, Switching KFs, Assumed density filters Probabilistic Graphical Models

More information

Collaborative topic models: motivations cont

Collaborative topic models: motivations cont Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad

Symbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad Symbolic Variable Elimination in Discrete and Continuous Graphical Models Scott Sanner Ehsan Abbasnejad Inference for Dynamic Tracking No one previously did this inference exactly in closed-form! Exact

More information

Modeling User Rating Profiles For Collaborative Filtering

Modeling User Rating Profiles For Collaborative Filtering Modeling User Rating Profiles For Collaborative Filtering Benjamin Marlin Department of Computer Science University of Toronto Toronto, ON, M5S 3H5, CANADA marlin@cs.toronto.edu Abstract In this paper

More information

Introduction to Artificial Intelligence (AI)

Introduction to Artificial Intelligence (AI) Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 9 Oct, 11, 2011 Slide credit Approx. Inference : S. Thrun, P, Norvig, D. Klein CPSC 502, Lecture 9 Slide 1 Today Oct 11 Bayesian

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Mixed Membership Matrix Factorization

Mixed Membership Matrix Factorization Mixed Membership Matrix Factorization Lester Mackey University of California, Berkeley Collaborators: David Weiss, University of Pennsylvania Michael I. Jordan, University of California, Berkeley 2011

More information

Recovering Probability Distributions from Missing Data

Recovering Probability Distributions from Missing Data Proceedings of Machine Learning Research 77:574 589, 2017 ACML 2017 Recovering Probability Distributions from Missing Data Jin Tian Iowa State University jtian@iastate.edu Editors: Yung-Kyun Noh and Min-Ling

More information

Introduction. next up previous Next: Problem Description Up: Bayesian Networks Previous: Bayesian Networks

Introduction. next up previous Next: Problem Description Up: Bayesian Networks Previous: Bayesian Networks Next: Problem Description Up: Bayesian Networks Previous: Bayesian Networks Introduction Autonomous agents often find themselves facing a great deal of uncertainty. It decides how to act based on its perceived

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Price: $25 (incl. T-Shirt, morning tea and lunch) Visit:

Price: $25 (incl. T-Shirt, morning tea and lunch) Visit: Three days of interesting talks & workshops from industry experts across Australia Explore new computing topics Network with students & employers in Brisbane Price: $25 (incl. T-Shirt, morning tea and

More information

Knowledge Extraction from DBNs for Images

Knowledge Extraction from DBNs for Images Knowledge Extraction from DBNs for Images Son N. Tran and Artur d Avila Garcez Department of Computer Science City University London Contents 1 Introduction 2 Knowledge Extraction from DBNs 3 Experimental

More information

Generative Models for Discrete Data

Generative Models for Discrete Data Generative Models for Discrete Data ddebarr@uw.edu 2016-04-21 Agenda Bayesian Concept Learning Beta-Binomial Model Dirichlet-Multinomial Model Naïve Bayes Classifiers Bayesian Concept Learning Numbers

More information

CV-width: A New Complexity Parameter for CNFs

CV-width: A New Complexity Parameter for CNFs CV-width: A New Complexity Parameter for CNFs Umut Oztok and Adnan Darwiche 1 Abstract. We present new complexity results on the compilation of CNFs into DNNFs and OBDDs. In particular, we introduce a

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Information Retrieval and Organisation Chapter 13 Text Classification and Naïve Bayes Dell Zhang Birkbeck, University of London Motivation Relevance Feedback revisited The user marks a number of documents

More information

Algorithms for NLP. Language Modeling III. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley

Algorithms for NLP. Language Modeling III. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Algorithms for NLP Language Modeling III Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Announcements Office hours on website but no OH for Taylor until next week. Efficient Hashing Closed address

More information

Discriminative Learning of Sum-Product Networks. Robert Gens Pedro Domingos

Discriminative Learning of Sum-Product Networks. Robert Gens Pedro Domingos Discriminative Learning of Sum-Product Networks Robert Gens Pedro Domingos X1 X1 X1 X1 X2 X2 X2 X2 X3 X3 X3 X3 X4 X4 X4 X4 X5 X5 X5 X5 X6 X6 X6 X6 Distributions X 1 X 1 X 1 X 1 X 2 X 2 X 2 X 2 X 3 X 3

More information

Cardinal and Ordinal Preferences. Introduction to Logic in Computer Science: Autumn Dinner Plans. Preference Modelling

Cardinal and Ordinal Preferences. Introduction to Logic in Computer Science: Autumn Dinner Plans. Preference Modelling Cardinal and Ordinal Preferences Introduction to Logic in Computer Science: Autumn 2007 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam A preference structure represents

More information

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning

Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March

More information

Sums of Products. Pasi Rastas November 15, 2005

Sums of Products. Pasi Rastas November 15, 2005 Sums of Products Pasi Rastas November 15, 2005 1 Introduction This presentation is mainly based on 1. Bacchus, Dalmao and Pitassi : Algorithms and Complexity results for #SAT and Bayesian inference 2.

More information

Prof. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme. Tanya Braun (Exercises)

Prof. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme. Tanya Braun (Exercises) Prof. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme Tanya Braun (Exercises) Slides taken from the presentation (subset only) Learning Statistical Models From

More information

A Relaxed Tseitin Transformation for Weighted Model Counting

A Relaxed Tseitin Transformation for Weighted Model Counting A Relaxed Tseitin Transformation for Weighted Model Counting Wannes Meert, Jonas Vlasselaer Computer Science Department KU Leuven, Belgium Guy Van den Broeck Computer Science Department University of California,

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

CS 188: Artificial Intelligence Fall 2011

CS 188: Artificial Intelligence Fall 2011 CS 188: Artificial Intelligence Fall 2011 Lecture 20: HMMs / Speech / ML 11/8/2011 Dan Klein UC Berkeley Today HMMs Demo bonanza! Most likely explanation queries Speech recognition A massive HMM! Details

More information

A Preference Logic With Four Kinds of Preferences

A Preference Logic With Four Kinds of Preferences A Preference Logic With Four Kinds of Preferences Zhang Zhizheng and Xing Hancheng School of Computer Science and Engineering, Southeast University No.2 Sipailou, Nanjing, China {seu_zzz; xhc}@seu.edu.cn

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project

More information

Planning by Probabilistic Inference

Planning by Probabilistic Inference Planning by Probabilistic Inference Hagai Attias Microsoft Research 1 Microsoft Way Redmond, WA 98052 Abstract This paper presents and demonstrates a new approach to the problem of planning under uncertainty.

More information

Acknowledgments 2. Part 0: Overview 17

Acknowledgments 2. Part 0: Overview 17 Contents Acknowledgments 2 Preface for instructors 11 Which theory course are we talking about?.... 12 The features that might make this book appealing. 13 What s in and what s out............... 14 Possible

More information

Quantifying uncertainty & Bayesian networks

Quantifying uncertainty & Bayesian networks Quantifying uncertainty & Bayesian networks CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2016 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition,

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

Machine learning for automated theorem proving: the story so far. Sean Holden

Machine learning for automated theorem proving: the story so far. Sean Holden Machine learning for automated theorem proving: the story so far Sean Holden University of Cambridge Computer Laboratory William Gates Building 15 JJ Thomson Avenue Cambridge CB3 0FD, UK sbh11@cl.cam.ac.uk

More information

A Stochastic l-calculus

A Stochastic l-calculus A Stochastic l-calculus Content Areas: probabilistic reasoning, knowledge representation, causality Tracking Number: 775 Abstract There is an increasing interest within the research community in the design

More information

Computational Complexity of Bayesian Networks

Computational Complexity of Bayesian Networks Computational Complexity of Bayesian Networks UAI, 2015 Complexity theory Many computations on Bayesian networks are NP-hard Meaning (no more, no less) that we cannot hope for poly time algorithms that

More information

Online Forest Density Estimation

Online Forest Density Estimation Online Forest Density Estimation Frédéric Koriche CRIL - CNRS UMR 8188, Univ. Artois koriche@cril.fr UAI 16 1 Outline 1 Probabilistic Graphical Models 2 Online Density Estimation 3 Online Forest Density

More information

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS

Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Many slides adapted from Jur van den Berg Outline POMDPs Separation Principle / Certainty Equivalence Locally Optimal

More information

Rank and Rating Aggregation. Amy Langville Mathematics Department College of Charleston

Rank and Rating Aggregation. Amy Langville Mathematics Department College of Charleston Rank and Rating Aggregation Amy Langville Mathematics Department College of Charleston Hamilton Institute 8/6/2008 Collaborators Carl Meyer, N. C. State University College of Charleston (current and former

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,

More information

Bayesian Networks. Marcello Cirillo PhD Student 11 th of May, 2007

Bayesian Networks. Marcello Cirillo PhD Student 11 th of May, 2007 Bayesian Networks Marcello Cirillo PhD Student 11 th of May, 2007 Outline General Overview Full Joint Distributions Bayes' Rule Bayesian Network An Example Network Everything has a cost Learning with Bayesian

More information

Lecture 15. Probabilistic Models on Graph

Lecture 15. Probabilistic Models on Graph Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how

More information

Arithmetic circuits of the noisy-or models

Arithmetic circuits of the noisy-or models Arithmetic circuits of the noisy-or models Jiří Vomlel Institute of Information Theory and Automation of the ASCR Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 08 Praha 8. Czech

More information

Bayesian belief networks. Inference.

Bayesian belief networks. Inference. Lecture 13 Bayesian belief networks. Inference. Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Midterm exam Monday, March 17, 2003 In class Closed book Material covered by Wednesday, March 12 Last

More information

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine CS 484 Data Mining Classification 7 Some slides are from Professor Padhraic Smyth at UC Irvine Bayesian Belief networks Conditional independence assumption of Naïve Bayes classifier is too strong. Allows

More information