PSDDs for Tractable Learning in Structured and Unstructured Spaces
|
|
- Dale McDowell
- 6 years ago
- Views:
Transcription
1 PSDDs for Tractable Learning in Structured and Unstructured Spaces Guy Van den Broeck DeLBP Aug 18, 2017
2 References Probabilistic Sentential Decision Diagrams Doga Kisa, Guy Van den Broeck, Arthur Choi and Adnan Darwiche KR, 2014 Learning with Massive Logical Constraints Doga Kisa, Guy Van den Broeck, Arthur Choi and Adnan Darwiche ICML LTPM workshop, 2014 Tractable Learning for Structured Probability Spaces Arthur Choi, Guy Van den Broeck and Adnan Darwiche IJCAI, 2015 Tractable Learning for Complex Probability Queries Jessa Bekker, Jesse Davis, Arthur Choi, Adnan Darwiche, Guy Van den Broeck. NIPS, 2015 Learning the Structure of PSDDs Yitao Liang, Jessa Bekker and Guy Van den Broeck UAI, 2017 Towards Compact Interpretable Models: Learning and Shrinking PSDDs Yitao Liang and Guy Van den Broeck IJCAI XAI workshop, 2017
3 (P)SDDs in Melbourne Sunday: Logical Foundations for Uncertainty and Machine Learning Workshop Adnan Darwiche: On the Role of Logic in Probabilistic Inference and Machine Learning YooJung Choi: Optimal Feature Selection for Decision Robustness in Bayesian Networks Sunday: Explainable AI Workshop Yitao Liang: Towards Compact Interpretable Models: Learning and Shrinking PSDDs Tuesday: IJCAI YooJung Choi (again)
4 Structured vs. unstructured probability spaces?
5 Running Example Courses: Logic (L) Knowledge Representation (K) Probability (P) Artificial Intelligence (A) Data
6 Running Example Courses: Logic (L) Knowledge Representation (K) Probability (P) Artificial Intelligence (A) Data Constraints Must take at least one of Probability or Logic. Probability is a prerequisite for AI. The prerequisites for KR is either AI or Logic.
7 Probability Space unstructured L K P A
8 Structured Probability Space unstructured L K P A Must take at least one of Probability or Logic. Probability is a prerequisite for AI. The prerequisites for KR is either AI or Logic. 7 out of 16 instantiations are impossible structured L K P A
9 Learning with Constraints Data Constraints (Background Knowledge) (Physics) Learn Statistical Model (Distribution)
10 Learning with Constraints Data Constraints (Background Knowledge) (Physics) Learn Statistical Model (Distribution) Learn a statistical model that assigns zero probability to instantiations that violate the constraints.
11 Example: Video [Lu, W. L., Ting, J. A., Little, J. J., & Murphy, K. P. (2013). Learning to track and identify players from broadcast sports videos.]
12 Example: Video [Lu, W. L., Ting, J. A., Little, J. J., & Murphy, K. P. (2013). Learning to track and identify players from broadcast sports videos.]
13 Example: Robotics [Wong, L. L., Kaelbling, L. P., & Lozano-Perez, T., Collision-free state estimation. ICRA 2012]
14 Example: Robotics [Wong, L. L., Kaelbling, L. P., & Lozano-Perez, T., Collision-free state estimation. ICRA 2012]
15 Example: Language Non-local dependencies: At least one verb in each sentence [Chang, M., Ratinov, L., & Roth, D. (2008). Constraints as prior knowledge],, [Chang, M. W., Ratinov, L., & Roth, D. (2012). Structured learning with constrained conditional models.], [
16 Example: Language Non-local dependencies: At least one verb in each sentence Sentence compression If a modifier is kept, its subject is also kept [Chang, M., Ratinov, L., & Roth, D. (2008). Constraints as prior knowledge],, [Chang, M. W., Ratinov, L., & Roth, D. (2012). Structured learning with constrained conditional models.], [
17 Example: Language Non-local dependencies: At least one verb in each sentence Sentence compression If a modifier is kept, its subject is also kept Information extraction [Chang, M., Ratinov, L., & Roth, D. (2008). Constraints as prior knowledge],, [Chang, M. W., Ratinov, L., & Roth, D. (2012). Structured learning with constrained conditional models.], [
18 Example: Language Non-local dependencies: At least one verb in each sentence Sentence compression If a modifier is kept, its subject is also kept Information extraction Semantic role labeling and many more! [Chang, M., Ratinov, L., & Roth, D. (2008). Constraints as prior knowledge],, [Chang, M. W., Ratinov, L., & Roth, D. (2012). Structured learning with constrained conditional models.], [
19 Example: Deep Learning [Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., et al.. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), ]
20 Example: Deep Learning [Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., et al.. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), ]
21 Example: Deep Learning [Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., et al.. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), ]
22 Example: Deep Learning [Graves, A., Wayne, G., Reynolds, M., Harley, T., Danihelka, I., Grabska-Barwińska, A., et al.. (2016). Hybrid computing using a neural network with dynamic external memory. Nature, 538(7626), ]
23 What are people doing now? Ignore constraints Handcraft into models Use specialized distributions Find non-structured encoding Try to learn constraints Hack your way around
24 What are people doing now? Ignore constraints Handcraft into models Use specialized distributions Find non-structured encoding Try to learn constraints Hack your way around + Accuracy? Specialized skill? Intractable inference? Intractable learning? Waste parameters? Risk predicting out of space? you are on your own
25 Structured Probability Spaces Everywhere in ML! Configuration problems, inventory, video, text, deep learning Planning and diagnosis (physics) Causal models: cooking scenarios (interpreting videos) Combinatorial objects: parse trees, rankings, directed acyclic graphs, trees, simple paths, game traces, etc.
26 Structured Probability Spaces Everywhere in ML! Configuration problems, inventory, video, text, deep learning Planning and diagnosis (physics) Causal models: cooking scenarios (interpreting videos) Combinatorial objects: parse trees, rankings, directed acyclic graphs, trees, simple paths, game traces, etc. Some representations: constrained conditional models, mixed networks, probabilistic logics.
27 Structured Probability Spaces Everywhere in ML! Configuration problems, inventory, video, text, deep learning Planning and diagnosis (physics) Causal models: cooking scenarios (interpreting videos) Combinatorial objects: parse trees, rankings, directed acyclic graphs, trees, simple paths, game traces, etc. Some representations: constrained conditional models, mixed networks, probabilistic logics. No statistical ML boxes out there that take constraints as input!
28 Structured Probability Spaces Everywhere in ML! Configuration problems, inventory, video, text, deep learning Planning and diagnosis (physics) Causal models: cooking scenarios (interpreting videos) Combinatorial objects: parse trees, rankings, directed acyclic graphs, trees, simple paths, game traces, etc. Some representations: constrained conditional models, mixed networks, probabilistic logics. No statistical ML boxes out there that take constraints as input! Goal: Constraints as important as data! General purpose!
29 Specification Language: Logic
30 Structured Probability Space unstructured L K P A Must take at least one of Probability or Logic. Probability is a prerequisite for AI. The prerequisites for KR is either AI or Logic. 7 out of 16 instantiations are impossible structured L K P A
31 Boolean Constraints unstructured L K P A out of 16 instantiations are impossible structured L K P A
32 Combinatorial Objects: Rankings rank sushi rank sushi 1 fatty tuna 2 sea urchin 3 salmon roe 4 shrimp 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll 1 shrimp 2 sea urchin 3 salmon roe 4 fatty tuna 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll 10 items: 3,628,800 rankings 20 items: 2,432,902,008,176,640,000 rankings
33 Combinatorial Objects: Rankings rank sushi 1 fatty tuna 2 sea urchin 3 salmon roe 4 shrimp 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll rank sushi 1 shrimp 2 sea urchin 3 salmon roe 4 fatty tuna 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll A ij item i at position j (n items require n 2 Boolean variables)
34 Combinatorial Objects: Rankings rank sushi 1 fatty tuna 2 sea urchin 3 salmon roe 4 shrimp 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll rank sushi 1 shrimp 2 sea urchin 3 salmon roe 4 fatty tuna 5 tuna 6 squid 7 tuna roll 8 see eel 9 egg 10 cucumber roll A ij item i at position j (n items require n 2 Boolean variables) An item may be assigned to more than one position A position may contain more than one item
35 Encoding Rankings in Logic A ij : item i at position j pos 1 pos 2 pos 3 pos 4 item 1 A 11 A 12 A 13 A 14 item 2 A 21 A 22 A 23 A 24 item 3 A 31 A 32 A 33 A 34 item 4 A 41 A 42 A 43 A 44
36 Encoding Rankings in Logic A ij : item i at position j pos 1 pos 2 pos 3 pos 4 constraint: each item i assigned to a unique position (n constraints) item 1 A 11 A 12 A 13 A 14 item 2 A 21 A 22 A 23 A 24 item 3 A 31 A 32 A 33 A 34 item 4 A 41 A 42 A 43 A 44
37 Encoding Rankings in Logic A ij : item i at position j pos 1 pos 2 pos 3 pos 4 constraint: each item i assigned to a unique position (n constraints) item 1 A 11 A 12 A 13 A 14 item 2 A 21 A 22 A 23 A 24 item 3 A 31 A 32 A 33 A 34 item 4 A 41 A 42 A 43 A 44 constraint: each position j assigned a unique item (n constraints)
38 Encoding Rankings in Logic A ij : item i at position j pos 1 pos 2 pos 3 pos 4 constraint: each item i assigned to a unique position (n constraints) item 1 A 11 A 12 A 13 A 14 item 2 A 21 A 22 A 23 A 24 item 3 A 31 A 32 A 33 A 34 item 4 A 41 A 42 A 43 A 44 constraint: each position j assigned a unique item (n constraints)
39 Structured Space for Paths cf. Nature paper
40 Structured Space for Paths cf. Nature paper Good variable assignment (represents route) 184
41 Structured Space for Paths cf. Nature paper Good variable assignment (represents route) 184 Bad variable assignment (does not represent route) 16,777,032
42 Structured Space for Paths cf. Nature paper Good variable assignment (represents route) 184 Bad variable assignment (does not represent route) 16,777,032 Space easily encoded in logical constraints See [Choi, Tavabi, Darwiche, AAAI 2016]
43 Structured Space for Paths cf. Nature paper Good variable assignment (represents route) 184 Bad variable assignment (does not represent route) 16,777,032 Space easily encoded in logical constraints See [Choi, Tavabi, Darwiche, AAAI 2016] Unstructured probability space: ,777,032 = 2 24
44 Deep Architecture Logic + Probability
45 Logical Circuits L K L P A P L L P A P L K L P P K K A A A A
46 Property: Decomposability L K L P A P L L P A P L K L P P K K A A A A
47 Property: Decomposability L K L P A P L L P A P L K L P P K K A A A A
48 Property: Determinism L K L P A P L L P A P L K L P P Input: L, K, P, A K K A A A A
49 Sentential Decision Diagram (SDD) L K L P A P L L P A P L K L P P Input: L, K, P, A K K A A A A
50 Sentential Decision Diagram (SDD) L K L P A P L L P A P L K L P P Input: L, K, P, A K K A A A A
51 Sentential Decision Diagram (SDD) L K L P A P L L P A P L K L P P Input: L, K, P, A K K A A A A
52 Tractable for Logical Inference Is structured space empty? (SAT) Count size of structured space (#SAT) Check equivalence of spaces
53 Tractable for Logical Inference Is structured space empty? (SAT) Count size of structured space (#SAT) Check equivalence of spaces Algorithms linear in circuit size (pass up, pass down, similar to backprop)
54 Tractable for Logical Inference Is structured space empty? (SAT) Count size of structured space (#SAT) Check equivalence of spaces Algorithms linear in circuit size (pass up, pass down, similar to backprop)
55 PSDD: Probabilistic SDD L K L P A P L L P A P L K L P P K K A A A A
56 PSDD: Probabilistic SDD L K L P A P L L P A P L K L P P K K A A A A Input: L, K, P, A
57 PSDD: Probabilistic SDD L K L P A P L L P A P L K L P P K K A A A A Input: L, K, P, A
58 PSDD: Probabilistic SDD L K L P A P L L P A P L K L P P K K A A A A Input: L, K, P, A Pr(L,K,P,A) = 0.3 x 1.0 x 0.8 x 0.4 x 0.25 = 0.024
59 PSDD nodes induce a normalized distribution! L K L P A P L L P A P L K L P P A A A A A A
60 PSDD nodes induce a normalized distribution! L K L P A P L L P A P L K L P P A A A A A A
61 PSDD nodes induce a normalized distribution! L K L P A P L L P A P L K L P P A A A A A A Can read probabilistic independences off the circuit structure
62 Tractable for Probabilistic Inference MAP inference: Find most-likely assignment (otherwise NP-complete) Computing conditional probabilities Pr(x y) (otherwise PP-complete) Sample from Pr(x y)
63 Tractable for Probabilistic Inference MAP inference: Find most-likely assignment (otherwise NP-complete) Computing conditional probabilities Pr(x y) (otherwise PP-complete) Sample from Pr(x y) Algorithms linear in circuit size (pass up, pass down, similar to backprop)
64 Learning PSDDs Logic + Probability + ML
65 Parameters are Interpretable L K L P A P L L P A P L K L P P K K A A A A Explainable AI DARPA Program
66 Parameters are Interpretable L K L P A P L L P A P L K L P P K K A A A A Student takes course L Explainable AI DARPA Program
67 Parameters are Interpretable L K L P A P L L P A P L K L P P Student takes course L K K A A Student takes course P A A Explainable AI DARPA Program
68 Parameters are Interpretable Probability of P given L L K L P A P L L P A P L K L P P Student takes course L K K A A Student takes course P A A Explainable AI DARPA Program
69 Learning Algorithms Parameter learning: Closed form max likelihood from complete data One pass over data to estimate Pr(x y)
70 Learning Algorithms Parameter learning: Closed form max likelihood from complete data One pass over data to estimate Pr(x y) Not a lot to say: very easy!
71 Learning Algorithms Parameter learning: Closed form max likelihood from complete data One pass over data to estimate Pr(x y) Not a lot to say: very easy! Circuit learning (naïve): Compile constraints to SDD circuit Use SAT solver technology Circuit does not depend on data
72 Learning Preference Distributions Special-purpose distribution: Mixture-of-Mallows # of components from 1 to 20 EM with 10 random seeds implementation of Lu & Boutilier PSDD
73 Learning Preference Distributions Special-purpose distribution: Mixture-of-Mallows # of components from 1 to 20 EM with 10 random seeds implementation of Lu & Boutilier PSDD This is the naive approach, circuit does not depend on data!
74 Learn Circuit from Data Even in unstructured spaces
75 Tractable Learning Bayesian networks Markov networks
76 Tractable Learning Bayesian networks Markov networks Do not support linear-time exact inference
77 Tractable Learning Historically: Polytrees, Chow-Liu trees, etc. SPNs Cutset Networks Both are Arithmetic Circuits (ACs) [Darwiche, JACM 2003]
78 PSDDs are Arithmetic Circuits n * * * * 1 * 2 * n p 1 s 1 p 2 s 2 PSDD p n s n p 1 s 1 p 2 s 2 p n s n AC
79 Tractable Learning Strong Properties Representational Freedom
80 Tractable Learning DNN Strong Properties Representational Freedom
81 Tractable Learning DNN Strong Properties Representational Freedom
82 Tractable Learning DNN Cutset SPN Strong Properties Representational Freedom
83 Tractable Learning DNN Cutset SPN Strong Properties Representational Freedom Perhaps the most powerful circuit proposed to date
84 PSDDs for the Logic-Phobic
85 PSDDs for the Logic-Phobic Bottom-up each node is a distribution
86 PSDDs for the Logic-Phobic Bottom-up each node is a distribution
87 PSDDs for the Logic-Phobic
88 PSDDs for the Logic-Phobic Multiply independent distributions
89 PSDDs for the Logic-Phobic
90 PSDDs for the Logic-Phobic Weighted mixture of lower level distributions
91 PSDDs for the Logic-Phobic
92 PSDDs for the Logic-Phobic
93 Variable Trees (vtrees) PSDD Vtree Correspondence
94 Learning Variable Trees How much do vars depend on each other? Learn vtree by hierarchical clustering
95 Learning Variable Trees How much do vars depend on each other? Learn vtree by hierarchical clustering
96 Learning Primitives
97 Learning Primitives
98 Learning Primitives Primitives maintain PSDD properties and structured space!
99 LearnPSDD 1 Vtree learning 2 Construct the most naïve PSDD 3 LearnPSDD (search for better structure)
100 LearnPSDD 1 Vtree learning 2 Construct the most naïve PSDD Generate candidate operations Simulate operations 3 LearnPSDD (search for better structure) Execute the best
101 Experiments on 20 datasets
102 Experiments on 20 datasets Compare with O-SPN: smaller size in 14, better LL in 11, win on both in 6 Compare with L-SPN: smaller size in 14, better LL in 6, win on both in 2
103 Experiments on 20 datasets Compare with O-SPN: smaller size in 14, better LL in 11, win on both in 6 Compare with L-SPN: smaller size in 14, better LL in 6, win on both in 2 Comparable in performance & Smaller in size
104 Ensembles of PSDDs
105 Ensembles of PSDDs EM/Bagging
106 Ensembles of PSDDs EM/Bagging
107 State-of-the-Art Performance
108 State-of-the-Art Performance State of the art in 6 datasets
109 What happens if you ignore constraints?
110 What happens if you ignore constraints? Roadmap Compile logic into a SDD Convert to a PSDD: Parameter estimation LearnPSDD
111 What happens if you ignore constraints? Roadmap Compile logic into a SDD Convert to a PSDD: Parameter estimation LearnPSDD Generate candidate operations Execute the best Simulate operations
112 What happens if you ignore constraints? Discrete multi-valued data A: a 1, a 2, a 3 a 1 a 2 a 3 a 1 a 2 a 3 a 1 a 2 a 3
113 What happens if you ignore constraints? Discrete multi-valued data A: a 1, a 2, a 3 a 1 a 2 a 3 a 1 a 2 a 3 a 1 a 2 a 3
114 What happens if you ignore constraints? Discrete multi-valued data A: a 1, a 2, a 3 a 1 a 2 a 3 a 1 a 2 a 3 a 1 a 2 a 3
115 What happens if you ignore constraints? Discrete multi-valued data A: a 1, a 2, a 3 a 1 a 2 a 3 a 1 a 2 a 3 a 1 a 2 a 3 Never omit domain constraints
116 Complex queries and Learning from constraints
117 Incomplete Data a classical complete dataset id X Y Z 1 x 1 y 2 z 1 2 x 2 y 1 z 2 3 x 2 y 1 z 2 4 x 1 y 1 z 1 5 x 1 y 2 z 2 closed-form (maximum-likelihood estimates are unique)
118 Incomplete Data a classical complete dataset id X Y Z 1 x 1 y 2 z 1 2 x 2 y 1 z 2 3 x 2 y 1 z 2 4 x 1 y 1 z 1 5 x 1 y 2 z 2 closed-form (maximum-likelihood estimates are unique) a classical incomplete dataset id X Y Z 1 x 1 y 2? 2 x 2 y 1? 3?? z 2 4? y 1 z 1 5 x 1 y 2 z 2 EM algorithm (on PSDDs)
119 Incomplete Data a classical complete dataset id X Y Z 1 x 1 y 2 z 1 2 x 2 y 1 z 2 3 x 2 y 1 z 2 4 x 1 y 1 z 1 a classical incomplete dataset id X Y Z 1 x 1 y 2? 2 x 2 y 1? 3?? z 2 4? y 1 z 1 a new type of incomplete dataset id X Y Z 1 X Z 2 x 2 and (y 2 or z 2 ) 3 x 2 y 1 4 X Y Z 1 5 x 1 y 2 z 2 closed-form (maximum-likelihood estimates are unique) 5 x 1 y 2 z 2 EM algorithm (on PSDDs) 5 x 1 and y 2 and z 2 Missed in the ML literature
120 Structured Datasets a classical complete dataset (e.g., total rankings) a classical incomplete dataset (e.g., top-k rankings) id st sushi fatty tuna fatty tuna 3 tuna 4 fatty tuna 2 nd sushi sea urchin 3 rd sushi salmon roe tuna shrimp tuna roll salmon roe sea eel tuna id st sushi fatty tuna 3 tuna 4 2 nd sushi 3 rd sushi sea urchin? fatty tuna?? fatty tuna tuna roll? salmon roe? 5 egg squid shrimp 5 egg??
121 Structured Datasets a classical complete dataset (e.g., total rankings) a new type of incomplete dataset (e.g., partial rankings) id st sushi fatty tuna fatty tuna 3 tuna 4 fatty tuna 2 nd sushi sea urchin 3 rd sushi salmon roe tuna shrimp tuna roll salmon roe sea eel tuna 5 egg squid shrimp id st sushi 2 nd sushi 3 rd sushi (fatty tuna > sea urchin) and (tuna > sea eel) (fatty tuna is 1 st ) and (salmon roe > egg) 3 tuna > squid 4 egg is last 5 egg > squid > shrimp (represents constraints on possible total rankings)
122 Learning from Incomplete Data Movielens Dataset: 3,900 movies, 6,040 users, 1m ratings take ratings from 64 most rated movies ratings 1-5 converted to pairwise prefs. PSDD for partial rankings 4 tiers 18,711 parameters movies by expected tier rank movie 1 The Godfather 2 The Usual Suspects 3 Casablanca 4 The Shawshank Redemption 5 Schindler s List 6 One Flew Over the Cuckoo s Nest 7 The Godfather: Part II 8 Monty Python and the Holy Grail 9 Raiders of the Lost Ark 10 Star Wars IV: A New Hope
123 PSDD Sizes
124 Structured Queries rank movie 1 Star Wars V: The Empire Strikes Back 2 Star Wars IV: A New Hope 3 The Godfather 4 The Shawshank Redemption 5 The Usual Suspects
125 Structured Queries rank movie 1 Star Wars V: The Empire Strikes Back 2 Star Wars IV: A New Hope 3 The Godfather 4 The Shawshank Redemption 5 The Usual Suspects no other Star Wars movie in top-5 at least one comedy in top-5
126 Structured Queries no other Star Wars movie in top-5 at least one comedy in top-5 rank movie 1 Star Wars V: The Empire Strikes Back 2 Star Wars IV: A New Hope 3 The Godfather 4 The Shawshank Redemption 5 The Usual Suspects rank movie 1 Star Wars V: The Empire Strikes Back 2 American Beauty 3 The Godfather 4 The Usual Suspects 5 The Shawshank Redemption
127 Structured Queries no other Star Wars movie in top-5 at least one comedy in top-5 rank movie 1 Star Wars V: The Empire Strikes Back 2 Star Wars IV: A New Hope 3 The Godfather 4 The Shawshank Redemption 5 The Usual Suspects rank movie 1 Star Wars V: The Empire Strikes Back 2 American Beauty 3 The Godfather 4 The Usual Suspects 5 The Shawshank Redemption diversified recommendations via logical constraints
128 Conclusions Structured spaces are everywhere PSDDs build on logical circuits 1. Tractability 2. Semantics 3. Natural encoding of structured spaces Learning is effective 1. From constraints encoding structured space State of the art learning preference distributions 2. From standard unstructured datasets using search State of the art on standard tractable learning datasets Novel settings for inference and learning Structured spaces / learning from constraints / complex queries
129 References Probabilistic Sentential Decision Diagrams Doga Kisa, Guy Van den Broeck, Arthur Choi and Adnan Darwiche KR, 2014 Learning with Massive Logical Constraints Doga Kisa, Guy Van den Broeck, Arthur Choi and Adnan Darwiche ICML LTPM workshop, 2014 Tractable Learning for Structured Probability Spaces Arthur Choi, Guy Van den Broeck and Adnan Darwiche IJCAI, 2015 Tractable Learning for Complex Probability Queries Jessa Bekker, Jesse Davis, Arthur Choi, Adnan Darwiche, Guy Van den Broeck. NIPS, 2015 Learning the Structure of PSDDs Yitao Liang, Jessa Bekker and Guy Van den Broeck UAI, 2017 Towards Compact Interpretable Models: Learning and Shrinking PSDDs Yitao Liang and Guy Van den Broeck IJCAI XAI workshop, 2017
130 (P)SDDs in Melbourne Sunday: Logical Foundations for Uncertainty and Machine Learning Workshop Adnan Darwiche: On the Role of Logic in Probabilistic Inference and Machine Learning YooJung Choi: Optimal Feature Selection for Decision Robustness in Bayesian Networks Sunday: Explainable AI Workshop Yitao Liang: Towards Compact Interpretable Models: Learning and Shrinking PSDDs Tuesday: IJCAI YooJung Choi (again)
131 Conclusions Statistical ML Probability Symbolic AI Logic PSDD Connectionism Deep
132 Questions? PSDD with 15,000 nodes LearnPSDD code: Other PSDD code: SDD code:
Probabilistic Circuits: A New Synthesis of Logic and Machine Learning
Probabilistic Circuits: A New Synthesis of Logic and Machine Learning Guy Van den Broeck UCSD May 14, 2018 Overview Statistical ML Probability Symbolic AI Logic Probabilistic Circuits Connectionism Deep
More informationTractable Learning in Structured Probability Spaces
Tractable Learning in Structured Probability Spaces Guy Van den Broeck UCLA Stats Seminar Jan 17, 2017 Outline 1. Structured probability spaces? 2. Specification language Logic 3. Deep architecture Logic
More informationTractable Learning in Structured Probability Spaces
Tractable Learning in Structured Probability Spaces Guy Van den Broeck DTAI Seminar - KU Leuven Dec 20, 2016 Structured probability spaces? Running Example Courses: Logic (L) Knowledge Representation (K)
More informationProbabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning
Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning Guy Van den Broeck HRL/ACTIONS @ KR Oct 28, 2018 Foundation: Logical Circuit Languages Negation Normal Form Circuits Δ
More informationKnowledge Compilation and Machine Learning A New Frontier Adnan Darwiche Computer Science Department UCLA with Arthur Choi and Guy Van den Broeck
Knowledge Compilation and Machine Learning A New Frontier Adnan Darwiche Computer Science Department UCLA with Arthur Choi and Guy Van den Broeck The Problem Logic (L) Knowledge Representation (K) Probability
More informationProbabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning
Probabilistic and Logistic Circuits: A New Synthesis of Logic and Machine Learning Guy Van den Broeck KULeuven Symposium Dec 12, 2018 Outline Learning Adding knowledge to deep learning Logistic circuits
More informationProbabilistic Sentential Decision Diagrams
Probabilistic Sentential Decision Diagrams Doga Kisa and Guy Van den Broeck and Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles {doga,guyvdb,aychoi,darwiche}@cs.ucla.edu
More informationStarAI Full, 6+1 pages Short, 2 page position paper or abstract
StarAI 2015 Fifth International Workshop on Statistical Relational AI At the 31st Conference on Uncertainty in Artificial Intelligence (UAI) (right after ICML) In Amsterdam, The Netherlands, on July 16.
More informationSum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017
Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth
More informationSum-Product Networks: A New Deep Architecture
Sum-Product Networks: A New Deep Architecture Pedro Domingos Dept. Computer Science & Eng. University of Washington Joint work with Hoifung Poon 1 Graphical Models: Challenges Bayesian Network Markov Network
More informationOn First-Order Knowledge Compilation. Guy Van den Broeck
On First-Order Knowledge Compilation Guy Van den Broeck Beyond NP Workshop Feb 12, 2016 Overview 1. Why first-order model counting? 2. Why first-order model counters? 3. What first-order circuit languages?
More informationLearning Logistic Circuits
Learning Logistic Circuits Yitao Liang and Guy Van den Broeck Computer Science Department University of California, Los Angeles {yliang, guyvdb}@cs.ucla.edu Abstract This paper proposes a new classification
More informationOn the Role of Canonicity in Bottom-up Knowledge Compilation
On the Role of Canonicity in Bottom-up Knowledge Compilation Guy Van den Broeck and Adnan Darwiche Computer Science Department University of California, Los Angeles {guyvdb,darwiche}@cs.ucla.edu Abstract
More informationBasing Decisions on Sentences in Decision Diagrams
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Basing Decisions on Sentences in Decision Diagrams Yexiang Xue Department of Computer Science Cornell University yexiang@cs.cornell.edu
More informationOn the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar
Proceedings of Machine Learning Research vol 73:153-164, 2017 AMBN 2017 On the Sizes of Decision Diagrams Representing the Set of All Parse Trees of a Context-free Grammar Kei Amii Kyoto University Kyoto
More informationApproximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract)
Approximating the Partition Function by Deleting and then Correcting for Model Edges (Extended Abstract) Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles
More informationSolving PP PP -Complete Problems Using Knowledge Compilation
Solving PP PP -Complete Problems Using Knowledge Compilation Umut Oztok and Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles {umut,aychoi,darwiche}@cs.ucla.edu
More informationOnline Algorithms for Sum-Product
Online Algorithms for Sum-Product Networks with Continuous Variables Priyank Jaini Ph.D. Seminar Consistent/Robust Tensor Decomposition And Spectral Learning Offline Bayesian Learning ADF, EP, SGD, oem
More informationEE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS
EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Uncertainty & Bayesian Networks
More informationECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning
ECE 6504: Advanced Topics in Machine Learning Probabilistic Graphical Models and Large-Scale Learning Topics Summary of Class Advanced Topics Dhruv Batra Virginia Tech HW1 Grades Mean: 28.5/38 ~= 74.9%
More informationFirst-Order Knowledge Compilation for Probabilistic Reasoning
First-Order Knowledge Compilation for Probabilistic Reasoning Guy Van den Broeck based on joint work with Adnan Darwiche, Dan Suciu, and many others MOTIVATION 1 A Simple Reasoning Problem...? Probability
More informationHidden Markov Models. x 1 x 2 x 3 x N
Hidden Markov Models 1 1 1 1 K K K K x 1 x x 3 x N Example: The dishonest casino A casino has two dice: Fair die P(1) = P() = P(3) = P(4) = P(5) = P(6) = 1/6 Loaded die P(1) = P() = P(3) = P(4) = P(5)
More informationFoundations of Artificial Intelligence
Foundations of Artificial Intelligence 8. Satisfiability and Model Construction Davis-Putnam-Logemann-Loveland Procedure, Phase Transitions, GSAT Joschka Boedecker and Wolfram Burgard and Bernhard Nebel
More informationParameter Es*ma*on: Cracking Incomplete Data
Parameter Es*ma*on: Cracking Incomplete Data Khaled S. Refaat Collaborators: Arthur Choi and Adnan Darwiche Agenda Learning Graphical Models Complete vs. Incomplete Data Exploi*ng Data for Decomposi*on
More informationOptimal Feature Selection for Decision Robustness in Bayesian Networks
Optimal Feature Selection for ecision Robustness in Bayesian Networks YooJung Choi, Adnan arwiche, and Guy Van den Broeck Computer Science epartment University of California, Los Angeles {yjchoi, darwiche,
More informationCOMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017
COMS 4721: Machine Learning for Data Science Lecture 20, 4/11/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University SEQUENTIAL DATA So far, when thinking
More informationIntelligent Systems:
Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition
More informationLecture 6: Graphical Models: Learning
Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)
More informationCompiling Knowledge into Decomposable Negation Normal Form
Compiling Knowledge into Decomposable Negation Normal Form Adnan Darwiche Cognitive Systems Laboratory Department of Computer Science University of California Los Angeles, CA 90024 darwiche@cs. ucla. edu
More informationUncertainty and Bayesian Networks
Uncertainty and Bayesian Networks Tutorial 3 Tutorial 3 1 Outline Uncertainty Probability Syntax and Semantics for Uncertainty Inference Independence and Bayes Rule Syntax and Semantics for Bayesian Networks
More informationProbabilistic Logic CNF for Reasoning
Probabilistic Logic CNF for Reasoning Song- Chun Zhu, Sinisa Todorovic, and Ales Leonardis At CVPR, Providence, Rhode Island June 16, 2012 Goal input tracking parsing reasoning activity recognition in
More informationPROBABILISTIC REASONING SYSTEMS
PROBABILISTIC REASONING SYSTEMS In which we explain how to build reasoning systems that use network models to reason with uncertainty according to the laws of probability theory. Outline Knowledge in uncertain
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationMachine Learning Summer School
Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,
More informationApproximate Inference by Compilation to Arithmetic Circuits
Approximate Inference by Compilation to Arithmetic Circuits Daniel Lowd Department of Computer and Information Science University of Oregon Eugene, OR 97403-1202 lowd@cs.uoregon.edu Pedro Domingos Department
More informationLearning Bayesian Network Parameters under Equivalence Constraints
Learning Bayesian Network Parameters under Equivalence Constraints Tiansheng Yao 1,, Arthur Choi, Adnan Darwiche Computer Science Department University of California, Los Angeles Los Angeles, CA 90095
More informationKU Leuven Department of Computer Science
Implementation and Performance of Probabilistic Inference Pipelines: Data and Results Dimitar Shterionov Gerda Janssens Report CW 679, March 2015 KU Leuven Department of Computer Science Celestijnenlaan
More informationCS 570: Machine Learning Seminar. Fall 2016
CS 570: Machine Learning Seminar Fall 2016 Class Information Class web page: http://web.cecs.pdx.edu/~mm/mlseminar2016-2017/fall2016/ Class mailing list: cs570@cs.pdx.edu My office hours: T,Th, 2-3pm or
More informationMixed Membership Matrix Factorization
Mixed Membership Matrix Factorization Lester Mackey 1 David Weiss 2 Michael I. Jordan 1 1 University of California, Berkeley 2 University of Pennsylvania International Conference on Machine Learning, 2010
More informationAutomatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries
Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries Anonymous Author(s) Affiliation Address email Abstract 1 2 3 4 5 6 7 8 9 10 11 12 Probabilistic
More informationSampling Algorithms for Probabilistic Graphical models
Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir
More informationLearning to Learn and Collaborative Filtering
Appearing in NIPS 2005 workshop Inductive Transfer: Canada, December, 2005. 10 Years Later, Whistler, Learning to Learn and Collaborative Filtering Kai Yu, Volker Tresp Siemens AG, 81739 Munich, Germany
More informationComputer Science and Logic A Match Made in Heaven
A Match Made in Heaven Luca Aceto Reykjavik University Reykjavik, 3 April 2009 Thanks to Moshe Vardi from whom I have drawn inspiration (read stolen ideas ) for this presentation. Why This Talk Today?
More informationCS 188: Artificial Intelligence. Bayes Nets
CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationarxiv: v1 [stat.ml] 15 Nov 2016
Recoverability of Joint Distribution from Missing Data arxiv:1611.04709v1 [stat.ml] 15 Nov 2016 Jin Tian Department of Computer Science Iowa State University Ames, IA 50014 jtian@iastate.edu Abstract A
More informationBayesian Networks BY: MOHAMAD ALSABBAGH
Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional
More informationPreliminaries. Data Mining. The art of extracting knowledge from large bodies of structured data. Let s put it to use!
Data Mining The art of extracting knowledge from large bodies of structured data. Let s put it to use! 1 Recommendations 2 Basic Recommendations with Collaborative Filtering Making Recommendations 4 The
More informationFinal. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.
CS 188 Spring 2014 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers
More informationExamination Artificial Intelligence Module Intelligent Interaction Design December 2014
Examination Artificial Intelligence Module Intelligent Interaction Design December 2014 Introduction This exam is closed book, you may only use a simple calculator (addition, substraction, multiplication
More informationModel Counting for Probabilistic Reasoning
Model Counting for Probabilistic Reasoning Beyond NP Workshop Stefano Ermon CS Department, Stanford Combinatorial Search and Optimization Progress in combinatorial search since the 1990s (SAT, SMT, MIP,
More informationModels, Data, Learning Problems
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Models, Data, Learning Problems Tobias Scheffer Overview Types of learning problems: Supervised Learning (Classification, Regression,
More informationGenerative Models for Sentences
Generative Models for Sentences Amjad Almahairi PhD student August 16 th 2014 Outline 1. Motivation Language modelling Full Sentence Embeddings 2. Approach Bayesian Networks Variational Autoencoders (VAE)
More informationOn the Relationship between Sum-Product Networks and Bayesian Networks
On the Relationship between Sum-Product Networks and Bayesian Networks International Conference on Machine Learning, 2015 Han Zhao Mazen Melibari Pascal Poupart University of Waterloo, Waterloo, ON, Canada
More informationThe exam is closed book, closed calculator, and closed notes except your one-page crib sheet.
CS 188 Fall 2015 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.
More informationAnnouncements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic
CS 188: Artificial Intelligence Fall 2008 Lecture 16: Bayes Nets III 10/23/2008 Announcements Midterms graded, up on glookup, back Tuesday W4 also graded, back in sections / box Past homeworks in return
More informationComputational Genomics. Systems biology. Putting it together: Data integration using graphical models
02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput
More informationDynamic models 1 Kalman filters, linearization,
Koller & Friedman: Chapter 16 Jordan: Chapters 13, 15 Uri Lerner s Thesis: Chapters 3,9 Dynamic models 1 Kalman filters, linearization, Switching KFs, Assumed density filters Probabilistic Graphical Models
More informationCollaborative topic models: motivations cont
Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationSymbolic Variable Elimination in Discrete and Continuous Graphical Models. Scott Sanner Ehsan Abbasnejad
Symbolic Variable Elimination in Discrete and Continuous Graphical Models Scott Sanner Ehsan Abbasnejad Inference for Dynamic Tracking No one previously did this inference exactly in closed-form! Exact
More informationModeling User Rating Profiles For Collaborative Filtering
Modeling User Rating Profiles For Collaborative Filtering Benjamin Marlin Department of Computer Science University of Toronto Toronto, ON, M5S 3H5, CANADA marlin@cs.toronto.edu Abstract In this paper
More informationIntroduction to Artificial Intelligence (AI)
Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 9 Oct, 11, 2011 Slide credit Approx. Inference : S. Thrun, P, Norvig, D. Klein CPSC 502, Lecture 9 Slide 1 Today Oct 11 Bayesian
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationMixed Membership Matrix Factorization
Mixed Membership Matrix Factorization Lester Mackey University of California, Berkeley Collaborators: David Weiss, University of Pennsylvania Michael I. Jordan, University of California, Berkeley 2011
More informationRecovering Probability Distributions from Missing Data
Proceedings of Machine Learning Research 77:574 589, 2017 ACML 2017 Recovering Probability Distributions from Missing Data Jin Tian Iowa State University jtian@iastate.edu Editors: Yung-Kyun Noh and Min-Ling
More informationIntroduction. next up previous Next: Problem Description Up: Bayesian Networks Previous: Bayesian Networks
Next: Problem Description Up: Bayesian Networks Previous: Bayesian Networks Introduction Autonomous agents often find themselves facing a great deal of uncertainty. It decides how to act based on its perceived
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationPrice: $25 (incl. T-Shirt, morning tea and lunch) Visit:
Three days of interesting talks & workshops from industry experts across Australia Explore new computing topics Network with students & employers in Brisbane Price: $25 (incl. T-Shirt, morning tea and
More informationKnowledge Extraction from DBNs for Images
Knowledge Extraction from DBNs for Images Son N. Tran and Artur d Avila Garcez Department of Computer Science City University London Contents 1 Introduction 2 Knowledge Extraction from DBNs 3 Experimental
More informationGenerative Models for Discrete Data
Generative Models for Discrete Data ddebarr@uw.edu 2016-04-21 Agenda Bayesian Concept Learning Beta-Binomial Model Dirichlet-Multinomial Model Naïve Bayes Classifiers Bayesian Concept Learning Numbers
More informationCV-width: A New Complexity Parameter for CNFs
CV-width: A New Complexity Parameter for CNFs Umut Oztok and Adnan Darwiche 1 Abstract. We present new complexity results on the compilation of CNFs into DNNFs and OBDDs. In particular, we introduce a
More informationInformation Retrieval and Organisation
Information Retrieval and Organisation Chapter 13 Text Classification and Naïve Bayes Dell Zhang Birkbeck, University of London Motivation Relevance Feedback revisited The user marks a number of documents
More informationAlgorithms for NLP. Language Modeling III. Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley
Algorithms for NLP Language Modeling III Taylor Berg-Kirkpatrick CMU Slides: Dan Klein UC Berkeley Announcements Office hours on website but no OH for Taylor until next week. Efficient Hashing Closed address
More informationDiscriminative Learning of Sum-Product Networks. Robert Gens Pedro Domingos
Discriminative Learning of Sum-Product Networks Robert Gens Pedro Domingos X1 X1 X1 X1 X2 X2 X2 X2 X3 X3 X3 X3 X4 X4 X4 X4 X5 X5 X5 X5 X6 X6 X6 X6 Distributions X 1 X 1 X 1 X 1 X 2 X 2 X 2 X 2 X 3 X 3
More informationCardinal and Ordinal Preferences. Introduction to Logic in Computer Science: Autumn Dinner Plans. Preference Modelling
Cardinal and Ordinal Preferences Introduction to Logic in Computer Science: Autumn 2007 Ulle Endriss Institute for Logic, Language and Computation University of Amsterdam A preference structure represents
More informationIntroduction to the Tensor Train Decomposition and Its Applications in Machine Learning
Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March
More informationSums of Products. Pasi Rastas November 15, 2005
Sums of Products Pasi Rastas November 15, 2005 1 Introduction This presentation is mainly based on 1. Bacchus, Dalmao and Pitassi : Algorithms and Complexity results for #SAT and Bayesian inference 2.
More informationProf. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme. Tanya Braun (Exercises)
Prof. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme Tanya Braun (Exercises) Slides taken from the presentation (subset only) Learning Statistical Models From
More informationA Relaxed Tseitin Transformation for Weighted Model Counting
A Relaxed Tseitin Transformation for Weighted Model Counting Wannes Meert, Jonas Vlasselaer Computer Science Department KU Leuven, Belgium Guy Van den Broeck Computer Science Department University of California,
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationCS 188: Artificial Intelligence Fall 2011
CS 188: Artificial Intelligence Fall 2011 Lecture 20: HMMs / Speech / ML 11/8/2011 Dan Klein UC Berkeley Today HMMs Demo bonanza! Most likely explanation queries Speech recognition A massive HMM! Details
More informationA Preference Logic With Four Kinds of Preferences
A Preference Logic With Four Kinds of Preferences Zhang Zhizheng and Xing Hancheng School of Computer Science and Engineering, Southeast University No.2 Sipailou, Nanjing, China {seu_zzz; xhc}@seu.edu.cn
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationPlanning by Probabilistic Inference
Planning by Probabilistic Inference Hagai Attias Microsoft Research 1 Microsoft Way Redmond, WA 98052 Abstract This paper presents and demonstrates a new approach to the problem of planning under uncertainty.
More informationAcknowledgments 2. Part 0: Overview 17
Contents Acknowledgments 2 Preface for instructors 11 Which theory course are we talking about?.... 12 The features that might make this book appealing. 13 What s in and what s out............... 14 Possible
More informationQuantifying uncertainty & Bayesian networks
Quantifying uncertainty & Bayesian networks CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2016 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition,
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationMachine learning for automated theorem proving: the story so far. Sean Holden
Machine learning for automated theorem proving: the story so far Sean Holden University of Cambridge Computer Laboratory William Gates Building 15 JJ Thomson Avenue Cambridge CB3 0FD, UK sbh11@cl.cam.ac.uk
More informationA Stochastic l-calculus
A Stochastic l-calculus Content Areas: probabilistic reasoning, knowledge representation, causality Tracking Number: 775 Abstract There is an increasing interest within the research community in the design
More informationComputational Complexity of Bayesian Networks
Computational Complexity of Bayesian Networks UAI, 2015 Complexity theory Many computations on Bayesian networks are NP-hard Meaning (no more, no less) that we cannot hope for poly time algorithms that
More informationOnline Forest Density Estimation
Online Forest Density Estimation Frédéric Koriche CRIL - CNRS UMR 8188, Univ. Artois koriche@cril.fr UAI 16 1 Outline 1 Probabilistic Graphical Models 2 Online Density Estimation 3 Online Forest Density
More informationPartially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS
Partially Observable Markov Decision Processes (POMDPs) Pieter Abbeel UC Berkeley EECS Many slides adapted from Jur van den Berg Outline POMDPs Separation Principle / Certainty Equivalence Locally Optimal
More informationRank and Rating Aggregation. Amy Langville Mathematics Department College of Charleston
Rank and Rating Aggregation Amy Langville Mathematics Department College of Charleston Hamilton Institute 8/6/2008 Collaborators Carl Meyer, N. C. State University College of Charleston (current and former
More informationIntelligent Systems (AI-2)
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 24, 2016 Slide Sources Raymond J. Mooney University of Texas at Austin D. Koller, Stanford CS - Probabilistic Graphical Models D. Page,
More informationBayesian Networks. Marcello Cirillo PhD Student 11 th of May, 2007
Bayesian Networks Marcello Cirillo PhD Student 11 th of May, 2007 Outline General Overview Full Joint Distributions Bayes' Rule Bayesian Network An Example Network Everything has a cost Learning with Bayesian
More informationLecture 15. Probabilistic Models on Graph
Lecture 15. Probabilistic Models on Graph Prof. Alan Yuille Spring 2014 1 Introduction We discuss how to define probabilistic models that use richly structured probability distributions and describe how
More informationArithmetic circuits of the noisy-or models
Arithmetic circuits of the noisy-or models Jiří Vomlel Institute of Information Theory and Automation of the ASCR Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 08 Praha 8. Czech
More informationBayesian belief networks. Inference.
Lecture 13 Bayesian belief networks. Inference. Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Midterm exam Monday, March 17, 2003 In class Closed book Material covered by Wednesday, March 12 Last
More informationCS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine
CS 484 Data Mining Classification 7 Some slides are from Professor Padhraic Smyth at UC Irvine Bayesian Belief networks Conditional independence assumption of Naïve Bayes classifier is too strong. Allows
More information