Treedy: A Heuristic for Counting and Sampling Subsets

Size: px
Start display at page:

Download "Treedy: A Heuristic for Counting and Sampling Subsets"

Transcription

1 1 / 27 HELSINGIN YLIOPISTO HELSINGFORS UNIVERSITET UNIVERSITY OF HELSINKI Treedy: A Heuristic for Counting and Sampling Subsets Teppo Niinimäki, Mikko Koivisto July 12, 2013 University of Helsinki Department of Computer Science

2 2 / 27 Outline Problem definition Application: Bayesian network learning Algorithms Experiments

3 3 / 27 Outline Problem definition Application: Bayesian network learning Algorithms Experiments

4 4 / 27 Problem overview N Subset counting problem: Compute Q W (Q) = S Q w(s) C

5 5 / 27 Problem setting Given: Ground set N of n elements Example N = {A, B, C, D} (n = 4)

6 5 / 27 Problem setting Given: Ground set N of n elements Collection C P(N) (downward closed) Example N = {A, B, C, D} (n = 4) C = {, A, B, C, D, AB, AC, AD, BC, BD, CD}

7 Problem setting Given: Ground set N of n elements Collection C P(N) (downward closed) Weights { w(s) 0 w(s) = 0 for S C otherwise Example (continued) S C w(s) 80 A 85 B 70 C 13 D 50 AB 99 AC 60 AD 90 BC 11 BD 14 CD 12 5 / 27

8 6 / 27 Problem definition Query set Q N Example Q = {B, C, D}

9 6 / 27 Problem definition Query set Q N Counting problem: Compute W (Q) = S Q w(s) Example Q = {B, C, D} S Q w(s) 80 B 70 C 13 D 50 BC 11 BD 14 CD 12 BCD 0 W (Q) = 250

10 Problem definition Query set Q N Counting problem: Compute W (Q) = S Q w(s) Sampling problem: Draw S Q s.t. Pr(S) = w(s) W (Q) Example Q = {B, C, D} S Q w(s) Pr(S) B C D BC BD CD BCD W (Q) = / 27

11 Problem definition Too slow Approximate with tolerance d [0, 1] Query set Q N Counting problem: Compute W (Q) = S Q w(s) within relative error d. Sampling problem: Draw S Q s.t. Pr(S) = w(s) W (Q) within total variation distance d. Example Q = {B, C, D} d = 0.2 S Q w(s) Pr(S) B C D BC BD CD BCD W (Q) = 250 [200, 300] 6 / 27

12 7 / 27 Outline Problem definition Application: Bayesian network learning Algorithms Experiments

13 8 / 27 Bayesian network structure: directed acyclic graph S v v conditional probabilities p(x) = v N p(x v x Sv )

14 9 / 27 Structure learning Task: Given data, find the probability of arc a b variables sample ? 3 8 Exact computation infeasible for large instances approximate with MCMC sampler

15 Order-MCMC (Friedman & Koller, 2003) Sample node orderings Draws v 1 v 2 v n from the (posterior) distribution n Pr(v 1 v 2 v n ) = W i ({v 1,..., v i 1 }) i=1 where W i (Q) = S i Q w i(s i ) C i = { potential parent sets for v i } w i depends on the data Q v i n subset counting queries 10 / 27

16 Order-MCMC (Friedman & Koller, 2003) Sample structures {v 1,..., v i 1 } v i For every node v i, samples a parent set S i {v 1,..., v i 1 } such that Pr(S i ) = w i (S i ) W i ({v 1,..., v i 1 }) n subset sampling queries 11 / 27

17 12 / 27 Outline Problem definition Application: Bayesian network learning Algorithms Experiments

18 Algorithm outline Set S is relevant if both S C and S Q. W (Q) = sum over relevant sets Collector algorithm for counting Visit some sets S N in some order. Add up weights of relevant sets. Stop once approximation tolerance d met. 13 / 27

19 Algorithm outline Set S is relevant if both S C and S Q. W (Q) = sum over relevant sets Collector algorithm for counting Visit some sets S N in some order. Add up weights of relevant sets. Stop once approximation tolerance d met. From counting to sampling Sample sets visited by the collector algorithm. distribution within total variation distance d 13 / 27

20 14 / 27 Four counting algorithms Exact An exact baseline algorithm. Visits all relevant sets.

21 14 / 27 Four counting algorithms Exact An exact baseline algorithm. Visits all relevant sets. Ideal Idealized approximation. Visits the minimum number of heaviest relevant sets.

22 14 / 27 Four counting algorithms Exact An exact baseline algorithm. Visits all relevant sets. Ideal Idealized approximation. Visits the minimum number of heaviest relevant sets. Sorted Improved version of the heuristic used by Friedman and Koller in the order-mcmc.

23 14 / 27 Four counting algorithms Exact An exact baseline algorithm. Visits all relevant sets. Ideal Idealized approximation. Visits the minimum number of heaviest relevant sets. Sorted Improved version of the heuristic used by Friedman and Koller in the order-mcmc. Treedy A novel heuristic based on tree traversal.

24 15 / 27 Sorted Preprocessing: Sort C by weight. Compute partial sums W 1, W 2,..., W C. Example (Q =BCD, d =0.2) S C w(s) W i AB AD A B AC D BD C CD BC 11 11

25 Sorted Preprocessing: Sort C by weight. Compute partial sums W 1, W 2,..., W C. Query: Traverse C starting from heaviest. Add up relevant weights to W. Stop once the weight of the remaining sets small enough. Example (Q =BCD, d =0.2) S C w(s) W i AB AD A B AC D BD C CD BC W = / 27

26 Treedy Preprocessing: Start with a lexicographical tree for C. 16 / 27

27 16 / 27 Treedy Preprocessing: Example AB w: 99 AC w: 60 AD w: 90 BC w: 11 BD w: 14 CD w: 12 A w: 85 B w: 70 C w: 13 D w: 50 w: 80

28 Treedy Preprocessing: Start with a lexicographical tree for C. Compute weight potentials φ(s). 16 / 27

29 16 / 27 Treedy Preprocessing: Example AB w: 99 φ: 99 AC w: 60 φ: 60 AD w: 90 φ: 90 BC w: 11 φ: 11 BD w: 14 φ: 14 CD w: 12 φ: 12 A w: 85 φ: 334 B w: 70 φ: 95 C w: 13 φ: 25 D w: 50 φ: 50 w: 80 φ: 584

30 Treedy Preprocessing: Start with a lexicographical tree for C. Compute weight potentials φ(s). Sort siblings by φ(s) and reorganize links. 16 / 27

31 16 / 27 Treedy Preprocessing: Example AB w: 99 φ: 99 AD w: 90 φ: 90 AC w: 60 φ: 60 BD w: 14 φ: 14 BC w: 11 φ: 11 CD w: 12 φ: 12 A w: 85 φ: 334 B w: 70 φ: 95 D w: 50 φ: 50 C w: 13 φ: 25 w: 80 φ: 584

32 16 / 27 Treedy Preprocessing: Example AB w: 99 φ: 99 AD w: 90 φ: 90 AC w: 60 φ: 60 BD w: 14 φ: 14 BC w: 11 φ: 11 CD w: 12 φ: 12 A w: 85 φ: 334 B w: 70 φ: 95 D w: 50 φ: 50 C w: 13 φ: 25 w: 80 φ: 584

33 Treedy Preprocessing: Start with a lexicographical tree for C. Compute weight potentials φ(s). Sort siblings by φ(s) and reorganize links. Compute aggregate potentials ψ(s). 16 / 27

34 16 / 27 Treedy Preprocessing: Example AB w: 99 φ: 99 ψ: 249 AD w: 90 φ: 90 ψ: 150 AC w: 60 φ: 60 ψ: 60 BD w: 14 φ: 14 ψ: 25 BC w: 11 φ: 11 ψ: 11 CD w: 12 φ: 12 ψ: 12 A w: 85 φ: 334 ψ: 504 B w: 70 φ: 95 ψ: 170 D w: 50 φ: 50 ψ: 75 C w: 13 φ: 25 ψ: 25 w: 80 φ: 584 ψ: 584

35 16 / 27 Treedy Preprocessing: Start with a lexicographical tree for C. Compute weight potentials φ(s). Sort siblings by φ(s) and reorganize links. Compute aggregate potentials ψ(s). Query: Traverse the tree greedily w.r.t. ψ.

36 16 / 27 Treedy Preprocessing: Start with a lexicographical tree for C. Compute weight potentials φ(s). Sort siblings by φ(s) and reorganize links. Compute aggregate potentials ψ(s). Query: Traverse the tree greedily w.r.t. ψ. Add up relevant weights to W.

37 16 / 27 Treedy Preprocessing: Start with a lexicographical tree for C. Compute weight potentials φ(s). Sort siblings by φ(s) and reorganize links. Compute aggregate potentials ψ(s). Query: Traverse the tree greedily w.r.t. ψ. Add up relevant weights to W. Ignore and skip irrelevant branches.

38 16 / 27 Treedy Preprocessing: Start with a lexicographical tree for C. Compute weight potentials φ(s). Sort siblings by φ(s) and reorganize links. Compute aggregate potentials ψ(s). Query: Traverse the tree greedily w.r.t. ψ. Add up relevant weights to W. Ignore and skip irrelevant branches. Stop once the weight of the remaining branches small enough.

39 17 / 27 Treedy Query: Example (Q =BCD, d =0.2) AB w: 99 φ: 99 ψ: 249 AD w: 90 φ: 90 ψ: 150 AC w: 60 φ: 60 ψ: 60 BD w: 14 φ: 14 ψ: 25 BC w: 11 φ: 11 ψ: 11 CD w: 12 φ: 12 ψ: 12 A w: 85 φ: 334 ψ: 504 B w: 70 φ: 95 ψ: 170 D w: 50 φ: 50 ψ: 75 C w: 13 φ: 25 ψ: 25 w: 80 φ: 584 ψ: 584

40 17 / 27 Treedy Query: Example (Q =BCD, d =0.2) AB w: 99 φ: 99 ψ: 249 AD w: 90 φ: 90 ψ: 150 AC w: 60 φ: 60 ψ: 60 BD w: 14 φ: 14 ψ: 25 BC w: 11 φ: 11 ψ: 11 CD w: 12 φ: 12 ψ: 12 A w: 85 φ: 334 ψ: 504 B w: 70 φ: 95 ψ: 170 D w: 50 φ: 50 ψ: 75 C w: 13 φ: 25 ψ: 25 w: 80 φ: 584 ψ: 584

41 17 / 27 Treedy Query: Example (Q =BCD, d =0.2) AB AD AC BD w: 14 φ: 14 ψ: 25 BC w: 11 φ: 11 ψ: 11 CD w: 12 φ: 12 ψ: 12 A B w: 70 φ: 95 ψ: 170 D w: 50 φ: 50 ψ: 75 C w: 13 φ: 25 ψ: 25 W': 80 Ψ: 170

42 17 / 27 Treedy Query: Example (Q =BCD, d =0.2) AB AD AC BD w: 14 φ: 14 ψ: 25 BC w: 11 φ: 11 ψ: 11 CD w: 12 φ: 12 ψ: 12 A B W': 150 Ψ: 100 D w: 50 φ: 50 ψ: 75 C w: 13 φ: 25 ψ: 25

43 17 / 27 Treedy Query: Example (Q =BCD, d =0.2) AB AD AC BD w: 14 φ: 14 ψ: 25 BC w: 11 φ: 11 ψ: 11 CD w: 12 φ: 12 ψ: 12 A B D W': 200 Ψ: 50 C w: 13 φ: 25 ψ: 25

44 18 / 27 Example summary Example (Q =BCD, d =0.2) S C w(s) Exact Ideal Sorted Treedy AB 99 1 AD 90 2 A B AC 60 6 D BD C CD 12 7 BC 11 4

45 19 / 27 Outline Problem definition Application: Bayesian network learning Algorithms Experiments

46 Experiment setting Data Artificial instances Different types of random weights: flat, steep, mixture, shuffled Bayesian network learning Data sampled from Bayesian networks: ALARM, HAILFINDER Datasets from UCI repository: Votes2, Chess, 10xPromoters, Splice 20 / 27

47 Experiment setting Counting queries Artificial instances Random query sets of different sizes Bayesian network learning Run order-mcmc using different counting algorithms Limit the size of sets in C to be at most k {3, 4, 5} measure runtime as a function of tolerance d [10 8, 0.5] 21 / 27

48 22 / 27 Artificial instances flat steep mixture shuffled Exact Ideal Sorted Treedy Runtimes for different types of artificial weight functions. (n = 60, k = 5)

49 23 / 27 Bayesian network learning Name n #Samples k #Steps Alarm Hailfinder Votes Chess xPromoters Splice

50 24 / 27 Bayesian network learning 50 samples 200 samples 1000 samples 5000 samples Exact Ideal Sorted Treedy Runtime of order-mcmc for data generated from ALARM-network. (n = 37, k = 5)

51 25 / 27 Bayesian network learning samples 200 samples 1000 samples 5000 samples Exact Ideal Sorted Treedy Runtime of order-mcmc for data generated from HAILFINDER-network. (n = 56, k = 4)

52 26 / 27 Bayesian network learning Votes2 Chess 10 Promoters Splice Exact Ideal Sorted Treedy Runtime of order-mcmc for UCI datasets.

53 27 / 27 Conclusion Problem: approximate subset counting and sampling queries Application: Bayesian network structure learning Two heuristic methods: Sorted and Treedy Significant speed-ups over the Exact algorithm However, still room for improvements (compared to the Ideal algorithm) Implementation (C++): cs.helsinki.fi/u/tzniinim/uai2013

54 27 / 27

55 27 / 27 Bayesian network learning Votes2 Chess 10 x Promoters Splice Exact Ideal Sorted Treedy Runtime as a function of query set size.

Local Structure Discovery in Bayesian Networks

Local Structure Discovery in Bayesian Networks 1 / 45 HELSINGIN YLIOPISTO HELSINGFORS UNIVERSITET UNIVERSITY OF HELSINKI Local Structure Discovery in Bayesian Networks Teppo Niinimäki, Pekka Parviainen August 18, 2012 University of Helsinki Department

More information

Partial Order MCMC for Structure Discovery in Bayesian Networks

Partial Order MCMC for Structure Discovery in Bayesian Networks Partial Order MCMC for Structure Discovery in Bayesian Networks Teppo Niinimäki, Pekka Parviainen and Mikko Koivisto Helsinki Institute for Information Technology HIIT, Department of Computer Science University

More information

Approximation Strategies for Structure Learning in Bayesian Networks. Teppo Niinimäki

Approximation Strategies for Structure Learning in Bayesian Networks. Teppo Niinimäki Department of Computer Science Series of Publications A Report A-2015-2 Approximation Strategies for Structure Learning in Bayesian Networks Teppo Niinimäki To be presented, with the permission of the

More information

Artificial Intelligence

Artificial Intelligence ICS461 Fall 2010 Nancy E. Reed nreed@hawaii.edu 1 Lecture #14B Outline Inference in Bayesian Networks Exact inference by enumeration Exact inference by variable elimination Approximate inference by stochastic

More information

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany Syllabus Fri. 21.10. (1) 0. Introduction A. Supervised Learning: Linear Models & Fundamentals Fri. 27.10. (2) A.1 Linear Regression Fri. 3.11. (3) A.2 Linear Classification Fri. 10.11. (4) A.3 Regularization

More information

Inference as Optimization

Inference as Optimization Inference as Optimization Sargur Srihari srihari@cedar.buffalo.edu 1 Topics in Inference as Optimization Overview Exact Inference revisited The Energy Functional Optimizing the Energy Functional 2 Exact

More information

CS 343: Artificial Intelligence

CS 343: Artificial Intelligence CS 343: Artificial Intelligence Bayes Nets: Sampling Prof. Scott Niekum The University of Texas at Austin [These slides based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley.

More information

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks BY: MOHAMAD ALSABBAGH Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional

More information

Bayesian Networks to design optimal experiments. Davide De March

Bayesian Networks to design optimal experiments. Davide De March Bayesian Networks to design optimal experiments Davide De March davidedemarch@gmail.com 1 Outline evolutionary experimental design in high-dimensional space and costly experimentation the microwell mixture

More information

Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech

Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech Probabilistic Graphical Models and Bayesian Networks Artificial Intelligence Bert Huang Virginia Tech Concept Map for Segment Probabilistic Graphical Models Probabilistic Time Series Models Particle Filters

More information

Probabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference

Probabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference Probabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference David Poole University of British Columbia 1 Overview Belief Networks Variable Elimination Algorithm Parent Contexts

More information

Machine learning: lecture 20. Tommi S. Jaakkola MIT CSAIL

Machine learning: lecture 20. Tommi S. Jaakkola MIT CSAIL Machine learning: lecture 20 ommi. Jaakkola MI AI tommi@csail.mit.edu Bayesian networks examples, specification graphs and independence associated distribution Outline ommi Jaakkola, MI AI 2 Bayesian networks

More information

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic CS 188: Artificial Intelligence Fall 2008 Lecture 16: Bayes Nets III 10/23/2008 Announcements Midterms graded, up on glookup, back Tuesday W4 also graded, back in sections / box Past homeworks in return

More information

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22. Introduction to Artificial Intelligence V22.0472-001 Fall 2009 Lecture 15: Bayes Nets 3 Midterms graded Assignment 2 graded Announcements Rob Fergus Dept of Computer Science, Courant Institute, NYU Slides

More information

Fractional Factorial Designs

Fractional Factorial Designs Fractional Factorial Designs ST 516 Each replicate of a 2 k design requires 2 k runs. E.g. 64 runs for k = 6, or 1024 runs for k = 10. When this is infeasible, we use a fraction of the runs. As a result,

More information

Introduction to Bayesian Learning

Introduction to Bayesian Learning Course Information Introduction Introduction to Bayesian Learning Davide Bacciu Dipartimento di Informatica Università di Pisa bacciu@di.unipi.it Apprendimento Automatico: Fondamenti - A.A. 2016/2017 Outline

More information

Structure Learning: the good, the bad, the ugly

Structure Learning: the good, the bad, the ugly Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 Structure Learning: the good, the bad, the ugly Graphical Models 10708 Carlos Guestrin Carnegie Mellon University September 29 th, 2006 1 Understanding the uniform

More information

CS 188: Artificial Intelligence. Bayes Nets

CS 188: Artificial Intelligence. Bayes Nets CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew

More information

Prof. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme. Tanya Braun (Exercises)

Prof. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme. Tanya Braun (Exercises) Prof. Dr. Ralf Möller Dr. Özgür L. Özçep Universität zu Lübeck Institut für Informationssysteme Tanya Braun (Exercises) Slides taken from the presentation (subset only) Learning Statistical Models From

More information

Bayesian Networks. instructor: Matteo Pozzi. x 1. x 2. x 3 x 4. x 5. x 6. x 7. x 8. x 9. Lec : Urban Systems Modeling

Bayesian Networks. instructor: Matteo Pozzi. x 1. x 2. x 3 x 4. x 5. x 6. x 7. x 8. x 9. Lec : Urban Systems Modeling 12735: Urban Systems Modeling Lec. 09 Bayesian Networks instructor: Matteo Pozzi x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 1 outline example of applications how to shape a problem as a BN complexity of the inference

More information

Directed Graphical Models

Directed Graphical Models CS 2750: Machine Learning Directed Graphical Models Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 Graphical Models If no assumption of independence is made, must estimate an exponential

More information

Intelligent Systems:

Intelligent Systems: Intelligent Systems: Undirected Graphical models (Factor Graphs) (2 lectures) Carsten Rother 15/01/2015 Intelligent Systems: Probabilistic Inference in DGM and UGM Roadmap for next two lectures Definition

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

Bayesian Networks Inference with Probabilistic Graphical Models

Bayesian Networks Inference with Probabilistic Graphical Models 4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning

More information

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Part I C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS Probabilistic Graphical Models Graphical representation of a probabilistic model Each variable corresponds to a

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 16 Undirected Graphs Undirected Separation Inferring Marginals & Conditionals Moralization Junction Trees Triangulation Undirected Graphs Separation

More information

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005 Instructor: Professor Jeff A. Bilmes Uncertainty & Bayesian Networks

More information

Tree-Based Inference for Dirichlet Process Mixtures

Tree-Based Inference for Dirichlet Process Mixtures Yang Xu Machine Learning Department School of Computer Science Carnegie Mellon University Pittsburgh, USA Katherine A. Heller Department of Engineering University of Cambridge Cambridge, UK Zoubin Ghahramani

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Spring Announcements CS 188: Artificial Intelligence Spring 2011 Lecture 16: Bayes Nets IV Inference 3/28/2011 Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Announcements

More information

CHAPTER-17. Decision Tree Induction

CHAPTER-17. Decision Tree Induction CHAPTER-17 Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Probabilistic Graphical Models for Image Analysis - Lecture 1

Probabilistic Graphical Models for Image Analysis - Lecture 1 Probabilistic Graphical Models for Image Analysis - Lecture 1 Alexey Gronskiy, Stefan Bauer 21 September 2018 Max Planck ETH Center for Learning Systems Overview 1. Motivation - Why Graphical Models 2.

More information

CS Lecture 3. More Bayesian Networks

CS Lecture 3. More Bayesian Networks CS 6347 Lecture 3 More Bayesian Networks Recap Last time: Complexity challenges Representing distributions Computing probabilities/doing inference Introduction to Bayesian networks Today: D-separation,

More information

Entropy-based Pruning for Learning Bayesian Networks using BIC

Entropy-based Pruning for Learning Bayesian Networks using BIC Entropy-based Pruning for Learning Bayesian Networks using BIC Cassio P. de Campos Queen s University Belfast, UK arxiv:1707.06194v1 [cs.ai] 19 Jul 2017 Mauro Scanagatta Giorgio Corani Marco Zaffalon Istituto

More information

Intelligent Systems (AI-2)

Intelligent Systems (AI-2) Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 3, 2016 CPSC 422, Lecture 11 Slide 1 422 big picture: Where are we? Query Planning Deterministic Logics First Order Logics Ontologies

More information

Bayesian Networks. Machine Learning, Fall Slides based on material from the Russell and Norvig AI Book, Ch. 14

Bayesian Networks. Machine Learning, Fall Slides based on material from the Russell and Norvig AI Book, Ch. 14 Bayesian Networks Machine Learning, Fall 2010 Slides based on material from the Russell and Norvig AI Book, Ch. 14 1 Administrativia Bayesian networks The inference problem: given a BN, how to make predictions

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

Entropy-based pruning for learning Bayesian networks using BIC

Entropy-based pruning for learning Bayesian networks using BIC Entropy-based pruning for learning Bayesian networks using BIC Cassio P. de Campos a,b,, Mauro Scanagatta c, Giorgio Corani c, Marco Zaffalon c a Utrecht University, The Netherlands b Queen s University

More information

Bayesian Networks. Motivation

Bayesian Networks. Motivation Bayesian Networks Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Motivation Assume we have five Boolean variables,,,, The joint probability is,,,, How many state configurations

More information

Integer Programming for Bayesian Network Structure Learning

Integer Programming for Bayesian Network Structure Learning Integer Programming for Bayesian Network Structure Learning James Cussens Helsinki, 2013-04-09 James Cussens IP for BNs Helsinki, 2013-04-09 1 / 20 Linear programming The Belgian diet problem Fat Sugar

More information

Graphical models. Sunita Sarawagi IIT Bombay

Graphical models. Sunita Sarawagi IIT Bombay 1 Graphical models Sunita Sarawagi IIT Bombay http://www.cse.iitb.ac.in/~sunita 2 Probabilistic modeling Given: several variables: x 1,... x n, n is large. Task: build a joint distribution function Pr(x

More information

Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data

Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data SUPPLEMENTARY MATERIAL A FACTORED DELETION FOR MAR Table 5: alarm network with MCAR data We now give a more detailed derivation

More information

Probabilistic Near-Duplicate. Detection Using Simhash

Probabilistic Near-Duplicate. Detection Using Simhash Probabilistic Near-Duplicate Detection Using Simhash Sadhan Sood, Dmitri Loguinov Presented by Matt Smith Internet Research Lab Department of Computer Science and Engineering Texas A&M University 27 October

More information

Machine Learning for Data Science (CS4786) Lecture 24

Machine Learning for Data Science (CS4786) Lecture 24 Machine Learning for Data Science (CS4786) Lecture 24 Graphical Models: Approximate Inference Course Webpage : http://www.cs.cornell.edu/courses/cs4786/2016sp/ BELIEF PROPAGATION OR MESSAGE PASSING Each

More information

Mario A. Nascimento. Univ. of Alberta, Canada http: //

Mario A. Nascimento. Univ. of Alberta, Canada http: // DATA CACHING IN W Mario A. Nascimento Univ. of Alberta, Canada http: //www.cs.ualberta.ca/~mn With R. Alencar and A. Brayner. Work partially supported by NSERC and CBIE (Canada) and CAPES (Brazil) Outline

More information

Bayesian network learning by compiling to weighted MAX-SAT

Bayesian network learning by compiling to weighted MAX-SAT Bayesian network learning by compiling to weighted MAX-SAT James Cussens Department of Computer Science & York Centre for Complex Systems Analysis University of York Heslington, York YO10 5DD, UK jc@cs.york.ac.uk

More information

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.

Final. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes. CS 188 Spring 2014 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers

More information

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016 Plan for today and next week Today and next time: Bayesian networks (Bishop Sec. 8.1) Conditional

More information

Lossless Online Bayesian Bagging

Lossless Online Bayesian Bagging Lossless Online Bayesian Bagging Herbert K. H. Lee ISDS Duke University Box 90251 Durham, NC 27708 herbie@isds.duke.edu Merlise A. Clyde ISDS Duke University Box 90251 Durham, NC 27708 clyde@isds.duke.edu

More information

An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets

An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets IEEE Big Data 2015 Big Data in Geosciences Workshop An Optimized Interestingness Hotspot Discovery Framework for Large Gridded Spatio-temporal Datasets Fatih Akdag and Christoph F. Eick Department of Computer

More information

Single-tree GMM training

Single-tree GMM training Single-tree GMM training Ryan R. Curtin May 27, 2015 1 Introduction In this short document, we derive a tree-independent single-tree algorithm for Gaussian mixture model training, based on a technique

More information

arxiv: v1 [stat.ml] 28 Oct 2017

arxiv: v1 [stat.ml] 28 Oct 2017 Jinglin Chen Jian Peng Qiang Liu UIUC UIUC Dartmouth arxiv:1710.10404v1 [stat.ml] 28 Oct 2017 Abstract We propose a new localized inference algorithm for answering marginalization queries in large graphical

More information

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car CSE 573: Artificial Intelligence Autumn 2012 Bayesian Networks Dan Weld Many slides adapted from Dan Klein, Stuart Russell, Andrew Moore & Luke Zettlemoyer Outline Probabilistic models (and inference)

More information

Learning MN Parameters with Approximation. Sargur Srihari

Learning MN Parameters with Approximation. Sargur Srihari Learning MN Parameters with Approximation Sargur srihari@cedar.buffalo.edu 1 Topics Iterative exact learning of MN parameters Difficulty with exact methods Approximate methods Approximate Inference Belief

More information

An Introduction to Bayesian Networks: Representation and Approximate Inference

An Introduction to Bayesian Networks: Representation and Approximate Inference An Introduction to Bayesian Networks: Representation and Approximate Inference Marek Grześ Department of Computer Science University of York Graphical Models Reading Group May 7, 2009 Data and Probabilities

More information

Statistical learning. Chapter 20, Sections 1 4 1

Statistical learning. Chapter 20, Sections 1 4 1 Statistical learning Chapter 20, Sections 1 4 Chapter 20, Sections 1 4 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete

More information

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference 15-780: Graduate Artificial Intelligence ayesian networks: Construction and inference ayesian networks: Notations ayesian networks are directed acyclic graphs. Conditional probability tables (CPTs) P(Lo)

More information

Mixtures of Gaussians. Sargur Srihari

Mixtures of Gaussians. Sargur Srihari Mixtures of Gaussians Sargur srihari@cedar.buffalo.edu 1 9. Mixture Models and EM 0. Mixture Models Overview 1. K-Means Clustering 2. Mixtures of Gaussians 3. An Alternative View of EM 4. The EM Algorithm

More information

Discrete Bayesian Networks: The Exact Posterior Marginal Distributions

Discrete Bayesian Networks: The Exact Posterior Marginal Distributions arxiv:1411.6300v1 [cs.ai] 23 Nov 2014 Discrete Bayesian Networks: The Exact Posterior Marginal Distributions Do Le (Paul) Minh Department of ISDS, California State University, Fullerton CA 92831, USA dminh@fullerton.edu

More information

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016)

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016) FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016) The final exam will be on Thursday, May 12, from 8:00 10:00 am, at our regular class location (CSI 2117). It will be closed-book and closed-notes, except

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Lecture 7: DecisionTrees

Lecture 7: DecisionTrees Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:

More information

Announcements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences

Announcements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences CS 188: Artificial Intelligence Spring 2011 Announcements Assignments W4 out today --- this is your last written!! Any assignments you have not picked up yet In bin in 283 Soda [same room as for submission

More information

Variational algorithms for marginal MAP

Variational algorithms for marginal MAP Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Variational algorithms for marginal MAP Alexander Ihler UC Irvine CIOG Workshop November 2011 Work with Qiang

More information

Priority queues implemented via heaps

Priority queues implemented via heaps Priority queues implemented via heaps Comp Sci 1575 Data s Outline 1 2 3 Outline 1 2 3 Priority queue: most important first Recall: queue is FIFO A normal queue data structure will not implement a priority

More information

Learning Bayesian Networks for Biomedical Data

Learning Bayesian Networks for Biomedical Data Learning Bayesian Networks for Biomedical Data Faming Liang (Texas A&M University ) Liang, F. and Zhang, J. (2009) Learning Bayesian Networks for Discrete Data. Computational Statistics and Data Analysis,

More information

Bayesian Approaches Data Mining Selected Technique

Bayesian Approaches Data Mining Selected Technique Bayesian Approaches Data Mining Selected Technique Henry Xiao xiao@cs.queensu.ca School of Computing Queen s University Henry Xiao CISC 873 Data Mining p. 1/17 Probabilistic Bases Review the fundamentals

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Bayesian Networks: Representation, Variable Elimination

Bayesian Networks: Representation, Variable Elimination Bayesian Networks: Representation, Variable Elimination CS 6375: Machine Learning Class Notes Instructor: Vibhav Gogate The University of Texas at Dallas We can view a Bayesian network as a compact representation

More information

Probabilistic representation and reasoning

Probabilistic representation and reasoning Probabilistic representation and reasoning Applied artificial intelligence (EDAF70) Lecture 04 2019-02-01 Elin A. Topp Material based on course book, chapter 13, 14.1-3 1 Show time! Two boxes of chocolates,

More information

PCSI-labeled Directed Acyclic Graphs

PCSI-labeled Directed Acyclic Graphs Date of acceptance 7.4.2014 Grade Instructor Eximia cum laude approbatur Jukka Corander PCSI-labeled Directed Acyclic Graphs Jarno Lintusaari Helsinki April 8, 2014 UNIVERSITY OF HELSINKI Department of

More information

Bayes Nets III: Inference

Bayes Nets III: Inference 1 Hal Daumé III (me@hal3.name) Bayes Nets III: Inference Hal Daumé III Computer Science University of Maryland me@hal3.name CS 421: Introduction to Artificial Intelligence 10 Apr 2012 Many slides courtesy

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Quantifying uncertainty & Bayesian networks

Quantifying uncertainty & Bayesian networks Quantifying uncertainty & Bayesian networks CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2016 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition,

More information

CMPSCI 240: Reasoning about Uncertainty

CMPSCI 240: Reasoning about Uncertainty CMPSCI 240: Reasoning about Uncertainty Lecture 17: Representing Joint PMFs and Bayesian Networks Andrew McGregor University of Massachusetts Last Compiled: April 7, 2017 Warm Up: Joint distributions Recall

More information

Another look at Bayesian. inference

Another look at Bayesian. inference Another look at Bayesian A general scenario: - Query variables: X inference - Evidence (observed) variables and their values: E = e - Unobserved variables: Y Inference problem: answer questions about the

More information

An Empirical-Bayes Score for Discrete Bayesian Networks

An Empirical-Bayes Score for Discrete Bayesian Networks JMLR: Workshop and Conference Proceedings vol 52, 438-448, 2016 PGM 2016 An Empirical-Bayes Score for Discrete Bayesian Networks Marco Scutari Department of Statistics University of Oxford Oxford, United

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Final Exam December 12, 2017

Final Exam December 12, 2017 Introduction to Artificial Intelligence CSE 473, Autumn 2017 Dieter Fox Final Exam December 12, 2017 Directions This exam has 7 problems with 111 points shown in the table below, and you have 110 minutes

More information

Learning in Bayesian Networks

Learning in Bayesian Networks Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks

More information

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard

More information

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181. Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität

More information

Tailored Bregman Ball Trees for Effective Nearest Neighbors

Tailored Bregman Ball Trees for Effective Nearest Neighbors Tailored Bregman Ball Trees for Effective Nearest Neighbors Frank Nielsen 1 Paolo Piro 2 Michel Barlaud 2 1 Ecole Polytechnique, LIX, Palaiseau, France 2 CNRS / University of Nice-Sophia Antipolis, Sophia

More information

CS 484 Data Mining. Association Rule Mining 2

CS 484 Data Mining. Association Rule Mining 2 CS 484 Data Mining Association Rule Mining 2 Review: Reducing Number of Candidates Apriori principle: If an itemset is frequent, then all of its subsets must also be frequent Apriori principle holds due

More information

Classification Based on Logical Concept Analysis

Classification Based on Logical Concept Analysis Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.

More information

Lecture 8: Bayesian Networks

Lecture 8: Bayesian Networks Lecture 8: Bayesian Networks Bayesian Networks Inference in Bayesian Networks COMP-652 and ECSE 608, Lecture 8 - January 31, 2017 1 Bayes nets P(E) E=1 E=0 0.005 0.995 E B P(B) B=1 B=0 0.01 0.99 E=0 E=1

More information

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models

Computational Genomics. Systems biology. Putting it together: Data integration using graphical models 02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Ensembles Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique Fédérale de Lausanne

More information

Markov properties for directed graphs

Markov properties for directed graphs Graphical Models, Lecture 7, Michaelmas Term 2009 November 2, 2009 Definitions Structural relations among Markov properties Factorization G = (V, E) simple undirected graph; σ Say σ satisfies (P) the pairwise

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

Decision trees. Special Course in Computer and Information Science II. Adam Gyenge Helsinki University of Technology

Decision trees. Special Course in Computer and Information Science II. Adam Gyenge Helsinki University of Technology Decision trees Special Course in Computer and Information Science II Adam Gyenge Helsinki University of Technology 6.2.2008 Introduction Outline: Definition of decision trees ID3 Pruning methods Bibliography:

More information

An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application

An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application, CleverSet, Inc. STARMAP/DAMARS Conference Page 1 The research described in this presentation has been funded by the U.S.

More information

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Probabilistic Reasoning. (Mostly using Bayesian Networks) Probabilistic Reasoning (Mostly using Bayesian Networks) Introduction: Why probabilistic reasoning? The world is not deterministic. (Usually because information is limited.) Ways of coping with uncertainty

More information

Bayesian networks. Chapter Chapter

Bayesian networks. Chapter Chapter Bayesian networks Chapter 14.1 3 Chapter 14.1 3 1 Outline Syntax Semantics Parameterized distributions Chapter 14.1 3 2 Bayesian networks A simple, graphical notation for conditional independence assertions

More information

Learning Chordal Markov Networks via Branch and Bound

Learning Chordal Markov Networks via Branch and Bound Learning Chordal Markov Networks via Branch and Bound Kari Rantanen HIIT, Dept. Comp. Sci., University of Helsinki Antti Hyttinen HIIT, Dept. Comp. Sci., University of Helsinki Matti Järvisalo HIIT, Dept.

More information

VCMC: Variational Consensus Monte Carlo

VCMC: Variational Consensus Monte Carlo VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object

More information

Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints. Hadi Mohasel Afshar Scott Sanner Christfried Webers

Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints. Hadi Mohasel Afshar Scott Sanner Christfried Webers Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints Hadi Mohasel Afshar Scott Sanner Christfried Webers Inference in Hybrid Graphical Models / Probabilistic Programs Limitations

More information

Chapter 9: Relations Relations

Chapter 9: Relations Relations Chapter 9: Relations 9.1 - Relations Definition 1 (Relation). Let A and B be sets. A binary relation from A to B is a subset R A B, i.e., R is a set of ordered pairs where the first element from each pair

More information

Outline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting

Outline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting Outline for today Information Retrieval Efficient Scoring and Ranking Recap on ranked retrieval Jörg Tiedemann jorg.tiedemann@lingfil.uu.se Department of Linguistics and Philology Uppsala University Efficient

More information