Probabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference

Similar documents
Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech

Quantifying uncertainty & Bayesian networks

Inference in Graphical Models Variable Elimination and Message Passing Algorithm

Intro to AI: Lecture 8. Volker Sorge. Introduction. A Bayesian Network. Inference in. Bayesian Networks. Bayesian Networks.

Bayesian Networks BY: MOHAMAD ALSABBAGH

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Rapid Introduction to Machine Learning/ Deep Learning

Introduction to Causal Calculus

Cogs 14B: Introduction to Statistical Analysis

Computational Complexity of Inference

Probabilistic Robotics

Intelligent Systems (AI-2)

Directed Graphical Models

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Stochastic Simulation

Computer Science CPSC 322. Lecture 23 Planning Under Uncertainty and Decision Networks

Logic, Knowledge Representation and Bayesian Decision Theory

Reasoning Under Uncertainty: Conditioning, Bayes Rule & the Chain Rule

STAT 598L Probabilistic Graphical Models. Instructor: Sergey Kirshner. Probability Review

Sociology 6Z03 Topic 10: Probability (Part I)

Tópicos Especiais em Modelagem e Análise - Aprendizado por Máquina CPS863

Machine Learning Lecture 14

Probabilistic Graphical Models (I)

Bayesian networks. Chapter 14, Sections 1 4

Probabilistic Machine Learning

Exact Inference Algorithms Bucket-elimination

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

4 : Exact Inference: Variable Elimination

Probability. CS 3793/5233 Artificial Intelligence Probability 1

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

1. what conditional independencies are implied by the graph. 2. whether these independecies correspond to the probability distribution

Bayesian Machine Learning

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

Bayesian networks. Chapter Chapter

Learning in Bayesian Networks

PROBABILISTIC REASONING SYSTEMS

Artificial Intelligence Uncertainty

Variable Elimination: Algorithm

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes.

CS6220: DATA MINING TECHNIQUES

CSEP 573: Artificial Intelligence

Introduction into Bayesian statistics

Bayesian Networks. Motivation

Statistical Approaches to Learning and Discovery

COMP9414: Artificial Intelligence Reasoning Under Uncertainty

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Reasoning with Uncertainty

Lecture 8. Probabilistic Reasoning CS 486/686 May 25, 2006

Exact Inference Algorithms Bucket-elimination

Ch.6 Uncertain Knowledge. Logic and Uncertainty. Representation. One problem with logical approaches: Department of Computer Science

Lecture 1: Probability Fundamentals

Probabilistic Representation and Reasoning

Bayesian Approaches Data Mining Selected Technique

Where are we? Knowledge Engineering Semester 2, Reasoning under Uncertainty. Probabilistic Reasoning

COMP90051 Statistical Machine Learning

Bayes Nets: Independence

Variable Elimination: Algorithm

Computational Cognitive Science

Stochastic Simulation

Directed Graphical Models or Bayesian Networks

Tractable Inference in Hybrid Bayesian Networks with Deterministic Conditionals using Re-approximations

CS 188: Artificial Intelligence. Bayes Nets

Uncertain Knowledge and Bayes Rule. George Konidaris

Probability and Statistics

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

Bayesian Networks aka belief networks, probabilistic networks. Bayesian Networks aka belief networks, probabilistic networks. An Example Bayes Net

CMPSCI 240: Reasoning about Uncertainty

Uncertainty and Bayesian Networks

Bayesian Networks Representation

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

CMPSCI 240: Reasoning Under Uncertainty

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Learning Objectives. c D. Poole and A. Mackworth 2010 Artificial Intelligence, Lecture 7.2, Page 1

Introduction to Probabilistic Reasoning. Image credit: NASA. Assignment

Computational Cognitive Science

Terminology. Experiment = Prior = Posterior =

Decision-Theoretic Specification of Credal Networks: A Unified Language for Uncertain Modeling with Sets of Bayesian Networks

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Learning P-maps Param. Learning

Reasoning with Bayesian Networks

Probabilistic Classification

Computer Science CPSC 322. Lecture 18 Marginalization, Conditioning

Recall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network.

Bayesian Learning. CSL603 - Fall 2017 Narayanan C Krishnan

Single Maths B: Introduction to Probability

{ p if x = 1 1 p if x = 0

School of EECS Washington State University. Artificial Intelligence

Probabilistic Graphical Models

Probabilistic Graphical Networks: Definitions and Basic Results

Quantifying Uncertainty

Artificial Intelligence Bayesian Networks

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Dynamic Approaches: The Hidden Markov Model

CS 188: Artificial Intelligence Spring Announcements

Probability Theory and Simulation Methods

Midterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas

Probabilistic Graphical Models

Transcription:

Probabilistic Partial Evaluation: Exploiting rule structure in probabilistic inference David Poole University of British Columbia 1

Overview Belief Networks Variable Elimination Algorithm Parent Contexts & Structured Representations Structure-preserving inference Conclusion 2

Belief (Bayesian) Networks n P(x 1,...,x n ) = P(x i x i 1...x 1 ) = i=1 n i=1 P(x i π xi ) π xi are parents of x i : set of variables such that the predecessors are independent of x i given its parents. 3

Variable Elimination Algorithm Given: Bayesian Network, Query variable, Observations, Elimination ordering on remaining variables 1. set observed variables 2. sum out variables according to elimination ordering 3. renormalize 4

Summing Out a Variable... h............ c d e f g a b...... Sum out e: P(a c, d, e) P(b e, f, g) P(e h) P(a, b c, d, f, g, h) 5

Structured Probability Tables P(a c, d, e) P(b e, f, g) t d e c t f t f p 1 p 2 p 3 p 4 f f e g p 5 p 6 p 7 p 8 p 2 = P(a = t d = t e = f ) 6

Eliminating e, preserving structure We only need to consider a & b together when d = true f = true. In this context c & g are irrelevant. In all other contexts we can consider a & b separately. When d = false f = false, e is irrelevant. In this context the probabilities shouldn t be affected by eliminating e. 7

Contextual Independence Given a set of variables C,acontext on C is an assignment of one value to each variable in C. Suppose X, Y and C are disjoint sets of variables. X and Y are contextually independent given context c val(c) if P(X Y=y 1 C=c) = P(X Y=y 2 C=c) for all y 1, y 2 val(y) such that P(y 1 c) >0and P(y 2 c) >0. 8

Parent Contexts A parent context for variable x i is a context c for a subset of the predecessors for x i such that x i is contextually independent of the other predecessors given c. For variable x i & assignment x i 1 =v i 1,...,x 1 =v 1 of values to its preceding variables, there is a parent context π v i 1...v 1 x i. P(x 1 =v 1,...,x n =v n ) n = P(x i =v n x i 1 =v i 1,...,x 1 =v 1 ) = i=1 n i=1 P(x i =v i π v i 1...v 1 x i ) 9

Idea behind probabilistic partial evaluation Maintain rules that are statements of probabilities in contexts. When eliminating a variable, you can ignore all rules that don t involve that variable. This wins when a variable is only in few parent contexts. Eliminating a variable looks like resolution! 10

Rule-based representation of our example a d e : p 1 b f e : p 5 a d e : p 2 b f e : p 6 a d c : p 3 b f g : p 7 a b c : p 4 b f g : p 8 e h : p 9 e h : p 10 11

Eliminating e a d e : p 1 b f e : p 5 a d e : p 2 b f e : p 6 a d c : p 3 b f g : p 7 a b c : p 4 b f g : p 8 e h : p 9 e h : p 10 unaffected by eliminating e 12

Variable partial evaluation If we are eliminating e, and have rules: x y e : p 1 x y e : p 2 e z : p 3 no other rules compatible with y contain e in the body y & z are compatible contexts, we create the rule: x y z : p 1 p 3 + p 2 (1 p 3 ) 13

Splitting Rules A rule a b : p 1 can be split on variable d, forming rules: a b d : p 1 a b d : p 1 14

Why Split? If there are different contexts for a given e and for a given e, you need to split the contexts to make them directly comparable: a b c e : p1 a b e : p 1 a b c e : p 1 a b c e : p 2 a b c e : p 3 15

Combining Heads Rules a c : p 1 b c : p 2 where a and b refer to different variables, can be combined producing: a b c : p 1 p 2 Thus in the context with a, b, and c all true, the latter rule can be used instead of the first two. 16

Splitting Compatible Bodies a d e : p 1 b f e : p 5 a d f e : p 1 b d f e : p 5 a d f e : p 1 b d f e : p 5 a d e : p 2 b f e : p 6 a d f e : p 2 b d f e : p 6 a d f e : p 2 b d f e : p 6 e h : p 9 e h : p 10 17

Combining Rules a d f e : p 1 b d f e : p 5 a d f e : p 1 b d f e : p 5 a d f e : p 2 b d f e : p 6 a d f e : p 2 b d f e : p 6 e h : p 9 e h : p 10 18

Result of eliminating e The resultant rules encode the probabilities of {a, b} in the contexts: d f h, d f h For all other contexts we consider a and b separately. The resulting number of rules is 24. Tree structured probability for P(a, b c, d, f, g, h, i) has 72 leaves. (Same as number of rules if a and b are combined in all contexts). VE has a table of size 256. 19

Evidence We can set the values of all evidence variables before summing out the remaining non-query variables. Suppose e 1 =o 1... e s =o s is observed: Remove any rule that contains e i =o i, where o i = o i in the body. Remove any term e i =o i in the body of a rule. Replace any e i =o i, where o i = o i, in the head of a rule false. Replace any e i =o i in the head of a rule by true. In rule heads, use true a a, and false a false. 20

Conclusions New notion of parent context rule-based representation for Bayesian networks. New algorithm for probabilistic inference that preserves rule-structure. Exploits more structure than tree-based representations of conditional probability. Allows for finer-grained approximation than in a Bayesian network. 21