Inference in Bayesian networks

Similar documents
Artificial Intelligence Methods. Inference in Bayesian networks

Inference in Bayesian networks

Inference in Bayesian Networks

Stochastic inference in Bayesian networks, Markov chain Monte Carlo methods

Bayesian Networks. Philipp Koehn. 6 April 2017

Artificial Intelligence

Bayesian networks. AIMA2e Chapter 14

Bayesian Networks. Philipp Koehn. 29 October 2015

Outline } Exact inference by enumeration } Exact inference by variable elimination } Approximate inference by stochastic simulation } Approximate infe

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Lecture 10: Bayesian Networks and Inference

Bayesian networks. Chapter Chapter

Bayesian networks. Chapter 14, Sections 1 4

Course Overview. Summary. Outline

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Informatics 2D Reasoning and Agents Semester 2,

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Probabilistic Reasoning Systems

CS 730/730W/830: Intro AI

Belief Networks for Probabilistic Inference

BAYESIAN NETWORKS AIMA2E CHAPTER (SOME TOPICS EXCLUDED) AIMA2e Chapter (some topics excluded) 1

qfundamental Assumption qbeyond the Bayes net Chain rule conditional independence assumptions,

CS 188: Artificial Intelligence. Bayes Nets

Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech

School of EECS Washington State University. Artificial Intelligence

PROBABILISTIC REASONING SYSTEMS

Quantifying Uncertainty

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

last two digits of your SID

CS 343: Artificial Intelligence

Uncertainty and Bayesian Networks

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Bayes Networks 6.872/HST.950

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Bayesian Networks. Motivation

Quantifying uncertainty & Bayesian networks

Introduction to Artificial Intelligence Belief networks

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayes Net Representation. CS 188: Artificial Intelligence. Approximate Inference: Sampling. Variable Elimination. Sampling.

Machine Learning for Data Science (CS4786) Lecture 24

Bayes Nets III: Inference

Likelihood Weighting and Importance Sampling

Artificial Intelligence Bayesian Networks

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo. Sampling Methods. Machine Learning. Torsten Möller.

Learning Bayesian Networks (part 1) Goals for the lecture

Sampling Methods. Bishop PRML Ch. 11. Alireza Ghane. Sampling Rejection Sampling Importance Sampling Markov Chain Monte Carlo

Stochastic Simulation

Review: Bayesian learning and inference

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Bayesian networks: approximate inference

Bayes Nets: Independence

CSEP 573: Artificial Intelligence

Probabilistic Reasoning. (Mostly using Bayesian Networks)

PROBABILISTIC REASONING Outline

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Uncertainty and knowledge. Uncertainty and knowledge. Reasoning with uncertainty. Notes

Informatics 2D Reasoning and Agents Semester 2,

Graphical Models and Kernel Methods

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

Bayesian Approaches Data Mining Selected Technique

Bayesian Networks. 7. Approximate Inference / Sampling

Foundations of Artificial Intelligence

Intelligent Systems (AI-2)

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

CS 188: Artificial Intelligence Spring Announcements

Ch.6 Uncertain Knowledge. Logic and Uncertainty. Representation. One problem with logical approaches: Department of Computer Science

Reasoning Under Uncertainty

Introduction to Probabilistic Reasoning. Image credit: NASA. Assignment

Reasoning Under Uncertainty

PROBABILITY AND INFERENCE

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Bayesian Networks. B. Inference / B.3 Approximate Inference / Sampling

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Bayesian networks. Independence. Bayesian networks. Markov conditions Inference. by enumeration rejection sampling Gibbs sampler

Bayes Nets: Sampling

Introduction to Artificial Intelligence. Unit # 11

COMP5211 Lecture Note on Reasoning under Uncertainty

Sampling Algorithms for Probabilistic Graphical models

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

COMP538: Introduction to Bayesian Networks

Hidden Markov Models. Vibhav Gogate The University of Texas at Dallas

Learning With Bayesian Networks. Markus Kalisch ETH Zürich

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

T Machine Learning: Basic Principles

Temporal probability models. Chapter 15

Directed Graphical Models or Bayesian Networks

CS 5522: Artificial Intelligence II

Graphical Models - Part I

Y. Xiang, Inference with Uncertain Knowledge 1

CS188 Outline. We re done with Part I: Search and Planning! Part II: Probabilistic Reasoning. Part III: Machine Learning

CS 343: Artificial Intelligence

CSEP 573: Artificial Intelligence

14 PROBABILISTIC REASONING

Approximate Inference

Bayesian Networks. Characteristics of Learning BN Models. Bayesian Learning. An Example

CMPSCI 240: Reasoning about Uncertainty

Bayesian Networks: Representation, Variable Elimination

Transcription:

Inference in Bayesian networks hapter 14.4 5 hapter 14.4 5 1

Exact inference by enumeration Outline Approximate inference by stochastic simulation hapter 14.4 5 2

Inference tasks Simple queries: compute posterior marginal P(X i E =e) e.g., P(NoGas Gauge = empty, Lights =on, Starts = false) onjunctive queries: P(X i,x j E =e) = P(X i E =e)p(x j X i,e=e) Optimal decisions: decision networks include utility information; probabilistic inference required for P(outcome action,evidence) Value of information: which evidence to seek next? Sensitivity analysis: which probability values are most critical? Explanation: why do I need a new starter motor? hapter 14.4 5 3

Inference by enumeration Slightly intelligent way to sum out variables from the joint without actually constructing its explicit representation Simple query on the burglary network: P(B j, m) = P(B,j, m)/p(j, m) = αp(b, j,m) = α Σ e Σ a P(B,e, a,j, m) B J A E M Rewrite full joint entries using product of P entries: P(B j, m) = α Σ e Σ a P(B)P(e)P(a B,e)P(j a)p(m a) = αp(b) Σ e P(e) Σ a P(a B,e)P(j a)p(m a) Recursive depth-first enumeration: O(n) space, O(d n ) time hapter 14.4 5 4

Enumeration algorithm function Enumeration-Ask(X,e,bn) returns a distribution over X inputs: X, the query variable e, observed values for variables E bn, a Bayesian network with variables {X} E Y Q(X ) a distribution over X, initially empty for each value x i of X do extend e with value x i for X Q(x i ) Enumerate-All(Vars[bn],e) return Normalize(Q(X )) function Enumerate-All(vars,e) returns a real number if Empty?(vars) then return 1.0 Y irst(vars) if Y has value y in e then return P(y P a(y )) Enumerate-All(Rest(vars), e) else return y P(y Pa(Y )) Enumerate-All(Rest(vars),e y ) where e y is e extended with Y = y hapter 14.4 5 5

L L L L omplexity of exact inference Multiply connected networks: can reduce 3SA to exact inference NP-hard equivalent to counting 3SA models #P-complete 0.5 0.5 0.5 0.5 A B D 1. A v B v 2. v D v A 1 2 3 3. B v v D AND hapter 14.4 5 6

Inference by stochastic simulation Basic idea: 1) Draw N samples from a sampling distribution S 2) ompute an approximate posterior probability ˆP 3) Show this converges to the true probability P Outline: Sampling from an empty network Rejection sampling: reject samples disagreeing with evidence Likelihood weighting: use evidence to weight samples 0.5 oin hapter 14.4 5 7

Sampling from an empty network function Prior-Sample(bn) returns an event sampled from bn inputs: bn, a belief network specifying joint distribution P(X 1,..., X n ) x an event with n elements for i = 1 to n do x i a random sample from P(X i parents(x i )) given the values of Parents(X i ) in x return x hapter 14.4 5 8

Example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 hapter 14.4 5 9

Example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 hapter 14.4 5 10

Example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 hapter 14.4 5 11

Example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 hapter 14.4 5 12

Example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 hapter 14.4 5 13

Example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 hapter 14.4 5 14

Example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 hapter 14.4 5 15

Sampling from an empty network contd. Probability that PriorSample generates a particular event S PS (x 1...x n ) = Π n i = 1P(x i parents(x i )) = P(x 1...x n ) i.e., the true prior probability E.g., S PS (t,f, t,t) = 0.5 0.9 0.8 0.9 = 0.324 = P(t,f, t,t) Let N PS (x 1...x n ) be the number of samples generated for event x 1,...,x n hen we have lim N ˆP(x 1,...,x n ) = lim N N PS(x 1,...,x n )/N = S PS (x 1,...,x n ) = P(x 1...x n ) hat is, estimates derived from PriorSample are consistent Shorthand: ˆP(x1,...,x n ) P(x 1...x n ) hapter 14.4 5 16

Rejection sampling ˆP(X e) estimated from samples agreeing with e function Rejection-Sampling(X, e, bn, N) returns an estimate of P(X e) local variables: N, a vector of counts over X, initially zero for j = 1 to N do x Prior-Sample(bn) if x is consistent with e then N[x] N[x]+1 where x is the value of X in x return Normalize(N[X]) E.g., estimate P( = true) using 100 samples 27 samples have = true Of these, 8 have =true and 19 have = false. ˆP( = true) = Normalize( 8, 19 ) = 0.296, 0.704 Similar to a basic real-world empirical estimation procedure hapter 14.4 5 17

Analysis of rejection sampling ˆP(X e) = αn PS (X,e) (algorithm defn.) = N PS (X,e)/N PS (e) (normalized by N PS (e)) P(X, e)/p(e) (property of PriorSample) = P(X e) (defn. of conditional probability) Hence rejection sampling returns consistent posterior estimates Problem: hopelessly expensive if P(e) is small P(e) drops off exponentially with number of evidence variables! hapter 14.4 5 18

Likelihood weighting Idea: fix evidence variables, sample only nonevidence variables, and weight each sample by the likelihood it accords the evidence function Likelihood-Weighting(X, e, bn, N) returns an estimate of P(X e) local variables: W, a vector of weighted counts over X, initially zero for j = 1 to N do x, w Weighted-Sample(bn) W[x] W[x] + w where x is the value of X in x return Normalize(W[X ]) function Weighted-Sample(bn,e) returns an event and a weight x an event with n elements; w 1 for i = 1 to n do if X i has a value x i in e then w w P(X i = x i parents(x i )) else x i a random sample from P(X i parents(x i )) return x, w hapter 14.4 5 19

Likelihood weighting example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 w = 1.0 hapter 14.4 5 20

Likelihood weighting example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 w = 1.0 hapter 14.4 5 21

Likelihood weighting example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 w = 1.0 hapter 14.4 5 22

Likelihood weighting example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 w = 1.0 0.1 hapter 14.4 5 23

Likelihood weighting example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 w = 1.0 0.1 hapter 14.4 5 24

Likelihood weighting example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 w = 1.0 0.1 hapter 14.4 5 25

Likelihood weighting example P() loudy P(S ).10 P(R ).80.20 S R P(W S,R).99.01 w = 1.0 0.1 0.99 = 0.099 hapter 14.4 5 26

Likelihood weighting analysis Sampling probability for WeightedSample is S WS (z,e) = Π l i = 1P(z i parents(z i )) Note: pays attention to evidence in ancestors only somewhere in between prior and posterior distribution loudy Weight for a given sample z,e is w(z,e) = Π m i = 1P(e i parents(e i )) Weighted sampling probability is S WS (z,e)w(z,e) = Π l i = 1P(z i parents(z i )) Π m i = 1P(e i parents(e i )) = P(z,e) (by standard global semantics of network) Hence likelihood weighting returns consistent estimates but performance still degrades with many evidence variables because a few samples have nearly all the total weight hapter 14.4 5 27

Exact inference by enumeration: NP-hard on general graphs Summary Approximate inference by LW: LW does poorly when there is lots of (downstream) evidence LW, generally insensitive to topology onvergence can be very slow with probabilities close to 1 or 0 an handle arbitrary combinations of discrete and continuous variables hapter 14.4 5 28