Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Similar documents
Probabilistic Reasoning Systems

Bayesian networks. Chapter Chapter

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Bayesian networks. Chapter 14, Sections 1 4

Introduction to Artificial Intelligence Belief networks

CS Belief networks. Chapter

Uncertainty and Bayesian Networks

Quantifying uncertainty & Bayesian networks

PROBABILISTIC REASONING SYSTEMS

Outline } Conditional independence } Bayesian networks: syntax and semantics } Exact inference } Approximate inference AIMA Slides cstuart Russell and

CS 380: ARTIFICIAL INTELLIGENCE UNCERTAINTY. Santiago Ontañón

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

Artificial Intelligence

Bayesian Networks. Philipp Koehn. 6 April 2017

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Graphical Models - Part I

Uncertainty and Belief Networks. Introduction to Artificial Intelligence CS 151 Lecture 1 continued Ok, Lecture 2!

Bayesian Networks. Philipp Koehn. 29 October 2015

Bayesian Networks. Material used

This lecture. Reading. Conditional Independence Bayesian (Belief) Networks: Syntax and semantics. Chapter CS151, Spring 2004

Review: Bayesian learning and inference

Belief Networks for Probabilistic Inference

14 PROBABILISTIC REASONING

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

School of EECS Washington State University. Artificial Intelligence

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

Inference in Bayesian networks

Stochastic inference in Bayesian networks, Markov chain Monte Carlo methods

Bayesian networks. Chapter Chapter

Bayesian networks. Chapter Chapter Outline. Syntax Semantics Parameterized distributions. Chapter

Inference in Bayesian Networks

Reasoning Under Uncertainty

Bayesian networks. Chapter AIMA2e Slides, Stuart Russell and Peter Norvig, Completed by Kazim Fouladi, Fall 2008 Chapter 14.

PROBABILISTIC REASONING Outline

Reasoning Under Uncertainty

CSE 473: Artificial Intelligence Autumn 2011

Bayesian Belief Network

Probabilistic representation and reasoning

Bayesian networks: Modeling

Course Overview. Summary. Outline

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

Probabilistic Models. Models describe how (a portion of) the world works

Informatics 2D Reasoning and Agents Semester 2,

Probabilistic representation and reasoning

Probabilistic Models

Brief Intro. to Bayesian Networks. Extracted and modified from four lectures in Intro to AI, Spring 2008 Paul S. Rosenbloom

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Artificial Intelligence Uncertainty

CS 188: Artificial Intelligence Fall 2009

Artificial Intelligence

CS 188: Artificial Intelligence Spring Announcements

Directed Graphical Models

CS 5522: Artificial Intelligence II

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Bayesian networks (1) Lirong Xia

Foundations of Artificial Intelligence

Artificial Intelligence Methods. Inference in Bayesian networks

p(x) p(x Z) = y p(y X, Z) = αp(x Y, Z)p(Y Z)

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

CS 343: Artificial Intelligence

Probabilistic Models Bayesian Networks Markov Random Fields Inference. Graphical Models. Foundations of Data Analysis

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Probabilistic Reasoning

Bayesian Networks. Motivation

Uncertainty. Chapter 13

Bayesian Networks BY: MOHAMAD ALSABBAGH

Another look at Bayesian. inference

Uncertainty. Outline

Artificial Intelligence Bayesian Networks

CMPSCI 240: Reasoning about Uncertainty

Bayesian belief networks. Inference.

Y. Xiang, Inference with Uncertain Knowledge 1

Outline } Exact inference by enumeration } Exact inference by variable elimination } Approximate inference by stochastic simulation } Approximate infe

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

COMP5211 Lecture Note on Reasoning under Uncertainty

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Artificial Intelligence Bayes Nets: Independence

CS 188: Artificial Intelligence Fall 2008

Lecture 10: Bayesian Networks and Inference

Bayesian Networks. Machine Learning, Fall Slides based on material from the Russell and Norvig AI Book, Ch. 14

last two digits of your SID

Quantifying Uncertainty

Cartesian-product sample spaces and independence

Introduction to Artificial Intelligence. Unit # 11

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Bayes Nets: Independence

Uncertainty. Chapter 13

CS 5522: Artificial Intelligence II

Uncertain Knowledge and Reasoning

Chapter 13 Quantifying Uncertainty

Bayesian belief networks

Uncertainty. Chapter 13. Chapter 13 1

Web-Mining Agents Data Mining

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

Graphical Models - Part II

COMP9414/ 9814/ 3411: Artificial Intelligence. 14. Uncertainty. Russell & Norvig, Chapter 13. UNSW c AIMA, 2004, Alan Blair, 2012

Learning Bayesian Networks (part 1) Goals for the lecture

Transcription:

Copyright Richard J. Povinelli rev 1.0, 10/1//2001 Page 1 Probabilistic Reasoning Systems Dr. Richard J. Povinelli Objectives You should be able to apply belief networks to model a problem with uncertainty. be able to briefly describe other approaches to modeling uncertainty such as Dempster-Shafer theory and fuzzy sets/logic. Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 1 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 2 Outline Conditional independence Bayesian networks: syntax and semantics Exact inference Approximate inference Exact inference by enumeration Exact inference by variable elimination Approximate inference by stochastic simulation Approximate inference by Markov chain Monte Carlo Independence Two random variables A and B are (absolutely) independent iff P (A B) = P (A ) P (A,B) = P (A B) P (B) = P (A ) P (B) for example A and B are two coin tosses If n Boolean variables are independent, the full joint is P(X 1,, X n ) = Π i P(X i ) hence can be specified by just n numbers Absolute independence is a very strong requirement, seldom met Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 3 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 4 Conditional independence Consider the dentist problem with three random variables: Toothache, Cavity, Catch (steel probe catches in my tooth) The full joint distribution has 2 3-1 = 7 independent entries If I have a cavity, the probability that the probe catches in it doesn't depend on whether I have a toothache: (1) P (Catch Toothache, Cavity) = P (Catch Cavity) i.e., Catch is conditionally independent of Toothache given Cavity The same independence holds if I haven't got a cavity: (2) P (Catch Toothache, Cavity) = P (Catch Cavity) Conditional independence II Equivalent statements to (1) (1a) P (Toothache Catch, Cavity) = P(Toothache Cavity) Why?? (1b) P (Toothache, Catch Cavity) = P(Toothache Cavity) P(Catch Cavity) Why?? Full joint distribution can now be written as P(Toothache, Catch, Cavity) = P(Toothache,Catch Cavity)P(Cavity) = P(Toothache Cavity)P(Catch Cavity)P(Cavity) i.e., 2 + 2 + 1 = 5 independent numbers (equations 1 and 2 remove 2) Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 5 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 6

Copyright Richard J. Povinelli rev 1.0, 10/1//2001 Page 2 Conditional independence III Equivalent statements to (1) (1a) P (Toothache Catch, Cavity) = P(Toothache Cavity) Why?? P (Toothache Catch, Cavity) = P (Catch Toothache, Cavity) P(Toothache Cavity) /P(Catch Cavity) = P(Catch Cavity) P(Toothache Cavity)/P(Catch Cavity) (from 1) = P(ToothachejCavity) (1b) P (Toothache, Catch Cavity) = P(Toothache Cavity) P(Catch Cavity) Why?? P(Toothache, Catch Cavity) = P (Toothache Catch, Cavity) P(Catch Cavity) (product rule) = P (Toothache Cavity) P(Catch Cavity) (from 1a) Belief networks A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions Syntax: a set of nodes, one per variable a directed, acyclic graph (link directly influences ) a conditional distribution for each node given its parents: P(X i Parents(X i )) In the simplest case, conditional distribution represented as a conditional probability table (CPT) Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 7 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 8 I'm at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call. Sometimes it's set off by minor earthquakes. Is there a burglar? Variables: Burglar, Earthquake,,, Network topology reflects causal knowledge: Note: k parents O(d k n) numbers vs. O(d n ) Semantics Global semantics defines the full joint distribution as the product of the local conditional distributions: P(X 1,, X n ) = Π n i = 1 P(X n Parents(X i )) e.g., P (J M A B E) is given by?? Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 9 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 10 Semantics Global semantics defines the full joint distribution as the product of the local conditional distributions: P(X 1,, X n ) = Π n i = 1 P(X n Parents(X i )) e.g., P (J M A B E) is given by?? = P( B) P( E) P(A B E) P(J A) P(M A) Local semantics: each node is conditionally independent of its nondescendants given its parents Theorem: Local semantics, global semantics Markov blanket Each node is conditionally independent of all others given its Markov blanket: parents + children + children's parents Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 11 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 12

Copyright Richard J. Povinelli rev 1.0, 10/1//2001 Page 3 Constructing belief networks Need a method such that a series of locally testable assertions of conditional independence guarantees the required global semantics P (J M) = P (J)? 1. Choose an ordering of variables X 1,, X n 2. For i = 1 to n add X i to the network select parents from X 1,, X i-1 such that P(X i Parents(X i )) = P(X i X 1,, X i-1 ) This choice of parents guarantees the global semantics: P(X 1,, X n ) = Π n i = 1 P(X i X 1,, X i-1 ) (chain rule) = Π n i = 1 P(X n Parents(X i )) by construction Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 13 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 14 P (A J, M) = P (A J)? P(A J, M) = P(A)? P (B A, J, M) = P (B A)? P (B A, J, M) = P (B)? Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 15 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 16 P (B A, J, M) = P (B A)? Yes P (B A, J, M) = P (B)? No P (E B, A, J, M) = P (E A)? P (E B, A, J, M) = P (E A, B)? P (B A, J, M) = P (B A)? Yes P (B A, J, M) = P (B)? No P (E B, A, J, M) = P (E A)? No P (E B, A, J, M) = P (E A, B)? Yes Earthquake Earthquake Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 17 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 18

Copyright Richard J. Povinelli rev 1.0, 10/1//2001 Page 4 CAT Belief Network Using the previous example make a better belief network. Why is yours better? Work with a partner for 5 minutes. : Car diagnosis Initial evidence: engine won't start Testable variables (thin ovals), diagnosis variables (thick ovals) Hidden variables (shaded) ensure sparse structure, reduce parameters Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 19 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 20 : Car insurance Predict claim costs (medical, liability, property) given data on application form (otherunshaded nodes) Inference tasks Simple queries: compute posterior marginal P(X i E = e) e.g., P (NoGas Gauge = empty, Lights = on, Starts = false) Conjunctive queries: P(X i, X j E = e) = P(X i E = e) P(X j X i, E= e) Optimal decisions decision networks include utility information probabilistic inference required for P (outcome action, evidence) Value of information: which evidence to seek next? Sensitivity analysis: which probability values are most critical? Explanation: why do I need a new starter motor? Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 21 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 22 Inference by enumeration Slightly intelligent way to sum out variables from the joint without actually constructing its explicit representation Simple query on the burglary network P(B J = true, M = true) = P(B, J = true, M = true)/p(j = true, M = true) = αp(b, J = true, M = true) = α Σ e Σ a P(B, e, a, J = true, M = true) Rewrite full joint entries using product of CPT entries: P (B = true J = true, M = true) = ασ e Σ a P (B=true) P(e) P(a B=true, e) P(J=true a) P(M=true a) = αp(b=true) Σ e P(e) Σ a P(a B=true, e) P(J=true a) P(M=true a) Inference by variable elimination Enumeration is inefficient: repeated computation e.g., computes P (J = trueja)p (M = trueja) for each value of e Variable elimination: carry out summations right to left, storing intermediate results (factors) to avoid recomputation P(B J = true, M = true) = α P(B) Σ e P(e) Σ a P(a B, e) P(J = true a) P(M = true a) Earthquake JohnCall MaryCall = α P(B) Σ e P(e) Σ a P(a B, e) P(J = true a) f M (a) = α P(B) Σ e P(e) Σ a P(a B, e) f J (a) f M (a) = α P(B) Σ e P(e) Σ a f A (a, b, e) f J (a) f M (a) = α P(B) Σ e P(e) f AJM (b, e) (sum out A) = α P(B)f EAJM (b) (sum out E) = α f B (b) x f EAJM (b) Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 23 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 24

Copyright Richard J. Povinelli rev 1.0, 10/1//2001 Page 5 Complexity of exact inference Singly connected networks (or polytrees) any two nodes are connected by at most one (undirected) path time and space cost of variable elimination are O(d k n) Multiply connected networks: can reduce 3SAT to exact inference => NP hard equivalent to counting 3SAT models => NP complete Inference by stochastic simulation Basic idea: 1) Draw N samples from a sampling distribution S 2) Compute an approximate posterior probability P 3) Show this converges to the true probability P Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 25 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 26 Logic Sampling Simulate events to estimate the desired probability What is P(WetGrass Cloudy)? CAT Logic Sampling Using the previous belief network determine P(Rain Wet Grass, Cloudy). How many simulations did you run? Work with a partner for 5 minutes. Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 27 Copyright Richard J. Povinelli rev 1.0, 10/7/2001 Page 28