Introduction to Artificial Intelligence Belief networks

Similar documents
Bayesian networks. Chapter Chapter

CS 380: ARTIFICIAL INTELLIGENCE UNCERTAINTY. Santiago Ontañón

Bayesian networks. Chapter AIMA2e Slides, Stuart Russell and Peter Norvig, Completed by Kazim Fouladi, Fall 2008 Chapter 14.

CS Belief networks. Chapter

Bayesian networks. Chapter Chapter Outline. Syntax Semantics Parameterized distributions. Chapter

Bayesian networks: Modeling

Outline } Conditional independence } Bayesian networks: syntax and semantics } Exact inference } Approximate inference AIMA Slides cstuart Russell and

Graphical Models - Part I

Example. Bayesian networks. Outline. Example. Bayesian networks. Example contd. Topology of network encodes conditional independence assertions:

Outline. Bayesian networks. Example. Bayesian networks. Example contd. Example. Syntax. Semantics Parameterized distributions. Chapter 14.

Bayesian networks. Chapter 14, Sections 1 4

Belief networks Chapter 15.1í2. AIMA Slides cæstuart Russell and Peter Norvig, 1998 Chapter 15.1í2 1

Probabilistic Reasoning Systems

Bayesian networks. Chapter Chapter

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Belief Networks for Probabilistic Inference

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

p(x) p(x Z) = y p(y X, Z) = αp(x Y, Z)p(Y Z)

Probabilistic Models Bayesian Networks Markov Random Fields Inference. Graphical Models. Foundations of Data Analysis

Graphical Models - Part II

Course Overview. Summary. Outline

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Bayesian Networks. Philipp Koehn. 6 April 2017

Rational decisions From predicting the effect to optimal intervention

Bayesian Networks. Philipp Koehn. 29 October 2015

Review: Bayesian learning and inference

Rational decisions. Chapter 16. Chapter 16 1

Probabilistic representation and reasoning

Directed Graphical Models

Bayesian networks. AIMA2e Chapter 14

Probabilistic representation and reasoning

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Rational decisions. Chapter 16 1

Uncertainty and Bayesian Networks

PROBABILISTIC REASONING SYSTEMS

Informatics 2D Reasoning and Agents Semester 2,

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Quantifying uncertainty & Bayesian networks

Artificial Intelligence Bayes Nets: Independence

Another look at Bayesian. inference

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Bayesian Networks. Material used

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

COMP5211 Lecture Note on Reasoning under Uncertainty

Introduction to Artificial Intelligence. Unit # 11

This lecture. Reading. Conditional Independence Bayesian (Belief) Networks: Syntax and semantics. Chapter CS151, Spring 2004

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

CSE 473: Artificial Intelligence Autumn 2011

CS 5522: Artificial Intelligence II

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

Uncertainty and Belief Networks. Introduction to Artificial Intelligence CS 151 Lecture 1 continued Ok, Lecture 2!

Bayes Nets: Independence

Bayesian Networks. Motivation

BAYESIAN NETWORKS AIMA2E CHAPTER (SOME TOPICS EXCLUDED) AIMA2e Chapter (some topics excluded) 1

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Artificial Intelligence Bayesian Networks

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

Probabilistic Models. Models describe how (a portion of) the world works

Bayesian belief networks

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

14 PROBABILISTIC REASONING

Defining Things in Terms of Joint Probability Distribution. Today s Lecture. Lecture 17: Uncertainty 2. Victor R. Lesser

Bayesian belief networks. Inference.

COMP9414: Artificial Intelligence Reasoning Under Uncertainty

Decision. AI Slides (5e) c Lin

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Stochastic inference in Bayesian networks, Markov chain Monte Carlo methods

CS 188: Artificial Intelligence Spring Announcements

Inference in Bayesian networks

Inference in Bayesian Networks

Ch.6 Uncertain Knowledge. Logic and Uncertainty. Representation. One problem with logical approaches: Department of Computer Science

Bayesian Belief Network

Product rule. Chain rule

Bayesian networks (1) Lirong Xia

Bayesian Networks. Machine Learning, Fall Slides based on material from the Russell and Norvig AI Book, Ch. 14

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Introduction to Probabilistic Reasoning. Image credit: NASA. Assignment

Local learning in probabilistic networks with hidden variables

Outline. CSE 573: Artificial Intelligence Autumn Agent. Partial Observability. Markov Decision Process (MDP) 10/31/2012

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

CS 5522: Artificial Intelligence II

CS 188: Artificial Intelligence Fall 2009

Reasoning Under Uncertainty

Lecture 10: Bayesian Networks and Inference

Learning Bayesian Networks (part 1) Goals for the lecture

CS 188: Artificial Intelligence Spring Announcements

Y. Xiang, Inference with Uncertain Knowledge 1

CS 343: Artificial Intelligence

Probabilistic Representation and Reasoning

Foundations of Artificial Intelligence

CSEP 573: Artificial Intelligence

last two digits of your SID

PROBABILISTIC REASONING Outline

Quantifying Uncertainty

Reasoning Under Uncertainty

Transcription:

Introduction to Artificial Intelligence Belief networks Chapter 15.1 2 Dieter Fox Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-0

Outline Bayesian networks: syntax and semantics Inference tasks Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-1

Belief networks A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions Syntax: a set of nodes, one per variable a directed, acyclic graph (link directly influences ) a conditional distribution for each node given its parents: P(X i P arents(x i )) In the simplest case, conditional distribution represented as a conditional probability table (CPT) Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-2

Example I m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn t call. Sometimes it s set off by minor earthquakes. Is there a burglar? Variables: Burglar, Earthquake, Alarm, JohnCalls, M arycalls Network topology reflects causal knowledge: Burglary P(B).001 Earthquake P(E).002 Alarm B T T F F E T F T F P(A).95.94.29.001 JohnCalls A P(J) T.90 F.05 MaryCalls A P(M) T.70 F.01 Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-3

Example I m at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn t call. Sometimes it s set off by minor earthquakes. Is there a burglar? Variables: Burglar, Earthquake, Alarm, JohnCalls, M arycalls Network topology reflects causal knowledge: Burglary P(B).001 Earthquake P(E).002 Alarm B T T F F E T F T F P(A).95.94.29.001 JohnCalls A P(J) T.90 F.05 MaryCalls A P(M) T.70 F.01 Note: k parents O(d k n) numbers vs. O(d n ) Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-4

Semantics Global semantics defines the full joint distribution as the product of the local conditional distributions: P(X 1,..., X n ) =Π n i = 1P(X i P arents(x i )) e.g., P (J M A B E) is given by?? = Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-5

Semantics Global semantics defines the full joint distribution as the product of the local conditional distributions: P(X 1,..., X n ) =Π n i = 1P(X i P arents(x i )) e.g., P (J M A B E) is given by?? = P ( B)P ( E)P (A B E)P (J A)P (M A) Local semantics: each node is conditionally independent of its nondescendants given its parents Theorem: Local semantics global semantics Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-6

Markov blanket Each node is conditionally independent of all others given its Markov blanket: parents + children + children s parents U 1... U m Z 1j X Z nj Y 1... Y n Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-7

Constructing belief networks Need a method such that a series of locally testable assertions of conditional independence guarantees the required global semantics 1. Choose an ordering of variables X 1,..., X n 2. For i = 1 to n add X i to the network select parents from X 1,..., X i 1 such that P(X i P arents(x i )) = P(X i X 1,..., X i 1 ) This choice of parents guarantees the global semantics: P(X 1,..., X n ) =Π n i = 1P(X i X 1,..., X i 1 ) (chain rule) =Π n i = 1P(X i P arents(x i )) by construction Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-8

Example Suppose we choose the ordering M, J, A, B, E MaryCalls JohnCalls P (J M) = P (J)? Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-9

Example Suppose we choose the ordering M, J, A, B, E MaryCalls JohnCalls Alarm P (J M) = P (J)? No P (A J, M) = P (A J)? P (A J, M) = P (A)? Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-10

Example Suppose we choose the ordering M, J, A, B, E MaryCalls JohnCalls Alarm Burglary P (J M) = P (J)? No P (A J, M) = P (A J)? P (A J, M) = P (A)? P (B A, J, M) = P (B A)? P (B A, J, M) = P (B)? No Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-11

Example Suppose we choose the ordering M, J, A, B, E MaryCalls JohnCalls Alarm Burglary Earthquake P (J M) = P (J)? No P (A J, M) = P (A J)? P (A J, M) = P (A)? P (B A, J, M) = P (B A)? Yes P (B A, J, M) = P (B)? No P (E B, A, J, M) = P (E A)? P (E B, A, J, M) = P (E A, B)? No Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-12

Example Suppose we choose the ordering M, J, A, B, E MaryCalls JohnCalls Alarm Burglary Earthquake P (J M) = P (J)? No P (A J, M) = P (A J)? P (A J, M) = P (A)? P (B A, J, M) = P (B A)? Yes P (B A, J, M) = P (B)? No P (E B, A, J, M) = P (E A)? No P (E B, A, J, M) = P (E A, B)? Yes No Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-13

Example: Car diagnosis Initial evidence: engine won t start Testable variables (thin ovals), diagnosis variables (thick ovals) Hidden variables (shaded) ensure sparse structure, reduce parameters battery age alternator broken fanbelt broken battery dead no charging battery flat no oil no gas fuel line blocked starter broken lights oil light gas gauge engine won t start Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-14

Example: Car insurance Predict claim costs (medical, liability, property) given data on application form (other unshaded nodes) Age GoodStudent RiskAversion SeniorTrain SocioEcon Mileage VehicleYear ExtraCar DrivingSkill MakeModel DrivingHist Antilock DrivQuality Airbag CarValue HomeBase AntiTheft Ruggedness Accident OwnDamage Theft Cushioning OtherCost OwnCost MedicalCost LiabilityCost PropertyCost Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-15

Inference in Bayesian networks Instantiate some nodes (evidence nodes) and query other nodes. P (Burglary JohnCalls)?? Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-16

Inference in Bayesian networks Instantiate some nodes (evidence nodes) and query other nodes. P (Burglary JohnCalls)?? Burglary only every 1000 days, but John calls 50 times in 1000 days, i.e. for each burglary we receive 50 false alarms. P (Burglary JohnCalls) = 0.016! P (Burglary JohnCalls, MaryCalls) = 0.29. Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-17

Types of inference 1. Diagnostic: From effects to causes P (Burglary JohnCalls) = 0.016 2. Causal: From causes to effects P (JohnCalls Burglary) = 0.86 3. Intercausal: between causes of common effect P (Burglary Alarm) = 0.376, but P (Burglary Alarm, Earthquake) = 0.003. 4. Mixed: Combinations of 1.-3. P (Alarm JohnCalls, Earthquake) = 0.03 Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-18

Inference tasks Queries: compute posterior marginal P(X i E = e) e.g., P (NoGas Gauge = empty, Lights = on, Starts = false) Optimal decisions: decision networks include utility information; probabilistic inference required for P (outcome action, evidence) Value of information: which evidence to seek next? Sensitivity analysis: which probability values are most critical? Explanation: why do I need a new starter motor? Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-19

Compact conditional distributions CPT grows exponentially with no. of parents CPT becomes infinite with continuous-valued parent or child Solution: canonical distributions that are defined compactly Deterministic nodes are the simplest case: X = f(p arents(x)) for some function f E.g., Boolean functions NorthAmerican Canadian US Mexican E.g., numerical relationships among continuous variables Level t = inflow + precipation - outflow - evaporation Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-20

Compact conditional distributions contd. Noisy-OR distributions model multiple noninteracting causes 1) Parents U 1... U k include all causes (can add leak node) 2) Independent failure probability q i for each cause alone P (X U 1... U j, U j+1... U k ) = 1 Π j i = 1q i Cold F lu M alaria P (F ever) P ( F ever) F F F 0.0 1.0 F F T 0.9 0.1 F T F 0.8 0.2 F T T 0.98 0.02 = 0.2 0.1 T F F 0.4 0.6 T F T 0.94 0.06 = 0.6 0.1 T T F 0.88 0.12 = 0.6 0.2 T T T 0.988 0.012 = 0.6 0.2 0.1 Number of parameters linear in number of parents Based on AIMA Slides c S. Russell and P. Norvig, 1998 Chapter 15.1 2 0-21