Outline. Bayesian networks. Example. Bayesian networks. Example contd. Example. Syntax. Semantics Parameterized distributions. Chapter 14.

Similar documents
Example. Bayesian networks. Outline. Example. Bayesian networks. Example contd. Topology of network encodes conditional independence assertions:

Bayesian networks. Chapter Chapter Outline. Syntax Semantics Parameterized distributions. Chapter

Bayesian networks. Chapter Chapter

Bayesian networks. Chapter AIMA2e Slides, Stuart Russell and Peter Norvig, Completed by Kazim Fouladi, Fall 2008 Chapter 14.

Bayesian networks: Modeling

CS Belief networks. Chapter

CS 380: ARTIFICIAL INTELLIGENCE UNCERTAINTY. Santiago Ontañón

Outline } Conditional independence } Bayesian networks: syntax and semantics } Exact inference } Approximate inference AIMA Slides cstuart Russell and

Introduction to Artificial Intelligence Belief networks

Course Overview. Summary. Outline

Belief networks Chapter 15.1í2. AIMA Slides cæstuart Russell and Peter Norvig, 1998 Chapter 15.1í2 1

Bayesian Networks. Philipp Koehn. 29 October 2015

Bayesian Networks. Philipp Koehn. 6 April 2017

Graphical Models - Part I

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Bayesian networks. Chapter 14, Sections 1 4

Belief Networks for Probabilistic Inference

Bayesian networks. Chapter Chapter

Graphical Models - Part II

p(x) p(x Z) = y p(y X, Z) = αp(x Y, Z)p(Y Z)

Probabilistic Models Bayesian Networks Markov Random Fields Inference. Graphical Models. Foundations of Data Analysis

Review: Bayesian learning and inference

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Uncertainty and Bayesian Networks

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Another look at Bayesian. inference

Probabilistic Reasoning Systems

14 PROBABILISTIC REASONING

Bayesian networks. AIMA2e Chapter 14

Probabilistic representation and reasoning

Informatics 2D Reasoning and Agents Semester 2,

Probabilistic representation and reasoning

PROBABILISTIC REASONING SYSTEMS

Lecture 10: Bayesian Networks and Inference

BAYESIAN NETWORKS AIMA2E CHAPTER (SOME TOPICS EXCLUDED) AIMA2e Chapter (some topics excluded) 1

Bayesian Belief Network

Outline. CSE 473: Artificial Intelligence Spring Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Uncertainty and Belief Networks. Introduction to Artificial Intelligence CS 151 Lecture 1 continued Ok, Lecture 2!

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

This lecture. Reading. Conditional Independence Bayesian (Belief) Networks: Syntax and semantics. Chapter CS151, Spring 2004

Bayesian Networks. Material used

Rational decisions. Chapter 16. Chapter 16 1

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Quantifying uncertainty & Bayesian networks

CSE 473: Artificial Intelligence Autumn 2011

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Rational decisions From predicting the effect to optimal intervention

Rational decisions. Chapter 16 1

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

Directed Graphical Models

Bayesian networks (1) Lirong Xia

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

Probabilistic Models

Probabilistic Models. Models describe how (a portion of) the world works

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Directed and Undirected Graphical Models

Bayes Nets. CS 188: Artificial Intelligence Fall Example: Alarm Network. Bayes Net Semantics. Building the (Entire) Joint. Size of a Bayes Net

Directed Graphical Models or Bayesian Networks

PROBABILISTIC REASONING Outline

Product rule. Chain rule

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Chris Bishop s PRML Ch. 8: Graphical Models

Artificial Intelligence Bayes Nets: Independence

CS 5522: Artificial Intelligence II

CS 188: Artificial Intelligence Spring Announcements

Stochastic inference in Bayesian networks, Markov chain Monte Carlo methods

Introduction to Artificial Intelligence. Unit # 11

15-780: Graduate Artificial Intelligence. Bayesian networks: Construction and inference

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

Part I. C. M. Bishop PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 8: GRAPHICAL MODELS

CS 343: Artificial Intelligence

Outline. Uncertainty. Methods for handling uncertainty. Uncertainty. Making decisions under uncertainty. Probability. Uncertainty

Bayes Nets: Independence

Bayesian Networks. Alan Ri2er

Announcements. CS 188: Artificial Intelligence Spring Bayes Net Semantics. Probabilities in BNs. All Conditional Independences

Bayesian Networks. Motivation

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

Machine learning: lecture 20. Tommi S. Jaakkola MIT CSAIL

Graphical Models and Kernel Methods

CS 188: Artificial Intelligence Spring Announcements

CS 188: Artificial Intelligence Fall 2009

CS 5522: Artificial Intelligence II

Uncertainty. Outline. Probability Syntax and Semantics Inference Independence and Bayes Rule. AIMA2e Chapter 13

Local Probability Models

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Cavities. Independence. Bayes nets. Why do we care about variable independence? P(W,CY,T,CH) = P(W )P(CY)P(T CY)P(CH CY)

Announcements. CS 188: Artificial Intelligence Fall Example Bayes Net. Bayes Nets. Example: Traffic. Bayes Net Semantics

Bayesian Networks (Part II)

Reasoning Under Uncertainty

Reasoning Under Uncertainty: Belief Network Inference

Foundations of Artificial Intelligence

Recap: Bayes Nets. CS 473: Artificial Intelligence Bayes Nets: Independence. Conditional Independence. Bayes Nets. Independence in a BN

Bayesian Networks Introduction to Machine Learning. Matt Gormley Lecture 24 April 9, 2018

Cavity. Weather. Burglary. Earthquake T T F F T F T F Alarm .001 T F JohnCalls T F. MaryCalls P(E) P(B) P(A B,E) P(J A)

Transcription:

Outline Syntax ayesian networks Semantics Parameterized distributions Chapter 4. 3 Chapter 4. 3 Chapter 4. 3 2 ayesian networks simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions Syntax: a set of nodes, one per variable a directed, acyclic graph (link directly influences ) a conditional distribution for each node given its parents: P(X i Parents(X i )) In the simplest case, conditional distribution represented as a conditional probability table (CP) giving the distribution over X i for each combination of parent values opology of network encodes conditional independence assertions: Weather Cavity oothache Catch W eather is independent of the other variables oothache and Catch are conditionally independent given Cavity Chapter 4. 3 3 Chapter 4. 3 4 contd. I m at work, neighbor ohn calls to say my alarm is ringing, but neighbor ary doesn t call. Sometimes it s set off by minor earthquakes. Is there a burglar? P(). P().2 Variables: urglar,,,, Network topology reflects causal knowledge: burglar can set the alarm off n earthquake can set the alarm off he alarm can cause ary to call he alarm can cause ohn to call P(,).95.94.29. P( ).9.5 P( ).7. Chapter 4. 3 5 Chapter 4. 3 6

Compactness Global semantics CPforooleanX i with k oolean parents has 2 k rows for the combinations of parent values Global semantics defines the full joint distribution as the product of the local conditional distributions: ach row requires one number p for X i = true (the number for X i = false is just p) If each variable has no more than k parents, the complete network requires O(n 2 k ) numbers P (x,...,x n )=Π n i =P (x i parents(x i )) e.g., P (j m a b e) = I.e., grows linearly with n, vs.o(2 n ) for the full joint distribution or burglary net, ++4+2+2=numbers (vs. 2 5 =3) Chapter 4. 3 7 Chapter 4. 3 8 Global semantics Local semantics Global semantics defines the full joint distribution as the product of the local conditional distributions: Local semantics: each node is conditionally independent of its nondescendants given its parents P (x,...,x n )=Π n i =P (x i parents(x i )) e.g., P (j m a b e) U U m = P (j a)p (m a)p (a b, e)p ( b)p ( e) =.9.7..999.998.63 Z j X Z nj Y Y n heorem: Local semantics global semantics Chapter 4. 3 9 Chapter 4. 3 arkov blanket ach node is conditionally independent of all others given its arkov blanket: parents + children + children s parents Constructing ayesian networks Need a method such that a series of locally testable assertions of conditional independence guarantees the required global semantics Z j U X U m Z nj. Choose an ordering of variables X,...,X n 2. or i =ton add X i to the network select parents from X,...,X i such that P(X i Parents(X i )) = P(X i X,...,X i ) his choice of parents guarantees the global semantics: Y Y n P(X,...,X n )=Π n i =P(X i X,...,X i ) (chain rule) = Π n i =P(X i Parents(X i )) (by construction) Chapter 4. 3 Chapter 4. 3 2

P ( ) =P ()? P (, ) =P ( )? P (, ) =P ()? Chapter 4. 3 3 Chapter 4. 3 4 P (, ) =P ( )? P (, ) =P ()? P (,, ) =P ( )? P (,, ) =P ()? P (, ) =P ( )? P (, ) =P ()? P (,, ) =P ( )? Yes P (,, ) =P ()? P (,,,) =P ( )? P (,,,) =P (, )? Chapter 4. 3 5 Chapter 4. 3 6 contd. P (, ) =P ( )? P (, ) =P ()? P (,, ) =P ( )? Yes P (,, ) =P ()? P (,,,) =P ( )? P (,,,) =P (, )? Yes Deciding conditional independence is hard in noncausal directions (Causal models and conditional independence seem hardwired for humans!) ssessing conditional probabilities is hard in noncausal directions Network is less compact: +2+4+2+4=3numbers needed Chapter 4. 3 7 Chapter 4. 3 8

: Car diagnosis Initial evidence: car won t start estable variables (green),, so fix it variables (orange) Hidden variables (gray) ensure sparse structure, reduce parameters age dead alternator no charging fanbelt ge Seniorrain : Car insurance GoodStudent Riskversion DrivingSkill DrivingHist DrivQuality akeodel Sociocon ntilock ileage VehicleYear xtracar irbag CarValue Homease ntiheft meter flat no oil no gas fuel line blocked starter Ruggedness ccident OwnDamage heft Cushioning OtherCost OwnCost lights oil light gas gauge car won t start dipstick edicalcost LiabilityCost PropertyCost Chapter 4. 3 9 Chapter 4. 3 2 Compact conditional distributions CP grows exponentially with number of parents CP becomes infinite with continuous-valued parent or child Solution: canonical distributions that are defined compactly Deterministic nodes are the simplest case: X = f(parents(x)) for some function f.g., oolean functions rthmerican Canadian US exican.g., numerical relationships among continuous variables Level = inflow + precipitation - outflow - evaporation t Compact conditional distributions contd. isy-or distributions model multiple noninteracting causes ) Parents U...U k include all causes (can add leak node) 2) Independent failure probability q i for each cause alone P (X U...U j, U j+... U k )= Π j i =q i Cold lu alaria P (ever) P ( ever)...9..8.2.98.2 =.2..4.6.94.6 =.6..88.2 =.6.2.988.2 =.6.2. Number of parameters linear in number of parents Chapter 4. 3 2 Chapter 4. 3 22 Hybrid (discrete+continuous) networks Discrete (Subsidy? and uys?); continuous (Harvest and Cost) Subsidy? Cost uys? Harvest Option : discretization possibly large errors, large CPs Option 2: finitely parameterized canonical families ) Continuous variable, discrete+continuous parents (e.g., Cost) 2) Discrete variable, continuous parents (e.g., uys?) Continuous child variables Need one conditional density function for child variable given continuous parents, for each possible assignment to discrete parents ost common is the linear Gaussian model, e.g.,: P (Cost= c Harvest= h, Subsidy?=true) = N(a t h + b t,σ t )(c) = exp c (a t h + b t ) σ t 2π 2 ean Cost varies linearly with Harvest,varianceisfixed Linear variation is unreasonable over the full range but works OK if the likely range of Harvest is narrow σ t 2 Chapter 4. 3 23 Chapter 4. 3 24

Continuous child variables P(Cost Harvest,Subsidy?=true).35.3.25.2.5..5 5 Cost 5 Harvest ll-continuous network with LG distributions full joint distribution is a multivariate Gaussian Discrete variable w/ continuous parents Probability of uys? given Cost should be a soft threshold: P(uys?=false Cost=c).8.6.4.2 Discrete+continuous LG network is a conditional Gaussian network i.e., a multivariate Gaussian over all continuous variables for each combination of discrete variable values 2 4 6 8 2 Cost c Probit distribution uses integral of Gaussian: Φ(x) = x N(, )(x)dx P (uys?=true Cost= c) =Φ(( c + μ)/σ) Chapter 4. 3 25 Chapter 4. 3 26 Why the probit?. It s sort of the right shape 2. Can view as hard threshold whose location is subject to noise Cost Cost ise uys? Discrete variable contd. Sigmoid (or logit) distribution also used in neural networks: P (uys?=true Cost= c) = +exp( 2 c+μ σ ) Sigmoid has similar shape to probit but much longer tails: P(uys?=false Cost=c).9.8.7.6.5.4.3.2. 2 4 6 8 2 Cost c Chapter 4. 3 27 Chapter 4. 3 28 Summary ayes nets provide a natural representation for (causally induced) conditional independence opology + CPs = compact representation of joint distribution Generally easy for (non)experts to construct Canonical distributions (e.g., noisy-or) = compact representation of CPs Continuous variables parameterized distributions (e.g., linear Gaussian) Chapter 4. 3 29