Bayes Nets for representing and reasoning about uncertainty

Similar documents
Bayes Nets for representing and reasoning about uncertainty

Bayes Nets for representing

Probabilistic and Bayesian Analytics

Probability Basics. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Probabilistic and Bayesian Analytics Based on a Tutorial by Andrew W. Moore, Carnegie Mellon University

Sources of Uncertainty

Machine Learning

Bayes Networks. CS540 Bryan R Gibson University of Wisconsin-Madison. Slides adapted from those used by Prof. Jerry Zhu, CS540-1

Machine Learning

CS 331: Artificial Intelligence Probability I. Dealing with Uncertainty

PROBABILISTIC REASONING SYSTEMS

Machine Learning

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Modeling and reasoning with uncertainty

CISC 4631 Data Mining

CS 188: Artificial Intelligence. Bayes Nets

CS 4649/7649 Robot Intelligence: Planning

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Probability review. Adopted from notes of Andrew W. Moore and Eric Xing from CMU. Copyright Andrew W. Moore Slide 1

Algorithms for Playing and Solving games*

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Bayes and Naïve Bayes Classifiers CS434

Bayes Nets III: Inference

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

CS 5522: Artificial Intelligence II

Bayesian belief networks

Basic Probability and Statistics

A [somewhat] Quick Overview of Probability. Shannon Quinn CSCI 6900

Bayes Nets: Independence

An AI-ish view of Probability, Conditional Probability & Bayes Theorem

10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty.

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Bayesian Networks. Axioms of Probability Theory. Conditional Probability. Inference by Enumeration. Inference by Enumeration CSE 473

CS 188: Artificial Intelligence Spring Announcements

Lecture 6: Graphical Models

CS 343: Artificial Intelligence

Informatics 2D Reasoning and Agents Semester 2,

Basic Probabilistic Reasoning SEG

Bayesian Approaches Data Mining Selected Technique

CS 188: Artificial Intelligence Fall 2008

Bayesian Networks. Motivation

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Bayesian Networks: Independencies and Inference

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Directed Graphical Models or Bayesian Networks

Approximate Inference

Machine Learning. CS Spring 2015 a Bayesian Learning (I) Uncertainty

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

1 Probabilities. 1.1 Basics 1 PROBABILITIES

Probabilistic Models

Outline. CSE 473: Artificial Intelligence Spring Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

COMP5211 Lecture Note on Reasoning under Uncertainty

Lecture 8. Probabilistic Reasoning CS 486/686 May 25, 2006

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Probabilistic representation and reasoning

CS 188: Artificial Intelligence. Our Status in CS188

Probabilistic representation and reasoning

CMPT Machine Learning. Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th

Dealing with Uncertainty. CS 331: Artificial Intelligence Probability I. Outline. Random Variables. Random Variables.

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

CS 188: Artificial Intelligence Spring Announcements

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Introduction to Artificial Intelligence. Unit # 11

Bayesian Networks. Distinguished Prof. Dr. Panos M. Pardalos

Probabilistic and Bayesian Analytics

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

Quantifying uncertainty & Bayesian networks

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

CS 188: Artificial Intelligence Fall 2009

CSE 473: Artificial Intelligence Autumn 2011

Probabilistic Models. Models describe how (a portion of) the world works

Bayesian Networks Representation

Probabilistic Reasoning Systems

Introduction to Machine Learning

Review: Bayesian learning and inference

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

Intelligent Systems (AI-2)

Introduction to Machine Learning

Bayes Nets: Sampling

Chapter 4: Techniques of Circuit Analysis

Our learning goals for Bayesian Nets

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Uncertainty and Bayesian Networks

Discrete Probability and State Estimation

1 Probabilities. 1.1 Basics 1 PROBABILITIES

Product rule. Chain rule

Recall from last time. Lecture 3: Conditional independence and graph structure. Example: A Bayesian (belief) network.

VC-dimension for characterizing classifiers

CS 5522: Artificial Intelligence II

Probabilistic Classification

CS 5100: Founda.ons of Ar.ficial Intelligence

VC-dimension for characterizing classifiers

Learning Bayesian Networks (part 1) Goals for the lecture

Bayes Networks 6.872/HST.950

Transcription:

Bayes Nets for representing and reasoning about uncertainty Note to other teachers and users of these slides. ndrew would be delighted if you found this source material useful in giing your own lectures. Feel free to use these slides erbatim, or to modify them to fit your own needs. PowerPoint originals are aailable. If you make use of a significant portion of these slides in your own lecture, please include this message, or the following link to the source repository of ndrew s tutorials: http://www.cs.cmu.edu/~awm/tutorials. Comments and corrections gratefully receied. ndrew W. Moore ssociate Professor School of Computer Science Carnegie Mellon Uniersity www.cs.cmu.edu/~awm awm@cs.cmu.edu 42-268-7599 Copyright 2, ndrew W. Moore Oct 5th, 2 What we ll discuss Recall the numerous and dramatic benefits of Joint Distributions for describing uncertain worlds Reel with terror at the problem with using Joint Distributions Discoer how Bayes Net methodology allows us to built Joint Distributions in manageable chunks Discoer there s still a lurking problem Start to sole that problem Copyright 2, ndrew W. Moore Bayes Nets: Slide 2

Why this matters In ndrew s opinion, the most important technology in the Machine Learning / I field to hae emerged in the last years. clean, clear, manageable language and methodology for expressing what you re certain and uncertain about lready, many practical applications in medicine, factories, helpdesks: this problem these symptoms) anomalousness of this obseration choosing next diagnostic test these obserations Copyright 2, ndrew W. Moore Bayes Nets: Slide 3 Why this matters In ndrew s opinion, the most important technology in the Machine Learning / I field ctie to hae Data emerged in the last years. Collection clean, clear, manageable language and methodology for expressing what you re Inference certain and uncertain about lready, many practical applications nomaly in medicine, factories, helpdesks: Detection this problem these symptoms) anomalousness of this obseration choosing next diagnostic test these obserations Copyright 2, ndrew W. Moore Bayes Nets: Slide 4

Ways to deal with Uncertainty Three-alued logic: True / False / Maybe Fuzzy logic (truth alues between and ) Non-monotonic reasoning (especially focused on Penguin informatics) Dempster-Shafer theory (and an extension known as quasi-bayesian theory) Possibabilistic Logic Probability Copyright 2, ndrew W. Moore Bayes Nets: Slide 5 Discrete Random Variables is a Boolean-alued random ariable if denotes an eent, and there is some degree of uncertainty as to whether occurs. Examples The US president in 223 will be male You wake up tomorrow with a headache You hae Ebola Copyright 2, ndrew W. Moore Bayes Nets: Slide 6

Probabilities We write ) as the fraction of possible worlds in which is true We could at this point spend 2 hours on the philosophy of this. But we won t. Copyright 2, ndrew W. Moore Bayes Nets: Slide 7 Visualizing Eent space of all possible worlds Worlds in which is true ) rea of reddish oal Its area is Worlds in which is False Copyright 2, ndrew W. Moore Bayes Nets: Slide 8

Interpreting the axioms < ) < True) False) or B) ) + B) - and B) The area of can t get any smaller than nd a zero area would mean no world could eer hae true Copyright 2, ndrew W. Moore Bayes Nets: Slide 9 Interpreting the axioms < ) < True) False) or B) ) + B) - and B) The area of can t get any bigger than nd an area of would mean all worlds will hae true Copyright 2, ndrew W. Moore Bayes Nets: Slide

Interpreting the axioms < ) < True) False) or B) ) + B) - and B) B Copyright 2, ndrew W. Moore Bayes Nets: Slide Interpreting the axioms < ) < True) False) or B) ) + B) - and B) B or B) and B) B Simple addition and subtraction Copyright 2, ndrew W. Moore Bayes Nets: Slide 2

These xioms are Not to be Trifled With There hae been attempts to do different methodologies for uncertainty Fuzzy Logic Three-alued logic Dempster-Shafer Non-monotonic reasoning But the axioms of probability are the only system with this property: If you gamble using them you can t be unfairly exploited by an opponent using some other system [di Finetti 93] Copyright 2, ndrew W. Moore Bayes Nets: Slide 3 Theorems from the xioms < ) <, True), False) or B) ) + B) - and B) From these we can proe: not ) ~) -) How? Copyright 2, ndrew W. Moore Bayes Nets: Slide 4

Side Note I am inflicting these proofs on you for two reasons:. These kind of manipulations will need to be second nature to you if you use probabilistic analytics in depth 2. Suffering is good for you Copyright 2, ndrew W. Moore Bayes Nets: Slide 5 nother important theorem < ) <, True), False) or B) ) + B) - and B) From these we can proe: ) ^ B) + ^ ~B) How? Copyright 2, ndrew W. Moore Bayes Nets: Slide 6

Conditional Probability B) Fraction of worlds in which B is true that also hae true H Hae a headache F Coming down with Flu F H) / F) /4 H F) /2 H Headaches are rare and flu is rarer, but if you re coming down with flu there s a 5-5 chance you ll hae a headache. Copyright 2, ndrew W. Moore Bayes Nets: Slide 7 Conditional Probability F H F) Fraction of flu-inflicted worlds in which you hae a headache H Hae a headache F Coming down with Flu H) / F) /4 H F) /2 H #worlds with flu and headache ------------------------------------ #worlds with flu rea of H and F region ------------------------------ rea of F region H ^ F) ----------- F) Copyright 2, ndrew W. Moore Bayes Nets: Slide 8

Definition of Conditional Probability ^ B) B) ----------- B) Corollary: The Chain Rule ^ B) B) B) Copyright 2, ndrew W. Moore Bayes Nets: Slide 9 Bayes Rule ^ B) B) B) B ) ----------- --------------- ) ) This is Bayes Rule Bayes, Thomas (763) n essay towards soling a problem in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53:37-48 Copyright 2, ndrew W. Moore Bayes Nets: Slide 2

Using Bayes Rule to Gamble R R B B $. R B B The Win enelope has a dollar and four beads in it The Lose enelope has three beads and no money Triial question: someone draws an enelope at random and offers to sell it to you. How much should you pay? Copyright 2, ndrew W. Moore Bayes Nets: Slide 2 Using Bayes Rule to Gamble $. The Win enelope has a dollar and four beads in it The Lose enelope has three beads and no money Interesting question: before deciding, you are allowed to see one bead drawn from the enelope. Suppose it s black: How much should you pay? Suppose it s red: How much should you pay? Copyright 2, ndrew W. Moore Bayes Nets: Slide 22

Calculation $. Copyright 2, ndrew W. Moore Bayes Nets: Slide 23 Multialued Random Variables Suppose can take on more than 2 alues is a random ariable with arity k if it can take on exactly one alue out of {, 2,.. k } Thus P ) if i j ) i j ( 2 k Copyright 2, ndrew W. Moore Bayes Nets: Slide 24

Copyright 2, ndrew W. Moore Bayes Nets: Slide 25 n easy fact about Multialued Random Variables: Using the axioms of probability < ) <, True), False) or B) ) + B) - and B) nd assuming that obeys It s easy to proe that j i P j i if ) ( ) ( 2 k P ) ( ) ( 2 i j i j P P Copyright 2, ndrew W. Moore Bayes Nets: Slide 26 n easy fact about Multialued Random Variables: Using the axioms of probability < ) <, True), False) or B) ) + B) - and B) nd assuming that obeys It s easy to proe that j i P j i if ) ( ) ( 2 k P ) ( ) ( 2 i j i j P P nd thus we can proe ) ( k j j P

Copyright 2, ndrew W. Moore Bayes Nets: Slide 27 nother fact about Multialued Random Variables: Using the axioms of probability < ) <, True), False) or B) ) + B) - and B) nd assuming that obeys It s easy to proe that j i P j i if ) ( ) ( 2 k P ) ( ]) [ ( 2 i j i j B P B P Copyright 2, ndrew W. Moore Bayes Nets: Slide 28 nother fact about Multialued Random Variables: Using the axioms of probability < ) <, True), False) or B) ) + B) - and B) nd assuming that obeys It s easy to proe that j i P j i if ) ( ) ( 2 k P ) ( ]) [ ( 2 i j i j B P B P nd thus we can proe ) ( ) ( k j j B P B P

More General Forms of Bayes Rule B ) ) B) B ) ) + B ~ ) ~ ) B X) X) B X) B X) Copyright 2, ndrew W. Moore Bayes Nets: Slide 29 More General Forms of Bayes Rule i B) n k B ) ) B i ) ) k i k Copyright 2, ndrew W. Moore Bayes Nets: Slide 3

Useful Easy-to-proe facts B) + B) n k k B) Copyright 2, ndrew W. Moore Bayes Nets: Slide 3 The Joint Distribution Recipe for making a joint distribution of M ariables: Example: Boolean ariables, B, C Copyright 2, ndrew W. Moore Bayes Nets: Slide 32

The Joint Distribution Recipe for making a joint distribution of M ariables:. Make a truth table listing all combinations of alues of your ariables (if there are M Boolean ariables then the table will hae 2 M rows). Example: Boolean ariables, B, C B C Copyright 2, ndrew W. Moore Bayes Nets: Slide 33 The Joint Distribution Recipe for making a joint distribution of M ariables:. Make a truth table listing all combinations of alues of your ariables (if there are M Boolean ariables then the table will hae 2 M rows). 2. For each combination of alues, say how probable it is. Example: Boolean ariables, B, C B C Prob.3.5..5.5..25. Copyright 2, ndrew W. Moore Bayes Nets: Slide 34

The Joint Distribution Recipe for making a joint distribution of M ariables:. Make a truth table listing all combinations of alues of your ariables (if there are M Boolean ariables then the table will hae 2 M rows). 2. For each combination of alues, say how probable it is. 3. If you subscribe to the axioms of probability, those numbers must sum to. Example: Boolean ariables, B, C B.5.25 C.5 Prob.3.5..5.5..25...5. C.3 B. Copyright 2, ndrew W. Moore Bayes Nets: Slide 35 Using the Joint Once you hae the JD you can ask for the probability of any logical expression inoling your attribute E) rows matching E row) Copyright 2, ndrew W. Moore Bayes Nets: Slide 36

Using the Joint Poor Male).4654 E) rows matching E row) Copyright 2, ndrew W. Moore Bayes Nets: Slide 37 Using the Joint Poor).764 E) rows matching E row) Copyright 2, ndrew W. Moore Bayes Nets: Slide 38

Inference with the Joint E E 2 ) E E2) E ) 2 rows matching E and E rows matching E row) 2 2 row) Copyright 2, ndrew W. Moore Bayes Nets: Slide 39 Inference with the Joint E E 2 ) E E2) E ) 2 rows matching E and E rows matching E row) 2 2 row) Male Poor).4654 /.764.62 Copyright 2, ndrew W. Moore Bayes Nets: Slide 4

Joint distributions Good news Once you hae a joint distribution, you can ask important questions about stuff that inoles a lot of uncertainty Bad news Impossible to create for more than about ten attributes because there are so many numbers needed when you build the damn thing. Copyright 2, ndrew W. Moore Bayes Nets: Slide 4 Using fewer numbers Suppose there are two eents: M: Manuela teaches the class (otherwise it s ndrew) S: It is sunny The joint p.d.f. for these eents contain four entries. If we want to build the joint p.d.f. we ll hae to inent those four numbers. OR WILL WE?? We don t hae to specify with bottom leel conjunctie eents such as ~M^S) IF instead it may sometimes be more conenient for us to specify things like: M), S). But just M) and S) don t derie the joint distribution. So you can t answer all questions. Copyright 2, ndrew W. Moore Bayes Nets: Slide 42

Using fewer numbers Suppose there are two eents: M: Manuela teaches the class (otherwise it s ndrew) S: It is sunny The joint p.d.f. for these eents contain four entries. If we want to build the joint p.d.f. we ll hae to inent those four numbers. OR WILL WE?? We don t hae to specify with bottom leel conjunctie eents such as ~M^S) IF instead it may sometimes be more conenient for us to specify things like: M), S). But just M) and S) don t derie the joint distribution. So you can t answer all questions. What extra assumption can you make? Copyright 2, ndrew W. Moore Bayes Nets: Slide 43 Independence The sunshine leels do not depend on and do not influence who is teaching. This can be specified ery simply: S M) S) This is a powerful statement! It required extra domain knowledge. different kind of knowledge than numerical probabilities. It needed an understanding of causation. Copyright 2, ndrew W. Moore Bayes Nets: Slide 44

Independence From S M) S), the rules of probability imply: (can you proe these?) ~S M) ~S) M S) M) M ^ S) M) S) ~M ^ S) ~M) S), (PM^~S) M)~S), ~M^~S) ~M)~S) Copyright 2, ndrew W. Moore Bayes Nets: Slide 45 Independence From S M) S), the rules of probability imply: (can you proe these?) nd in general: ~S M) ~S) Mu ^ S) Mu) S) M S) M) for each of the four combinations of M ^ S) M) S) utrue/false True/False ~M ^ S) ~M) S), (PM^~S) M)~S), ~M^~S) ~M)~S) Copyright 2, ndrew W. Moore Bayes Nets: Slide 46

We e stated: M).6 S).3 S M) S) Independence From these statements, we can derie the full joint pdf. T T F F M T F T F S Prob nd since we now hae the joint pdf, we can make any queries we like. Copyright 2, ndrew W. Moore Bayes Nets: Slide 47 more interesting case M : Manuela teaches the class S : It is sunny L : The lecturer arries slightly late. ssume both lecturers are sometimes delayed by bad weather. ndrew is more likely to arrie late than Manuela. Copyright 2, ndrew W. Moore Bayes Nets: Slide 48

more interesting case M : Manuela teaches the class S : It is sunny L : The lecturer arries slightly late. ssume both lecturers are sometimes delayed by bad weather. ndrew is more likely to arrie late than Manuela. Let s begin with writing down knowledge we re happy about: S M) S), S).3, M).6 Lateness is not independent of the weather and is not independent of the lecturer. Copyright 2, ndrew W. Moore Bayes Nets: Slide 49 more interesting case M : Manuela teaches the class S : It is sunny L : The lecturer arries slightly late. ssume both lecturers are sometimes delayed by bad weather. ndrew is more likely to arrie late than Manuela. Let s begin with writing down knowledge we re happy about: S M) S), S).3, M).6 Lateness is not independent of the weather and is not independent of the lecturer. We already know the Joint of S and M, so all we need now is L Su, M) in the 4 cases of u/ True/False. Copyright 2, ndrew W. Moore Bayes Nets: Slide 5

more interesting case M : Manuela teaches the class S : It is sunny L : The lecturer arries slightly late. ssume both lecturers are sometimes delayed by bad weather. ndrew is more likely to arrie late than Manuela. S M) S) S).3 M).6 L M ^ S).5 L M ^ ~S). L ~M ^ S). L ~M ^ ~S).2 Now we can derie a full joint p.d.f. with a mere six numbers instead of seen* *Saings are larger for larger numbers of ariables. Copyright 2, ndrew W. Moore Bayes Nets: Slide 5 more interesting case M : Manuela teaches the class S : It is sunny L : The lecturer arries slightly late. ssume both lecturers are sometimes delayed by bad weather. ndrew is more likely to arrie late than Manuela. S M) S) S).3 M).6 L M ^ S).5 L M ^ ~S). L ~M ^ S). L ~M ^ ~S).2 Question: Express Lx ^ My ^ Sz) in terms that only need the aboe expressions, where x,y and z may each be True or False. Copyright 2, ndrew W. Moore Bayes Nets: Slide 52

bit of notation S M) S) S).3 M).6 L M ^ S).5 L M ^ ~S). L ~M ^ S). L ~M ^ ~S).2 s).3 S M M).6 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 L Copyright 2, ndrew W. Moore Bayes Nets: Slide 53 S M) S) S).3 M).6 s).3 bit of notation L M ^ S).5 L M ^ ~S) Read. the absence of an arrow L ~M ^ S) between. S and M to mean it L ~M ^ ~S) would.2not help me predict M if I knew the alue of S M).6 S M This kind of stuff will be thoroughly formalized later L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 L Read the two arrows into L to mean that if I want to know the alue of L it may help me to know M and to know S. Copyright 2, ndrew W. Moore Bayes Nets: Slide 54

n een cuter trick Suppose we hae these three eents: M : Lecture taught by Manuela L : Lecturer arries late R : Lecture concerns robots Suppose: ndrew has a higher chance of being late than Manuela. ndrew has a higher chance of giing robotics lectures. What kind of independence can we find? How about: L M) L)? R M) R)? L R) L)? Copyright 2, ndrew W. Moore Bayes Nets: Slide 55 Conditional independence Once you know who the lecturer is, then whether they arrie late doesn t affect whether the lecture concerns robots. R M,L) R M) and R ~M,L) R ~M) We express this in the following way: R and L are conditionally independent gien M..which is also notated by the following diagram. L Copyright 2, ndrew W. Moore Bayes Nets: Slide 56 M R Gien knowledge of M, knowing anything else in the diagram won t help us with L, etc.

Conditional Independence formalized R and L are conditionally independent gien M if for all x,y,z in {T,F}: Rx My ^ Lz) Rx My) More generally: Let S and S2 and S3 be sets of ariables. Set-of-ariables S and set-of-ariables S2 are conditionally independent gien S3 if for all assignments of alues to the ariables in the sets, S s assignments S 2 s assignments & S 3 s assignments) S s assignments S3 s assignments) Copyright 2, ndrew W. Moore Bayes Nets: Slide 57 Example: Shoe-size is conditionally independent of Gloe-size gien height weight and age R and L are conditionally independent gien M if means for all x,y,z in {T,F}: forall s,g,h,w,a Rx My ShoeSizes Heighth,Weightw,gea) ^ Lz) Rx My) ShoeSizes Heighth,Weightw,gea,GloeSizeg) More generally: Let S and S2 and S3 be sets of ariables. Set-of-ariables S and set-of-ariables S2 are conditionally independent gien S3 if for all assignments of alues to the ariables in the sets, S s assignments S 2 s assignments & S 3 s assignments) S s assignments S3 s assignments) Copyright 2, ndrew W. Moore Bayes Nets: Slide 58

Example: Shoe-size is conditionally independent of Gloe-size gien height weight and age R and L are conditionally independent gien M if does not mean for all x,y,z in {T,F}: forall s,g,h Rx My ^ Lz) ShoeSizes Heighth) Rx My) ShoeSizes Heighth, GloeSizeg) More generally: Let S and S2 and S3 be sets of ariables. Set-of-ariables S and set-of-ariables S2 are conditionally independent gien S3 if for all assignments of alues to the ariables in the sets, S s assignments S 2 s assignments & S 3 s assignments) S s assignments S3 s assignments) Copyright 2, ndrew W. Moore Bayes Nets: Slide 59 Conditional independence L M R We can write down M). nd then, since we know L is only directly influenced by M, we can write down the alues of L M) and L ~M) and know we e fully specified L s behaior. Ditto for R. M).6 L M).85 L ~M).7 R M).3 R ~M).6 R and L conditionally independent gien M Copyright 2, ndrew W. Moore Bayes Nets: Slide 6

Conditional independence M M).6 L L M).85 L ~M).7 R M).3 R ~M).6 R Conditional Independence: R M,L) R M), R ~M,L) R ~M) gain, we can obtain any member of the Joint prob dist that we desire: Lx ^ Ry ^ Mz) Copyright 2, ndrew W. Moore Bayes Nets: Slide 6 ssume fie ariables T: The lecture started by :35 L: The lecturer arries late R: The lecture concerns robots M: The lecturer is Manuela S: It is sunny T only directly influenced by L (i.e. T is conditionally independent of R,M,S gien L) L only directly influenced by M and S (i.e. L is conditionally independent of R gien M & S) R only directly influenced by M (i.e. R is conditionally independent of L,S, gien M) M and S are independent Copyright 2, ndrew W. Moore Bayes Nets: Slide 62

Making a Bayes net T: The lecture started by :35 L: The lecturer arries late R: The lecture concerns robots M: The lecturer is Manuela S: It is sunny S M L R T Step One: add ariables. Just choose the ariables you d like to be included in the net. Copyright 2, ndrew W. Moore Bayes Nets: Slide 63 Making a Bayes net T: The lecture started by :35 L: The lecturer arries late R: The lecture concerns robots M: The lecturer is Manuela S: It is sunny S M L R T Step Two: add links. The link structure must be acyclic. If node X is gien parents Q,Q 2,..Q n you are promising that any ariable that s a non-descendent of X is conditionally independent of X gien {Q,Q 2,..Q n } Copyright 2, ndrew W. Moore Bayes Nets: Slide 64

Making a Bayes net T: The lecture started by :35 L: The lecturer arries late R: The lecture concerns robots M: The lecturer is Manuela S: It is sunny s).3 S M M).6 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 L T T L).3 T ~L).8 R R M).3 R ~M).6 Step Three: add a probability table for each node. The table for node X must list X Parent Values) for each possible combination of parent alues Copyright 2, ndrew W. Moore Bayes Nets: Slide 65 Making a Bayes net T: The lecture started by :35 L: The lecturer arries late R: The lecture concerns robots M: The lecturer is Manuela S: It is sunny s).3 S M M).6 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 L T T L).3 T ~L).8 Copyright 2, ndrew W. Moore Bayes Nets: Slide 66 R R M).3 R ~M).6 Two unconnected ariables may still be correlated Each node is conditionally independent of all nondescendants in the tree, gien its parents. You can deduce many other conditional independence relations from a Bayes net. See the next lecture.

Bayes Nets Formalized Bayes net (also called a belief network) is an augmented directed acyclic graph, represented by the pair V, E where: V is a set of ertices. E is a set of directed edges joining ertices. No loops of any length are allowed. Each ertex in V contains the following information: The name of a random ariable probability distribution table indicating how the probability of this ariable s alues depends on all possible combinations of parental alues. Copyright 2, ndrew W. Moore Bayes Nets: Slide 67 Building a Bayes Net. Choose a set of releant ariables. 2. Choose an ordering for them 3. ssume they re called X.. X m (where X is the first in the ordering, X is the second, etc) 4. For i to m:. dd the X i node to the network 2. Set Parents(X i ) to be a minimal subset of {X X i- } such that we hae conditional independence of X i and all other members of {X X i- } gien Parents(X i ) 3. Define the probability table of X i k ssignments of Parents(X i ) ). Copyright 2, ndrew W. Moore Bayes Nets: Slide 68

Example Bayes Net Building Suppose we re building a nuclear power station. There are the following random ariables: GRL : Gauge Reads Low. CTL : Core temperature is low. FG : Gauge is faulty. F : larm is faulty S : larm sounds If alarm working properly, the alarm is meant to sound if the gauge stops reading a low temp. If gauge working properly, the gauge is meant to read the temp of the core. Copyright 2, ndrew W. Moore Bayes Nets: Slide 69 Computing a Joint Entry How to compute an entry in a joint distribution? E.G: What is S ^ ~M ^ L ~R ^ T)? s).3 S M M).6 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 L T T L).3 T ~L).8 R R M).3 R ~M).6 Copyright 2, ndrew W. Moore Bayes Nets: Slide 7

s).3 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 Computing with Bayes Net S L T M T L).3 T ~L).8 M).6 R R M).3 R ~M).6 T ^ ~R ^ L ^ ~M ^ S) T ~R ^ L ^ ~M ^ S) * ~R ^ L ^ ~M ^ S) T L) * ~R ^ L ^ ~M ^ S) T L) * ~R L ^ ~M ^ S) * L^~M^S) T L) * ~R ~M) * L^~M^S) T L) * ~R ~M) * L ~M^S)*~M^S) T L) * ~R ~M) * L ~M^S)*~M S)*S) T L) * ~R ~M) * L ~M^S)*~M)*S). Copyright 2, ndrew W. Moore Bayes Nets: Slide 7 The general case X x ^ X 2 x 2 ^.X n- x n- ^ X n x n ) X n x n ^ X n- x n- ^.X 2 x 2 ^ X x ) X n x n X n- x n- ^.X 2 x 2 ^ X x ) * X n- x n- ^. X 2 x 2 ^ X x ) X n x n X n- x n- ^.X 2 x 2 ^ X x ) * X n- x n-. X 2 x 2 ^ X x ) * X n-2 x n-2 ^. X 2 x 2 ^ X x ) : : n P X x X x K X x i n i P ( )(( ) ( ))) i i i ( X x ) ssignment s of Parents( X )) i i i So any entry in joint pdf table can be computed. nd so any conditional probability can be computed. Copyright 2, ndrew W. Moore Bayes Nets: Slide 72 i

Where are we now? We hae a methodology for building Bayes nets. We don t require exponential storage to hold our probability table. Only exponential in the maximum number of parents of any node. We can compute probabilities of any gien assignment of truth alues to the ariables. nd we can do it in time linear with the number of nodes. So we can also compute answers to any questions. s).3 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 S L T E.G. What could we do to compute R T,~S)? M T L).3 T ~L).8 M).6 Copyright 2, ndrew W. Moore Bayes Nets: Slide 73 R R M).3 R ~M).6 Where are we now? Step We : Compute hae a methodology R ^ T ^ ~S) for building Bayes nets. We don t require exponential storage to hold our probability Step 2: Compute ~R ^ T ^ ~S) table. Only exponential in the maximum number of parents of any node. Step 3: Return We can compute probabilities of any gien assignment of truth alues to the ariables. nd we can do it in time R ^ T ^ ~S) linear with the number of nodes. ------------------------------------- R So we ^ T ^ can ~S)+ also ~R compute ^ T ^ ~S) answers to any questions. s).3 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 S L T E.G. What could we do to compute R T,~S)? M T L).3 T ~L).8 M).6 Copyright 2, ndrew W. Moore Bayes Nets: Slide 74 R R M).3 R ~M).6

Where are we now? Step We : Compute hae a methodology R ^ T ^ ~S) for building Sum Bayes of all the nets. rows in the Joint that match R ^ T ^ ~S We don t require exponential storage to hold our probability Step 2: Compute ~R ^ T ^ ~S) table. Only exponential in the maximum number of parents of any node. Sum of all the rows in the Joint Step 3: Return We can compute probabilities of any that gien match assignment ~R ^ T ^ ~Sof truth alues to the ariables. nd we can do it in time R ^ T ^ ~S) linear with the number of nodes. ------------------------------------- R So we ^ T ^ can ~S)+ also ~R compute ^ T ^ ~S) answers to any questions. s).3 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 S L T E.G. What could we do to compute R T,~S)? M T L).3 T ~L).8 M).6 Copyright 2, ndrew W. Moore Bayes Nets: Slide 75 R R M).3 R ~M).6 Where are we now? Step We : Compute hae a methodology R ^ T ^ ~S) for building Sum Bayes of all the nets. rows in the Joint that match R ^ T ^ ~S We don t require exponential storage to hold our probability Step 2: Compute ~R ^ T ^ ~S) table. Only exponential in the maximum number of parents of any node. Sum of all the rows in the Joint Step 3: Return We can compute probabilities of any that gien match assignment ~R ^ T ^ ~Sof truth alues to the ariables. nd we can do it in time R ^ T ^ ~S) linear with the number of nodes. ------------------------------------- R So we ^ T ^ can ~S)+ also ~R compute ^ T ^ ~S) answers Each to any of these questions. obtained by the computing a joint s).3 S M M).6 probability entry method of the earlier slides L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 L T T L).3 T ~L).8 E.G. What could we do to compute R T,~S)? Copyright 2, ndrew W. Moore Bayes Nets: Slide 76 R R M).3 R ~M).6 4 joint computes 4 joint computes

The good news We can do inference. We can compute any conditional probability: Some ariable Some other ariable alues ) E E 2 ) E E2) E ) 2 joint entries matching E and E joint entries matching E joint entry) joint entry) 2 2 Copyright 2, ndrew W. Moore Bayes Nets: Slide 77 The good news We can do inference. We can compute any conditional probability: Some ariable Some other ariable alues ) E E 2 ) E E2) E ) 2 joint entries matching E and E joint entries matching E joint entry) joint entry) 2 2 Suppose you hae m binary-alued ariables in your Bayes Net and expression E 2 mentions k ariables. How much work is the aboe computation? Copyright 2, ndrew W. Moore Bayes Nets: Slide 78

The sad, bad news Conditional probabilities by enumerating all matching entries in the joint are expensie: Exponential in the number of ariables. Copyright 2, ndrew W. Moore Bayes Nets: Slide 79 The sad, bad news Conditional probabilities by enumerating all matching entries in the joint are expensie: Exponential in the number of ariables. But perhaps there are faster ways of querying Bayes nets? In fact, if I eer ask you to manually do a Bayes Net inference, you ll find there are often many tricks to sae you time. So we e just got to program our computer to do those tricks too, right? Copyright 2, ndrew W. Moore Bayes Nets: Slide 8

The sad, bad news Conditional probabilities by enumerating all matching entries in the joint are expensie: Exponential in the number of ariables. But perhaps there are faster ways of querying Bayes nets? In fact, if I eer ask you to manually do a Bayes Net inference, you ll find there are often many tricks to sae you time. So we e just got to program our computer to do those tricks too, right? Sadder and worse news: General querying of Bayes nets is NP-complete. Copyright 2, ndrew W. Moore Bayes Nets: Slide 8 Bayes nets inference algorithms poly-tree is a directed acyclic graph in which no two nodes hae more than one path between them. S X X M 2 L T poly tree X 3 X 5 R X 4 S X X 2 M L X 3 R X 4 X T 5 Not a poly tree (but still a legal Bayes net) If net is a poly-tree, there is a linear-time algorithm (see a later ndrew lecture). The best general-case algorithms conert a general net to a polytree (often at huge expense) and calls the poly-tree algorithm. nother popular, practical approach (doesn t assume poly-tree): Stochastic Simulation. Copyright 2, ndrew W. Moore Bayes Nets: Slide 82

Sampling from the Joint Distribution s).3 S M M).6 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 L T T L).3 T ~L).8 R R M).3 R ~M).6 It s pretty easy to generate a set of ariable-assignments at random with the same probability as the underlying joint distribution. How? Copyright 2, ndrew W. Moore Bayes Nets: Slide 83 Sampling from the Joint Distribution s).3 S M M).6 L M^S).5 L M^~S). L ~M^S). L ~M^~S).2 L T T L).3 T ~L).8 R R M).3 R ~M).6. Randomly choose S. S True with prob.3 2. Randomly choose M. M True with prob.6 3. Randomly choose L. The probability that L is true depends on the assignments of S and M. E.G. if steps and 2 had produced STrue, MFalse, then probability that L is true is. 4. Randomly choose R. Probability depends on M. 5. Randomly choose T. Probability depends on L Copyright 2, ndrew W. Moore Bayes Nets: Slide 84

general sampling algorithm Let s generalize the example on the preious slide to a general Bayes Net. s in Slides 6-7, call the ariables X.. X n, where Parents(X i ) must be a subset of {X.. X i- }. For i to n:. Find parents, if any, of X i. ssume n(i) parents. Call them X p(i,), X p(i,2), X p(i,n(i)). 2. Recall the alues that those parents were randomly gien: x p(i,), x p(i,2), x p(i,n(i)). 3. Look up in the lookup-table for: X i True X p(i,) x p(i,),x p(i,2) x p(i,2) X p(i,n(i)) x p(i,n(i)) ) 4. Randomly set x i True according to this probability x, x 2, x n are now a sample from the joint distribution of X, X 2, X n. Copyright 2, ndrew W. Moore Bayes Nets: Slide 85 Stochastic Simulation Example Someone wants to know R True T True ^ S False ) We ll do lots of random samplings and count the number of occurrences of the following: N c : Num. samples in which TTrue and SFalse. N s : Num. samples in which RTrue, TTrue and SFalse. N : Number of random samplings Now if N is big enough: N c /N is a good estimate of TTrue and SFalse). N s /N is a good estimate of RTrue,TTrue, SFalse). R T^~S) R^T^~S)/T^~S), so N s / N c can be a good estimate of R T^~S). Copyright 2, ndrew W. Moore Bayes Nets: Slide 86

General Stochastic Simulation Someone wants to know E E 2 ) We ll do lots of random samplings and count the number of occurrences of the following: N c : Num. samples in which E 2 N s : Num. samples in which E and E 2 N : Number of random samplings Now if N is big enough: N c /N is a good estimate of E 2 ). N s /N is a good estimate of E, E 2 ). E E 2 ) E ^ E 2 )/E 2 ), so N s / N c can be a good estimate of E E 2 ). Copyright 2, ndrew W. Moore Bayes Nets: Slide 87 Likelihood weighting Problem with Stochastic Sampling: With lots of constraints in E, or unlikely eents in E, then most of the simulations will be thrown away, (they ll hae no effect on Nc, or Ns). Imagine we re part way through our simulation. In E2 we hae the constraint Xi We re just about to generate a alue for Xi at random. Gien the alues assigned to the parents, we see that Xi parents) p. Now we know that with stochastic sampling: we ll generate Xi proportion p of the time, and proceed. nd we ll generate a different alue proportion -p of the time, and the simulation will be wasted. Instead, always generate Xi, but weight the answer by weight p to compensate. Copyright 2, ndrew W. Moore Bayes Nets: Slide 88

Likelihood weighting Set N c :, N s :. Generate a random assignment of all ariables that matches E 2. This process returns a weight w. 2. Define w to be the probability that this assignment would hae been generated instead of an unmatching assignment during its generation in the original algorithm.fact: w is a product of all likelihood factors inoled in the generation. 3. N c : N c + w 4. If our sample matches E then N s : N s + w 5. Go to gain, N s / N c estimates E E 2 ) Copyright 2, ndrew W. Moore Bayes Nets: Slide 89 Case Study I Pathfinder system. (Heckerman 99, Probabilistic Similarity Networks, MIT Press, Cambridge M). Diagnostic system for lymph-node diseases. 6 diseases and symptoms and test-results. 4, probabilities Expert consulted to make net. 8 hours to determine ariables. 35 hours for net topology. 4 hours for probability table alues. pparently, the experts found it quite easy to inent the causal links and probabilities. Pathfinder is now outperforming the world experts in diagnosis. Being extended to seeral dozen other medical domains. Copyright 2, ndrew W. Moore Bayes Nets: Slide 9

Questions What are the strengths of probabilistic networks compared with propositional logic? What are the weaknesses of probabilistic networks compared with propositional logic? What are the strengths of probabilistic networks compared with predicate logic? What are the weaknesses of probabilistic networks compared with predicate logic? (How) could predicate logic and probabilistic networks be combined? Copyright 2, ndrew W. Moore Bayes Nets: Slide 9 What you should know The meanings and importance of independence and conditional independence. The definition of a Bayes net. Computing probabilities of assignments of ariables (i.e. members of the joint p.d.f.) with a Bayes net. The slow (exponential) method for computing arbitrary, conditional probabilities. The stochastic simulation method and likelihood weighting. Copyright 2, ndrew W. Moore Bayes Nets: Slide 92