Introduction to Probabilistic Reasoning. Image credit: NASA. Assignment

Similar documents
Directed Graphical Models

Quantifying uncertainty & Bayesian networks

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Introduction to Artificial Intelligence. Unit # 11

PROBABILISTIC REASONING SYSTEMS

Bayesian Networks. Probability Theory and Probabilistic Reasoning. Emma Rollon and Javier Larrosa Q

Probability. CS 3793/5233 Artificial Intelligence Probability 1

Probabilistic Reasoning Systems

Probabilistic representation and reasoning

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Probabilistic representation and reasoning

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Review: Bayesian learning and inference

Introduction to Artificial Intelligence Belief networks

Bayesian networks. Chapter 14, Sections 1 4

Bayesian Networks. Motivation

Graphical Models - Part I

Intelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks

Bayes Networks 6.872/HST.950

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Bayesian Networks BY: MOHAMAD ALSABBAGH

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

Uncertainty and Bayesian Networks

COMP5211 Lecture Note on Reasoning under Uncertainty

Bayes Nets: Independence

Probabilistic Representation and Reasoning

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Ch.6 Uncertain Knowledge. Logic and Uncertainty. Representation. One problem with logical approaches: Department of Computer Science

Artificial Intelligence Uncertainty

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

CS 188: Artificial Intelligence Spring Announcements

Artificial Intelligence Bayes Nets: Independence

Artificial Intelligence

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Bayesian belief networks. Inference.

CMPT Machine Learning. Bayesian Learning Lecture Scribe for Week 4 Jan 30th & Feb 4th

Uncertainty. Outline

Motivation. Bayesian Networks in Epistemology and Philosophy of Science Lecture. Overview. Organizational Issues

CS 5522: Artificial Intelligence II

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

COS402- Artificial Intelligence Fall Lecture 10: Bayesian Networks & Exact Inference

Uncertainty. Chapter 13

Bayesian belief networks

Bayesian networks. Chapter Chapter

14 PROBABILISTIC REASONING

This lecture. Reading. Conditional Independence Bayesian (Belief) Networks: Syntax and semantics. Chapter CS151, Spring 2004

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Informatics 2D Reasoning and Agents Semester 2,

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Uncertainty and Belief Networks. Introduction to Artificial Intelligence CS 151 Lecture 1 continued Ok, Lecture 2!

last two digits of your SID

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

COMP9414: Artificial Intelligence Reasoning Under Uncertainty

Bayesian Networks Representation

An AI-ish view of Probability, Conditional Probability & Bayes Theorem

10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty.

Pengju

Y. Xiang, Inference with Uncertain Knowledge 1

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

CS 188: Artificial Intelligence Spring Announcements

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Reasoning Under Uncertainty

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

Artificial Intelligence Bayesian Networks

Intelligent Systems (AI-2)

Defining Things in Terms of Joint Probability Distribution. Today s Lecture. Lecture 17: Uncertainty 2. Victor R. Lesser

School of EECS Washington State University. Artificial Intelligence

CS 188: Artificial Intelligence Fall 2009

Probabilistic Classification

BN Semantics 3 Now it s personal! Parameter Learning 1

Bayesian Approaches Data Mining Selected Technique

Introduction to Artificial Intelligence (AI)

Brief Intro. to Bayesian Networks. Extracted and modified from four lectures in Intro to AI, Spring 2008 Paul S. Rosenbloom

Bayesian networks: Modeling

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

Reasoning Under Uncertainty: Conditioning, Bayes Rule & the Chain Rule

Probabilistic Graphical Models and Bayesian Networks. Artificial Intelligence Bert Huang Virginia Tech

FIRST ORDER LOGIC AND PROBABILISTIC INFERENCING. ECE457 Applied Artificial Intelligence Page 1

Another look at Bayesian. inference

Announcements. Inference. Mid-term. Inference by Enumeration. Reminder: Alarm Network. Introduction to Artificial Intelligence. V22.

Probability and Uncertainty. Bayesian Networks

Learning Bayesian Networks (part 1) Goals for the lecture

Probabilistic Models

Bayes Nets III: Inference

CS 561: Artificial Intelligence

Approximate Inference

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Bayesian networks. Chapter AIMA2e Slides, Stuart Russell and Peter Norvig, Completed by Kazim Fouladi, Fall 2008 Chapter 14.

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

STATISTICAL METHODS IN AI/ML Vibhav Gogate The University of Texas at Dallas. Propositional Logic and Probability Theory: Review

Conditional Independence

Lecture 5: Bayesian Network

Artificial Intelligence

Transcription:

Introduction to Probabilistic Reasoning Brian C. Williams 16.410/16.413 November 17 th, 2010 11/17/10 copyright Brian Williams, 2005-10 1 Brian C. Williams, copyright 2000-09 Image credit: NASA. Assignment Homework: Problem Set #8: Linear Programming, due today, Wednesday, November 16 th. Problem Set #9: Probabilistic Reasoning, out today, due Wednesday, November 24 th. Readings: Today: Review of Probabilities and Probabilistic Reasoning. AIMA Chapter 13. AIMA Chapter 14, Sections. 1-5. Monday: HMMs, localization & mapping AIMA Chapter 15, Sections. 1-3. 11/17/10 copyright Brian Williams, 2005-10 2 1

Notation S, Q, R, P Logical sentences Φ Background theory (a sentence). not, and ( ), or (v), implies ( ), if and only if (iff, ). Standard logical connectives where iff if and only if. M(S), entails, Models of sentence S, entails, false. A, B, C Sets., φ niverse of all elements, empty set.,, ~, - Set union, intersection, inverse and difference. Equivalent to. V: Variable or vector of variables. V i : The ith variable of vector V. V, v i : A particular assignment to V; short form for V= v, V i = v i. V t : V at time t. V i:j : A sequence of V from time i to time j. 11/17/10 copyright Brian Williams, 2005-10 3 Notation S: States or state vector. O: Observables or observation vector. X: Mode or mode vector. S0, S1 Stuck at 0/1 mode. Prefix (k) L Returns the first k elements of list L. Sort L by R Sorts list L in increasing order based on relation R. s i ith sample in sample space. P(X) The probability of X occurring. P(X Y) The probability of X, conditioned on Y occurring. A C B) A is conditionally independent of C given B. 11/17/10 copyright Brian Williams, 2005-10 4 2

Outline Motivation Set Theoretic View of Propositional Logic From Propositional Logic to Probabilities Probabilistic Inference General Queries and Inference Methods Bayes Net Inference Model-based Diagnosis (Optional) 11/17/10 copyright Brian Williams, 2005-10 5 Multiple Faults Occur three shorts, tank-line and pressure jacket burst, panel flies off. Lecture 16: Framed as CSP. Image source: NASA. APOLLO 13 How do we compare the space of alternative diagnoses? How do we explain the cause of failure? How do we prefer diagnoses that explain failure? 11/17/10 copyright Brian Williams, 2005-10 6 3

Due to the unknown mode, there tends to be an exponential number of diagnoses. G G Good Good Candidates with NKNOWN failure modes F1 Fn Candidates with KNOWN failure modes 1. Introduce fault models. More constraining, hence more easy to rule out. Increases size of candidate space. 2. Enumerate most likely diagnoses X i based on probability. Prefix (k) ( Sort {X i } by decreasing P(X i O) ) Most of the probability mass is covered by a few diagnoses. copyright Brian Williams, 2005-10 11/17/10 7 in 0 A X B Y C out out 0 Idea: Include known fault modes (S0 = stuck at 0, S1= stuck at 1, ) as well as nknown. Diagnoses: (42 of 64 candidates) Fully Explained Failures: [A=G, B=G, C=S0] [A=G, B=S1, C=S0] [A=S0, B=G, C=G]... Partially Explained: [A=G, B=, C=S0] [A=, B=S1, C=G] [A=S0, B=, C=G]... Faults Isolated, No Explanation: [A=G, B=G, C=] [A=G, B=, C=G] [A=, B=G, C=G] 11/17/10 copyright Brian Williams, 2005-10 8 4

Estimate Dynamically with a Bayes Filter An action is taken Posterior belief after an action State Space P(S t o 1:t ): Initial belief Posterior belief after sensing Image by MIT OpenCourseWare. Localizing a Robot within a Topological Map x 1 x 2 : p(x 2 x 1,a)=.9 x 3 : p(x 3 x 1,a)=.05 x 4 : p(x 4 x 1,a)=.05 Observations can be features such as corridor features, junction features, etc. Source unknown. All rights reserved. This content is excluded from our Creative Commons license. For more information, see http://ocw.mit.edu/fairuse. 5

Outline Motivation Set Theoretic View of Propositional Logic From Propositional Logic to Probabilities Probabilistic Inference 11/17/10 copyright Brian Williams, 2005-10 11 Propositional Logic Set Theoretic Semantics Given sentence S: models of S M(S) niverse of all interpretations () 11/17/10 copyright Brian Williams, 2005-10 12 6

Set Theoretic Semantics: S True M(True) niverse 11/17/10 copyright Brian Williams, 2005-10 13 Set Theoretic Semantics: S not Q M(not Q) M(Q) M(not Q) M(Q) 11/17/10 copyright Brian Williams, 2005-10 14 7

Set Theoretic Semantics: S Q and R M(Q) M(Q and R) M(R) M(Q and R) M(Q) M(R) 11/17/10 copyright Brian Williams, 2005-10 15 Set Theoretic Semantics: False M(False) φ 11/17/10 copyright Brian Williams, 2005-10 16 8

Set Theoretic Semantics: Q or R M(Q or R) M(Q) M(R) M(Q or R) M(Q) M(R) 11/17/10 copyright Brian Williams, 2005-10 17 Set Theoretic Semantics: Q implies R M(R) M(Q) Q implies R is True iff Q entails R M(Q implies R) M(Q) M(R) 11/17/10 copyright Brian Williams, 2005-10 18 9

Set Theoretic Semantics: P implies Q M(R) M(not R) M(Q) Q implies R is True iff Q and not R is inconsistent Q implies R is True iff M(Q and not R) φ 11/17/10 copyright Brian Williams, 2005-10 19 Axioms of Sets over niverse 1. A B B A Commutativity 2. A (B C) A (B C) Associativity 3. A (B C) A B A C Distributivity 4. ~ (~A) A ~ Elimination 5. ~ (A B) (~ A) (~ B) De Morgan s 6. A (~A) φ 7. A A Identity propositional logic axioms follow from these axioms. 11/17/10 copyright Brian Williams, 2005-10 20 10

Outline Motivation Set Theoretic View of Propositional Logic From Propositional Logic to Probabilities Degrees of Belief Discrete Random Variables Probabilistic Inference 11/17/10 copyright Brian Williams, 2005-10 21 Why Do We Need Degrees of Belief? Given theory Φ, sentence Q. is never true. could be true. must be true. M(Φ) M(Q) M(Φ) M(Q) M(Q) M(Φ) Q = One of the valves to Engine A is stuck closed. M(Φ) M(Q) Q = All of the components in the propulsion system are broken in a way never seen before. unsatisfiable satisfiable entailment 11/17/10 copyright Brian Williams, 2005-10 22 11

P(Φ) = P(s i ) s i Φ Degrees of Belief Probability P: Events [0,1] is a measure of the likelihood that an Event is true. Events: like propositions. Φ, Q, Φ Q Φ Q Φ Q Sample Space {s i }: like interpretations. mutually exclusive, collectively exhaustive, finest grained events. P defined over s i. Like counting weighted models in logic. 11/17/10 copyright Brian Williams, 2005-10 23 Axioms of Probability P: Events Reals 1. For any event A, P(A) 0. 2. P() = 1. 3. If A B = φ, then P(A B) = P(A) + P(B). All conventional probability theory follows. 11/17/10 copyright Brian Williams, 2005-10 24 12

Conditional Probability: P(Q Φ) Is Q true given Φ? What is P( Q given Φ )? Φ Q Φ Q P(s j Φ) = P(s j ) / P(Φ) if s j Φ 0 otherwise = P(s j, Φ) P(Φ) 11/17/10 copyright Brian Williams, 2005-10 25 Conditional Probability: P(Q Φ) Is Q true given Φ? What is P( Q given Φ )? Φ Q Φ Q 11/17/10 copyright Brian Williams, 2005-10 26 13

Degree of Belief Given theory Φ, sentence Q. is inconsistent. could be true. must be true. M(Φ) M(Q) M(Φ) M(Q) M(Q) M(Φ) P(Q Φ)= 0 P(Q, Φ) 1 P(Φ) 11/17/10 copyright Brian Williams, 2005-10 27 Representing Sample Space sing Discrete Random Variables Discrete Random Variable X : Domain D X = {x 1, x 2, x n } mutually exclusive, collectively exhaustive. P(X): D X [0, 1] P(x i ) = 1 Joint Distribution over X 1, X n : Y {cloudy,.197.20 Domain Π D Xi sunny}.003.60 P(X 1, X n ): Π D Xi [0, 1] Z {raining, dry} Notation: P(x 1, x n ) P(X 1 = x 1 ^. ^ X n = x n ) 11/17/10 copyright Brian Williams, 2005-10 28 14

Outline Motivation Set Theoretic View of Propositional Logic From Propositional Logic to Probabilities Probabilistic Inference General Queries and Inference Methods Bayes Net Inference Model-based Diagnosis (Optional) 11/17/10 copyright Brian Williams, 2005-10 29 Types of Probabilistic Queries Let X = <S; O> Belief Assessment b(s i ) = P( S i o) Most Probable Explanation (MPE) s * = arg max P(s o) for all s D S Maximum Aposteriori Hypothesis (MAP) Given A S a * = arg max P(a o) for all a D A Maximum Expected tility (ME) Given decision variables D d* = arg max x u(x)p(x d ) for all d D D 11/17/10 copyright Brian Williams, 2005-10 30 15

Common Inference Methods and Approximations used to Answer Queries 1. Product (chain) rule. 2. Product rule + conditional independence. 3. Eliminate variables by marginalizing. 4. Exploiting distribution during elimination. 5. Conditional probability via elimination. 6. Conditional probability via Bayes Rule. Approximations: 1. niform Likelihood Assumption 2. Naïve Bayes Assumption 11/17/10 copyright Brian Williams, 2005-10 31 Outline Motivation Set Theoretic View of Propositional Logic From Propositional Logic to Probabilities Probabilistic Inference General Queries and Inference Methods Bayes Net Inference Model-based Diagnosis (Optional) 11/17/10 copyright Brian Williams, 2005-10 32 16

Bayesian Network Burglary Earthquake TV Alarm Nap JohnCall MaryCall P(M A, N) Input: Directed acyclic graph: Nodes denote random variables. Arcs tails denote parents P of child C. Conditional probability P(C P) for each node. Output: Answers to queries given earlier. 11/17/10 copyright Brian Williams, 2005-10 33 Burglary Conditional Independence for a Bayesian Network Earthquake TV Alarm Nap JohnCall Answer MaryCall M B, E, T, J N, A P(M N, A, B, E, T, J) = P(M N, A) A variable is conditionally independent of its non-descendants, given its parents. 11/17/10 copyright Brian Williams, 2005-10 34 17

Computing Joint Distributions P(X) for Bayesian Networks Burglary Earthquake TV Alarm Nap JohnCall MaryCall Answer Joint Distribution: Assigns probabilities to every interpretation. Queries computed from joint. How do we compute joint distribution P(J,M,T,A,N,B,E)? 11/17/10 copyright Brian Williams, 2005-10 35 Product (Chain) Rule (Method 1) P(C S) : Smoking = no light heavy P(C=none) 0.96 0.88 0.60 P(C=benign) 0.03 0.08 0.30 P(C=malig) 0.01 0.04 0.10 P(C, S): Smoking / Cancer = 0.96 x 0.80 none benign malig no 0.768 0.024 0.008 light 0.132 0.012 0.006 heavy 0.03 0.015 0.005 C (Cancer) {none, benig, malig} S (Smoking) {no, light, heavy} P(S) : P(S=no) 0.80 P(S=light) 0.15 P(S=heavy) 0.05 11/17/10 copyright Brian Williams, 2005-10 36 18

Product Rule + Conditional Independence (Method 2) If A is conditionally independent of C given B (i.e., A C B), then: Suppose A B, C E and B C E, find P(A, B, C E): Product rule Independence 11/17/10 copyright Brian Williams, 2005-10 37 Product Rule for Bayes Nets Burglary Earthquake X i is independent of its ancestors, given its parents: TV Alarm Nap JohnCall MaryCall Product rule Independence 11/17/10 copyright Brian Williams, 2005-10 38 19

Computing Probabilistic Queries Let X = <S; O> Belief Assessment: b(s i ) = P(S i e) Most Probable Explanation (MPE): s * = arg max P(s o) for all s D S Maximum Aposteriori Hypothesis (MAP): Given A S a * = arg max P(s o) for all a D A Solution: Some combination of 1. Start with joint distribution. 2. Eliminate some variables (marginalize). 3. Condition on observations. 4. Find assignment maximizing probability. 11/17/10 copyright Brian Williams, 2005-10 39 Eliminate Variables by Marginalizing (Method 3) Given P(A, B) find P(A) :.397.603 B {raining, dry}.197.20.003.60 P(A, B) A {cloudy, sunny} P(A).2.8 P(B) 11/17/10 copyright Brian Williams, 2005-10 40 20

Exploit Distribution During Elimination (Method 4) Burglary Earthquake Bayes Net Chain Rule: X i is independent of its ancestors, given its parents. TV Alarm Nap JohnCall MaryCall Issue: Size of P(X) is exponential in X Soln: Move sums inside product. 11/17/10 copyright Brian Williams, 2005-10 41 Compute Conditional Probability via Elimination (Method 5) Example: P(A B)?.197/.2.003/.2 P(A B) Note: The denominator is constant for all a i in A.20/.8.60/.8 B {raining, dry}.197.003.20.60 P(A,B) 11/17/10 copyright Brian Williams, 2005-10 42 B {cloudy, sunny}.2.8 P(B) 21

Computing Probabilistic Queries Let X = <S, O> Belief Assessment: S = <S i, Y> b(s i ) = P(S i o) Most Probable Explanation (MPE): s * = arg max P(s o) for all s D S Maximum Aposteriori Hypothesis (MAP): S = <A, Y> Given A S a * = arg max P(a o) for all a D A 11/17/10 copyright Brian Williams, 2005-10 43 Outline Motivation Set Theoretic View of Propositional Logic From Propositional Logic to Probabilities Probabilistic Inference General Queries and Inference Methods Bayes Net Inference Model-based Diagnosis (Optional) 11/17/10 copyright Brian Williams, 2005-10 44 22

Estimating States of Dynamic Systems (next week) S T Given sequence of commands and observations: Infer current distribution of (most likely) states. Infer most likely trajectories. 11/17/10 copyright Brian Williams, 2005-10 45 23

MIT OpenCourseWare http://ocw.mit.edu 16.410 / 16.413 Principles of Autonomy and Decision Making Fall 2010 For information about citing these materials or our Terms of se, visit: http://ocw.mit.edu/terms.