Bayesian belief networks

Similar documents
Bayesian belief networks. Inference.

Modeling and reasoning with uncertainty

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Introduction to Artificial Intelligence. Unit # 11

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Probabilistic Reasoning Systems

PROBABILISTIC REASONING SYSTEMS

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

Quantifying uncertainty & Bayesian networks

Introduction to Artificial Intelligence Belief networks

Knowledge Representation. Propositional logic

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Review: Bayesian learning and inference

Directed Graphical Models

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

COMP5211 Lecture Note on Reasoning under Uncertainty

Knowledge Representation. Propositional logic.

Bayesian Networks. Motivation

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

Bayesian networks. Chapter 14, Sections 1 4

Bayesian networks. Chapter Chapter

Informatics 2D Reasoning and Agents Semester 2,

CS 188: Artificial Intelligence Fall 2009

CS 5522: Artificial Intelligence II

CS 343: Artificial Intelligence

CS 380: ARTIFICIAL INTELLIGENCE UNCERTAINTY. Santiago Ontañón

Bayesian Networks and Decision Graphs

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Probabilistic Models

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Artificial Intelligence Bayesian Networks

Probabilistic representation and reasoning

Ch.6 Uncertain Knowledge. Logic and Uncertainty. Representation. One problem with logical approaches: Department of Computer Science

Defining Things in Terms of Joint Probability Distribution. Today s Lecture. Lecture 17: Uncertainty 2. Victor R. Lesser

Introduction to Probabilistic Reasoning. Image credit: NASA. Assignment

Outline. CSE 573: Artificial Intelligence Autumn Bayes Nets: Big Picture. Bayes Net Semantics. Hidden Markov Models. Example Bayes Net: Car

Y. Xiang, Inference with Uncertain Knowledge 1

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

This lecture. Reading. Conditional Independence Bayesian (Belief) Networks: Syntax and semantics. Chapter CS151, Spring 2004

Outline. Spring It Introduction Representation. Markov Random Field. Conclusion. Conditional Independence Inference: Variable elimination

CS 188: Artificial Intelligence Fall 2008

Graphical Models - Part I

CS 5522: Artificial Intelligence II

Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

Probabilistic representation and reasoning

Another look at Bayesian. inference

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

Uncertainty and Belief Networks. Introduction to Artificial Intelligence CS 151 Lecture 1 continued Ok, Lecture 2!

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

CSE 473: Artificial Intelligence Autumn 2011

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

CS Belief networks. Chapter

Bayesian Networks BY: MOHAMAD ALSABBAGH

CMPSCI 240: Reasoning about Uncertainty

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Probabilistic Representation and Reasoning

Announcements. CS 188: Artificial Intelligence Spring Probability recap. Outline. Bayes Nets: Big Picture. Graphical Model Notation

Bayesian Networks. Vibhav Gogate The University of Texas at Dallas

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

Probabilistic Classification

Probability. CS 3793/5233 Artificial Intelligence Probability 1

CS 188: Artificial Intelligence Spring Announcements

COMP9414: Artificial Intelligence Reasoning Under Uncertainty

CS 188: Artificial Intelligence Spring Announcements

Uncertainty. Russell & Norvig Chapter 13.

A Brief Introduction to Graphical Models. Presenter: Yijuan Lu November 12,2004

Probabilistic Models. Models describe how (a portion of) the world works

Learning Bayesian Networks (part 1) Goals for the lecture

Reasoning Under Uncertainty

14 PROBABILISTIC REASONING

Uncertainty and Bayesian Networks

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Brief Intro. to Bayesian Networks. Extracted and modified from four lectures in Intro to AI, Spring 2008 Paul S. Rosenbloom

Bayesian belief networks

Bayesian Networks. Material used

Bayes Networks 6.872/HST.950

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Reasoning Under Uncertainty

Intelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks

CIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example.

Bayes Nets. CS 188: Artificial Intelligence Fall Example: Alarm Network. Bayes Net Semantics. Building the (Entire) Joint. Size of a Bayes Net

Product rule. Chain rule

Reasoning with Bayesian Networks

Bayes Nets: Independence

Probabilistic Reasoning in AI COMP-526

Bayesian Belief Network

Where are we? Knowledge Engineering Semester 2, Reasoning under Uncertainty. Probabilistic Reasoning

Outline } Conditional independence } Bayesian networks: syntax and semantics } Exact inference } Approximate inference AIMA Slides cstuart Russell and

Bayesian networks. Chapter Chapter Outline. Syntax Semantics Parameterized distributions. Chapter

Bayesian networks: Modeling

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Computer Science CPSC 322. Lecture 18 Marginalization, Conditioning

CS 188: Artificial Intelligence. Our Status in CS188

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

Automatic Differentiation Equipped Variable Elimination for Sensitivity Analysis on Probabilistic Inference Queries

Figure 1: Bayesian network for problem 1. P (A = t) = 0.3 (1) P (C = t) = 0.6 (2) Table 1: CPTs for problem 1. C P (D = t) f 0.9 t 0.

EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS

Transcription:

CS 2001 Lecture 1 Bayesian belief networks Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square 4-8845 Milos research interests Artificial Intelligence Planning, reasoning and optimization in the presence of uncertainty Machine learning Applications: Medicine Finance and investments Main research focus: Models of high dimensional stochastic problems and their efficient solutions 1

KB for medical diagnosis. We want to build a KB system for the diagnosis of pneumonia. Problem description: Disease: pneumonia Patient symptoms (findings, lab tests): Fever, Cough, Paleness, WBC (white blood cells) count, Chest pain, etc. Representation of a patient case: Statements that hold (are true) for that patient. E.g: Fever =True Cough =False WBCcount=High Diagnostic task: we want to infer whether the patient suffers from the pneumonia or not given the symptoms Uncertainty To make diagnostic inference possible we need to represent rules or axioms that relate symptoms and diagnosis Problem: disease/symptoms relation is not deterministic (things may vary from patient to patient) Disease Symptoms uncertainty A patient suffering from pneumonia may not have fever all the times, may or may not have a cough, white blood cell test can be in a normal range. Symptoms Disease uncertainty High fever is typical for many diseases (e.g. bacterial diseases) and does not point specifically to pneumonia Fever, cough, paleness, high WBC count combined do not always point to pneumonia 2

Modeling the uncertainty. How to describe, represent the relations in the presence of uncertainty? How to manipulate such knowledge to make inferences? Humans can reason with uncertainty. Pneumonia? Pale Fever Cough WBC count Methods for representing uncertainty KB systems based on propositional and first-order logic often represent uncertain statements, axioms of the domain in terms of rules with various certainty factors Very popular in 70-80s (MYCIN) If Then 1. The stain of the organism is gram-positive, and 2. The morphology of the organism is coccus, and 3. The growth conformation of the organism is chains with certainty 0.7 the identity of the organism is streptococcus Problems: Chaining of multiple inference rules (propagation of uncertainty) Combinations of rules with the same conclusions After some number of combinations results not intuitive. 3

Representing certainty factors Facts (propositional statements about the world) are assigned some certainty number reflecting the belief in that the statement is satisfied: CF( Pneumonia = True) = 0.7 Rules incorporate tests on the certainty values ( A in [0.5 1, ]) ( B in [0.7,1]) C with CF= 0.8 Methods for combination of conclusions ( A in [0.5 1, ]) ( B in [0.7,1]) C with CF= 0.8 ( E in [0.8,1]) ( D in [0.9,1]) C with CF = 0.9 CF( C) = max[ 0.9;0.8] = 0.9 CF( C) = 0.9* 0.8 = 0.72 CF( C) = 0.9+ 0.8 0.9* 0.8 = 0.98? Probability theory a well-defined coherent theory for representing uncertainty and for reasoning with it Representation: Proposition statements assignment of values to random variables Pneumonia = True WBCcount = high Probabilities over statements model the degree of belief in these statements Pneumonia = True) = 0.001 WBCcount = high) = 0.005 Pneumonia= True, Fever= True) = 0.0009 Pneumonia= False, WBCcount = normal, Cough= False) = 0.97 4

Joint probability distribution Joint probability distribution (for a set variables) Defines probabilities for all possible assignments to values of variables in the set pneumonia, WBCcount ) Pneumonia True False 2 3 table WBCcount high normal low 0.0008 0.0001 0.0001 0.0042 0.9929 0.0019 0.005 0. 993 0. 002 WBCcount) Marginalization(summing of rows, or columns) - summing out variables Pneumonia) 0.001 0.999 Conditional probability distribution Conditional probability distribution: Probability distribution of A given (fixed B) A, B) P ( A B) = B) Conditional probability is defined in terms of joint probabilities Joint probabilities can be expressed in terms of conditional probabilities P ( A, B) = A B) B) Conditional probability is useful for diagnostic reasoning the effect of a symptoms (findings) on the disease P ( Pneumonia= True Fever= True, WBCcount= high, Cough= True) 5

Modeling uncertainty with probabilities Full joint distribution: joint distribution over all random variables defining the domain it is sufficient to represent the complete domain and to do any type of probabilistic reasoning Problems: Space complexity. To store full joint distribution requires to remember O(d n ) numbers. n number of random variables, d number of values Inference complexity. To compute some queries requires. O(d n ) steps. Acquisition problem. Who is going to define all of the probability entries? Pneumonia example. Complexities. Space complexity. Pneumonia (2 values: T,F), Fever (2: T,F), Cough (2: T,F), WBCcount (3: high, normal, low), paleness (2: T,F) Number of assignments: 2*2*2*3*2=48 We need to define at least 47 probabilities. Time complexity. Assume we need to compute the probability of Pneumonia=T from the full joint Pneumonia = T) = = Fever = i, Cough = i T, F j T, F k= h, n, l u T, F Sum over 2*2*3*2=24 combinations j, WBCcount = k, Pale = u) 6

Modeling uncertainty with probabilities Knowledge based system era (70s early 80 s) Extensional non-probabilistic models Probability techniques avoided because of space, time and acquisition bottlenecks in defining full joint distributions Negative effect on the advancement of KB systems and AI in 80s in general Breakthrough (late 80s, beginning of 90s) Bayesian belief networks Give solutions to the space, acquisition bottlenecks Significant improvements in the time cost of inferences Bayesian belief networks (BBNs) Bayesian belief networks. Represent the full joint distribution more compactly with smaller number of parameters. Take advantage of conditional and marginal independences among components in the distribution A and B are independent P ( A, B) = A) B) A and B are conditionally independent given C P ( A, B C) = A C) B C) P ( A C, B) = A C) 7

system example. Assume your house has an alarm systemagainst burglary. You live in the seismically active area and the alarm system can get occasionally set off by an earthquake. You have two neighbors, Mary and John, who do not know each other. If they hear the alarm they call you, but this is not guaranteed. We want to represent the probability distribution of events: Burglary, Earthquake,, Mary calls and John calls Causal relations Burglary Earthquake JohnCalls MaryCalls Bayesian belief network. 1. Graph reflecting direct (causal) dependencies between variables 2. Local conditional distributions relating variables and their parents Burglary B) Earthquake E) A B,E) J A) M A) JohnCalls MaryCalls 8

Bayesian belief network. Burglary JohnCalls B) E) T F T F 0.001 0.999 J A) A T F T 0.90 0.1 F 0.05 0.95 Earthquake A B,E) B E T F T T 0.95 0.05 T F 0.94 0.06 F T 0.29 0.71 F F 0.001 0.999 MaryCalls 0.002 0.998 M A) A T F T 0.7 0.3 F 0.01 0.99 Bayesian belief networks (general) Two components: B = ( S, ΘS ) Directed acyclic graph Nodes correspond to random variables (Missing) links encode independences Parameters Local conditional probability distributions for every variable-parent configuration i pa( i )) Where: pa ( i ) - stand for parents of i B J A A B,E) B E T F E M T T 0.95 0.05 T F 0.94 0.06 F T 0.29 0.71 F F 0.001 0.999 9

Full joint distribution in BBNs Full joint distribution is defined in terms of local conditional distributions (obtained via the chain rule): 1, 2,.., n ) = Example: i= 1,.. n Assume the following assignment of values to random variables B = T, E= T, A= T, J = T, M = F i pa( Then its probability is: B= T, E= T, A= T, J = T, M = F) = P ( B= T) E= T) A= T B= T, E= T) J = T A= T) M= F A= T) i )) B J A E M Independences in BBNs 3 basic independence structures 1. 2. 3. Burglary Burglary Earthquake JohnCalls MaryCalls JohnCalls 1. JohnCalls is independent of Burglary given 2. Burglary is independent of Earthquake (not knowing ) Burglary and Earthquake become dependent given!! 3. MaryCalls is independent of JohnCalls given 10

Independences in BBNs Other dependences/independences in the network Burglary Earthquake RadioReport JohnCalls MaryCalls Earthquake and Burglary are dependent given MaryCalls Burglary and MaryCalls are dependent (not knowing ) Burglary and RadioReportare independent given Earthquake Burglary and RadioReportare dependent given MaryCalls Parameters: full joint: BBN: Parameter complexity problem In the BBN the full joint distribution is expressed as a product of conditionals (of smaller) complexity 1, 2,.., n ) = 2 3 + 2 5 = 2(2 32 2 ) + i= 1,.. n 2(2) = i 20 pa( i )) Burglary Earthquake Parameters to be defined: full joint: 2 5 1 = 31 JohnCalls MaryCalls BBN: 2 2 + 2(2) + 2(1) = 10 11

Model acquisition problem The structure of the BBN typically reflects causal relations BBNs are also sometime referred to as causal networks Causal structure is very intuitive in many applications domain and it is relatively easy to obtain from the domain expert Probability parameters of BBN correspond to conditional distributions relating random variables and their parents The complexity of local distributions is much smaller than the full joint Easier to estimate the probability parameters by consulting an expert or by learning them from data BBNs built in practice In various areas: Intelligent user interfaces (Microsoft) Troubleshooting, diagnosis of a technical device Medical diagnosis: Pathfinder (Intellipath) CPSC Munin QMR-DT Collaborative filtering Military applications Insurance, credit applications 12

Diagnosis of car engine Diagnose the engine start problem Car insurance example Predict claim costs (medical, liability) based on application data 13

(ICU) network CPCS Computer-based Patient Case Simulation system (CPCS-PM) developed by Parker and Miller (at University of Pittsburgh) 422 nodes and 867 arcs 14

QMR-DT Medical diagnosis in internal medicine Bipartite network of disease/findings relations 15