Bayesian Networks. Course: CS40022 Instructor: Dr. Pallab Dasgupta

Similar documents
Reasoning under Uncertainty

CIS587 - Artificial Intellgence. Bayesian Networks CIS587 - AI. KB for medical diagnosis. Example.

Artificial Intelligence Bayesian Networks

Bayesian belief networks

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Engineering Risk Benefit Analysis

Bayesian decision theory. Nuno Vasconcelos ECE Department, UCSD

Hidden Markov Models

Stochastic Structural Dynamics

International Journal of Mathematical Archive-3(3), 2012, Page: Available online through ISSN

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Formalisms For Fusion Belief in Design

What Independencies does a Bayes Net Model? Bayesian Networks: Independencies and Inference. Quick proof that independence is symmetric

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Directed Graphical Models

Evaluation for sets of classes

} Often, when learning, we deal with uncertainty:

CS47300: Web Information Search and Management

Tsitsiashvili G. Sh , Vladivostok, Radio st. 7, IAM FEB RAS

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

Lecture 3: Probability Distributions

Representing arbitrary probability distributions Inference. Exact inference; Approximate inference

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Probability Theory (revisited)

14 PROBABILISTIC REASONING

EPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

Probability and Random Variable Primer

Lecture Notes on Linear Regression

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Why BP Works STAT 232B

Machine learning: Density estimation

), it produces a response (output function g (x)

Problem Set 9 Solutions

Linear Regression Analysis: Terminology and Notation

Computing Correlated Equilibria in Multi-Player Games

CS-433: Simulation and Modeling Modeling and Probability Review

EM and Structure Learning

Bayesian epistemology II: Arguments for Probabilism

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

Fuzzy Boundaries of Sample Selection Model

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

Bayesian predictive Configural Frequency Analysis

Polynomial Regression Models

On the correction of the h-index for career length

Checking Pairwise Relationships. Lecture 19 Biostatistics 666

Accepted for the Twelfth Conference on Uncertainty in Artificial Intelligence (UAI-96) August 1-3, 1996, Portland, Oregon, USA

CS286r Assign One. Answer Key

One-sided finite-difference approximations suitable for use with Richardson extrapolation

A New Evidence Combination Method based on Consistent Strength

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov

1.4. Experiments, Outcome, Sample Space, Events, and Random Variables

Defining Things in Terms of Joint Probability Distribution. Today s Lecture. Lecture 17: Uncertainty 2. Victor R. Lesser

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

Linear Approximation with Regularization and Moving Least Squares

Global Sensitivity. Tuesday 20 th February, 2018

find (x): given element x, return the canonical element of the set containing x;

9 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

Vapnik-Chervonenkis theory

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Expected Value and Variance

Lecture 12: Discrete Laplacian

Numerical Heat and Mass Transfer

COMP9414: Artificial Intelligence Reasoning Under Uncertainty

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

ECE 534: Elements of Information Theory. Solutions to Midterm Exam (Spring 2006)

Stat 543 Exam 2 Spring 2016

Introduction to Artificial Intelligence. Unit # 11

On the Revision of Probabilistic Beliefs using Uncertain Evidence

6. Stochastic processes (2)

PROBABILISTIC REASONING SYSTEMS

6. Stochastic processes (2)

The Order Relation and Trace Inequalities for. Hermitian Operators

COMP5211 Lecture Note on Reasoning under Uncertainty

Probabilistic Reasoning. (Mostly using Bayesian Networks)

Statistics and Quantitative Analysis U4320. Segment 3: Probability Prof. Sharyn O Halloran

Stat 543 Exam 2 Spring 2016

Bayesian Networks. Motivation

x = , so that calculated

1 Derivation of Rate Equations from Single-Cell Conductance (Hodgkin-Huxley-like) Equations

= z 20 z n. (k 20) + 4 z k = 4

CS : Algorithms and Uncertainty Lecture 17 Date: October 26, 2016

Modelli Clamfim Equazione del Calore Lezione ottobre 2014

ON THE EQUIVALENCE OF ORDINAL BAYESIAN INCENTIVE COMPATIBILITY AND DOMINANT STRATEGY INCENTIVE COMPATIBILITY FOR RANDOM RULES

A New Evolutionary Computation Based Approach for Learning Bayesian Network

Math1110 (Spring 2009) Prelim 3 - Solutions

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

PHYS 705: Classical Mechanics. Calculus of Variations II

10.40 Appendix Connection to Thermodynamics and Derivation of Boltzmann Distribution


Transcription:

Bayesan Networks Course: CS40022 Instructor: Dr. Pallab Dasgupta Department of Computer Scence & Engneerng Indan Insttute of Technology Kharagpur

Example Burglar alarm at home Farly relable at detectng a burglary Responds at tmes to mnor earthquakes Two neghbors, on hearng alarm, calls polce John always calls when he hears the alarm, but sometmes confuses the telephone rngng wth the alarm and calls then, too. Mary lkes loud musc and sometmes msses the alarm altogether

Belef Network Example Burglary P(B 0.001 Earthquake 0.002 B E P(A Alarm T T T F 0.95 0.95 F T 0.29 A P(J F F 0.001 JohnCalls T F 0.90 0.05 MaryCalls A T P(M 0.70 F 0.01

The jont probablty dstrbuton A generc entry n the jont probablty dstrbuton P(x 1,, x n s gven by: P(x 1,...,x n = n = 1 P(x Parents(

The jont probablty dstrbuton Probablty of the event that the alarm has sounded but nether a burglary nor an earthquake has occurred, and both Mary and John call: P(J M A B E = P(J A P(M A P(A B E P( B P( E = 0.9 0.7 0.001 0.999 0.998 = 0.00062

Condtonal ndependence P( x = = = 1,..., P( x P( x n = 1 x n n n P( x x x n 1 n 1 x,...,,..., 1 x x,..., 1 1 P( x P( x x 1 n 1 n 1,..., x...p( x x n 2 2 1,..., The belef network represents condtonal ndependence: P(,..., = 1 P( x 1 x 1 P( x Parents( 1

Incremental Network Constructon 1. Choose the set of relevant varables that descrbe the doman 2. Choose an orderng for the varables (very( mportant step 3. Whle there are varables left: a Pck a varable and add a node for t b Set Parents( to some mnmal set of exstng nodes such that the condtonal ndependence property s satsfed c Defne the condtonal prob table for

Condtonal Independence Relatons If every undrected path from a node n to a node n Y s d-separated d by a gven set of evdence nodes E, then and Y are condtonally ndependent gven E. A set of nodes E d-separates two sets of nodes and Y f every undrected path from a node n to a node n Y s blocked gven E.

Condtonal Independence Relatons A path s blocked gven a set of nodes E f there s a node Z on the path for whch one of three condtons holds: 1. Z s n E and Z has one arrow on the path leadng n and one arrow out 2. Z s n E and Z has both path arrows leadng out 3. Nether Z nor any descendant of Z s n E, and both path arrows lead n to Z

Cond Independence n belef networks Battery Rado Ignton Petrol Starts Whether there s petrol and whether the rado plays are ndependent gven evdence about whether the gnton takes place Petrol and Rado are ndependent f t s known whether the battery works

Cond Independence n belef networks Battery Rado Ignton Petrol Starts Petrol and Rado are ndependent gven no evdence at all. But they are dependent gven evdence about whether the car starts. If the car does not start, then the rado playng s ncreased evdence that we are out of petrol.

Inferences usng belef networks Dagnostc nferences (from effects to causes Gven that JohnCalls,, nfer that P(Burglary JohnCalls = 0.016 Causal nferences (from causes to effects Gven Burglary, nfer that P(JohnCalls Burglary = 0.86 and P(MaryCalls Burglary = 0.67

Inferences usng belef networks Intercausal nferences (between causes of a common effect Gven Alarm, we have P(Burglary Alarm = 0.376. If we add evdence that Earthquake s true, then P(Burglary Alarm Earthquake goes down to 0.003 Mxed nferences Settng the effect JohnCalls to true and the cause Earthquake to false gves P(Alarm JohnCalls Earthquake = 0.003

The four patterns Q E Q E E Q E Q E Dagnostc Causal InterCausal Mxed

Answerng queres We consder cases where the belef network s a poly-tree There s at most one undrected path between any two nodes

Answerng queres E U 1 U m Z 1j Z nj E Y 1 Y n

Answerng queres U = U 1 U m are parents of node Y = Y 1 Y n are chldren of node s the query varable E s a set of evdence varables The am s to compute P( E

Defntons E s the causal support for The evdence varables above above that are connected to through ts parents E s the evdental support for The evdence varables below below that are connected to through ts chldren U \ refers to all the evdence connected to node U except va the path from E U Y \ refers to all the evdence connected to node Y through ts parents for E Y

The computaton of P( E = P( = E,E,E P( Snce d-separates E from E, we can use condtonal ndependence to smplfy the frst term n the numerator We can treat the denomnator as a constant E E P( E = α P( E

The computaton of P( E We consder all possble confguratons of the parents of and how lkely they are gven E. Let U be the vector of parents U 1,, U m, and let u be an assgnment of values to them. P( E = P( u u,e P(u E

The computaton of P( E P( E = P( u u,e P(u U d-separates from E, so the frst term smplfes to P( u We can smplfy the second term by notng E d-separates each U from the others, the probablty of a conjuncton of ndependent varables s equal to the product of ther ndvdual probabltes P( E = P( u u P(u E E

The computaton of P( E P( E = P( u u P(u The last term can be smplfed by parttonng E nto E U1\,, E Um\ and notng that E U\ d-separates U from all the other evdence n E = P( u u P(u E E U\ P( u s a lookup n the cond prob table of P(u E U\ s a recursve (smaller sub-problem

The computaton of Let Z be the parents of Y other than, and let z be an assgnment of values to the parents The evdence n each Y box s condtonally ndependent of the others gven = Y\

The computaton of = Y\ Averagng over Y and z yelds: = y z Y\,y,z P (y,z

The computaton of = y z Y\,y,z P (y,z Breakng E Y\ nto the two ndependent components E Y and E Y\ = y z Y,y,z Y\,y,z P(y,z

The computaton of = y z Y,y,z Y\,y,z P(y,z E Y s ndependent of and z gven y, and E Y\ s ndependent of and y = y Y Y\ y z z P(y,z

The computaton of = y Y Y\ y z z P(y,z Apply Bayes rule to Y\ z : = y P(z EY \ P(z Y y z Y\ P(y,z

The computaton of = y P(z EY \ P(z Y y z Y\ P(y,z Rewrtng the conjuncton of Y and z : = z P(z y EY\ P(z Y y Y\ P(y,z P(z

The computaton of z = P(z y EY\ P(z Y y Y\ P(y,z P(z P(z = P(z because Z and are d-separated. Also Y\ s a constant y = Y y z βp(z E Y\ P(y,z

The computaton of = y Y y z βp(z E Y\ P(y,z The parents of Y (the Z j are ndependent of each other. We also combne the β nto one sngle β

The computaton of = β y P(y,z Y y z j P(z j E Z j \Y Y y s a recursve nstance of P(y, z s a cond prob table entry for Y P(z j E Zj\Y s a recursve sub-nstance of the P( E calculaton

Inference n multply connected belef networks Clusterng methods Transform the net nto a probablstcally equvalent (but topologcally dfferent poly- tree by mergng offendng nodes Condtonng methods Instantate varables to defnte values, and then evaluate a poly-tree for each possble nstantaton

Inference n multply connected belef networks Stochastc smulaton methods Use the network to generate a large number of concrete models of the doman that are consstent wth the network dstrbuton. They gve an approxmaton of the exact evaluaton.

Default reasonng Some conclusons are made by default unless a counter-evdence evdence s obtaned Non-monotonc reasonng Ponts to ponder Whats the semantc status of default rules? What happens when the evdence matches the premses of two default rules wth conflctng conclusons? If a belef s retracted later, how can a system keep track of whch conclusons need to be retracted as a consequence?

Issues n Rule-based methods for Uncertan Reasonng Localty In logcal reasonng systems, f we have A B, then we can conclude B gven evdence A, wthout worryng about any other rules.. In probablstc systems, we need to consder all avalable evdence.

Issues n Rule-based methods for Uncertan Reasonng Detachment Once a logcal proof s found for proposton B, we can use t regardless of how t was derved (t( can be detached from ts justfcaton. In probablstc reasonng, the source of the evdence s mportant for subsequent reasonng.

Issues n Rule-based methods for Uncertan Reasonng Truth functonalty In logc, the truth of complex sentences can be computed from the truth of the components. Probablty combnaton does not work ths way, except under strong ndependence assumptons. A famous example of a truth functonal system for uncertan reasonng s the certanty factors model,, developed for the Mycn medcal dagnostc program

Dempster-Shafer Theory Desgned to deal wth the dstncton between uncertanty and gnorance. We use a belef functon Bel( probablty that the evdence supports the proposton When we do not have any evdence about, we assgn Bel( = 0 as well as Bel( = 0

Dempster-Shafer Theory For example, f we do not know whether a con s far, then: Bel( ( Heads = Bel( Heads = 0 If we are gven that the con s far wth 90% certanty, then: Bel( ( Heads = 0.9 0.5 = 0.45 Bel( Heads = 0.9 0.5 = 0.45 Note that we stll have a gap of 0.1 that s not accounted for by the evdence

Fuzzy Logc Fuzzy set theory s a means of specfyng how well an object satsfes a vague descrpton Truth s a value between 0 and 1 Uncertanty stems from lack of evdence, but gven the dmensons of a man concludng whether he s fat has no uncertanty nvolved

Fuzzy Logc The rules for evaluatng the fuzzy truth, T, of a complex sentence are T(A B = mn( T(A, T(B T(A B = max( T(A, T(B T( A = 1 T(A