COMP9414: Artificial Intelligence Reasoning Under Uncertainty

Similar documents
Uncertainty. Logic and Uncertainty. Russell & Norvig. Readings: Chapter 13. One problem with logical-agent approaches: C:145 Artificial

COMP5211 Lecture Note on Reasoning under Uncertainty

Quantifying uncertainty & Bayesian networks

Quantifying Uncertainty & Probabilistic Reasoning. Abdulla AlKhenji Khaled AlEmadi Mohammed AlAnsari

Ch.6 Uncertain Knowledge. Logic and Uncertainty. Representation. One problem with logical approaches: Department of Computer Science

Probabilistic Reasoning. (Mostly using Bayesian Networks)

PROBABILISTIC REASONING SYSTEMS

Bayesian networks. Chapter 14, Sections 1 4

Introduction to Artificial Intelligence. Unit # 11

Y. Xiang, Inference with Uncertain Knowledge 1

Informatics 2D Reasoning and Agents Semester 2,

Uncertainty. 22c:145 Artificial Intelligence. Problem of Logic Agents. Foundations of Probability. Axioms of Probability

Uncertainty. Introduction to Artificial Intelligence CS 151 Lecture 2 April 1, CS151, Spring 2004

Probabilistic Reasoning. Kee-Eung Kim KAIST Computer Science

COMP9414: Artificial Intelligence Propositional Logic: Automated Reasoning

14 PROBABILISTIC REASONING

Probabilistic representation and reasoning

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Probabilistic representation and reasoning

Foundations of Artificial Intelligence

Review: Bayesian learning and inference

Uncertainty and Bayesian Networks

Defining Things in Terms of Joint Probability Distribution. Today s Lecture. Lecture 17: Uncertainty 2. Victor R. Lesser

Bayesian Networks. Material used

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Introduction to Artificial Intelligence Belief networks

Uncertainty. Russell & Norvig Chapter 13.

Basic Probability and Statistics

Bayesian Networks. Motivation

Bayesian networks. Soleymani. CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018

Probability. CS 3793/5233 Artificial Intelligence Probability 1

This lecture. Reading. Conditional Independence Bayesian (Belief) Networks: Syntax and semantics. Chapter CS151, Spring 2004

Probabilistic Reasoning Systems

Uncertainty and Belief Networks. Introduction to Artificial Intelligence CS 151 Lecture 1 continued Ok, Lecture 2!

Objectives. Probabilistic Reasoning Systems. Outline. Independence. Conditional independence. Conditional independence II.

COMP9414/ 9814/ 3411: Artificial Intelligence. 14. Uncertainty. Russell & Norvig, Chapter 13. UNSW c AIMA, 2004, Alan Blair, 2012

Artificial Intelligence Uncertainty

Uncertainty. Chapter 13, Sections 1 6

Artificial Intelligence

Another look at Bayesian. inference

Uncertainty. Chapter 13

Bayesian networks. Chapter Chapter

Artificial Intelligence Programming Probability

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

Directed Graphical Models

Learning Bayesian Networks (part 1) Goals for the lecture

Bayesian Network. Outline. Bayesian Network. Syntax Semantics Exact inference by enumeration Exact inference by variable elimination

Uncertainty. Outline

Where are we? Knowledge Engineering Semester 2, Reasoning under Uncertainty. Probabilistic Reasoning

Reasoning Under Uncertainty

An AI-ish view of Probability, Conditional Probability & Bayes Theorem

10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty.

Graphical Models - Part I

Probabilistic Representation and Reasoning

Mathematics. ( : Focus on free Education) (Chapter 16) (Probability) (Class XI) Exercise 16.2

13.4 INDEPENDENCE. 494 Chapter 13. Quantifying Uncertainty

Probabilistic Reasoning

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016

Brief Intro. to Bayesian Networks. Extracted and modified from four lectures in Intro to AI, Spring 2008 Paul S. Rosenbloom

Artificial Intelligence Bayesian Networks

Computer Science CPSC 322. Lecture 18 Marginalization, Conditioning

Axioms of Probability? Notation. Bayesian Networks. Bayesian Networks. Today we ll introduce Bayesian Networks.

CS 380: ARTIFICIAL INTELLIGENCE UNCERTAINTY. Santiago Ontañón

COMP9414: Artificial Intelligence First-Order Logic

Bayesian Networks. Semantics of Bayes Nets. Example (Binary valued Variables) CSC384: Intro to Artificial Intelligence Reasoning under Uncertainty-III

Lecture 3 - Axioms of Probability

Bayesian Networks. Machine Learning, Fall Slides based on material from the Russell and Norvig AI Book, Ch. 14

Uncertainty. Chapter 13

Pengju XJTU 2016

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Bayesian networks. Chapter AIMA2e Slides, Stuart Russell and Peter Norvig, Completed by Kazim Fouladi, Fall 2008 Chapter 14.

Reasoning Under Uncertainty

Implementing Machine Reasoning using Bayesian Network in Big Data Analytics

Bayesian belief networks

Artificial Intelligence CS 6364

Probabilistic Reasoning

Intelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks

Introduction to Bayes Nets. CS 486/686: Introduction to Artificial Intelligence Fall 2013

Introduction to Probabilistic Reasoning. Image credit: NASA. Assignment

Resolution or modus ponens are exact there is no possibility of mistake if the rules are followed exactly.

CS 5100: Founda.ons of Ar.ficial Intelligence

Mean, Median and Mode. Lecture 3 - Axioms of Probability. Where do they come from? Graphically. We start with a set of 21 numbers, Sta102 / BME102

Uncertain Knowledge and Reasoning

Probability Basics. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Bayesian Networks BY: MOHAMAD ALSABBAGH

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Probability theory. References:

Bayesian Networks and Decision Graphs

Stochastic Methods. 5.0 Introduction 5.1 The Elements of Counting 5.2 Elements of Probability Theory

UNCERTAINTY. In which we see what an agent should do when not all is crystal-clear.

Uncertainty in the World. Representing Uncertainty. Uncertainty in the World and our Models. Uncertainty

CS 5522: Artificial Intelligence II

Uncertainty (Chapter 13, Russell & Norvig) Introduction to Artificial Intelligence CS 150 Lecture 14

Lecture Overview. Introduction to Artificial Intelligence COMP 3501 / COMP Lecture 11: Uncertainty. Uncertainty.

Uncertainty. Chapter 13. Chapter 13 1

CS 188: Artificial Intelligence Spring Announcements

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes.

Quantifying Uncertainty

Transcription:

COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 2 COMP9414: Artificial Intelligence Reasoning Under Uncertainty Overview Problems with Logical Approach What Do the Numbers Mean? Wayne Wobcke Room J17-433 wobcke@cse.unsw.edu.au Based on slides by Maurice Pagnucco Review of Probability Theory Conditional Probability and Bayes Rule Bayesian Belief Networks Semantics of Bayesian Networks Inference in Bayesian Networks Conclusion COMP9414 c UNSW, 2012 COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 1 COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 3 Reasoning Under Uncertainty Problems with Logical Approach One drawback of the logical approach to reasoning is that an agent can rarely ascertain the truth of all propositions in the environment In fact, propositions (and their logical structure) may be inappropriate for modelling some domains especially those involving uncertainty Rational decisions for a decision theoretic agent depend on importance of goals and the likelihood that they can be achieved References: Ivan Bratko, Prolog Programming for Artificial Intelligence, Addison-Wesley, 2001. (Chapter 15.6) Fourth Edition, Pearson Education, 2012. (Chapter 16) Stuart J. Russell and Peter Norvig, Artificial Intelligence: A Modern Approach, Third Edition, Pearson Education, 2010. (Chapters 13, 14) Consider trying to formalise a medical diagnosis system: p(symptom(p, AbdominalPain) Disease(p, Appendicitis)) This rule is not correct since patients with abdominal pain may be suffering from other diseases p(symptom(p, AbdominalPain) Disease(p, Appendicitis) Disease(p, Ulcer) Disease(p, Indig)...) We could try to write a causal rule: p(disease(p, Ulcer) Symptom(p, AbdominalPain))

COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 4 COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 6 Sources of Uncertainty Sample Space and Events Difficulties arise with the logical approach due to: incompleteness agent may not have complete theory for domain ignorance agent may not have enough information about domain noise information agent does have may be unreliable non-determinism environment itself may be inherently unpredictable Probability gives us a way of summarising this uncertainty e.g. may believe that there is a probability of 0.75 that patient suffers from appendicitis if they have abdominal pains Flip a coin three times The possible outcomes are: Set of all possible outcomes: TTT TTH THT THH HTT HTH HHT HHH S={TTT, TTH, THT, THH, HTT, HTH, HHT, HHH} Any subset of the sample space is known as an event Any singleton subset of the sample space is known as a simple event COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 5 COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 7 What Do the Numbers Mean? Sample Space and Events Statistical/Frequentist View Long-range frequency of a set of events e.g. probability of the event of heads appearing on the toss of a coin long-range frequency of heads that appear on coin toss TTT THT THH HTT Sample Space Objective View Probabilities are real aspects of the world Personal/Subjective/Bayesian View Measure of belief in proposition based on agent s knowledge, e.g. probability of heads is measure of your belief that coin will land heads based on your belief about the coin; other agents may assign a different probability based on their beliefs (subjective) TTH HTH HHH HHT Event Simple Event

COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 8 Prior Probability COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 10 Axioms of Probability prior or unconditional probability that proposition A is true For example, P(Appendicitis)=0.3 In the absence of any other information, agent believes there is a probability of 0.3 (30%) of the event of the patient suffering from appendicitis As soon as we get new information we must reason with conditional probabilities 1. 0 1 All probabilities are between 0 and 1 2. P()=1 P()=0 Valid propositions have probability 1 Unsatisfiable propositions have probability 0 3. P(A B)=+P(B) P(A B) Can determine probabilities of all other propositions For example, P(A A)=+P( A) P(A A) P()=+P( A) P() 1=+P( A) 0 Therefore P( A)=1 COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 9 Random Variables COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 11 Conditional Probability Propositions are random variables that can take on several values P(Weather = Sunny) = 0.8 P(Weather = Rain) = 0.1 P(Weather = Cloudy) = 0.09 P(Weather = Snow) = 0.01 Every random variable X has a domain of possible values x 1, x 2,...x n Probabilities of all possible values P(Weather)= 0.8, 0.1, 0.09, 0.01 is a probability distribution P(Weather, Appendicitis) is a combination of random variables represented by cross product (can also use logical connectives P(A B) to represent compound events) When new information is gained we can no longer use prior probabilities Conditional or posterior probability P(A B) is the probability of A given that all we know is B e.g. P(Appendicitis AbdominalPain)=0.75 Product Rule: P(A B)=P(A B).P(B) Therefore P(A B)= P(A B) P(B) provided P(B) > 0 P(X Y)=P(X = x i Y = y j ) for all i, j P(X, Y)=P(X Y).P(Y) a set of equations

COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 12 Normalisation COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 14 Joint Probability Distribution 0.0 0.0 TTT TTH HTH 0.1/0.45=2/9 THT 0.0 HHH 0.1/0.45=2/9 THH 0.0 HHT HTT 0.1/0.45=2/9 0.15/0.45=3/9 Simple events are mutually exclusive and jointly exhaustive Probability of complex event is sum of probabilities of compatible simple events P(Appendicitis)=0.04+0.06=0.10 P(Appendicitis AbdominalPain)=0.04+0.06+0.01=0.11 P(Appendicitis AbdominalPain) = P(Appendicitis AbdominalPain) P(AbdominalPain) = 0.04 0.04+0.01 = 0.8 Problem: With many random variables the number of probabilities is vast Conditional probability distribution given that first coin ish COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 13 Joint Probability Distribution COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 15 Bayes Rule Complete specification of probabilities to all propositions in the domain Suppose we have random variables X 1, X 2,..., X n An atomic (simple) event is an assignment of particular values to all variables Joint probability distribution P(X 1, X 2,..., X n ) assigns probabilities to all possible atomic events For example, a simple medical domain with two Boolean random variables: AbdominalPain AbdominalPain Appendicitis 0.04 0.06 Appendicitis 0.01 0.89 P(B A)= P(A B)P(B) Modern AI systems abandon joint probabilities and work with conditional probabilities utilising Bayes Rule Deriving Bayes Rule: P(A B) = P(A B)P(B) (Definition) P(B A) = P(B A) (Definition) So P(A B)P(B)=P(B A) since P(A B)=P(B A) Hence P(B A)= P(A B)P(B) if 0 Note: If =0, P(B A) is undefined

COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 16 Applying Bayes Rule COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 18 Conditional Independence Example (Russell & Norvig, 1995) Doctor knows that meningitis causes a stiff neck 50% of the time 1 chance of patient having meningitis is 50000 chance of patient having a stiff neck is 20 1 P(Sti f f Neck Meningitis)=0.5 P(Meningitis)= 1 50000 P(Sti f f Neck)= 20 1 P(Meningitis Sti f f Neck) = 0.5 1 50000 1 1 20 = 0.0002 P(Sti f f Neck Meningitis).P(Meningitis) P(Sti f f Neck) = Observe: Appendicitis is direct cause of both abdominal pain and nausea If we know patient is suffering from appendicitis, then probability of nausea should not depend on the presence of abdominal pain; likewise probability of abdominal pain should not depend on nausea We say that nausea and abdominal pain are conditionally independent given appendicitis An event X is independent of an event Y conditional on the background knowledge K if knowing Y does not affect the probability of X given K P(X K)=P(X Y,K) COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 17 Using Bayes Rule COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 19 Bayesian Belief Networks Suppose we have two conditional probabilities for appendicitis P(Appendicitis AbdominalPain) = 0.8 P(Appendicitis Nausea) = 0.1 P(Appendicitis AbdominalPain Nausea)= P(AbdominalPain Nausea Appendicitis).P(Appendicitis) P(AbdominalPain Nausea) Need to know P(AbdominalPain Nausea Appendicitis) With more symptoms that is a daunting task A Bayesian belief network (also Bayesian Network, probabilistic network, causal network, knowledge map) is a directed acyclic graph (DAG) where: Each node corresponds to a random variable Directed links connect pairs of nodes a directed link from node X to node Y means that X has a direct influence on Y Each node has a conditional probability table quantifying effect of parents on node Independence assumption of Bayesian networks: Each random variable is (conditionally) independent of its nondescendants given its parents

COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 20 COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 22 Bayesian Belief Networks Conditional Probability Table Example (Pearl, 1988) You have a new burglar alarm at home that is quite reliable at detecting burglars but may also respond at times to an earthquake. You also have two neighbours, John and Mary, who promise to call you at work when they hear the alarm. John always calls when he hears the alarm but sometimes confuses the telephone ringing with the alarm and calls then, also Mary likes loud music and sometimes misses the alarm. Given the evidence of who has or has not called, we would like to estimate the probability of a burglary. Row contains conditional probability of each node value for a conditioning case (i.e. possible combination of values for parent node) P(Alarm Burglary Earthquake) Burglary Earthquake 0.950 0.050 0.940 0.060 0.290 0.710 0.001 0.999 COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 21 COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 23 Bayesian Belief Networks Semantics of Bayesian Networks Example (Pearl, 1988) Burglary JohnCalls P(Burglary) 0.001 Alarm Alarm P(JohnCalls) 0.90 0.05 Burglary Earthquake Earthquake Flase MaryCalls P(Alarm) 0.95 0.94 0.29 0.001 P(Earthquake) 0.002 Alarm P(MaryCalls) 0.70 0.01 Probabilities summarise potentially infinite set of possible circumstances Bayesian network provides a complete description of the domain Joint probability distribution can be determined from the belief network P(X 1, X 2,..., X n )= n i=1 P(X i Parents(X i )) For example, P(J M A B E)= P(J A).P(M A).P(A B E).P( B).P( E) = 0.90 0.70 0.001 0.999 0.998=0.000628 Bayesian network is a complete and non-redundant representation of domain (and can be far more compact than joint probability distribution)

COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 24 Semantics of Bayesian Networks COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 26 Calculation using Bayesian Networks Factorisation of joint probability distribution Chain Rule: Use conditional probabilities to decompose conjunctions P(X 1 X 2... X n )=P(X 1 ).P(X 2 X 1 ).P(X 3 X 1 X 2 ).....P(X n X 1 X 2... X n 1 ) Now, order the variables X 1, X 2,..., X n in a belief network so that a variable comes after its parents let π Xi be the tuple of parents of variable X i (this is a complex random variable) Using the chain rule we have P(X 1 X 2... X n )=P(X 1 ).P(X 2 X 1 ). P(X 3 X 1 X 2 ).....P(X n X 1 X 2... X n 1 ) Fact 1: Consider random variable X with parents Y 1, Y 2,..., Y n : P(X Y 1... Y n Z)=P(X Y 1... Y n ) if Z doesn t involve a descendant of X (including X itself) Fact 2: If Y 1,...,Y n are pairwise disjoint and exhaust all possibilities: P(X)=ΣP(X Y i )=ΣP(X Y i ).P(Y i ) P(X Z)=ΣP(X Y i Z) e.g. P(J B)= P(J B) ΣP( j B e a m) where j ranges over J, J, e over E, E, a over A, A and m over M, M P(B) = ΣP(J B e a m) COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 25 Semantics of Bayesian Networks COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 27 Calculation using Bayesian Networks Each P(X i X 1 X 2... X i 1 ) has the property that it is not conditioned on a descendant of X i (given ordering of variables in belief network) Therefore, by conditional independence we have P(X i X 1 X 2... X i 1 )=P(X i π Xi ) That is, rewriting the chain rule P(X 1, X 2,..., X n )= n i=1 P(X i π Xi ) P(J B E A M) = P(J A).P(B).P(E).P(A B E).P(M A)= 0.90 0.001 0.002 0.95 0.70=0.00000197 P(J B E A M)=0.00591016 P(J B E A M)=5 10 11 P(J B E A M)=2.99 10 8 P(J B E A M)=0.000000513 P(J B E A M)=0.000253292 P(J B E A M)=4.95 10 9 P(J B E A M)=2.96406 10 6

COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 28 Calculation using Bayesian Networks COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 30 Inference in Bayesian Networks P( J B E A M)=0.000000133 P( J B E A M)=6.56684 10 5 P( J B E A M)=9.5 10 10 P( J B E A M)=5.6886 10 7 P( J B E A M)=0.000000057 P( J B E A M)=2.81436 10 5 P( J B E A M)=9.405 10 8 P( J B E A M)=5.63171 10 5 Diagnostic Inference From effects to causes P(Burglary JohnCalls) = 0.016 Causal Inference From causes to effects P(JohnCalls Burglary) = 0.85; P(MaryCalls Burglary) = 0.67 Intercausal Inference Explaining away P(Burglary Alarm) = 0.3736 but adding evidence, P(Burglary Alarm Earthquake)=0.003; despite the fact that burglaries and earthquakes are independent, the presence of one makes the other much less likely Mixed Inference Combinations of the patterns above Diagnostic + Causal: P(Alarm JohnCalls Earthquake) Intercausal + Diagnostic: P(Burglary JohnCalls Earthquake) COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 29 Calculation using Bayesian Networks COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 31 Inference in Bayesian Networks Therefore, P(J B)= P(J B) P(B) = ΣP(J B e a m) ΣP( j B e a m) = 0.00849017 0.001 P(J B)=0.849017 Can often simplify calculation without using full joint probabilities but not always Q E E Q E E Q Intercausal Q E Diagnostic Causal Mixed Q = query; E = evidence

COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 32 Example Causal Inference COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 34 Conclusion P(JohnCalls Burglary) P(J B)=P(J A B).P(A B)+P(J A B).P( A B) = P(J A).P(A B) + P(J A).P( A B) = P(J A).P(A B)+P(J A).(1 P(A B)) Now P(A B)=P(A B E).P(E B)+P(A B E).P( E B) = P(A B E).P(E)+P(A B E).P( E) = 0.95 0.002+0.94 0.998=0.94002 Therefore P(J B)=0.90 0.94002+0.05 0.05998=0.849017 Fact 3: P(X Z)=P(X Y Z).P(Y Z)+P(X Y Z).P( Y Z), since X Z (X Y Z) (X Y Z) (conditional version of Fact 2) Due to noise or uncertainty it may be advantageous to reason with probabilities Dealing with joint probabilities can become difficult due to the large number of values involved Use of Bayes Rule and conditional probabilities may be a way around this Bayesian belief networks allow compact representation of probabilities and efficient reasoning with probabilities They work by exploiting the notion of conditional independence Elegant recursive algorithms can be given to automate the process of inference in Bayesian networks This is currently one of the hot topics in AI COMP9414, Monday 16 April, 2012 Reasoning Under Uncertainty 33 Example Diagnostic Inference P(Earthquake Alarm) P(E A)= P(A E).P(E) = P(A B E).P(B).P(E)+P(A B E).P( B).P(E) = 0.95 0.001 0.002+0.29 0.999 0.002 = 5.8132 10 4 Now =P(A B E).P(B).P(E)+P(A B E).P( B).P(E)+ P(A B E).P(B).P( E) + P(A B E).P( B).P( E) And P(A B E).P(B).P( E) + P(A B E).P( B).P( E) = 0.94 0.001 0.998+0.001 0.999 0.998=0.001935122 So =5.8132 10 4 + 0.001935122=0.002516442 Therefore P(E A)= 5.8132 10 4 0.002516442 = 0.2310087 Fact 4: P(X Y)=P(X).P(Y) if X, Y are conditionally independent