Introduction to Bayesian Networks

Introduction to Bayesian Networks Anders Ringgaard Kristensen Slide 1

Outline Causal networks Bayesian Networks Evidence Conditional Independence and d-separation Compilation The moral graph The triangulated graph The junction tree Slide 2

A quiz You have signed up for a quiz in a TV-show The rules are as follows: The host of the show will show you 3 doors Behind one of the doors a treasure is hidden You just have to choose the right door and the treasure is yours. You have two choices: Initially you choose a door and tell the host which one you have chosen. The host will open one of the other doors. He always opens a door where the treasure is not hidden. You can now choose Either to keep your initial choice and the host will open the door you first mentioned. Or you can change your choice and the host will open the new door you have chosen. Slide 3

A quiz let s try! 1 2 3 Slide 4

Can we model the quiz? Identify the variables: True placement, True {1, 2, 3} First choice, Choice 1 {1, 2, 3} Door opened, Opened {1, 2, 3} Second choice, Choice 2 {Keep, Change} Reward, Gain {0, 1000} Slide 5

Identify relations Chosen initially at random Causal Choice 1 Decided by the player Opened Choice 2 Causal Causal True Gain Slide 6 Chosen initially at random

Notation C Random variable, Chance node Parent 1 Child Parent 2 Edges into a chance node (yellow circle) correspond to a set of conditional probabilities. They express the influence of the values of the parents on the value of the child. Slide 7

Baysian networks Basically a static method A static version of data filtering Like dynamic linear models we may: Model observed phenomena by underlying unobservable variables. Combine with our knowledge on animal production. Like Markov decision processes, there is a structure and a set of parameters. All parameters are probabilities. Slide 8

The textbook A general textbook on Bayesian networks and decision graphs. Written by professor Finn Verner Jensen from Ålborg University one of the leading research centers for Bayesian networks. Many agricultural examples due to close collaboration with KVL and DJF through the Dina network, Danish Informatics Network in the Agricultural Sciences. Slide 9

Probabilities What is the probability that a farmer observes a particular cow in heat during a 3-week period? P(Heat = yes ) = a P(Heat = no ) = b a + b = 1 (no other options) The value of Heat ( yes or no ) is observable. What is the probability that the cow is pregnant? P(Pregnant = yes ) = c P(Pregnant = no ) = d c + d = 1 (no other options) The value of Pregnant ( yes or no ) is not observable. Slide 10

Conditional probabilities Now, assume that the cow is pregnant. What is the conditional probability that the farmer observes it in heat? P(Heat = yes Pregnant = yes ) = a p+ P(Heat = no Pregnant = yes ) = b p+ Again, a p+ + b p+ = 1 Now, assume that the cow is not pregnant. Accordingly: P(Heat = yes Pregnant = no ) = a p- P(Heat = no Pregnant = no ) = b p- Again, a p- + b p- = 1 Each value of Pregnant defines a full probability distribution for Heat. Such a distribution is called conditional Slide 11

A small Bayesian net Pregnant Pregnant = yes Pregnant = no c = 0.5 d = 0.5 Heat = yes Heat = no Heat Pregnant = yes a p+ = 0.02 b p+ = 0.98 Pregnant = no a p- = 0.60 b p- = 0.40 Let us build the net! Slide 12

Experience with the net: Evidence By entering information on an observed value of Heat we can revise our belief in the value of the unobservable variable Pregnant. The observed value of a variable is called evidence. The revision of beliefs is done by use of Baye s Theorem: Slide 13

Baye s Theorem for our net Slide 14

Let us extend the example A sow model: Insemination Several heat observations Pregnancy test Consistent combination of information from different sources Slide 15

Why build a Bayesian network Because you wish to estimate certainties for the values of variables that are not observable (or only observable at an unacceptable cost). Such variables are called hypothesis variables. The estimates are obtained by observing information variables that either Influence the value of the hypothesis variable ( risk factors ), or Depend on the hypothesis variable ( symptoms ) Diagnostics/Trouble shooting Slide 16

Diagnostics/troubleshooting Risk 1 Risk 2 Risk 3 State Symp 1 Symp 2 Symp 3 Symp 4 Slide 17

The sow pregnancy model Insem. Risk factor Pregn. Hypothesis variable Heat 1 Heat 2 Heat 3 Test Symptoms Slide 18

Transmission of evidence Age Calved Lact. Num. Yes/No Yes/No Age of a heifer/cow influences the probability that it has calved. Information on the Calved variable influences the probability that the animal is lactating. Thus, information on Age will influence our belief in the state of Lact. If, however, Calved is observed, there will be no influence of Age on Lact.! Evidence may be transmitted through a serial connection, unless the state of the intermediate variable is known. Age and Lact are d-separated given Calved. They are conditionally independent given observation of Calved Slide 19

Diverging connections Breed Landrace/Yorkshire/Duroc Num. Litter size Color White/Black/Brown The breed of a sow influences litter size as well as color. Observing the value of Color will tell us something about the Breed and, thus, indirectly about the Litter size. If, however, Breed is observed, there will be no influence of Color on Litter size! Evidence may be transmitted through a diverging connection, unless the state of the intermediate variable is known. Litter size and Color are d-separated given Breed. They are conditionally independent given observation of Breed Slide 20

Converging connections Yes/No Mastitis Heat Yes/No Temp. Num. If nothing is known about Temp., the values of Mastitis and Heat are independent. If, however, Temp. is observed at a high level, the supplementary information that the cow is in heat will decrease our believe in the state Yes for Mastitis. Explaining away effect. Evidence may only be transmitted through a converging connection if the connecting variable (or a descendant) is observed. Slide 21

Example: Mastitis detection Previous case Milk yield Mastitis index Mastitis Heat Conductivity Temperature Slide 22

Compilation of Bayesian networks Compilation: Create a moral graph Add edges between all pairs of nodes having a common child. Remove all directions Triangulate the moral graph Add edges until all cycles of more than 3 nodes have a chord Identify the cliques of the triangulated graph and organize them into a junction tree. The software system does it automatically (and can show all intermediate stages). Slide 23

Why use Bayesian networks? A consistent framework for Representation and dealing with uncertainty Combination of information from different sources. Combination of numerical knowledge with structural expert knowledge. Slide 24