Bayesian networks II. Model building. Anders Ringgaard Kristensen

Similar documents
Bayesian Networks and Decision Graphs

Outline. A quiz

Introduction to Bayesian Networks

Outline. Introduction to Bayesian Networks. Outline. Directed Acyclic Graphs. Bayesian Networks Overview Building Models Modeling Tricks.

Advanced Herd Management Probabilities and distributions

What is it all about? Introduction to Bayesian Networks. Method to reasoning under uncertainty. Where we reason using probabilities

Prediction of RPO by Model Tree. Saleh Shahinfar Department of Dairy Science University of Wisconsin-Madison JAM2013

10/3/2018. Our main example: SimFlock. Breeding animals Hens & Cocks

Quantitative Genetics I: Traits controlled my many loci. Quantitative Genetics: Traits controlled my many loci

Advanced topics from statistics

Monte Carlo Simulation I

Inheritance part 1 AnswerIT

Modeling and Reasoning with Bayesian Networks. p.1

Markov Decision Processes: Biosens II

Mechanisms of Evolution

Maternal Genetic Models

Building Bayesian Networks. Lecture3: Building BN p.1

4. Conditional Probability

Linear algebra A brush-up course. Anders Ringgaard Kristensen

Outline. Probability. Math 143. Department of Mathematics and Statistics Calvin College. Spring 2010

Class Copy! Return to teacher at the end of class! Mendel's Genetics

Cover Requirements: Name of Unit Colored picture representing something in the unit

Biology Semester 2 Final Review

Recall from last time: Conditional probabilities. Lecture 2: Belief (Bayesian) networks. Bayes ball. Example (continued) Example: Inference problem

Chapter 5. Heredity. Table of Contents. Section 1 Mendel and His Peas. Section 2 Traits and Inheritance. Section 3 Meiosis


Department of Large Animal Sciences. Outline. Slide 2. Department of Large Animal Sciences. Slide 4. Department of Large Animal Sciences

Q Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2.

Synapsis: pairing of two homologous chromosomes that occurs during prophase I.

Social Influence in Online Social Networks. Epidemiological Models. Epidemic Process

Two Correlated Proportions Non- Inferiority, Superiority, and Equivalence Tests

2. Overproduction: More species are produced than can possibly survive

Genetics (patterns of inheritance)

Part 2- Biology Paper 2 Inheritance and Variation Knowledge Questions

1 Mendel and His Peas

Reproduction of Organisms

What is Natural Selection? Natural & Artificial Selection. Answer: Answer: What are Directional, Stabilizing, Disruptive Natural Selection?

Chapter Eleven: Heredity

Linear Regression (1/1/17)

Quantitative characters - exercises

REVISION: GENETICS & EVOLUTION 20 MARCH 2013

7.2: Natural Selection and Artificial Selection pg

Convergent evolution:

Biology Chapter 11: Introduction to Genetics

Observing Patterns in Inherited Traits

Cell division and multiplication

Markov Chains and Pandemics

LIFE SCIENCES GRADE 12 SESSION 20 ( LEARNER NOTES)

Herd Management Science

1. Let A and B be two events such that P(A)=0.6 and P(B)=0.6. Which of the following MUST be true?

University of Technology, Building and Construction Engineering Department (Undergraduate study) PROBABILITY THEORY

Solutions to Problem Set 4

4. Identify one bird that would most likely compete for food with the large tree finch. Support your answer. [1]

1 Mendel and His Peas

Machine Learning for Data Science (CS4786) Lecture 19

Introduction to population genetics & evolution

Intelligent Systems: Reasoning and Recognition. Reasoning with Bayesian Networks

Brief Glimpse of Agent-Based Modeling

Biology 110 Survey of Biology. Quizzam

Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Probability (Devore Chapter Two)

Genetics Review Sheet Learning Target 11: Explain where and how an organism inherits its genetic information and this influences their

Announcements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic

Labs 7 and 8: Mitosis, Meiosis, Gametes and Genetics

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Evolution. Species Changing over time

3/4/2015. Review. Phenotype

Model Building: Selected Case Studies

Bayes Nets III: Inference

Uncertain Reasoning. Environment Description. Configurations. Models. Bayesian Networks

mrna Codon Table Mutant Dinosaur Name: Period:

Introduction to Genetics

Microevolution Changing Allele Frequencies

Battling Bluetongue and Schmallenberg virus: Local scale behavior of transmitting vectors

Investigating Limits in MATLAB

3-LS1-1 From Molecules to Organisms: Structures and Processes

Evolution of Populations. Populations evolve. Changes in populations. Natural selection acts on individuals differential survival. Populations evolve

Animal Models. Sheep are scanned at maturity by ultrasound(us) to determine the amount of fat surrounding the muscle. A model (equation) might be

Evolutionary change. Evolution and Diversity. Two British naturalists, one revolutionary idea. Darwin observed organisms in many environments

Guided Notes Unit 6: Classical Genetics

Summer Work Biology. 1. If the sperm of a horse has 32 chromosomes, how many chromosomes will its body cells have? a. 16 c. 2 b. 64 d.

THE SAMPLING DISTRIBUTION OF THE MEAN

Speciation factsheet. What is a species?

Probability and Statistics. Joyeeta Dutta-Moscato June 29, 2015

Is it possible to see atoms?

Evolution 101. Understanding Evolution for the Layperson Jack Krebs April 6, 2006

Big Idea 3B Basic Review. 1. Which disease is the result of uncontrolled cell division? a. Sickle-cell anemia b. Alzheimer s c. Chicken Pox d.

Unit 2: Cellular Chemistry, Structure, and Physiology Module 5: Cellular Reproduction

~ 3 ~ -LOGIC WITH UNIVERSAL GENERALIZATIONS- Validity

Write 2 facts from the following slides. OR If there are questions on the slides answer the questions.

BIG IDEA 4: BIOLOGICAL SYSTEMS INTERACT, AND THESE SYSTEMS AND THEIR INTERACTIONS POSSESS COMPLEX PROPERTIES.

Example Items. Biology

HONORS PSYCHOLOGY REVIEW QUESTIONS

Intelligent Systems (AI-2)

Parts 2. Modeling chromosome segregation

e.g. population: 500, two alleles: Red (R) and White (r). Total: 1000 genes for flower color in the population

Emily Blanton Phylogeny Lab Report May 2009

Page 2. (b) (i) 2.6 to 2.7 = 2 marks; Incorrect answer but evidence of a numerator of OR or denominator of 9014 = 1 mark; 2

, some directions, namely, the directions of the 1. and

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

Transcription:

Bayesian networks II. Model building Anders Ringgaard Kristensen

Outline Determining the graphical structure Milk test Mastitis diagnosis Pregnancy Determining the conditional probabilities Modeling methods and tricks Object oriented Bayesian networks

Milk test Infected? { Yes, No } Test? { Positive, Negative } Sensitivity/Specificity determines the conditional probabilities Direction of edge! Causal direction Against the reasoning direction

Daily measurements Inf 1 Inf 2 Inf 3 Inf 4 Inf 5 Inf 6 Inf 7 Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Are the infection states of different days independent? Probably not! Markov property Duration of disease

Dependence between test results Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 Correctness of test depends on whether it was correct yesterday. To determine whether it was correct yesterday: The true infection state yesterday The test result yesterday

Dependence between test results Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 A simplifying intermediate variable Cor i { yes, no } indicating whether the test was correct.

Mastitis diagnosis, AMS No Subclinical Clinical Mastitis Heat No Yes Conductivity Temperature Separate variables (don t pool mastitis & heat) Check conditional independence Are Conductivity and Temperature independent, given Mastitis?

If not conditional independent No Subclinical Clinical Mastitis Heat No Yes Conductivity Temperature If conductivity influences temperature.

If not conditional independent No Subclinical Clinical Mastitis Heat No Yes Conductivity Temperature If temperature influences conductivity. The causal direction may be difficult to determine

If not conditional independent No Subclinical Clinical Mastitis Heat No Yes Conductivity Temperature If the direction of an edge cannot be determined, a variable is often missing! Other disease

Pregnancy (again, again ) A goat is mated, and six weeks later we want to test it for pregnancy. We have three tests available: Blood test Urine test Scanning The variables of our problem are: Pregnant { yes, no } Blood { positive, negative } Urine { positive, negative } Scan { positive, negative }

Pregnancy test BN Pregnant Blood Urine Scan Check for conditional independence

Pregnancy test BN, revised Pregnant The blood test and the urine test both measure a hormone level. The scanning does something completely different. Hormone Blood Urine Scan

Outline Determining the graphical structure Milk test Mastitis diagnosis Pregnancy Determining the conditional probabilities Modeling methods and tricks

Determining the probabilities Statistical model with parameters estimated from data. Law of nature. Experts of the domain.

Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 The P(Test i Inf i ) conditional probability is supplied by the test retailer.

P(Test i Inf i ) P(Test i = yes Inf i ) P(Test i = yes Inf i ) Inf i = yes 0.99 0.01 Inf i = no 0.01 0.99 Defined by the sensitivity and specificity of the test.

Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 The P(Cor i Inf i, Test i ) conditional probability is trivial.

P(Cor i Inf i, Test i ) Inf i Test i P(Cor i =y Inf i,test i ) P(Cor i =n Inf i,test i ) yes yes 1 0 yes no 0 1 no yes 0 1 no no 1 0 If Inf i and Test i agree, the test is correct!

Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 The P(Test i Inf i, Cor i-1 ) conditional probabilities needs some assumptions.

P(Test i Inf i, Cor i-1 ) Assumptions: A correct test has 99.9% chance of being correct next time. An incorrect test has 30% chance of being incorrect next time: Thus, it is still most likely to be correct. In agreement with the example file provided from the homepage. In disagreement with the textbook.

P(Test i Inf i, Cor i-1 ) Inf i Cor i-1 P(Test i =y Inf i,cor i-1 ) P(Test i =n Inf i,cor i-1 ) yes yes 0.999 0.001 yes no 0.7 0.3 no yes 0.001 0.999 no no 0.3 0.7

Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 The P(Inf 1 ) probabilities must be modeled.

P(Inf 1 ) Assume that the milk test is made on single cow level at the farm. We need the probability λ that the milk from a particular cow is infected on an arbitrary day (i.e. P(Inf i = yes ) = λ). The farmer has no knowledge about λ, but The dairy performs a very precise bulk tank test: If the milk from just one cow is infected, the bulk tank test will be positive. On average, the bulk tank test is positive once a month

P(Inf 1 ) Further assumptions: λ is the same for all cows Cows are infected independently. Under those assumptions: (1 - λ) 50 = 29/30 λ = 1 (29/30) 0.02 0.0007

Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 The P(Inf i Inf i-1, Inf i-2 ) conditional probabilities must be modeled.

P(Inf i Inf i-1, Inf i-2 ) Assume the following properties of the infection: A not-infected cow has probability q of becoming infected. An infection always lasts for at least 2 days After 2 days, the probability of recovery is π Define a state space model: s i {nn, ny, yn, yy} where e.g. ny means: not-infected day i-1 but infected day i

P(Inf i Inf i-2, Inf i-1 ) Transition propabilities: Day i Day i+1 nn ny yn yy P(yes) nn (1-q) q 0 0 q ny 0 0 0 1 1 yn (1-q) q 0 0 q yy 0 0 π (1-π) (1-π) Only assumptions on min. duration, q and π

A procedure There are basically 3 parameters: Duration minimum 2 days Probability of becoming infected q Daily probability of recovery π (after 2 days) The 3 parameters should be estimated from data. If data is not available, we may have to rely on experts. Experts guesses may be calibrated to the overall probability of infection at a given day.

Limiting state distribution

Choice of parameters Let the probability of recovery (after 2 days) be π = 0.4 Let the probability of becoming diseased be q = 0.0002 The steady state distribution a is calculated in the Limit.R program

Milk test example Inf 1 Inf 3 Inf 5 Inf 7 Inf 2 Inf 4 Inf 6 Cor 1 Cor 2 Cor 3 Cor 4 Cor 5 Cor 6 Test 1 Test 3 Test 5 Test 7 Test 2 Test 4 Test 6 The P(Inf i Inf i-1 ) conditional probabilities must be modeled.

P(Inf i Inf i-1 ) Some assumption compensating for the fact that we don t know the infection state two days ago

A stud farm: Genealogical tree Ann Brian Cecily Fred Dorothy Eric Gwenn Henry Irene John is suffering from a serious hereditary disease caused by a recessive gene. State space for a horse: aa, aa or AA The genotype aa is diseased. The genotype aa is carrier. We want to cull all carriers! John

Contitional probabilities from genetics Mother Father aa aa AA aa (1, 0, 0) (0.5, 0.5, 0) (0, 1, 0) aa (0.5, 0.5, 0) (0.25, 0.5, 0.25) (0, 0.5, 0.5) AA (0, 1, 0) (0, 0.5, 0.5) (0, 0, 1)

Unknown parents Two unknown parents: Assume that the distribution reflects the population probabilities of being healthy or carrier (if they had been diseased they would not have survived until breeding age ). One unknown parent: Introduce a dummy parent reflecting the population distribution.

The diseased state aa Impossible for all other horses than John Two options: Delete the state and adjust all probabilities accordingly. Keep the state and enter the evidence that the horses are either healthy or carriers

Obtaining the probabilities Sources: Pure data estimation (frequency counts) Model and parameter estimation from data Provided by nature Subjective expert assessments

Outline Determining the graphical structure Milk test Mastitis diagnosis Pregnancy Determining the conditional probabilities Modeling methods and tricks

Logical constraints: SIR model We observe the spread of a contagious disease in a population of, say 50 animals. Each animal is either Susceptible Infective Removed (recovered/dead) Let S i, I i and R i be the number of susceptible, infective and removed, respectively, at time i

The SIR model S 1 S 2 S 3 I 1 R 1 I 2 R 2 I 3 R 3 Basic problem: S i, I i and R i are not independent Cannot be solved by directed (causal) edges. Even though the conditional probabilities of the model may be correct, it may happen that S i + I i + R i 50

The SIR model S 1 S 2 S 3 I 1 R 1 I 2 R 2 I 3 R 3 C 1 C 1 C 1 { Valid, Invalid } P(C i = Valid S i, I i, R i ) = 1 if S i + I i + R i = 50 P(C i = Valid S i, I i, R i ) = 0 if S i + I i + R i 50 Enter the evidence P(C i = Valid) and propagate

Logical constraints Refer also to the sock-sorting problem in the textbook.

Object oriented Bayesian networks For large networks with many more or less identical items, an object oriented approach may be relevant. Object Oriented Networks are like all other Bayesian network, but the object oriented approach makes the construction phase easier. Example: We wish to construct a model of a cow herd with a number of cows: We construct a herd model of which the variables are defined at herd level: Conception rate Heat detection rate We construct a cow model describing variables relating to individual cows. This cow model is used as a template for construction of separate cow objects having the same structure: Pregnancy status Heat detection(s) Pregnancy diagnosis Thus, we can easily create as many cows as we want and afterwards link each cow to the herd model.

Object oriented Bayesian networks: Cow class A model of a dairy cow. Used as a template for cow objects. The cow has been inseminated, and we observe it for heat We also perform a pregnancy diagnosis The value of the variable Pregnant depends on a herd level conception rate (CR). The outcomes of the Heat variables depends on the pregnancy state and the herd level heat detection rate (HDR).

Object oriented Bayesian networks: Herd class A model of a dairy herd. It contains the herd level variables CR and HDR and a number of cow objects. Each cow object is created from the cow class (template). The model is shown with the objects collapsed.

Object oriented Bayesian networks: Herd class expanded With the expanded view, the structure of the objects is visible The expanded model is just an ordinary Bayesian network