Bayesian Networks and Decision Graphs

Size: px

Start display at page:

Download "Bayesian Networks and Decision Graphs"

Wilfred Frank Hicks
5 years ago
Views:

1 Bayesian Networks and Decision Graphs Chapter 3 Chapter 3 p. 1/47

2 Building models Milk from a cow may be infected. To detect whether or not the milk is infected, you can apply a test which may either give a positive or a negative test result. The test is not perfect: It may give false positives as well as false negatives. Chapter 3 p. 2/47

3 Building models Milk from a cow may be infected. To detect whether or not the milk is infected, you can apply a test which may either give a positive or a negative test result. The test is not perfect: It may give false positives as well as false negatives. Hypothesis events Inf: y,n Information events Test: pos,neg Chapter 3 p. 2/47

4 7-day model I Infections develop over time: Inf 1 Inf 2 Inf 3 Inf 4 Inf 5 Inf 6 Inf 7 Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Chapter 3 p. 3/47

5 7-day model I Infections develop over time: Inf 1 Inf 2 Inf 3 Inf 4 Inf 5 Inf 6 Inf 7 Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Assumption: The Markov property: If I know the present, then the past has no influence on the future, i.e. Inf i 1 is d-separated from Inf i+1 given Inf i. But what if yesterday s Inf-state has an impact on tomorrow s Inf-state? Chapter 3 p. 3/47

6 7-day model II Yesterday s Inf-state has an impact on tomorrow s Inf-state: Inf 1 Inf 2 Inf 3 Inf 4 Inf 5 Inf 6 Inf 7 Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Chapter 3 p. 4/47

7 7-day model III The test-failure is dependent on whether or not the test failed yesterday: Inf 1 Inf 2 Inf 3 Inf 4 Inf 5 Inf 6 Inf 7 Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Chapter 3 p. 5/47

8 Sore throat I wake up one morning with a sore throat. It may be the beginning of a cold or I may suffer from angina. If it is a severe angina, then I will not go to work. To gain more insight, I can take my temperature and look down my throat for yellow spots. Chapter 3 p. 6/47

9 Sore throat I wake up one morning with a sore throat. It may be the beginning of a cold or I may suffer from angina. If it is a severe angina, then I will not go to work. To gain more insight, I can take my temperature and look down my throat for yellow spots. Hypothesis variables: Cold? - {n, y} Angina? - {no, mild, severe} Information variables: Sore throat? - {n, y} See spots? - {n, y} Fever? - {no, low, high} Chapter 3 p. 6/47

10 Model for sore throat Cold? Angina? Fever? Sore Throat? See spots? Chapter 3 p. 7/47

11 Model for sore throat Cold? Angina? Fever? Sore Throat? See spots? Chapter 3 p. 7/47

12 Insemination of a cow Six weeks after the insemination of a cow, there are two tests: a Blood test and a Urine test. Pregnant? Blood test Urine test Chapter 3 p. 8/47

13 Insemination of a cow Six weeks after the insemination of a cow, there are two tests: a Blood test and a Urine test. Pregnant? Blood test Urine test Check the conditional independences: If we know that the cow is pregnant, will a negative blood test then change our expectation for the urine test? If it will, then the model does not reflect reality! Chapter 3 p. 8/47

14 Insemination of a cow: A more correct model Pregnant? Mediating variable Hormonal changes Blood test Urine test But does this actually make a difference? Chapter 3 p. 9/47

15 Insemination of a cow: A more correct model Pregnant? Mediating variable Hormonal changes Blood test Urine test But does this actually make a difference? Assume that both tests are negative in the incorrect model: This will overestimate the probability for Pregnant?=n. Pregnant? Blood test Urine test Chapter 3 p. 9/47

16 Why mediating variables? Why do we introduce mediating variables: Necessary to catch the correct conditional independences. Can ease the specification of the probabilities in the model. For example: If you find that there is a dependence between two variables A and B, but cannot determine a causal relation: Try with a mediating variable! C C D OR C D A?? B A B A B Chapter 3 p. 10/47

17 A simplified poker game The game consists of: Two players. Three cards to each player. Two rounds of changing cards (max two cards in the second round) What kind of hand does my opponent have? Chapter 3 p. 11/47

18 A simplified poker game The game consists of: Two players. Three cards to each player. Two rounds of changing cards (max two cards in the second round) Hypothesis variable: What kind of hand does my opponent have? OH - {no, 1a, 2v, fl, st, 3v, sf} Information variables: FC - {0, 1, 2, 3} and SC - {0, 1, 2} Chapter 3 p. 11/47

19 A simplified poker game The game consists of: Two players. Three cards to each player. Two rounds of changing cards (max two cards in the second round) Hypothesis variable: What kind of hand does my opponent have? OH - {no, 1a, 2v, fl, st, 3v, sf} Information variables: FC - {0, 1, 2, 3} and SC - {0, 1, 2} SC But how do we find: FC OH P(FC), P(SC FC) and P(OH SC, FC)?? Chapter 3 p. 11/47

20 A simplified poker game: Mediating variables Introduce mediating variables: The opponent s initial hand, OH 0. The opponent s hand after the first change of cards, OH 1. OH 0 OH 1 OH FC SC Note: The states of OH 0 and OH 1 are different from OH. Chapter 3 p. 12/47

21 Naïve Bayes models Hyp P(Hyp) Inf 1 P(Inf 1 Hyp) Inf n P(Inf n Hyp) We want the posterior probability of the hypothesis variable Hyp given the observations {Inf 1 = e 1,..., Inf n = e n }: P(Hyp Inf 1 = e 1,..., Inf n = e n ) = P(Inf 1 = e 1,..., Inf n = e n Hyp)P(Hyp) P(Inf 1 = e 1,..., Inf n = e n ) = µ P(Inf 1 = e 1 Hyp)... P(Inf n = e n Hyp)P(Hyp) Note: The model assumes that the information variables are independent given the hypothesis variable. Chapter 3 p. 13/47

22 Summary: Catching the structure 1. Identify the relevant events and organize them in variables: Hypothesis variables - Includes the events that are not directly observable. Information variables - Information channels. 2. Determine causal relations between the variables. 3. Check conditional independences in the model. 4. Introduce mediating variables. Chapter 3 p. 14/47

23 Where do the numbers come from? Theoretical insight. Statistics (large databases) Subjective estimates Chapter 3 p. 15/47

24 Infected milk Inf Test We need the probabilities: P(Test Inf) - provided by the factory. P(Inf) - cow or farm specific. Determining P(Inf): Assume that the farmer has 50 cows. The milk is poured into a container, and the dairy tests the milk with a very precise test. In average, the milk is infected once per month. Chapter 3 p. 16/47

25 Infected milk Inf Test We need the probabilities: P(Test Inf) - provided by the factory. P(Inf) - cow or farm specific. Determining P(Inf): Assume that the farmer has 50 cows. The milk is poured into a container, and the dairy tests the milk with a very precise test. In average, the milk is infected once per month. Calculations: P(#Cows-infected 1) = 1 30 hence P(#Cows-infected < 1) = = If P(Inf = y) = x, then P(Inf = n) = (1 x) and: (1 x) 50 = 29 « x = Chapter 3 p. 16/47

26 7-day model I Infections develop over time: Inf i Inf i+1 Test i Test i+1 From experience we have: Risk of becoming infected? Chance of getting cured from one day to another? 0.3 Chapter 3 p. 17/47

27 7-day model I Infections develop over time: Inf i Inf i+1 Test i Test i+1 From experience we have: Risk of becoming infected? Chance of getting cured from one day to another? 0.3 This gives us: y Inf i Inf i+1 y n P(Inf i+1 Inf i ) n Chapter 3 p. 17/47

28 7-day model I Infections develop over time: Inf i Inf i+1 Test i Test i+1 From experience we have: Risk of becoming infected? Chance of getting cured from one day to another? 0.3 This gives us: y Inf i Inf i+1 y n P(Inf i+1 Inf i ) n Chapter 3 p. 17/47

29 7-day model II Inf i 1 Inf i Inf i+1 Inf i 1 y n Test i 1 Test i Test i+1 Inf i y n P(Inf i+1 = y Inf i 1, Inf i ) That is: An infection always lasts at least two days. After two days, the chance of being cured is 0.4. However, we also need to specify P(Test i+1 Inf i+1, Test i, Inf i ): A correct test has a 99.9% of being correct the next time. An incorrect test has a 90% of being incorrect the next time. This can be done much easier by introducing mediating variables! Chapter 3 p. 18/47

30 7-day model III Inf i 1 Inf i Inf i+1 Cor i 1 Cor i Test i 1 Test i Test i+1 We need the probabilities: y Inf i n Pos Test i Neg P(Cor i = y Inf i, Test i ) y Inf i Cor i 1 y n P(Test i = Pos Inf i, Cor i 1 ) n Chapter 3 p. 19/47

31 7-day model III Inf i 1 Inf i Inf i+1 Cor i 1 Cor i Test i 1 Test i Test i+1 We need the probabilities: y Inf i n Test i Pos 1 0 Neg 0 1 P(Cor i = y Inf i, Test i ) y Inf i Cor i 1 y n P(Test i = Pos Inf i, Cor i 1 ) n Chapter 3 p. 19/47

32 Stud farm Genealogical structure for the horses in a stud farm: L Ann Brian Cecily K Fred Dorothy Eric Gwen Henry Irene John e We get evidence e that John is sick. Chapter 3 p. 20/47

33 Stud farm: Conditional probabilities I The disease is carried by a recessive gene: aa: sick, aa: Carrier, AA: Healthy We should specify the probabilities: Mother aa aa AA aa (,, ) (,, ) (,, ) Father aa (,, ) (,, ) (,, ) AA (,, ) (,, ) (,, ) P(Offspring Father, Mother) Chapter 3 p. 21/47

34 Stud farm: Conditional probabilities I The disease is carried by a recessive gene: aa: sick, aa: Carrier, AA: Healthy We should specify the probabilities: Father Mother aa aa AA aa (1, 0, 0) (0.5, 0.5, 0) (0, 1, 0) aa (0.5, 0.5, 0) (0.25, 0.5, 0.25) (0, 0.5, 0.5) AA (0, 1, 0) (0, 0.5, 0.5) (0, 0, 1) P(Offspring Father, Mother) Chapter 3 p. 21/47

35 Stud farm: Conditional probabilities II But the other horses are not sick: John: aa, aa, AA. Other horses: aa, AA. Prior probabilities: P(aA) = 0.01 and P(AA) = Conditional probabilities: Henry aa Irene AA aa (0.25,0.5,0.25) (0,0.5,0.5) AA (0,0.5,0.5) (0,0,1) P(John Henry, Irene) Mother aa AA Father aa (, ) (, ) AA (, ) (, ) P(Offspring Father, Mother) Chapter 3 p. 22/47

36 Stud farm: Conditional probabilities II But the other horses are not sick: John: aa, aa, AA. Other horses: aa, AA. Prior probabilities: P(aA) = 0.01 and P(AA) = Conditional probabilities: Henry aa Irene AA aa (0.25,0.5,0.25) (0,0.5,0.5) AA (0,0.5,0.5) (0,0,1) P(John Henry, Irene) aa Mother AA aa (2/3,1/3) (0.5,0.5) Father AA (0.5,0.5) (0,1) P(Offspring Father, Mother) Drop the first state and normalize: ( 0.25, 0.5,0.25) (2/3, 1/3) Chapter 3 p. 22/47

37 A simplified poker game I OH 0 OH 1 OH FC SC In order to find: P(OH 0 ) = ( No, 1a, 2cons, 2s, 2v, fl, st, 3v, sf ) we have to go into combinatorics: #good Chapter 3 p. 23/47

38 A simplified poker game I OH 0 OH 1 OH FC SC P(OH 0 ) (0.167 No, a, cons, s, v, fl, st, v, sf ) we have to go into combinatorics: #good For example, P(OH 0 = st) = `52 3. Similar considerations apply to P(OH 1 OH 0, FC). E.g. P(OH 1 2cons,1) = (0 No, 0 1a, cons, s, v, 0 fl, st, 0 3v, 0 sf ) Chapter 3 p. 23/47

39 A simplified poker game II OH 0 OH 1 OH FC SC Theoretical considerations are not enough: P(FC OH 0 ) = What is my opponents strategy? Assume the strategy: no 3 1a 2 2s 2cons 2v 1 2cons 2s 1 (Keep 2s) 2cons 2v 2s 2v 1 (Keep 2v) fl st 3v sf 0 Note: the states 2cons 2s, 2cons 2v, 2s 2v are redundant. Chapter 3 p. 24/47

40 A simplified poker game II OH 0 OH 1 OH FC SC Theoretical considerations are not enough: P(FC OH 0 ) = What is my opponents strategy? Assume the strategy: no 3 1a 2 2s 2cons 2v 1 2cons 2s 1 (Keep 2s) 2cons 2v 2s 2v 1 (Keep 2v) fl st 3v sf 0 Note: the states 2cons 2s, 2cons 2v, 2s 2v are redundant. However, knowing my system my opponent may bluff. Chapter 3 p. 24/47

41 Transmission of symbol strings A language L over {a, b} is transmitted through a channel. Each word is surrounded by c. In the transmission some characters may be corrupted by noise and may be confused with others. A five-letter word has been transmitted. Hypothesis variables: Information variables: Chapter 3 p. 25/47

42 Transmission of symbol strings A language L over {a, b} is transmitted through a channel. Each word is surrounded by c. In the transmission some characters may be corrupted by noise and may be confused with others. A five-letter word has been transmitted. Hypothesis variables: T 1, T 2, T 3, T 4, T 5 (States: a, b) Information variables: R 1, R 2, R 3, R 4, R 5 (States: a, b, c) T 1 T 2 T 3 T 4 T 5 R 1 R 2 R 3 R 4 R 5 Chapter 3 p. 25/47

43 Transmission of symbol strings A language L over {a, b} is transmitted through a channel. Each word is surrounded by c. In the transmission some characters may be corrupted by noise and may be confused with others. A five-letter word has been transmitted. Hypothesis variables: T 1, T 2, T 3, T 4, T 5 (States: a, b) Information variables: R 1, R 2, R 3, R 4, R 5 (States: a, b, c) T 1 T 2 T 3 T 4 T 5 R 1 R 2 R 3 R 4 R 5 P(R i T i ) can be determined through statistics: R i R i = a R i = b R i = c T i a b Are the T i s independent? Chapter 3 p. 25/47

44 Transmission of symbol strings T 1 T 2 T 3 T 4 T 5 R 1 R 2 R 3 R 4 R 5 To find P(T i+1 T i ): Look at the permitted words and their frequencies. First 2 Last 3 aaa aab aba abb baa bab bba bbb aa ab ba bb Chapter 3 p. 26/47

45 Transmission of symbol strings T 1 T 2 T 3 T 4 T 5 R 1 R 2 R 3 R 4 R 5 To find P(T i+1 T i ): Look at the permitted words and their frequencies. First 2 Last 3 aaa aab aba abb baa bab bba bbb aa ab ba bb P(T 2 = a T 1 = a) = P(T 2 = a, T 1 = a) P(T 1 = a) = = 0.6 Chapter 3 p. 26/47

46 Cold or angina? I Cold? Angina? Subjective estimates: Fever? Sore Throat? See spots? P(Cold?) = (0.97,0.03) P(Angina?) = (0.993, 0.005, 0.002) Angina? no mild severe no See spots? yes P(See spots? Angina?) But how do we find e.g. P(Sore throat? Angina?, Cold?)? Chapter 3 p. 27/47

47 Cold or angina? II Cold? Angina? Fever? Sore Throat? See spots? If neither Cold? nor Angina?, then P(Sore throat? = y) = If only Cold?, then P(Sore throat? = y) = 0.4. If only Angina? = mild, then P(Sore throat? = y) = 0.7. If Angina? = severe, then P(Sore throat? = y) = 1. Angina? no mild severe no Cold? yes 0.4?? 1 P(Sore throat? = y Cold?, Angina?) Chapter 3 p. 28/47

48 Cold or angina? III We have the partial specification: Angina? no mild severe no Cold? yes 0.4?? 1 P(Sore throat? = yes Cold?, Angina?) In order to find P(Sore throat = yes Cold? = yes, Angina? = mild) assume that: Out of 100 mornings, I have a background sore throat on 5 of them. 95 left: 40% cold-sore = left: 70% mild angina-sore = 39.9 In total: = Chapter 3 p. 29/47

49 Cold or angina? III We have the partial specification: Angina? no mild severe no Cold? yes 0.4?? 1 P(Sore throat? = yes Cold?, Angina?) In order to find P(Sore throat = yes Cold? = yes, Angina? = mild) assume that: Out of 100 mornings, I have a background sore throat on 5 of them. 95 left: 40% cold-sore = left: 70% mild angina-sore = 39.9 In total: = Cold? Angina? no mild severe no yes Chapter 3 p. 29/47

50 Cold or angina? I Cold? Angina? Subjective estimates: Fever? Sore Throat? See spots? P(Cold?) = (0.97,0.03) P(Angina?) = (0.993, 0.005, 0.002) Angina? no mild severe no See spots? yes P(See spots? Angina?) But how do we find e.g. P(Sore throat? Angina?, Cold?)? Chapter 3 p. 30/47

51 Cold or angina? II Cold? Angina? Fever? Sore Throat? See spots? If neither Cold? nor Angina?, then P(Sore throat? = y) = If only Cold?, then P(Sore throat? = y) = 0.4. If only Angina? = mild, then P(Sore throat? = y) = 0.7. If Angina? = severe, then P(Sore throat? = y) = 1. Angina? no mild severe no Cold? yes 0.4?? 1 P(Sore throat? = y Cold?, Angina?) Chapter 3 p. 31/47

52 Cold or angina? III We have the partial specification: Angina? no mild severe no Cold? yes 0.4?? 1 P(Sore throat? = yes Cold?, Angina?) In order to find P(Sore throat = yes Cold? = yes, Angina? = mild) assume that: Out of 100 mornings, I have a background sore throat on 5 of them. 95 left: 40% cold-sore = left: 70% mild angina-sore = 39.9 In total: = Chapter 3 p. 32/47

53 Cold or angina? III We have the partial specification: Angina? no mild severe no Cold? yes 0.4?? 1 P(Sore throat? = yes Cold?, Angina?) In order to find P(Sore throat = yes Cold? = yes, Angina? = mild) assume that: Out of 100 mornings, I have a background sore throat on 5 of them. 95 left: 40% cold-sore = left: 70% mild angina-sore = 39.9 In total: = Cold? Angina? no mild severe no yes Chapter 3 p. 32/47

54 Several independent causes, in general Cause 1 Effect Cause i results in Effect with probability x i. Cause n P(Effect = yes Combination of causes) =?? Chapter 3 p. 33/47

55 Several independent causes, in general Cause 1 Cause n Effect Cause i results in Effect with probability x i. P(Effect = yes Combination of causes) =?? Way to look at it: Cause i results in Effect unless it is inhibited by something. The Inhibitor has probability q i = 1 x i. Assumption: The Inhibitors are independent. That is, the probability of Inh i and Inh j = q i q j. Thus, P(Effect = yes Cause i, Cause j, Cause k ) = 1 q i q j q k Chapter 3 p. 33/47

56 Noisy or A 1 A n 1 q 1 1 q n B All nodes are binary. All causes for B are listed explicitly. In general: If only A 1 and A 2 are on: P(B = y A i1 = = A ik = y, the rest = n) = 1 q i1 q ik P(B = y A 1 = y, A 2 = y, the rest = n) = 1 q 1 q 2 Note: If P(B = y All = n) = x > 0, then introduce a background cause C which is always on, and q c = 1 x. Chapter 3 p. 34/47

57 Divorcing A 1 A 1 A 1 C 1 C 1 A 2 A 2 A 2 B A3 B B A 3 A 3 A 4 C 2 A 4 A = = = 81 Can this always be done? Chapter 3 p. 35/47

58 Divorcing A 1 A 1 A 1 C 1 C 1 A 2 A 2 A 2 B A3 B B A 3 A 3 A 4 C 2 A 4 A = = = 81 Can this always be done? Price Condition Size Age Brand Buy? Chapter 3 p. 35/47

59 Divorcing: An example A: a 1, a 2, a 3 P(F a 1, b 2, D, E) = P(F a 2, b 1, D, E) P(F a 1, b 1, D, E) = P(F a 2, b 2, D, E) P(F a 3, b i, D, E) = P(F a j, b 3, D, E) B: b 1, b 2, b 3 F D: d 1, d 2, d 3 E: e 1, e 2, e 3 Chapter 3 p. 36/47

60 Divorcing: An example A: a 1, a 2, a 3 C: c 1, c 2, c 3 P(F a 1, b 2, D, E) = P(F a 2, b 1, D, E) P(F a 1, b 1, D, E) = P(F a 2, b 2, D, E) P(F a 3, b i, D, E) = P(F a j, b 3, D, E) B: b 1, b 2, b 3 D: d1, d2, d3 F E: e 1, e 2, e 3 P(C a 1, b 2 ) = P(C a 2, b 1 ) = (1,0, 0) P(C a 1, b 1 ) = P(C a 2, b 2 ) = (0,1, 0) P(C a 3, b i ) = P(C a j, b 3 ) = (0,0,1) P(F c 1, D, E) = P(F a 1, b 2, D, E) P(F c 2, D, E) = P(F a 1, b 1, D, E) P(F c 3, D, E) = P(F a 3, b i, D, E) Chapter 3 p. 36/47

61 Logical constraints I have washed two pairs of socks, and now it is hard to distinguish them. Still it is important for me to couple them correctly. The color and pattern give indications. P 1 : C 1 : p 1, p 2 c 1, c 2 P 2 : C 2 : p 1, p 2 c 1, c 2 P 3 : C 3 : p 1, p 2 c 1, c 2 P 4 : C 4 : p 1, p 2 c 1, c 2 Sock 1 : t 1, t 2 Sock 2 : t 1, t 2 Sock 3 : t 1, t 2 Sock 4 : t 1, t 2 Chapter 3 p. 37/47

62 Logical constraints I have washed two pairs of socks, and now it is hard to distinguish them. Still it is important for me to couple them correctly. The color and pattern give indications. P 1 : C 1 : p 1, p 2 c 1, c 2 P 2 : C 2 : p 1, p 2 c 1, c 2 P 3 : C 3 : p 1, p 2 c 1, c 2 P 4 : C 4 : p 1, p 2 c 1, c 2 Sock 1 : t 1, t 2 Sock 2 : t 1, t 2 Sock 3 : t 1, t 2 Sock 4 : t 1, t 2 S 3, S 4 S1, S2 t 1, t 1 t 1, t 2 t 2, t 1 t 2, t 2 t 1, t t 1, t t 2, t t 2, t Const.: y,n P(Const. = y Sock 1,..., Sock 4 ) y Chapter 3 p. 37/47

63 Expert disagreement There are three experts: A C B B B y n y n y n D y C n P 1 (D = y B, C) y C n P 2 (D = y B, C) y C n P 3 (D = y B, C) B I believe twice as much in P 3 as I do in the others! Chapter 3 p. 38/47

64 Expert disagreement There are three experts: A C B B B y n y n y n S D y C n P 1 (D = y B, C) y C n P 2 (D = y B, C) y C n P 3 (D = y B, C) B I believe twice as much in P 3 as I do in the others! Encode the confidence in P(S): P(S) = (0.25,0.25,0.5) hence, C y B y (0.4,0.4,0.6) (0.7,0.9,0.7) n (0.6, 0.4, 0.5) (0.9, 0.7, 0.9) P(D = y B, C, S) n Chapter 3 p. 38/47

65 Interventions Clean the spark plugs: Fuel SP FM St? Chapter 3 p. 39/47

66 Interventions Clean the spark plugs: Fuel SP Clean FM St? SP-C St-C? Chapter 3 p. 39/47

67 Joint probabilities I Cold? Angina? Fever? Sore Throat? See spots? It is not unusual to suffer from both cold and angina, so we look for the joint probability: P(Angina?, Cold? ē) Chapter 3 p. 40/47

68 Joint probabilities I Cold? Angina? Fever? Sore Throat? See spots? It is not unusual to suffer from both cold and angina, so we look for the joint probability: P(Angina?, Cold? ē) From the fundamental rule we have: P(Angina?, Cold? ē) = P(Angina? Cold?, ē)p(cold? ē) The probability: P(Cold? ē) can be found by propagating ē. P(Angina? Cold?, ē) can be found from P(Angina? Cold? = yes, ē) and P(Angina? Cold? = no, ē). Chapter 3 p. 40/47

69 Joint probabilities II From the fundamental rule we have: P(Angina?, Cold? ē) = P(Angina? Cold?, ē)p(cold? ē) Assume that: e = (Fever? = no, SeeSpots? = yes, SoreThroat? = no). We can calculate: P(Cold? ē) = (, ) As well as: P(Angina? Cold? = yes, ē) = (,, ) P(Angina? Cold? = no, ē) = (,, ) We can now calculate P(Angina?, Cold? ē): Cold? no yes Angina? no mild severe Chapter 3 p. 41/47

70 Joint probabilities II From the fundamental rule we have: P(Angina?, Cold? ē) = P(Angina? Cold?, ē)p(cold? ē) Assume that: e = (Fever? = no, SeeSpots? = yes, SoreThroat? = no). We can calculate: P(Cold? ē) = (0.997(n), 0.003(y)) As well as: P(Angina? Cold? = yes, ē) = (0(n),1(m),0(s)) P(Angina? Cold? = no, ē) = (0(n), 0.971(m), 0.029(s)) We can now calculate P(Angina?, Cold? ē): Cold? Angina? no mild severe no yes Chapter 3 p. 41/47

71 Most probable explanation (MPE) We can find the most probable configuration of Cold? and Angina? from: P(Angina?, Cold? ē) However, this can be achieved must faster: Use maximization instead of summation when marginalizing out a variable. Chapter 3 p. 42/47

72 Most probable explanation (MPE) We can find the most probable configuration of Cold? and Angina? from: P(Angina?, Cold? ē) However, this can be achieved must faster: Use maximization instead of summation when marginalizing out a variable. This gives us MPE(Cold?)=no and MPE(Angina?)=mild. Chapter 3 p. 42/47

73 Is the evidence reliable? Since I see Fever? = no and SoreThroat? = no it seems questionable that I see spots! Can this warning be given by the system? Is the evidence coherent? Chapter 3 p. 43/47

74 Is the evidence reliable? Since I see Fever? = no and SoreThroat? = no it seems questionable that I see spots! Can this warning be given by the system? Is the evidence coherent? For a coherent case covered by the model we expect the evidence to support each other: P(e 1, e 2 ) > P(e 1 )P(e 2 ) We can measure this using: conf(e 1, e 2 ) = log 2 P(e 1 )P(e 2 ) P(e 1, e 2 ) Thus, if conf(e 1, e 2 ) > 0 we take it as an indication that the evidence is conflicting. Chapter 3 p. 43/47

75 Example conf(fever? = no, SeeSpots? = yes, SoreThroat? = no) = log 2 P(Fever? = no)p(seespots? = yes)p(sorethroat? = no) P(Fever? = no, SeeSpots? = yes, SoreThroat? = no) = log = log 2 ( ) = Thus, we take it as an indication that the evidence is conflicting! Chapter 3 p. 44/47

76 What are the crucial findings? We would like to answer questions such as: What are the crucial findings? What if one of the findings were changed or removed? What set of findings would be sufficient for the conclusion? Assume the conclusion that I suffer from mild angina: Chapter 3 p. 45/47

77 What are the crucial findings? We would like to answer questions such as: What are the crucial findings? What if one of the findings were changed or removed? What set of findings would be sufficient for the conclusion? Assume the conclusion that I suffer from mild angina: It is not enough with SeeSpots? = yes: P(Angina? SeeSpots? = yes) = (0(n),0.024(m),0.976(s)) However, SeeSpots? = yes and SoreThroat? = no is sufficient: P(Angina? SeeSpots? = yes, SoreThroat? = no) = (0(n), 0.884(m), 0.116(s)) In this case findings on Fever? is irrelevant, e.g.: P(Angina? SeeSpots? = yes, SoreThroat? = no, Fever? = high) = (0(n), 0.683(m), 0.317(s)) Chapter 3 p. 45/47

78 Sensitivity to variations in parameters The initial tables: Angina? no mild severe Cold? no yes no yes no yes no yes P(Sore throat? Angina?, Cold?) Angina? no mild severe no yes P(See spots? Angina?) Chapter 3 p. 46/47

79 Sensitivity to variations in parameters The initial tables: Angina? no mild severe Cold? no yes no yes no yes no yes P(Sore throat? Angina?, Cold?) Angina? no mild severe no yes P(See spots? Angina?) Assume that we have the parameters: Angina? no mild severe Cold? no yes no yes no yes no t 0 yes P(Sore throat? Angina?, Cold?) We want e.g.: Angina? no mild severe no yes 0 s 1 P(See spots? Angina?) P(Angina? = mild ē)(t) ; P(Angina? = mild ē)(s) ; P(Angina? = mild ē)(s, t) Chapter 3 p. 46/47

80 Sensitivity analysis Theorem: P(ē)(t) = αt + β = x(t) Thus, we also have that P(Angina? = mild, ē)(t) = a t + b = y(t), and therefore: P(Angina? = mild ē)(t) = y(t) x(t) Chapter 3 p. 47/47

81 Sensitivity analysis Theorem: P(ē)(t) = αt + β = x(t) Thus, we also have that P(Angina? = mild, ē)(t) = a t + b = y(t), and therefore: P(Angina? = mild ē)(t) = y(t) x(t) For t = we have x(t) = and y(t) = If we change t to and propagate we get: x(0.002) = y(0.002) = We can now determine the coefficients α, β, a and b! Chapter 3 p. 47/47

Bayesian networks II. Model building. Anders Ringgaard Kristensen

Bayesian networks II. Model building Anders Ringgaard Kristensen Outline Determining the graphical structure Milk test Mastitis diagnosis Pregnancy Determining the conditional probabilities Modeling methods