COMP61011 Probabilistic Classifiers Part 1, Bayes Theorem
Reverend Thomas Bayes, 1702-1761 p ( T W ) W T ) T ) W ) Bayes Theorem forms the backbone of the past 20 years of ML research into probabilistic models. Used everywhere : e.g. finding sunken shipwrecks, your last Google search, determining the guilt of defendants in a trial, assessing the outcome of a breast cancer screening.
Thinking in Probabilities Outlook Temperature Humidity Wind Tennis? D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No wind strong ) The chances of the wind being strong, among all days.
Thinking in Probabilities Outlook Temperature Humidity Wind Tennis? D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No wind strong ) 6 /14 0. 4286 The chances of the wind being strong, among all days.
Thinking in Probabilities Outlook Temperature Humidity Wind Tennis? D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No wind strong tennis yes ) The chances of a strong wind day, given that the person enjoyed tennis.
Thinking in Probabilities Outlook Temperature Humidity Wind Tennis? D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D7 Overcast Cool Normal Strong Yes D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes wind strong tennis yes ) The chances of a strong wind day, given that the person enjoyed tennis.
Thinking in Probabilities Outlook Temperature Humidity Wind Tennis? D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D7 Overcast Cool Normal Strong Yes D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes wind strong tennis yes ) 3 / 9 0. 333 The chances of a strong wind day, given that the person enjoyed tennis.
Thinking in Probabilities Outlook Temperature Humidity Wind Tennis? D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No tennis yes wind strong ) The chances of the person enjoying tennis, given that it is a strong wind day.
Thinking in Probabilities Outlook Temperature Humidity Wind Tennis? D2 Sunny Hot High Strong No D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D14 Rain Mild High Strong No tennis yes wind strong ) 0.5 The chances of the person enjoying tennis, given that it is a strong wind day.
Thinking in Probabilities Outlook Temperature Humidity Wind Tennis? D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No temp hot tennis yes ) tennis yes temp hot) tennis yes temp hot, humidity high)
A Problem to Solve The facts: 1% of the female population have breast cancer 80% of women with breast cancer get positive mammography 9.6% of women without breast cancer also get positive mammography The question: A woman has a positive mammography. What is the probability that she has breast cancer? Quick guess: a) less than 1% b) somewhere between 1% and 70% c) between 70% and 80% d) more than 80%
Write down the probabilities of everything Define variables: C :1 M :1 The prior probability of cancer in the population is 1%, so The probability of positive test given there is cancer, If there is no cancer, we still have The question is: what is presence of cancer, 0 no cancer, positivemammography,0 negative mammography M 1 C 0) 0.096 C 1 M 1)? C 1) M 1 C 1) 0.01 0.8
Working with Concrete Numbers 10,000 patients M1 C1) 0.8 C1) 0.01? cancer C0) 0.99 M1 C0) 0.096? no cancer? cancer, positive test? cancer, negative test? no cancer, positive test? no cancer, negative test
Working with Concrete Numbers 10,000 patients C1) 0.01 100 cancer C0) 0.99 9900 no cancer M1 C1) 0.8 M1 C0) 0.096 80 cancer, positive test 20 cancer, negative test 950.4 no cancer, positive test 8949.6 no cancer, negative test C1 M1)? How many people from 10,000 get M1? How many people from 10,000 get C1 and M1?
Working with Concrete Numbers 10,000 patients C1) 0.01 100 cancer C0) 0.99 9900 no cancer M1 C1) 0.8 M1 C0) 0.096 80 cancer, positive test 20 cancer, negative test 950.4 no cancer, positive test 8949.6 no cancer, negative test C1 M1) 80 / (80+950.4) 7.76%
Surprising result Do you trust your Doctor? Although the probability of a positive mammography given cancer is 80%, the probability of cancer given a positive mammography is only about 7.8%. 8/10 doctors would have said: c) between 70% and 80%. WRONG Common mistake: the probability that a woman with positive mammography has cancer is not the same as the probability that a woman with cancer has a positive mammography. One must also consider : the background chances (prior) of having breast cancer, the chances of receiving a false alarm in the test.
A return to tennis, and... an interesting symmetry W : 1strong 0weak T : 1yes 0no p ( W 1 T 1) T 1) T 1 W 1) W 1) Try it again for a different assignment e.g. T 1, W 0
Bayes Rule From the previous slide, we know that. (i.e. true for any assignment of values to the variables) This leads to what is known as Bayes Rule: ) ( ) ( ) ( ) ( T p T W p W p W T p ) ( ) ( ) ( ) ( W p T p T W p W T p
Solving the medical problem with Bayes Rule we know this we know this we want this C 1 M 1) M 1 C 1) C M 1) 1) we don t know this
10,000 patients C1) 0.01 100 cancer C0) 0.99 9900 no cancer M1 C1) 0.8 M1 C0) 0.096 80 cancer, posi1ve test 20 cancer, negative test 950.4 no cancer, posi1ve test 8949.6 no cancer, negative test 80 M 1) + 10,000 950.4 10,000 ( 0.01 0.8) + ( 0.99 0.096) To get M1) Just multiply probabilities on the branches
Solving the medical problem with Bayes Rule C 1 M 1) M 1 C 1) C M 1) 1) M 1 C M 1 C 1) C 1) 1) C 1) + M 1 C 0) C 0) Notice the denominator now contains the same term as the numerator. We only need to know two terms here: M1 C1)C1) and M1 C0)C0)
Solving the medical problem with Bayes Rule C 1 M 1) M 1 C 1) C 1) 0.008 0.8 0. 01 C 0 M 1) M 1 C 0) C 0) 0.09504 0.096 0. 99 C 1 M 1) 0.008 0.008 + 0.09504 0.0776 7.76%
Another Problem to Solve with Probabilities Your car is making a noise. What are the chances that the tank is empty? The chances of the car making noise, if the tank really is empty. The chances of the car making noise, if the tank is not empty noisy 1 empty 1) noisy 1 empty 0) 0.9 0.2 The chances of the tank being empty, regardless of anything else. empty 1) 0.5 empty 1 noisy 1)?
Bayes Rule empty 1 noisy 1) noisy 1 empty 1) empty 1) 0.45 0.9 0. 5 empty 0 noisy 1) noisy 1 empty 0) empty 0) 0.1 0.2 0. 5 0.45 empty 1 noisy 1) 0.45+ 0.1 0.8182
Another Problem to Solve A person tests positive for a certain medical disease. What are the chances that they really do have the disease? The chances of the test being positive, if the person really is ill. The chances of the test being positive, if the person is in fact well. test 1 disease 1) test 1 disease 0) 0.9 0.01 The chances of the condition, in the general population. disease 1) 0.05 disease 1 test 1)?
Bayes Rule disease 1 test 1) test 1 disease 1) disease 1) 0.045 0.9 0. 05 disease 0 test 1) test 1 disease 0) disease 0) 0.0095 0.01 0. 95 0.045 disease 1 test 1) 0.045 + 0.0095 0.8257
Another Problem to Solve Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow. When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn't rain, he incorrectly forecasts rain 20% of the time. What is are the chances it will rain on the day of Marie's wedding? The chances of the forecast saying rain, if it really does rain. The chances of the forecast saying rain, if it will be fine. forecastrain 1 rain 1) forecastrain 1 rain 0) 0.9 0.2 The chances of rain, in the general case. rain 1) 5/365 0.0137
Bayes Rule rain 1 forecastrain 1) forecastrain 1 rain 1) rain 1) 0.0123 0.9 0. 0137 rain 0 forecastrain 1) forecastrain 1 rain 0) rain 0) 0.1973 0.2 0. 9863 0.0123 rain 1 forecastrain 1) 0.0123+ 0.1973 0.0587 Only 5.8% chance of rain
W T) T) as a network But what about the other features? T) T W T) W Outlook Temperature Humidity Wind Tennis? D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No
This? We ll run out of data.. T) T Wind, Temp, Humid, Outlook T) W,T,H,O Outlook Temperature Humidity Wind Tennis? D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No
) ( ) ( ) ( ) ( X p T p T X p X T p },,, { outlook humidity temp wind X etc,, 2 1 temp X wind X Let s assume that all the variables are INDEPENDENT given T : ) ( ) ( ) ( ) ( 4 1 X p T X p T p X T p i i This is the NAÏVE BAYES assumption.
X { wind, temp, humidity, outlook } X 1 wind, X 2 temp, etc T X ) T ) 4 i 1 X X ) i T ) T) T Wind Temp Humid Outlook Naïve Bayes - special case of a BAYESIAN NETWORK - can add more links to specify dependencies - e.g. temperature affects humidity
T) T Wind Temp Humid Outlook Can learn the linkage called structure learning but is NP hard Calculating the final probability T X1, Xn) is called inference. Computationally intensive when we have very complicated graphs. In spite of this BNs are a very flexible way of learning. A subclass of a wider class of probabilistic modelling algorithms. State of the art in modern Machine Learning