Probabilistic and Bayesian Analytics
|
|
- Thomasine Houston
- 5 years ago
- Views:
Transcription
1 This ersion of the Powerpoint presentation has been labeled to let you know which bits will by coered in the ain class (Tue/Thu orning lectures and which parts will be coered in the reiew sessions. Probabilistic and Bayesian Analytics Note to other teachers and users of these slides. Andrew would be delighted if you found this source aterial useful in giing your own lectures. Feel free to use these slides erbati, or to odify the to fit your own needs. PowerPoint originals are aailable. If you ake use of a significant portion of these slides in your own lecture, please include this essage, or the following link to the source repository of Andrew s tutorials: Coents and corrections gratefully receied. Andrew W. Moore Associate Professor School of Coputer Science Carnegie Mellon Uniersity aw@cs.cu.edu Copyright 2, Andrew W. Moore Aug 25th, 2 Probability The world is a ery uncertain place 3 years of Artificial Intelligence and Database research danced around this fact And then a few AI researchers decided to use soe ideas fro the eighteenth century Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 2
2 What we re going to do We will reiew the fundaentals of probability. It s really going to be worth it In this lecture, you ll see an exaple of probabilistic analytics in action: Bayes Classifiers Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 3 Discrete Rando Variables Reiew Session A is a Boolean-alued rando ariable if A denotes an eent, and there is soe degree of uncertainty as to whether A occurs. Exaples A The US president in 223 will be ale A ou wake up toorrow with a headache A ou hae Ebola Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 4
3 Reiew Session Probabilities We write A as the fraction of possible worlds in which A is true We could at this point spend 2 hours on the philosophy of this. But we won t. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 5 Reiew Session Visualizing A Eent space of all possible worlds Worlds in which A is true A Area of reddish oal Its area is Worlds in which A is False Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 6
4 Reiew Session The Axios of Probability < A < True False A or B A + B - A and B Where do these axios coe fro? Were they discoered? Answers coing up later. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 7 Reiew Session Interpreting the axios < A < True False A or B A + B - A and B The area of A can t get any saller than And a zero area would ean no world could eer hae A true Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 8
5 Interpreting the axios < A < True False A or B A + B - A and B Reiew Session The area of A can t get any bigger than And an area of would ean all worlds will hae A true Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 9 Interpreting the axios < A < True False A or B A + B - A and B Reiew Session A B Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide
6 Interpreting the axios < A < True False A or B A + B - A and B Reiew Session A B A or B A and B B Siple addition and subtraction Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide These Axios are Not to be Trifled With There hae been attepts to do different ethodologies for uncertainty Fuzzy Logic Three-alued logic Depster-Shafer Non-onotonic reasoning Reiew Session But the axios of probability are the only syste with this property: If you gable using the you can t be unfairly exploited by an opponent using soe other syste [di Finetti 93] Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 2
7 Theores fro the Axios < A <, True, False A or B A + B - A and B Fro these we can proe: not A ~A -A How? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 3 Reiew Session Side Note I a inflicting these proofs on you for two reasons:. These kind of anipulations will need to be second nature to you if you use probabilistic analytics in depth 2. Suffering is good for you Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 4
8 Another iportant theore < A <, True, False A or B A + B - A and B Fro these we can proe: A A ^ B + A ^ ~B How? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 5 Multialued Rando Variables Suppose A can take on ore than 2 alues A is a rando ariable with arity k if it can take on exactly one alue out of {, 2,.. k } Thus A P A A A if i A i j ( 2 k j Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 6
9 An easy fact about Multialued Rando Variables: Using the axios of probability < A <, True, False A or B A + B - A and B And assuing that A obeys A P A It s easy to proe that A A if i A i j ( 2 k j A A 2 A i A i j j Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 7 An easy fact about Multialued Rando Variables: Using the axios of probability < A <, True, False A or B A + B - A and B And assuing that A obeys A P A It s easy to proe that A A if i A i j ( 2 k j A A 2 A And thus we can proe k j A j i A i j j Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 8
10 Another fact about Multialued Rando Variables: Using the axios of probability < A <, True, False A or B A + B - A and B And assuing that A obeys A P A It s easy to proe that A A if i A i j ( 2 k j B [ A A 2 A ] i Reiew Session B A i j j Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 9 Another fact about Multialued Rando Variables: Using the axios of probability < A <, True, False A or B A + B - A and B And assuing that A obeys A P A It s easy to proe that A A if i A i j ( 2 k j B [ A A 2 A ] And thus we can proe B k j B A i B A i j j j Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 2
11 Eleentary Probability in Pictures ~A + A Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 2 Eleentary Probability in Pictures B B ^ A + B ^ ~A Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 22
12 Eleentary Probability in Pictures k j A j Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 23 Eleentary Probability in Pictures B k j B A j Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 24
13 Conditional Probability A B Fraction of worlds in which B is true that also hae A true H Hae a headache F Coing down with Flu F H / F /4 H F /2 H Headaches are rare and flu is rarer, but if you re coing down with flu there s a 5-5 chance you ll hae a headache. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 25 Conditional Probability F H F Fraction of flu-inflicted worlds in which you hae a headache H Hae a headache F Coing down with Flu H / F /4 H F /2 H #worlds with flu and headache #worlds with flu Area of H and F region Area of F region H ^ F F Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 26
14 Definition of Conditional Probability A ^ B A B B Corollary: The Chain Rule A ^ B A B B Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 27 Probabilistic Inference F H Hae a headache F Coing down with Flu H H / F /4 H F /2 One day you wake up with a headache. ou think: Drat! 5% of flus are associated with headaches so I ust hae a 5-5 chance of coing down with flu Is this reasoning good? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 28
15 Probabilistic Inference F H Hae a headache F Coing down with Flu H H / F /4 H F /2 F ^ H F H Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 29 What we just did A ^ B A B B B A A A This is Bayes Rule Bayes, Thoas (763 An essay towards soling a proble in the doctrine of chances. Philosophical Transactions of the Royal Society of London, 53:37-48 Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 3
16 Using Bayes Rule to Gable $. The Win enelope has a dollar and four beads in it The Lose enelope has three beads and no oney Triial question: soeone draws an enelope at rando and offers to sell it to you. How uch should you pay? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 3 Using Bayes Rule to Gable $. The Win enelope has a dollar and four beads in it The Lose enelope has three beads and no oney Interesting question: before deciding, you are allowed to see one bead drawn fro the enelope. Suppose it s black: How uch should you pay? Suppose it s red: How uch should you pay? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 32
17 Calculation $. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 33 More General Fors of Bayes Rule B A A A B B A A + B ~ A ~ A B A X A X A B X B X Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 34
18 Reiew Session More General Fors of Bayes Rule A i B n A k B A A B A i A k i k Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 35 Useful Easy-to-proe facts A B + A B n A k A k B Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 36
19 The Joint Distribution Recipe for aking a joint distribution of M ariables: Exaple: Boolean ariables A, B, C Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 37 The Joint Distribution Recipe for aking a joint distribution of M ariables:. Make a truth table listing all cobinations of alues of your ariables (if there are M Boolean ariables then the table will hae 2 M rows. A Exaple: Boolean ariables A, B, C B C Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 38
20 The Joint Distribution Recipe for aking a joint distribution of M ariables:. Make a truth table listing all cobinations of alues of your ariables (if there are M Boolean ariables then the table will hae 2 M rows. 2. For each cobination of alues, say how probable it is. A Exaple: Boolean ariables A, B, C B C Prob Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 39 The Joint Distribution Recipe for aking a joint distribution of M ariables:. Make a truth table listing all cobinations of alues of your ariables (if there are M Boolean ariables then the table will hae 2 M rows. 2. For each cobination of alues, say how probable it is. 3. If you subscribe to the axios of probability, those nubers ust su to. A A Exaple: Boolean ariables A, B, C B.5 C Prob C.3 B. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 4
21 Using the Joint One you hae the JD you can ask for the probability of any logical expression inoling your attribute E rows atching E row Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 4 Using the Joint Poor Male.4654 E rows atching E row Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 42
22 Using the Joint Poor.764 E rows atching E row Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 43 Inference with the Joint E E 2 E E2 E 2 rows atching E and E rows atching E row 2 2 row Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 44
23 Inference with the Joint E E 2 E E2 E 2 rows atching E and E rows atching E row 2 2 row Male Poor.4654 / Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 45 Inference is a big deal I e got this eidence. What s the chance that this conclusion is true? I e got a sore neck: how likely a I to hae eningitis? I see y lights are out and it s 9p. What s the chance y spouse is already asleep? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 46
24 Inference is a big deal I e got this eidence. What s the chance that this conclusion is true? I e got a sore neck: how likely a I to hae eningitis? I see y lights are out and it s 9p. What s the chance y spouse is already asleep? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 47 Inference is a big deal I e got this eidence. What s the chance that this conclusion is true? I e got a sore neck: how likely a I to hae eningitis? I see y lights are out and it s 9p. What s the chance y spouse is already asleep? There s a thriing set of industries growing based around Bayesian Inference. Highlights are: Medicine, Phara, Help Desk Support, Engine Fault Diagnosis Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 48
25 Where do Joint Distributions coe fro? Idea One: Expert Huans Idea Two: Sipler probabilistic facts and soe algebra Exaple: Suppose you knew A.7 B A.2 B ~A. C A^B. C A^~B.8 C ~A^B.3 C ~A^~B. Then you can autoatically copute the JD using the chain rule Ax ^ By ^ Cz Cz Ax^ By By Ax Ax In another lecture: Bayes Nets, a systeatic way to do this. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 49 Where do Joint Distributions coe fro? Idea Three: Learn the fro data! Prepare to see one of the ost ipressie learning algoriths you ll coe across in the entire course. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 5
26 Learning a joint distribution Build a JD table for your attributes in which the probabilities are unspecified A B C Prob???????? Fraction of all records in which A and B are True but C is False The fill in each row with records atching row Pˆ (row total nuber of records A B C Prob Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 5 Exaple of Learning a Joint This Joint was obtained by learning fro three attributes in the UCI Adult Census Database [Kohai 995] Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 52
27 Where are we? We hae recalled the fundaentals of probability We hae becoe content with what JDs are and how to use the And we een know how to learn JDs fro data. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 53 Density Estiation Our Joint Distribution learner is our first exaple of soething called Density Estiation A Density Estiator learns a apping fro a set of attributes to a Probability Input Attributes Density Estiator Probability Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 54
28 Density Estiation Copare it against the two other ajor kinds of odels: Input Attributes Classifier Prediction of categorical output Input Attributes Density Estiator Probability Input Attributes Regressor Prediction of real-alued output Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 55 Ealuating Density Estiation Test-set criterion for estiating perforance on future data* * See the Decision Tree or Cross Validation lecture for ore detail Input Attributes Classifier Prediction of categorical output Test set Accuracy Input Attributes Density Estiator Probability? Input Attributes Regressor Prediction of real-alued output Test set Accuracy Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 56
29 Ealuating a density estiator Gien a record x, a density estiator M can tell you how likely the record is: Pˆ ( x M Gien a dataset with R records, a density estiator can tell you how likely the dataset is: (Under the assuption that all records were independently generated fro the Density Estiator s JD Pˆ (dataset M Pˆ( x R x2 K xr M Pˆ( xk M k Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 57 A sall dataset: Miles Per Gallon pg odelyear aker 92 Training Set Records good 75to78 asia bad 7to74 aerica bad 75to78 europe bad 7to74 aerica bad 7to74 aerica bad 7to74 asia bad 7to74 asia bad 75to78 aerica : : : : : : : : : bad 7to74 aerica good 79to83 aerica bad 75to78 aerica good 79to83 aerica bad 75to78 aerica good 79to83 aerica good 79to83 aerica bad 7to74 aerica good 75to78 europe bad 75to78 europe Fro the UCI repository (thanks to Ross Quinlan Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 58
30 A sall dataset: Miles Per Gallon pg odelyear aker 92 Training Set Records good 75to78 asia bad 7to74 aerica bad 75to78 europe bad 7to74 aerica bad 7to74 aerica bad 7to74 asia bad 7to74 asia bad 75to78 aerica : : : : : : : : : bad 7to74 aerica good 79to83 aerica bad 75to78 aerica good 79to83 aerica bad 75to78 aerica good 79to83 aerica good 79to83 aerica bad 7to74 aerica good 75to78 europe bad 75to78 europe Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 59 A sall dataset: Miles Per Gallon pg odelyear aker 92 Training Set Records Pˆ(dataset M good 75to78 asia bad 7to74 aerica bad 75to78 europe bad 7to74 aerica bad 7to74 aerica bad 7to74 asia bad 7to74 asia bad 75to78 aerica : : : : : : : : : bad 7to74 aerica good 79to83 aerica bad 75to78P ˆ( aerica x good 79to83 aerica bad 75to78 aerica good 79to83 aerica good 79to83 (in this aerica bad 7to74 aerica good 75to78 europe bad 75to78 europe R x2 K x M k -23 case 3.4 R Pˆ( xk M Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 6
31 Log Probabilities Since probabilities of datasets get so sall we usually use log probabilities log Pˆ(dataset M log R k R Pˆ( xk M log Pˆ( xk M k Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 6 A sall dataset: Miles Per Gallon pg odelyear aker 92 Training Set Records log Pˆ(dataset good 75to78 asia bad 7to74 aerica bad 75to78 europe bad 7to74 aerica bad 7to74 aerica bad 7to74 asia bad 7to74 asia bad 75to78 aerica : : : : : : : : : bad 7to74 aerica good 79to83 aerica bad M75to78 aerica log good 79to83 aerica bad 75to78 aerica 79to83 aerica good 79to83 (in this aerica bad 7to74 aerica good 75to78 europe bad 75to78 europe R Pˆ( xk M log Pˆ( xk M k k case R Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 62
32 Suary: The Good News We hae a way to learn a Density Estiator fro data. Density estiators can do any good things Can sort the records by probability, and thus spot weird records (anoaly detection Can do inference: E E2 Autoatic Doctor / Help Desk etc Ingredient for Bayes Classifiers (see later Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 63 Suary: The Bad News Density estiation by directly learning the joint is triial, indless and dangerous Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 64
33 Using a test set An independent test set with 96 cars has a worse log likelihood (actually it s a billion quintillion quintillion quintillion quintillion ties less likely.density estiators can oerfit. And the full joint density estiator is the oerfittiest of the all! Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 65 Oerfitting Density Estiators If this eer happens, it eans there are certain cobinations that we learn are ipossible R log Pˆ(testset M log Pˆ( xk M log Pˆ( xk M k k if for any k Pˆ( x M k R Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 66
34 Using a test set The only reason that our test set didn t score -infinity is that y code is hard-wired to always predict a probability of at least one in 2 We need Density Estiators that are less prone to oerfitting Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 67 Naïe Density Estiation The proble with the Joint Estiator is that it just irrors the training data. We need soething which generalizes ore usefully. The naïe odel generalizes strongly: Assue that each attribute is distributed independently of any of the other attributes. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 68
35 Independently Distributed Data Let x[i] denote the i th field of record x. The independently distributed assuption says that for any i,, u u 2 u i- u i+ u M x[ i] x[] u, x[2] u2, Kx[ i ] ui, x[ i + ] ui+, Kx[ M ] u x[ i] Or in other words, x[i] is independent of {x[],x[2],..x[i-], x[i+], x[m]} This is often written as x[ i] { x[], x[2], K x[ i ], x[ i + ], Kx[ M ]} M Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 69 A note about independence Assue A and B are Boolean Rando Variables. Then A and B are independent if and only if A B A A and B are independent is often notated as A B Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 7
36 Reiew Session Independence Theores Assue A B A Then A^B Assue A B A Then B A A B B Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 7 Reiew Session Independence Theores Assue A B A Then ~A B Assue A B A Then A ~B ~A A Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 72
37 Multialued Independence For ultialued Rando Variables A and B, A B if and only if u, : A u B A u fro which you can then proe things like u, : A u B A u B u, : B A B Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 73 Back to Naïe Density Estiation Let x[i] denote the i th field of record x: Naïe DE assues x[i] is independent of {x[],x[2],..x[i-], x[i+], x[m]} Exaple: Suppose that each record is generated by randoly shaking a green dice and a red dice Dataset : A red alue, B green alue Dataset 2: A red alue, B su of alues Dataset 3: A su of alues, B difference of alues Which of these datasets iolates the naïe assuption? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 74
38 Using the Naïe Distribution Once you hae a Naïe Distribution you can easily copute any row of the joint distribution. Suppose A, B, C and D are independently distributed. What is A^~B^C^~D? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 75 Using the Naïe Distribution Once you hae a Naïe Distribution you can easily copute any row of the joint distribution. Suppose A, B, C and D are independently distributed. What is A^~B^C^~D? A ~B^C^~D ~B^C^~D A ~B^C^~D A ~B C^~D C^~D A ~B C^~D A ~B C ~D ~D A ~B C ~D Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 76
39 Naïe Distribution General Case Suppose x[], x[2], x[m] are independently distributed. M, x[2] u2, x[ M ] um x[ k] u k k x[] u K So if we hae a Naïe Distribution we can construct any row of the iplied Joint Distribution on deand. So we can do any inference But how do we learn a Naïe Density Estiator? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 77 Learning a Naïe Density Estiator Pˆ ( x[ i] u # records in which x[ i] u total nuber of records Another triial learning algorith! Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 78
40 Joint DE Contrast Naïe DE Can odel anything No proble to odel C is a noisy copy of A Gien records and ore than 6 Boolean attributes will screw up badly Can odel only ery boring distributions Outside Naïe s scope Gien records and, ultialued attributes will be fine Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 79 Epirical Results: Hopeless The hopeless dataset consists of 4, records and 2 Boolean attributes called a,b,c, u. Each attribute in each record is generated 5-5 randoly as or. Reiew Session Aerage test set log probability during folds of k-fold cross-alidation* Described in a future Andrew lecture Despite the ast aount of data, Joint oerfits hopelessly and does uch worse Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 8
41 Epirical Results: Logical Reiew Session The logical dataset consists of 4, records and 4 Boolean attributes called a,b,c,d where a,b,c are generated 5-5 randoly as or. D A^~C, except that in % of records it is flipped The DE learned by Joint The DE learned by Naie Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 8 Epirical Results: Logical Reiew Session The logical dataset consists of 4, records and 4 Boolean attributes called a,b,c,d where a,b,c are generated 5-5 randoly as or. D A^~C, except that in % of records it is flipped The DE learned by Joint The DE learned by Naie Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 82
42 Epirical Results: MPG The MPG dataset consists of 392 records and 8 attributes Reiew Session A tiny part of the DE learned by Joint The DE learned by Naie Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 83 Epirical Results: MPG The MPG dataset consists of 392 records and 8 attributes Reiew Session A tiny part of the DE learned by Joint The DE learned by Naie Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 84
43 Epirical Results: Weight s. MPG Suppose we train only fro the Weight and MPG attributes Reiew Session The DE learned by Joint The DE learned by Naie Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 85 Epirical Results: Weight s. MPG Suppose we train only fro the Weight and MPG attributes Reiew Session The DE learned by Joint The DE learned by Naie Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 86
44 Weight s. MPG : The best that Naïe can do Reiew Session The DE learned by Joint The DE learned by Naie Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 87 Reinder: The Good News We hae two ways to learn a Density Estiator fro data. *In other lectures we ll see astly ore ipressie Density Estiators (Mixture Models, Bayesian Networks, Density Trees, Kernel Densities and any ore Density estiators can do any good things Anoaly detection Can do inference: E E2 Autoatic Doctor / Help Desk etc Ingredient for Bayes Classifiers Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 88
45 Bayes Classifiers A foridable and sworn eney of decision trees Input Attributes Classifier Prediction of categorical output DT BC Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 89 How to build a Bayes Classifier Assue you want to predict output which has arity n and alues, 2, ny. Assue there are input attributes called X, X 2, X Break dataset into n saller datasets called DS, DS 2, DS ny. Define DS i Records in which i For each DS i, learn Density Estiator M i to odel the input distribution aong the i records. Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 9
46 How to build a Bayes Classifier Assue you want to predict output which has arity n and alues, 2, ny. Assue there are input attributes called X, X 2, X Break dataset into n saller datasets called DS, DS 2, DS ny. Define DS i Records in which i For each DS i, learn Density Estiator M i to odel the input distribution aong the i records. M i estiates X, X 2, X i Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 9 How to build a Bayes Classifier Assue you want to predict output which has arity n and alues, 2, ny. Assue there are input attributes called X, X 2, X Break dataset into n saller datasets called DS, DS 2, DS ny. Define DS i Records in which i For each DS i, learn Density Estiator M i to odel the input distribution aong the i records. M i estiates X, X 2, X i Idea: When a new set of input alues (X u, X 2 u 2,. X u coe along to be ealuated predict the alue of that akes X, X 2, X i ost likely predict Is this a good idea? argax X u L X u Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 92
47 How to build a Bayes Classifier Assue you want to predict output which has arity n and alues, 2, ny. Assue there are input attributes This called is a Maxiu X, X 2, X Likelihood classifier. Break dataset into n saller datasets called DS, DS 2, DS ny. Define DS i Records in which It i can get silly if soe s are For each DS i, learn Density Estiator M i to odel the input ery unlikely distribution aong the i records. M i estiates X, X 2, X i Idea: When a new set of input alues (X u, X 2 u 2,. X u coe along to be ealuated predict the alue of that akes X, X 2, X i ost likely predict Is this a good idea? argax X u L X u Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 93 How to build a Bayes Classifier Assue you want to predict output which has arity n and alues, 2, ny. Assue there are input attributes called X, X 2, X Break dataset into n saller datasets called DS, DS 2, DS ny. Define DS i Records in which i Much Better Idea For each DS i, learn Density Estiator M i to odel the input distribution aong the i records. M i estiates X, X 2, X i Idea: When a new set of input alues (X u, X 2 u 2,. X u coe along to be ealuated predict the alue of that akes i X, X 2, X ost likely predict argax X u L X u Is this a good idea? Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 94
48 Terinology MLE (Maxiu Likelihood Estiator: predict argax X u L X u MAP (Maxiu A-Posteriori Estiator: predict argax X u L X u Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 95 Getting what we need predict argax X u L X u Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 96
49 Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 97 Getting a posterior probability n j j j P u X u X P P u X u X P u X u X P P u X u X P u X u X P ( ( ( ( ( ( ( ( L L L L L Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 98 Bayes Classifiers in a nutshell ( ( argax ( argax predict P u X u X P u X u X P L L. Learn the distribution oer inputs for each alue. 2. This gies X, X 2, X i. 3. Estiate i. as fraction of records with i. 4. For a new prediction:
50 Bayes Classifiers in a nutshell. Learn the distribution oer inputs for each alue. 2. This gies X, X 2, X i. 3. Estiate i. as fraction of records with i. 4. For a new prediction: We can use our faorite Density Estiator here. predict argax argax X u L X X u u L X Right now we hae two options: u Joint Density Estiator Naïe Density Estiator Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 99 Joint Density Bayes Classifier predict argax X u L X u In the case of the joint Bayes Classifier this degenerates to a ery siple rule: predict the ost coon alue of aong records in which X u, X 2 u 2,. X u. Note that if no records hae the exact set of inputs X u, X 2 u 2,. X u, then X, X 2, X i for all alues of. In that case we just hae to guess s alue Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide
51 Joint BC Results: Logical The logical dataset consists of 4, records and 4 Boolean attributes called a,b,c,d where a,b,c are generated 5-5 randoly as or. D A^~C, except that in % of records it is flipped The Classifier learned by Joint BC Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide Joint BC Results: All Irreleant The all irreleant dataset consists of 4, records and 5 Boolean attributes called a,b,c,d..o where a,b,c are generated 5-5 randoly as or. (output with probability.75, with prob.25 Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 2
52 Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 3 Naïe Bayes Classifier ( ( argax predict P u X u X P L In the case of the naie Bayes Classifier this can be siplified: n j j j u X P P predict ( ( argax Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 4 Naïe Bayes Classifier ( ( argax predict P u X u X P L In the case of the naie Bayes Classifier this can be siplified: n j j j u X P P predict ( ( argax Technical Hint: If you hae, input attributes that product will underflow in floating point ath. ou should use logs: + n j j j u X P P predict ( log ( log argax
53 BC Results: XOR The XOR dataset consists of 4, records and 2 Boolean inputs called a and b, generated 5-5 randoly as or. c (output a XOR b The Classifier learned by Joint BC The Classifier learned by Naie BC Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 5 Naie BC Results: Logical Reiew Session The logical dataset consists of 4, records and 4 Boolean attributes called a,b,c,d where a,b,c are generated 5-5 randoly as or. D A^~C, except that in % of records it is flipped The Classifier learned by Naie BC Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 6
54 Naie BC Results: Logical Reiew Session The logical dataset consists of 4, records and 4 Boolean attributes called a,b,c,d where a,b,c are generated 5-5 randoly as or. D A^~C, except that in % of records it is flipped This result surprised Andrew until he had thought about it a little The Classifier learned by Joint BC Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 7 Naïe BC Results: All Irreleant The all irreleant dataset consists of 4, records and 5 Boolean attributes called a,b,c,d..o where a,b,c are generated 5-5 randoly as or. (output with probability.75, with prob.25 The Classifier learned by Naie BC Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 8
55 BC Results: MPG : 392 records Reiew Session The Classifier learned by Naie BC Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 9 BC Results: MPG : 4 records Reiew Session Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide
56 More Facts About Bayes Classifiers Many other density estiators can be slotted in*. Density estiation can be perfored with real-alued inputs* Bayes Classifiers can be built with real-alued inputs* Rather Technical Coplaint: Bayes Classifiers don t try to be axially discriinatie---they erely try to honestly odel what s going on* Zero probabilities are painful for Joint and Naïe. A hack (justifiable with the agic words Dirichlet Prior can help*. Naïe Bayes is wonderfully cheap. And suries, attributes cheerfully! *See future Andrew Lectures Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide Probability What you should know Fundaentals of Probability and Bayes Rule What s a Joint Distribution How to do inference (i.e. E E2 once you hae a JD Density Estiation What is DE and what is it good for How to learn a Joint DE How to learn a naïe DE Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 2
57 What you should know Bayes Classifiers How to build one How to predict with a BC Contrast between naïe and joint BCs Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 3 Interesting Questions Suppose you were ealuating NaieBC, JointBC, and Decision Trees Inent a proble where only NaieBC would do well Inent a proble where only Dtree would do well Inent a proble where only JointBC would do well Inent a proble where only NaieBC would do poorly Inent a proble where only Dtree would do poorly Inent a proble where only JointBC would do poorly Copyright 2, Andrew W. Moore Probabilistic Analytics: Slide 4
Probabilistic and Bayesian Analytics Based on a Tutorial by Andrew W. Moore, Carnegie Mellon University
robabilistic and Bayesian Analytics Based on a Tutorial by Andrew W. Moore, Carnegie Mellon Uniersity www.cs.cmu.edu/~awm/tutorials Discrete Random Variables A is a Boolean-alued random ariable if A denotes
More informationProbabilistic and Bayesian Analytics
Probabilistic and Bayesian Analytics Note to other teachers and users of these slides. Andrew would be delighted if you found this source material useful in giving your own lectures. Feel free to use these
More informationProbabilistic and Bayesian Learning
Probabilistic and Bayesian Learning Note to other teachers and sers of these slides. Andrew wold be delighted if yo fond this sorce aterial sefl in giing yor own lectres. Feel free to se these slides erbati,
More informationProbability Basics. Robot Image Credit: Viktoriya Sukhanova 123RF.com
Probability Basics These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these
More informationGaussians. Andrew W. Moore Professor School of Computer Science Carnegie Mellon University.
Note to other teachers and users of these slides. Andrew would be delighted if you found this source aterial useful in giing your own lectures. Feel free to use these slides erbati, or to odify the to
More informationSources of Uncertainty
Probability Basics Sources of Uncertainty The world is a very uncertain place... Uncertain inputs Missing data Noisy data Uncertain knowledge Mul>ple causes lead to mul>ple effects Incomplete enumera>on
More informationBayes Nets for representing and reasoning about uncertainty
Bayes Nets for representing and reasoning about uncertainty Note to other teachers and users of these slides. ndrew would be delighted if you found this source material useful in giing your own lectures.
More informationCS 331: Artificial Intelligence Naïve Bayes. Naïve Bayes
CS 33: Artificial Intelligence Naïe Bayes Thanks to Andrew Moore for soe corse aterial Naïe Bayes A special type of Bayesian network Makes a conditional independence assption Typically sed for classification
More informationMachine Learning
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 13, 2011 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting
More informationCISC 4631 Data Mining
CISC 4631 Data Mining Lecture 06: ayes Theorem Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Eamonn Koegh (UC Riverside) Andrew Moore (CMU/Google) 1 Naïve ayes Classifier
More informationOne Dimensional Collisions
One Diensional Collisions These notes will discuss a few different cases of collisions in one diension, arying the relatie ass of the objects and considering particular cases of who s oing. Along the way,
More informationA [somewhat] Quick Overview of Probability. Shannon Quinn CSCI 6900
A [somewhat] Quick Overview of Probability Shannon Quinn CSCI 6900 [Some material pilfered from http://www.cs.cmu.edu/~awm/tutorials] Probabilistic and Bayesian Analytics Note to other teachers and users
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,
More informationSoft Computing. Lecture Notes on Machine Learning. Matteo Mattecci.
Soft Computing Lecture Notes on Machine Learning Matteo Mattecci matteucci@elet.polimi.it Department of Electronics and Information Politecnico di Milano Matteo Matteucci c Lecture Notes on Machine Learning
More informationFeature Extraction Techniques
Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14, 2015 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting
More informationCombining Classifiers
Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/
More informationBayes and Naïve Bayes Classifiers CS434
Bayes and Naïve Bayes Classifiers CS434 In this lectre 1. Review some basic probability concepts 2. Introdce a sefl probabilistic rle - Bayes rle 3. Introdce the learning algorithm based on Bayes rle (ths
More informationBayes Nets for representing and reasoning about uncertainty
Bayes Nets for representing and reasoning about uncertainty Note to other teachers and users of these slides. ndrew would be delighted if you found this source material useful in giving your own lectures.
More informationModel Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon
Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential
More information1 Generalization bounds based on Rademacher complexity
COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges
More informationBayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)
Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu
More informationFinite fields. and we ve used it in various examples and homework problems. In these notes I will introduce more finite fields
Finite fields I talked in class about the field with two eleents F 2 = {, } and we ve used it in various eaples and hoework probles. In these notes I will introduce ore finite fields F p = {,,...,p } for
More informationCS 4649/7649 Robot Intelligence: Planning
CS 4649/7649 Robot Intelligence: Planning Probability Primer Sungmoon Joo School of Interactive Computing College of Computing Georgia Institute of Technology S. Joo (sungmoon.joo@cc.gatech.edu) 1 *Slides
More informationE0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis
E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds
More informationUsing EM To Estimate A Probablity Density With A Mixture Of Gaussians
Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points
More information1 Proof of learning bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a
More informationPattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October
More informationCS 331: Artificial Intelligence Probability I. Dealing with Uncertainty
CS 331: Artificial Intelligence Probability I Thanks to Andrew Moore for some course material 1 Dealing with Uncertainty We want to get to the point where we can reason with uncertainty This will require
More informationProbability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J.
Probability and Stochastic Processes: A Friendly Introduction for Electrical and oputer Engineers Roy D. Yates and David J. Goodan Proble Solutions : Yates and Goodan,1..3 1.3.1 1.4.6 1.4.7 1.4.8 1..6
More informationProbability review. Adopted from notes of Andrew W. Moore and Eric Xing from CMU. Copyright Andrew W. Moore Slide 1
robablty reew dopted from notes of ndrew W. Moore and Erc Xng from CMU Copyrght ndrew W. Moore Slde So far our classfers are determnstc! For a gen X, the classfers we learned so far ge a sngle predcted
More informationProbabilistic Machine Learning
Probabilistic Machine Learning by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Probabilistic Linear Regression I... Maxiu Likelihood Solution II... Maxiu-a-Posteriori
More informationCSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13
CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture
More informationVC-dimension for characterizing classifiers
VC-dimension for characterizing classifiers Note to other teachers and users of these slides. Andrew would be delighted if you found this source material useful in giving your own lectures. Feel free to
More informationA Simple Regression Problem
A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where
More informationGenerative v. Discriminative classifiers Intuition
Logistic Regression (Continued) Generative v. Discriminative Decision rees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University January 31 st, 2007 2005-2007 Carlos Guestrin 1 Generative
More informationEstimating Parameters for a Gaussian pdf
Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3
More informationDecision Trees. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. February 5 th, Carlos Guestrin 1
Decision Trees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 5 th, 2007 2005-2007 Carlos Guestrin 1 Linear separability A dataset is linearly separable iff 9 a separating
More information13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices
CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay
More information1 Bounding the Margin
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost
More informationLecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon
Lecture 2: Enseble Methods Isabelle Guyon guyoni@inf.ethz.ch Introduction Book Chapter 7 Weighted Majority Mixture of Experts/Coittee Assue K experts f, f 2, f K (base learners) x f (x) Each expert akes
More informationNB1140: Physics 1A - Classical mechanics and Thermodynamics Problem set 2 - Forces and energy Week 2: November 2016
NB1140: Physics 1A - Classical echanics and Therodynaics Proble set 2 - Forces and energy Week 2: 21-25 Noveber 2016 Proble 1. Why force is transitted uniforly through a assless string, a assless spring,
More information1 Rademacher Complexity Bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability
More informationNote-A-Rific: Mechanical
Note-A-Rific: Mechanical Kinetic You ve probably heard of inetic energy in previous courses using the following definition and forula Any object that is oving has inetic energy. E ½ v 2 E inetic energy
More informationThe Weierstrass Approximation Theorem
36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined
More informationA nonstandard cubic equation
MATH-Jan-05-0 A nonstandard cubic euation J S Markoitch PO Box West Brattleboro, VT 050 Dated: January, 05 A nonstandard cubic euation is shown to hae an unusually econoical solution, this solution incorporates
More informationProbability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides
Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides slide 1 Inference with Bayes rule: Example In a bag there are two envelopes one has a red ball (worth $100) and a black ball one
More informationMachine Learning. CS Spring 2015 a Bayesian Learning (I) Uncertainty
Machine Learning CS6375 --- Spring 2015 a Bayesian Learning (I) 1 Uncertainty Most real-world problems deal with uncertain information Diagnosis: Likely disease given observed symptoms Equipment repair:
More informationIntroduction to Machine Learning
Introduction to Machine Learning CS4375 --- Fall 2018 Bayesian a Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell 1 Uncertainty Most real-world problems deal with
More informationIntroduction to Machine Learning
Uncertainty Introduction to Machine Learning CS4375 --- Fall 2018 a Bayesian Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell Most real-world problems deal with
More information16 Independence Definitions Potential Pitfall Alternative Formulation. mcs-ftl 2010/9/8 0:40 page 431 #437
cs-ftl 010/9/8 0:40 page 431 #437 16 Independence 16.1 efinitions Suppose that we flip two fair coins siultaneously on opposite sides of a roo. Intuitively, the way one coin lands does not affect the way
More informationCS Lecture 13. More Maximum Likelihood
CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood
More informationSome Perspective. Forces and Newton s Laws
Soe Perspective The language of Kineatics provides us with an efficient ethod for describing the otion of aterial objects, and we ll continue to ake refineents to it as we introduce additional types of
More informationWhat is Probability? (again)
INRODUCTION TO ROBBILITY Basic Concepts and Definitions n experient is any process that generates well-defined outcoes. Experient: Record an age Experient: Toss a die Experient: Record an opinion yes,
More information26 Impulse and Momentum
6 Ipulse and Moentu First, a Few More Words on Work and Energy, for Coparison Purposes Iagine a gigantic air hockey table with a whole bunch of pucks of various asses, none of which experiences any friction
More informationLecture October 23. Scribes: Ruixin Qiang and Alana Shine
CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single
More informationIntroduction to Machine Learning. Recitation 11
Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,
More informationDerivative at a point
Roberto s Notes on Differential Calculus Capter : Definition of derivative Section Derivative at a point Wat you need to know already: Te concept of liit and basic etods for coputing liits. Wat you can
More informationOcean 420 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers
Ocean 40 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers 1. Hydrostatic Balance a) Set all of the levels on one of the coluns to the lowest possible density.
More informationVC-dimension for characterizing classifiers
VC-dimension for characterizing classifiers Note to other teachers and users of these slides. Andrew would be delighted if you found this source material useful in giving your own lectures. Feel free to
More informationBiostatistics Department Technical Report
Biostatistics Departent Technical Report BST006-00 Estiation of Prevalence by Pool Screening With Equal Sized Pools and a egative Binoial Sapling Model Charles R. Katholi, Ph.D. Eeritus Professor Departent
More informationBayes Nets for representing
Bayes Nets for representing and reasoning about uncertainty Note to other teachers and users of these slides. Andrew would be delighted if you found this source material useful in giving your own lectures.
More informationHandout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.
Notes on Coplexity Theory Last updated: October, 2005 Jonathan Katz Handout 7 1 More on Randoized Coplexity Classes Reinder: so far we have seen RP,coRP, and BPP. We introduce two ore tie-bounded randoized
More informationBlock designs and statistics
Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent
More informationSupport Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization
Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering
More informationSoft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis
Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES
More informationSequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5,
Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, 2015 31 11 Motif Finding Sources for this section: Rouchka, 1997, A Brief Overview of Gibbs Sapling. J. Buhler, M. Topa:
More informationBayes Decision Rule and Naïve Bayes Classifier
Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.
More informationCOS 424: Interacting with Data. Written Exercises
COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well
More informationComputational and Statistical Learning Theory
Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher
More informationDetection and Estimation Theory
ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer
More informationMachine Learning: Fisher s Linear Discriminant. Lecture 05
Machine Learning: Fisher s Linear Discriinant Lecture 05 Razvan C. Bunescu chool of Electrical Engineering and Coputer cience bunescu@ohio.edu Lecture 05 upervised Learning ask learn an (unkon) function
More informationA Smoothed Boosting Algorithm Using Probabilistic Output Codes
A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu
More informationLesson 24: Newton's Second Law (Motion)
Lesson 24: Newton's Second Law (Motion) To really appreciate Newton s Laws, it soeties helps to see how they build on each other. The First Law describes what will happen if there is no net force. The
More informationA Note on A Solution Method to the Problem. Proposed by Wang in Voting Systems
Applied Matheatical Sciences, Vol. 5, 0, no. 6, 305 3055 A Note on A Solution Method to the Proble Proposed by Wang in Voting Systes F. Hosseinzadeh Lotfi Departent of Matheatics, Science and Research
More informationDonald Fussell. October 28, Computer Science Department The University of Texas at Austin. Point Masses and Force Fields.
s Vector Moving s and Coputer Science Departent The University of Texas at Austin October 28, 2014 s Vector Moving s Siple classical dynaics - point asses oved by forces Point asses can odel particles
More informationSupport Vector Machines MIT Course Notes Cynthia Rudin
Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance
More informationTracking using CONDENSATION: Conditional Density Propagation
Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation
More informationAnalyzing Simulation Results
Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial
More information3.8 Three Types of Convergence
3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to
More informationFigure 1: Equivalent electric (RC) circuit of a neurons membrane
Exercise: Leaky integrate and fire odel of neural spike generation This exercise investigates a siplified odel of how neurons spike in response to current inputs, one of the ost fundaental properties of
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lessons 7 20 Dec 2017 Outline Artificial Neural networks Notation...2 Introduction...3 Key Equations... 3 Artificial
More informationThe Transactional Nature of Quantum Information
The Transactional Nature of Quantu Inforation Subhash Kak Departent of Coputer Science Oklahoa State University Stillwater, OK 7478 ABSTRACT Inforation, in its counications sense, is a transactional property.
More informationBasic Probability and Statistics
Basic Probability and Statistics Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Jerry Zhu, Mark Craven] slide 1 Reasoning with Uncertainty
More informationBirthday Paradox Calculations and Approximation
Birthday Paradox Calculations and Approxiation Joshua E. Hill InfoGard Laboratories -March- v. Birthday Proble In the birthday proble, we have a group of n randoly selected people. If we assue that birthdays
More informationComputational and Statistical Learning Theory
Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic
More informationDealing with Uncertainty. CS 331: Artificial Intelligence Probability I. Outline. Random Variables. Random Variables.
Dealing with Uncertainty CS 331: Artificial Intelligence Probability I We want to get to the point where we can reason with uncertainty This will require using probability e.g. probability that it will
More informationWill Monroe August 9, with materials by Mehran Sahami and Chris Piech. image: Arito. Parameter learning
Will Monroe August 9, 07 with aterials by Mehran Sahai and Chris Piech iage: Arito Paraeter learning Announceent: Proble Set #6 Goes out tonight. Due the last day of class, Wednesday, August 6 (before
More informationEnsemble Based on Data Envelopment Analysis
Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807
More informationMachine Learning, Fall 2009: Midterm
10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all
More informationMidterm, Fall 2003
5-78 Midterm, Fall 2003 YOUR ANDREW USERID IN CAPITAL LETTERS: YOUR NAME: There are 9 questions. The ninth may be more time-consuming and is worth only three points, so do not attempt 9 unless you are
More informationMultilayer Neural Networks
Multilayer Neural Networs Brain s. Coputer Designed to sole logic and arithetic probles Can sole a gazillion arithetic and logic probles in an hour absolute precision Usually one ery fast procesor high
More informationStandard & Canonical Forms
Standard & Canonical Fors CHAPTER OBJECTIVES Learn Binary Logic and BOOLEAN AlgebraLearn How to ap a Boolean Expression into Logic Circuit Ipleentation Learn How To anipulate Boolean Expressions and Siplify
More informationKinetic Molecular Theory of Ideal Gases
Lecture -3. Kinetic Molecular Theory of Ideal Gases Last Lecture. IGL is a purely epirical law - solely the consequence of experiental obserations Explains the behaior of gases oer a liited range of conditions.
More informationSF Chemical Kinetics.
SF Cheical Kinetics. Lecture 5. Microscopic theory of cheical reaction inetics. Microscopic theories of cheical reaction inetics. basic ai is to calculate the rate constant for a cheical reaction fro first
More informationIntroduction to Optimization Techniques. Nonlinear Programming
Introduction to Optiization echniques Nonlinear Prograing Optial Solutions Consider the optiization proble in f ( x) where F R n xf Definition : x F is optial (global iniu) for this proble, if f( x ) f(
More informationMachine Learning Basics: Estimators, Bias and Variance
Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics
More informationEE5900 Spring Lecture 4 IC interconnect modeling methods Zhuo Feng
EE59 Spring Parallel LSI AD Algoriths Lecture I interconnect odeling ethods Zhuo Feng. Z. Feng MTU EE59 So far we ve considered only tie doain analyses We ll soon see that it is soeties preferable to odel
More information