2. I will provide two examples to contrast the two schools. Hopefully, in the end you will fall in love with the Bayesian approach.
|
|
- Drusilla Gibbs
- 5 years ago
- Views:
Transcription
1 Bayesian Statistics 1. In the new era of big data, machine learning and artificial intelligence, it is important for students to know the vocabulary of Bayesian statistics, which has been competing with the classical school (frequentists) throughout the history of statistics. 2. I will provide two examples to contrast the two schools. Hopefully, in the end you will fall in love with the Bayesian approach. Example I: Estimating Proportion 1. Imagine you are a millionaire planning to buy a lake. You love eating walleye, so you want to buy a lake with a lot of walleyes in it. The parameter of interest is the proportion of walleyes in fish population in a lake. 2. Suppose you have no prior information about the proportion of walleye. That means you believe the proportion could be 0, or 10%, or 20%,... or 100% with equal probabilities. The walleye density is low if proportion is 0.1, while the density is high if proportion is Using the jargon of Bayesian statistics, you think the proportion of walleyes is a random variable, and the prior distribution is flat (kind of like a uniform distribution). 4. Here comes the first difference between Frequentist and Bayes: Bayesian school treats any unknown parameter (here, proportion of walleyes) as a random variable, while Frequentist treats a parameter as an unknown constant. As a result, Bayesian school implies implicitly that we will never know for sure the unknown parameter (since it is a random variable). 5. The prior distribution is subjective. One person s prior distribution can differ from another person s. For the same person, the prior distribution can evolve over time as more information becomes available. 6. For example, suppose a friend tells you that there used to be a lot of walleyes in the lake, and you trust him. Then you may assign higher probabilities to those big proportions. Then the new prior distribution can be P (θ = 0.1) = 0.2, P (θ = 0.9) = Here comes the second difference between Frequentist and Bayes: Bayes thinks probability measures the degree of belief, while Frequentist treats probability as the long 1
2 run frequency. Probability of 0.8 indicates that you have strong faith in what your friend tells you. In that regard, probability is also subjective in the Bayesian world. 8. For Bayesian school, statistical inference amounts to using information (data) to update the belief. More explicitly, the Bayes theorem states that P (θ data) = P (θ)p (data θ) P (data) (Bayes theorem) (1) where (a) P (θ) is called prior distribution of the unknown parameter θ the belief you have about θ before seeing the data (b) P (data θ) is called likelihood the probability that you observe the given sample of data conditional on the parameter (c) P (data) is the unconditional or marginal probability of observing the given sample (d) Most importantly, P (θ data) is the posterior distribution the updated belief about θ after information has been digested. Simply put, Bayesian method is concerned with moving from P (θ) to P (θ data), or moving from prior distribution to posterior distribution, using Bayes theorem (1). You can find the discussion of Bayes theorem in any statistics book or on Internet. 9. We can show that P (data) is free of θ since it has been integrated out P (data) = θ P (data, θ) = θ P (θ)p (data θ) (2) where P (data, θ) is the joint distribution. That means for the purpose of understanding θ, we could ignore the denominator in (1) and write P (θ data) P (θ)p (data θ) (3) where represents being proportional to. In short, to obtain the updated belief (posterior), we may only need to figure out prior and likelihood. 10. A technical note: when the information set is big (as sample size goes to infinity), the central limit theorem implies that the likelihood typically converges to a 2
3 normal distribution. So in the limit, the likelihood (normal distribution) dominates the prior, and in general the posterior is a bell-shaped curve. 11. For Bayesian school, the most important result is delivered by plotting the posterior distribution. Next I will show you how to do so for the example I. 12. First we need data (sample). Suppose you spend a whole day catching 10 fishes in that lake, and 3 of them are walleyes. For Frequentist, the estimate for population proportion is just the sample proportion 3 = 30%. That s almost it! The Frequentist now 10 believes we almost know the proportion of walleye, and it is 30%. But the Frequentist admits that there can be sampling uncertainty (another Frequentist possibly catches 4 or even 5 walleyes out of 10 fishes). So they compute the standard error se, and report a confidence interval ( se, se). They tell you that with 95% probability the true proportion is inside that interval. That is it! Then the Frequentist goes to the party. 13. For Bayesian school, they are unhappy with just an interval they want to know the whole (posterior) distribution of θ (the possible values and corresponding probabilities) (a) For simplicity, assuming flat prior distribution P (θ = k) = 1, k = 0, 0.1, 0.2,..., In words, you believe equal probabilities for low and high densities of walleye population. (b) The likelihood or probability of getting 3 walleyes out of 10 fishes, for given θ, is given by a binomial distribution Likelihood = P (3 sucesses out of 10 trials) = C 3 10θ 3 (1 θ) 10 3 (4) where C denotes the number of combinations. Basically, if the probability of success (catching a walleye) is θ, then having m successes out of n trials is C m n θ m (1 θ) n m. Please google binomial distribution to learn more. (c) According to (1) and (2), next we need to multiply the prior by the likelihood and divide the sum of that product. The prior distribution for θ and its posterior distribution after we catch 3 walleyes out of 10 fishes are 3
4 Prior (Red) and Posterior (Blue) Distributions Distribution theta po pr where a green line is drawn to highlight the value of θ = 0.3 that occurs with highest probability. (a) You can think of that number as the Bayesian point estimate of the proportion ˆθ Bayesian = 0.3, which in this case is identical to the Frequentist point estimate ˆθ Frequentist = 0.3. (b) The posterior distribution clearly shows that other values are possible. For instance, either θ = 0.2 or θ = 0.4 can be true with substantial probability. Nevertheless, θ 0.1 or θ 0.6 are unlikely given this sample. (c) So the information (catching 3 walleyes out of 10) is used to update our belief about the proportion we move from the flat prior distribution (red one) to the bell-shaped posterior distribution (blue one). The bell shape confirms the dominance of likelihood. After showing this graph, the Bayesian person finally can join the party! 14. The stata codes are clear set obs 11 sca n = 10 sca m = 3 gen pv = ([_n]-1)*0.1 gen pr = 1/11 4
5 gen lv = binomial(n, m, pv)-binomial(n, (m-1), pv) gen po = lv*pr qui sum po qui replace po = po/r(sum) twoway (connected po pv, ms(th)) (connected pr pv, ms(oh)), ytitle("distribution") where the stata function binomial(n, m, pv) reports P (m or more sucesses out of n trials). 15. What if we catch 6 walleyes out of 10 fishes? You only need to change sca m = 6 in my codes and get Posteior Posterior Distribution theta Now the most likely proportion is 0.6! If I am that walleye-loving millionaire, I may decide to buy the lake. 16. Of course, to be safe, the millionaire can try to get a bigger sample (catch 20 fishes and count how many are walleyes). He can also use the current bell-shaped posterior distribution (other than the naive flat one) as the new prior distribution, and try to get the second-round posterior distribution after catching a bigger sample of fishes. The point is, the Bayes method typically is used in an iterative fashion. 17. To summarize, the Bayesian method uses information to keep updating the belief. After more information arrives, we can update the belief again and again (by plotting posterior distribution again and again). Then informed decision can be made based on the posterior distribution. Bayes statistics is on-going statistics. 5
6 Example II: Mission Impossible for Frequentist (played not by Tom Cruise) 1. There are some problems that the Frequentist simply cannot help. Consider instead of proportion of walleyes, the millionaire wants to know the total number of all fishes (not just walleye). He will only buy the lake if that number is greater than, say, 50. I don t think you have learned any classical statistical model to estimate the size of population. So this is a mission almost impossible for a Frequentist Don t forget those Bayesian guys. This is how Bayesian method works for this tricky problem. Let s first catch 10 fishes, put red paint on them (or tattoo them if you can) and send them back into the lake. After one day, let s catch another 10 fishes, and count how many have red paint. 3. Suppose we have 3 red ones out of the 10 fishes. Intuitively we can do the math 10 red fishes population = 3 10 population 33 This calculation assumes the red fishes spread evenly in the lake. What if they do not (a tattooed guy may like to hang out with another tattooed guy)? So there should be uncertainty associated with the estimate 33. In this case, there is no way a Frequentist can give you something like standard error. Only the Bayes method can be used to account for that inherent uncertainty. 4. Let the unknown parameter be the population size θ = n. The key insight is that the probability of catching red fishes depends on n : P (sucess) = 10 n. Hence we can still use the Binomial distribution to solve this problem: P (3 sucesses out of 10 trials) = C 3 10 ( ) 3 ( ) 10 3 (5) n n 5. For the prior, again let s use flat one P (n = k) = 0.1, k = 10, 20,..., 100 as a starting point. The posterior distribution for the population size after catching 3 red fishes out of 10 is plotted below 1 I say almost because maximum likelihood method can be used by Frequentist. 6
7 Prior (Red) and Posterior (Blue) Distributions Distribution n po pr where the green line marks the most likely population size n = 30. The stata codes are clear set obs 10 sca m = 3 gen nv = ([_n])*10 gen pv = 10/nv gen pr = 1/10 gen lv = binomial(10, m, pv)-binomial(10, (m-1), pv) gen po = lv*pr qui sum po qui replace po = po/r(sum) twoway (connected po nv, ms(th)) (connected pr nv, ms(oh)), ytitle("distribution") 6. Exercise: please modify my codes to do a finer search for population size n = 10, 11, 12,..., The possible population sizes and their corresponding probabilities are. list nv po nv po
8 Bayes is lovable 1. First of all, economists adore Bayes. For example, they use utility function to measure how happy the millionaire is after buying that lake (and eating all those poor fishes). Because the fish population is random, economists compute something called expected utility E(u(c)) = j u(c j )P (c j ) = 10(0) + 20( ) ( ) Here we assume the consumption is fish, and use square root function because it satisfies diminishing marginal utility. Because probability is readily available from the posterior distribution, Bayes result can be incorporated seamlessly into the consumer theory. 2. In fact, any theory or problem that involves uncertainty (probability) can use the help of Bayesian statistics. Alan M. Turing used Bayesian to crack the German encrypted military code during WWII; The British navy used Bayesian to narrow down the sea area when hunting German U-boat; Google engineers use Bayesian to guess whether a picture is dog or cat; Dr. Li uses Bayesian to show off in front of his young kids... If you believe the world is full of uncertainty so that we can never know the truth (which lies somewhere unknown in the middle), please give Bayes a serious thought. 3. To learn more about Bayesian theory, I recommend the book Doing Bayesian Data Analysis: A Tutorial with R and BUGS written by John K. Kruschke. That book gives a good introduction to R as well. 8
(1) Introduction to Bayesian statistics
Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they
More informationEco411 Lab: Magic 1.96 and Numerical Method
Eco411 Lab: Magic 1.96 and Numerical Method 1. The goal of this lab session is to reinforce students understanding of the critical value 1.96 by applying the numerical method to evaluate a probability..
More informationConditional probabilities and graphical models
Conditional probabilities and graphical models Thomas Mailund Bioinformatics Research Centre (BiRC), Aarhus University Probability theory allows us to describe uncertainty in the processes we model within
More informationBayesian Models in Machine Learning
Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of
More informationInference for a Population Proportion
Al Nosedal. University of Toronto. November 11, 2015 Statistical inference is drawing conclusions about an entire population based on data in a sample drawn from that population. From both frequentist
More informationQuantitative Understanding in Biology 1.7 Bayesian Methods
Quantitative Understanding in Biology 1.7 Bayesian Methods Jason Banfelder October 25th, 2018 1 Introduction So far, most of the methods we ve looked at fall under the heading of classical, or frequentist
More informationIntroduction to Bayesian Statistics
Bayesian Parameter Estimation Introduction to Bayesian Statistics Harvey Thornburg Center for Computer Research in Music and Acoustics (CCRMA) Department of Music, Stanford University Stanford, California
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide
More informationWe're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p).
Sampling distributions and estimation. 1) A brief review of distributions: We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation,
More informationBayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007
Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.
More informationChapter Three. Hypothesis Testing
3.1 Introduction The final phase of analyzing data is to make a decision concerning a set of choices or options. Should I invest in stocks or bonds? Should a new product be marketed? Are my products being
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationUncertain Inference and Artificial Intelligence
March 3, 2011 1 Prepared for a Purdue Machine Learning Seminar Acknowledgement Prof. A. P. Dempster for intensive collaborations on the Dempster-Shafer theory. Jianchun Zhang, Ryan Martin, Duncan Ermini
More informationBayesian Econometrics
Bayesian Econometrics Christopher A. Sims Princeton University sims@princeton.edu September 20, 2016 Outline I. The difference between Bayesian and non-bayesian inference. II. Confidence sets and confidence
More informationBayesian Inference. STA 121: Regression Analysis Artin Armagan
Bayesian Inference STA 121: Regression Analysis Artin Armagan Bayes Rule...s! Reverend Thomas Bayes Posterior Prior p(θ y) = p(y θ)p(θ)/p(y) Likelihood - Sampling Distribution Normalizing Constant: p(y
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationLast few slides from last time
Last few slides from last time Example 3: What is the probability that p will fall in a certain range, given p? Flip a coin 50 times. If the coin is fair (p=0.5), what is the probability of getting an
More information1 A simple example. A short introduction to Bayesian statistics, part I Math 217 Probability and Statistics Prof. D.
probabilities, we ll use Bayes formula. We can easily compute the reverse probabilities A short introduction to Bayesian statistics, part I Math 17 Probability and Statistics Prof. D. Joyce, Fall 014 I
More informationA primer on Bayesian statistics, with an application to mortality rate estimation
A primer on Bayesian statistics, with an application to mortality rate estimation Peter off University of Washington Outline Subjective probability Practical aspects Application to mortality rate estimation
More informationProbability, Entropy, and Inference / More About Inference
Probability, Entropy, and Inference / More About Inference Mário S. Alvim (msalvim@dcc.ufmg.br) Information Theory DCC-UFMG (2018/02) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference
More information7.1 What is it and why should we care?
Chapter 7 Probability In this section, we go over some simple concepts from probability theory. We integrate these with ideas from formal language theory in the next chapter. 7.1 What is it and why should
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some
More informationProbabilistic Reasoning
Course 16 :198 :520 : Introduction To Artificial Intelligence Lecture 7 Probabilistic Reasoning Abdeslam Boularias Monday, September 28, 2015 1 / 17 Outline We show how to reason and act under uncertainty.
More informationJoint, Conditional, & Marginal Probabilities
Joint, Conditional, & Marginal Probabilities The three axioms for probability don t discuss how to create probabilities for combined events such as P [A B] or for the likelihood of an event A given that
More informationExact Inference by Complete Enumeration
21 Exact Inference by Complete Enumeration We open our toolbox of methods for handling probabilities by discussing a brute-force inference method: complete enumeration of all hypotheses, and evaluation
More information2. A Basic Statistical Toolbox
. A Basic Statistical Toolbo Statistics is a mathematical science pertaining to the collection, analysis, interpretation, and presentation of data. Wikipedia definition Mathematical statistics: concerned
More informationMACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION
MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION THOMAS MAILUND Machine learning means different things to different people, and there is no general agreed upon core set of algorithms that must be
More informationBayesian Learning Extension
Bayesian Learning Extension This document will go over one of the most useful forms of statistical inference known as Baye s Rule several of the concepts that extend from it. Named after Thomas Bayes this
More information6.867 Machine Learning
6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.
More informationStudy and research skills 2009 Duncan Golicher. and Adrian Newton. Last draft 11/24/2008
Study and research skills 2009. and Adrian Newton. Last draft 11/24/2008 Inference about the mean: What you will learn Why we need to draw inferences from samples The difference between a population and
More informationJoint, Conditional, & Marginal Probabilities
Joint, Conditional, & Marginal Probabilities Statistics 110 Summer 2006 Copyright c 2006 by Mark E. Irwin Joint, Conditional, & Marginal Probabilities The three axioms for probability don t discuss how
More informationSTOR 435 Lecture 5. Conditional Probability and Independence - I
STOR 435 Lecture 5 Conditional Probability and Independence - I Jan Hannig UNC Chapel Hill 1 / 16 Motivation Basic point Think of probability as the amount of belief we have in a particular outcome. If
More informationWeek 2 Quantitative Analysis of Financial Markets Bayesian Analysis
Week 2 Quantitative Analysis of Financial Markets Bayesian Analysis Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationP (E) = P (A 1 )P (A 2 )... P (A n ).
Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer
More informationCS 361: Probability & Statistics
October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite
More informationThe Island Problem Revisited
The Island Problem Revisited Halvor Mehlum 1 Department of Economics, University of Oslo, Norway E-mail: halvormehlum@econuiono March 10, 2009 1 While carrying out this research, the author has been associated
More informationBayes Theorem (10B) Young Won Lim 6/3/17
Bayes Theorem (10B) Copyright (c) 2017 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later
More informationBayes Theorem (4A) Young Won Lim 3/5/18
Bayes Theorem (4A) Copyright (c) 2017-2018 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any
More information6.867 Machine Learning
6.867 Machine Learning Problem set 1 Due Thursday, September 19, in class What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.
More informationMachine Learning, Fall 2012 Homework 2
0-60 Machine Learning, Fall 202 Homework 2 Instructors: Tom Mitchell, Ziv Bar-Joseph TA in charge: Selen Uguroglu email: sugurogl@cs.cmu.edu SOLUTIONS Naive Bayes, 20 points Problem. Basic concepts, 0
More informationABE Math Review Package
P a g e ABE Math Review Package This material is intended as a review of skills you once learned and wish to review before your assessment. Before studying Algebra, you should be familiar with all of the
More information2. Probability. Chris Piech and Mehran Sahami. Oct 2017
2. Probability Chris Piech and Mehran Sahami Oct 2017 1 Introduction It is that time in the quarter (it is still week one) when we get to talk about probability. Again we are going to build up from first
More informationBayesian Inference for Binomial Proportion
8 Bayesian Inference for Binomial Proportion Frequently there is a large population where π, a proportion of the population, has some attribute. For instance, the population could be registered voters
More informationProbability theory basics
Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:
More informationBusiness Statistics. Lecture 5: Confidence Intervals
Business Statistics Lecture 5: Confidence Intervals Goals for this Lecture Confidence intervals The t distribution 2 Welcome to Interval Estimation! Moments Mean 815.0340 Std Dev 0.8923 Std Error Mean
More informationConfidence intervals CE 311S
CE 311S PREVIEW OF STATISTICS The first part of the class was about probability. P(H) = 0.5 P(T) = 0.5 HTTHHTTTTHHTHTHH If we know how a random process works, what will we see in the field? Preview of
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationClassical and Bayesian inference
Classical and Bayesian inference AMS 132 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 8 1 / 11 The Prior Distribution Definition Suppose that one has a statistical model with parameter
More informationChapter 8: An Introduction to Probability and Statistics
Course S3, 200 07 Chapter 8: An Introduction to Probability and Statistics This material is covered in the book: Erwin Kreyszig, Advanced Engineering Mathematics (9th edition) Chapter 24 (not including
More informationIntroduction to Bayesian Statistics 1
Introduction to Bayesian Statistics 1 STA 442/2101 Fall 2018 1 This slide show is an open-source document. See last slide for copyright information. 1 / 42 Thomas Bayes (1701-1761) Image from the Wikipedia
More informationBayesian Estimation An Informal Introduction
Mary Parker, Bayesian Estimation An Informal Introduction page 1 of 8 Bayesian Estimation An Informal Introduction Example: I take a coin out of my pocket and I want to estimate the probability of heads
More informationEstimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio
Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationAn AI-ish view of Probability, Conditional Probability & Bayes Theorem
An AI-ish view of Probability, Conditional Probability & Bayes Theorem Review: Uncertainty and Truth Values: a mismatch Let action A t = leave for airport t minutes before flight. Will A 15 get me there
More information10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty.
An AI-ish view of Probability, Conditional Probability & Bayes Theorem Review: Uncertainty and Truth Values: a mismatch Let action A t = leave for airport t minutes before flight. Will A 15 get me there
More informationEco517 Fall 2004 C. Sims MIDTERM EXAM
Eco517 Fall 2004 C. Sims MIDTERM EXAM Answer all four questions. Each is worth 23 points. Do not devote disproportionate time to any one question unless you have answered all the others. (1) We are considering
More informationShould all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?
Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/
More informationApplied Bayesian Statistics STAT 388/488
STAT 388/488 Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 29, 207 Course Info STAT 388/488 http://math.luc.edu/~ebalderama/bayes 2 A motivating example (See
More informationChoosing priors Class 15, Jeremy Orloff and Jonathan Bloom
1 Learning Goals Choosing priors Class 15, 18.05 Jeremy Orloff and Jonathan Bloom 1. Learn that the choice of prior affects the posterior. 2. See that too rigid a prior can make it difficult to learn from
More informationGCSE MATHEMATICS HELP BOOKLET School of Social Sciences
GCSE MATHEMATICS HELP BOOKLET School of Social Sciences This is a guide to ECON10061 (introductory Mathematics) Whether this guide applies to you or not please read the explanation inside Copyright 00,
More informationUsing Probability to do Statistics.
Al Nosedal. University of Toronto. November 5, 2015 Milk and honey and hemoglobin Animal experiments suggested that honey in a diet might raise hemoglobin level. A researcher designed a study involving
More informationIntroduction to Bayesian Learning. Machine Learning Fall 2018
Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability
More informationIntro to Probability. Andrei Barbu
Intro to Probability Andrei Barbu Some problems Some problems A means to capture uncertainty Some problems A means to capture uncertainty You have data from two sources, are they different? Some problems
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More informationModeling Environment
Topic Model Modeling Environment What does it mean to understand/ your environment? Ability to predict Two approaches to ing environment of words and text Latent Semantic Analysis (LSA) Topic Model LSA
More informationChapter 18. Sampling Distribution Models /51
Chapter 18 Sampling Distribution Models 1 /51 Homework p432 2, 4, 6, 8, 10, 16, 17, 20, 30, 36, 41 2 /51 3 /51 Objective Students calculate values of central 4 /51 The Central Limit Theorem for Sample
More informationEvidence with Uncertain Likelihoods
Evidence with Uncertain Likelihoods Joseph Y. Halpern Cornell University Ithaca, NY 14853 USA halpern@cs.cornell.edu Riccardo Pucella Cornell University Ithaca, NY 14853 USA riccardo@cs.cornell.edu Abstract
More informationMachine Learning
Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 13, 2011 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting
More informationError analysis for efficiency
Glen Cowan RHUL Physics 28 July, 2008 Error analysis for efficiency To estimate a selection efficiency using Monte Carlo one typically takes the number of events selected m divided by the number generated
More informationExperiment 2 Random Error and Basic Statistics
PHY9 Experiment 2: Random Error and Basic Statistics 8/5/2006 Page Experiment 2 Random Error and Basic Statistics Homework 2: Turn in at start of experiment. Readings: Taylor chapter 4: introduction, sections
More informationProbability models for machine learning. Advanced topics ML4bio 2016 Alan Moses
Probability models for machine learning Advanced topics ML4bio 2016 Alan Moses What did we cover in this course so far? 4 major areas of machine learning: Clustering Dimensionality reduction Classification
More informationNotes on Discriminant Functions and Optimal Classification
Notes on Discriminant Functions and Optimal Classification Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Discriminant Functions Consider a classification problem
More informationJust Enough Likelihood
Just Enough Likelihood Alan R. Rogers September 2, 2013 1. Introduction Statisticians have developed several methods for comparing hypotheses and for estimating parameters from data. Of these, the method
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationChapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.
Chapter 18 Sampling Distribution Models Copyright 2010, 2007, 2004 Pearson Education, Inc. Normal Model When we talk about one data value and the Normal model we used the notation: N(μ, σ) Copyright 2010,
More informationBayesian Analysis for Natural Language Processing Lecture 2
Bayesian Analysis for Natural Language Processing Lecture 2 Shay Cohen February 4, 2013 Administrativia The class has a mailing list: coms-e6998-11@cs.columbia.edu Need two volunteers for leading a discussion
More informationBayesian Analysis (Optional)
Bayesian Analysis (Optional) 1 2 Big Picture There are two ways to conduct statistical inference 1. Classical method (frequentist), which postulates (a) Probability refers to limiting relative frequencies
More informationBayesian data analysis using JASP
Bayesian data analysis using JASP Dani Navarro compcogscisydney.com/jasp-tute.html Part 1: Theory Philosophy of probability Introducing Bayes rule Bayesian reasoning A simple example Bayesian hypothesis
More informationLecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1
Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,
More informationSolution: chapter 2, problem 5, part a:
Learning Chap. 4/8/ 5:38 page Solution: chapter, problem 5, part a: Let y be the observed value of a sampling from a normal distribution with mean µ and standard deviation. We ll reserve µ for the estimator
More informationBayesian Concept Learning
Learning from positive and negative examples Bayesian Concept Learning Chen Yu Indiana University With both positive and negative examples, it is easy to define a boundary to separate these two. Just with
More informationBayesian Inference for Normal Mean
Al Nosedal. University of Toronto. November 18, 2015 Likelihood of Single Observation The conditional observation distribution of y µ is Normal with mean µ and variance σ 2, which is known. Its density
More informationBayesian RL Seminar. Chris Mansley September 9, 2008
Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood
More informationSeries 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning)
Exercises Introduction to Machine Learning SS 2018 Series 6, May 14th, 2018 (EM Algorithm and Semi-Supervised Learning) LAS Group, Institute for Machine Learning Dept of Computer Science, ETH Zürich Prof
More informationBayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida
Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:
More informationStat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016
Stat 5421 Lecture Notes Proper Conjugate Priors for Exponential Families Charles J. Geyer March 28, 2016 1 Theory This section explains the theory of conjugate priors for exponential families of distributions,
More informationMCMC notes by Mark Holder
MCMC notes by Mark Holder Bayesian inference Ultimately, we want to make probability statements about true values of parameters, given our data. For example P(α 0 < α 1 X). According to Bayes theorem:
More informationWhat is proof? Lesson 1
What is proof? Lesson The topic for this Math Explorer Club is mathematical proof. In this post we will go over what was covered in the first session. The word proof is a normal English word that you might
More informationComputational Perception. Bayesian Inference
Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters
More informationMathematical Statistics
Mathematical Statistics MAS 713 Chapter 8 Previous lecture: 1 Bayesian Inference 2 Decision theory 3 Bayesian Vs. Frequentist 4 Loss functions 5 Conjugate priors Any questions? Mathematical Statistics
More informationName: Period: Date: Ocean to Continental Convergent Plate Boundary Continental to Continental Convergent Plate Boundary
Name: Period: Date: Plate Tectonics Over the past few weeks in Earth Science, we have been studying about Continental drift, Seafloor Spreading and Plate tectonics. You will now use all that you have learned
More informationExperiment 2 Random Error and Basic Statistics
PHY191 Experiment 2: Random Error and Basic Statistics 7/12/2011 Page 1 Experiment 2 Random Error and Basic Statistics Homework 2: turn in the second week of the experiment. This is a difficult homework
More informationCHINESE REMAINDER THEOREM
CHINESE REMAINDER THEOREM MATH CIRCLE AT WASHINGTON UNIVERSITY IN ST. LOUIS, APRIL 19, 2009 Baili MIN In a third-centry A. D. Chinese book Sun Tzu s Calculation Classic one problem is recorded which can
More informationExperiment 1: The Same or Not The Same?
Experiment 1: The Same or Not The Same? Learning Goals After you finish this lab, you will be able to: 1. Use Logger Pro to collect data and calculate statistics (mean and standard deviation). 2. Explain
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference
More informationCS 188: Artificial Intelligence. Our Status in CS188
CS 188: Artificial Intelligence Probability Pieter Abbeel UC Berkeley Many slides adapted from Dan Klein. 1 Our Status in CS188 We re done with Part I Search and Planning! Part II: Probabilistic Reasoning
More informationProbability and (Bayesian) Data Analysis
Department of Statistics The University of Auckland https://www.stat.auckland.ac.nz/ brewer/ Where to get everything To get all of the material (slides, code, exercises): git clone --recursive https://github.com/eggplantbren/madrid
More information