Probability Theory Review

Size: px
Start display at page:

Download "Probability Theory Review"

Transcription

1 Probability Theory Review Brendan O Connor Recitation Sept 11 & 12,

2 Mathematical Tools for Machine Learning Probability Theory Linear Algebra Calculus Wikipedia is great reference 2

3 Probability Theory in AI Fundamental problem: Reasoning about uncertainty Uncertainty in Artificial Intelligence : history of non-probabilistic approaches. fuzzy logic, nonmonotonic reasoning, Dempster-Shafer, three-valued logics... Judea Pearl 2011 Turing Award winner Pearl 1988 (among others): probability theory is the way to go Probabilitistic approaches completely dominate modern machine learning 3

4 Why Probability Theory? Dutch book If you violate these rules, you will lose money. Kolmogorov Axioms Cox Axioms Probabilities of boolean variables. Logic. Bruno de Finetti Andrey Kolmogorov

5 Cox axioms Probability theory as an extension of boolean logic. Let B(X Z) be a degree of belief function. Belief in proposition X Conditional on background knowledge Z Assume 1. Degrees-of-belief are well-ordered 2. B(~X) = f[ B(X) ] 3. B(X and Y) = g( B(X), B(Y X) ) note controversies -- see references on 5

6 Cox axioms Probability theory as an extension of boolean logic. Let B(X Z) be a degree of belief function. Belief in proposition X Conditional on background knowledge Z Assume 1. Degrees-of-belief are well-ordered 2. B(~X) = f[ B(X) ] 3. B(X and Y) = g( B(X), B(Y X) ) Then can map B(.) to P(.) where 1. P(true) = 1 2. P(false) = 0 3. P(X and Y) = P(X) P(Y X) 4. P(~X) = 1 - P(X) note controversies -- see references on 5

7 Cox axioms Probability theory as an extension of boolean logic. Let B(X Z) be a degree of belief function. Belief in proposition X Conditional on background knowledge Z Assume 1. Degrees-of-belief are well-ordered 2. B(~X) = f[ B(X) ] 3. B(X and Y) = g( B(X), B(Y X) ) Then can map B(.) to P(.) where 1. P(true) = 1 2. P(false) = 0 3. P(X and Y) = P(X) P(Y X) 4. P(~X) = 1 - P(X) 5 1. G(true) = 0 2. G(false) = -infinity note controversies -- see references on 3. G(X and Y) = G(X) + G(Y X) 4. G(~X) = log(1 - exp[g(x)])

8 Many views of probability Meaning of a probability Epistemic: knowledge/belief; vs Long-term frequencies Related: Bayesian vs Frequentist statistics 6

9 Discrete Random Variables Views of RVs: (1) outcome of a random process, or (2) has a true value that you don t know Binary (boolean) variable Values(A) = {0, 1} Values(A) = {-1, +1} Values(A) = {, } Categorical variable Values(A) = {0, 1, 2} Values(A) = {55, 32, 10} Values(A) = {,, } Probability (Mass) Function P(A=a) --> [0, 1] P(A=1) = 0.7 P(A=0) = Many notations PA(a) P(A) P(a)

10 Crucial definitions/laws They hold for all RV s. Make sure you know them! Conditional Probability and Chain Rule P (A B) = P (AB) P (B) P (AB) =P (A B)P (B) Law of Total Probability P (A) = b P (A) = b P (A, B = b) P (A B = b)p (B = b) Disjunction (Union) P (A B) =P (A)+P (B) P (AB) Negation (Complement) P ( A) =1 P (A) 8

11 variables A, B, C A 0 0 B 0 0 C 0 1 Prob A C B 0.10 Class Gender Age No Yes Total 1st Male Child st Male Adult st Female Child st Female Adult Lower Male Child Lower Male Adult Lower Female Child Lower Female Adult totals: st Class Lower Male No Gender Survival Female Yes No Yes Adult Child Child Adult Age 9 Any close conditional independences? P(Survive adult lower female) \approx P(Survive child lower female) Survive _ _ Age lower female

12 Data : 1035 (A,B) pairs P(A= ) = A= A= B= P(A= ) = B=

13 Data : 1035 (A,B) pairs A= A= B= B= P(A= ) = P(A= ) =

14 Data : 1035 (A,B) pairs A= A= B= B= P(A= ) = P(A= ) = Sum Rule P (A = a) =1 P(A= ) + P(A= ) = 1 a V alues(a)

15 Data : 1035 (A,B) pairs A= A= B= B= P(A= ) = P(A= ) = Sum Rule P (A = a) =1 P(A= ) + P(A= ) = 1 a V alues(a) Negation (Complement) 1035 P(A!= ) = 1 - P(A= ) =

16 Data : 1035 (A,B) pairs Disjunctions A= A= B= B= P(A= or B= ) = #{A= or B= } N = P (A B) =P (A)+P (B) P (AB) P(A= or B= ) = = P(A= ) + P(B= ) - P(A=,B= ) =

17 Data : 1035 (A,B) pairs A= A= B= B= Joint and Conditional Probs P(A=, B= ) = P(A= B= ) = A= A= B= B=

18 Data : 1035 (A,B) pairs Joint and Conditional Probs A= A= B= P(A=, B= ) = #{A=, B= } N = B= P(A= B= ) = A= A= B= B=

19 Data : 1035 (A,B) pairs Joint and Conditional Probs A= A= B= P(A=, B= ) = #{A=, B= } N = B= #{A=, B= } 15 P(A= B= ) = #{B= } = A= A= B= B=

20 Data : 1035 (A,B) pairs Joint and Conditional Probs A= A= B= P(A=, B= ) = #{A=, B= } N = B= #{A=, B= } 15 P(A= B= ) = #{B= } = Every probability A= A= has a universe or background knowledge P(A=, B= in the table) B= B= P(A= B=, and in the table)

21 Data : 1035 (A,B) pairs Joint and Conditional Probs A= A= B= P(A=, B= ) = #{A=, B= } N = B= P(A= B= ) = #{A=, B= } #{B= } = Fraction of worlds in which B is true that also have A true P (A B) = P (AB) P (B) P(A=, B= ) P(A= B= ) = = P(B= )

22 Data : 1035 (A,B) pairs Joint and Conditional Probs A= A= B= P(A=, B= ) = #{A=, B= } N = B= P(A= B= ) = #{A=, B= } #{B= } = Def n conditional prob. P (A B) = P (AB) P (B) Chain rule P (AB) =P (A B)P (B)

23 Data : 1035 (A,B) pairs Law of Total Probability A= A= P (A) = b P (A, B = b) B= B= Break A into B subgroups Add joint probs across subgroups P(A= ) = P(A=, B= ) + P(A=, B= ) = A= A= B= B=

24 Data : 1035 (A,B) pairs Law of Total Probability A= A= P (A) = b P (A, B = b) B= B= Break A into B subgroups Add joint probs across subgroups A= A= B= B= Break universe into B subgroups Average cond probs across subgroups P(A= ) = P(A=, B= ) + P(A=, B= ) = P (A) = P (A B = b)p (B = b) b P(A= ) = P(A= B= ) P(B= ) + P(A= B= ) P(B= ) =

25 Crucial definitions/laws They hold for all RV s Conditional Probability and Chain Rule P (A B) = P (AB) P (B) P (AB) =P (A B)P (B) Law of Total Probability P (A) = b P (A) = b P (A, B = b) P (A B = b)p (B = b) Disjunction (Union) P (A B) =P (A)+P (B) P (AB) Negation (Complement) P ( A) =1 P (A) 16

26 Background conditional variants P ( A) =1 P (A) P (AB) =P (A B)P (B) etc... picture 17

27 Background conditional variants P ( A) =1 P (A) P ( A Z) =1 P (A Z) P (AB) =P (A B)P (B) etc picture

28 Background conditional variants P ( A) =1 P (A) P ( A Z) =1 P (A Z) P (AB) =P (A B)P (B) P (AB Z) =P (A BZ)P (B Z) etc picture

29 Independence A and B are independent when P(A B) = P(A) (Alternate formulation...) P(AB) =? P(A B) P(B) P(A) P(B) 18

30 Independence A and B are independent when P(A B) = P(A) (Alternate formulation...) P(AB) =? P(A B) P(B) P(A) P(B) 18

31 Independence Chain rule: in general P(ABC) = But when A, B, C are independent P(ABC) =? If you have independence -- can estimate joint probability of thousands of RV s [1] = P(A BC) P(B C) P(C)? = P(A) P(B) P(C) 19

32 Independence Chain rule: in general P(ABC) = P(A BC) P(B C) P(C) But when A, B, C are independent P(ABC) =? If you have independence -- can estimate joint probability of thousands of RV s [1] = P(A BC) P(B C) P(C)? = P(A) P(B) P(C) 19

33 Notation P (AB) P (A B) P (A B) P (A = a, B = b) P (A = a B = b) P (A) = b P (A, B = b) P (A = a) = b P (A = a, B = b) 20

34 Discrete vs Continuous RV s Discrete: Prob Mass Fn 1 P (X) [0, 1] x V alues X P (X = x) =1 2 3 Continuous: Prob Density Fn There is no probability at one point! P (X [a, b]) = p(x)dx =1 b a Quiz: can p(x)>1? p(x)dx p Beta (x;4, 2) x

35 Discrete vs Continuous RV s Discrete: Prob Mass Fn 1 P (X) [0, 1] x V alues X P (X = x) =1 2 3 Continuous: Prob Density Fn There is no probability at one point! P (X [a, b]) = p(x)dx =1 p(x) [0, ) b a p(x)dx p Beta (x;4, 2) x

36 Bayes Rule Want P(H E) but only have P(E H) P (H E)P (E) =P (E H)P (H) H: have disease? P (H E) = P (E H)P (H) P (E) E: outcome of medical test Why? e.g. P(E H) has been measured, or H causes E... Likelihood Prior P (H E) = P (E H)P (H) P (E) Posterior Normalizer 22 Pierre-Simon Laplace Rev. Thomas Bayes c

37 Bayes Rule and its pesky denominator Likelihood Prior P (H E) = P (E H)P (H) P (E) = P (E h)p (H) h P (E h )P (h ) P (H E) = 1 Z P (E H)P (H) Z: whatever lets the posterior, when summed across h, to sum to 1 Zustandssumme, "sum over states" P (H E) P (E H)P (H) Proportional to Zustandssumme, "sum over states" -- from 19th century statistical physics. (Called a partition function for certain types of models, Markov random fields.) Unnormalized posterior By itself does not sum to 1! Computing this can be very difficult in high dimensions -- tons of work in statistical machine learning is essentially devoted to computing Z or variants of it. 23

38 Bayes Rule: Discrete Sum to 1? a b c P (H = h) Prior Multiply P (E H = h) Likelihood P (E H = h)p (H = h) Unnorm. Posterior Normalize 1 Z P (E H = h)p (H = h) Posterior 24

39 Bayes Rule: Discrete [0.2, 0.2, 0.6] a b c P (H = h) Prior Sum to 1? Yes Multiply P (E H = h) Likelihood P (E H = h)p (H = h) Unnorm. Posterior Normalize 1 Z P (E H = h)p (H = h) Posterior 24

40 Bayes Rule: Discrete [0.2, 0.2, 0.6] a b c P (H = h) Prior Sum to 1? Yes Multiply [0.2, 0.05, 0.05] P (E H = h) Likelihood No P (E H = h)p (H = h) Unnorm. Posterior Normalize 1 Z P (E H = h)p (H = h) Posterior 24

41 Bayes Rule: Discrete [0.2, 0.2, 0.6] a b c P (H = h) Prior Sum to 1? Yes Multiply [0.2, 0.05, 0.05] P (E H = h) Likelihood No [0.04, 0.01, 0.03] P (E H = h)p (H = h) Unnorm. Posterior No Normalize 1 Z P (E H = h)p (H = h) Posterior 24

42 Bayes Rule: Discrete [0.2, 0.2, 0.6] a b c P (H = h) Prior Sum to 1? Yes Multiply [0.2, 0.05, 0.05] P (E H = h) Likelihood No [0.04, 0.01, 0.03] P (E H = h)p (H = h) Unnorm. Posterior No Normalize [0.500, 0.125, 0.375] 1 Z P (E H = h)p (H = h) Posterior Yes 24

43 Bayes Rule: Discrete, uniform prior [0.33, 0.33, 0.33] a b c P (H = h) Prior Sum to 1? Yes Multiply [0.2, 0.05, 0.05] P (E H = h) Likelihood No [0.066, 0.016, 0.016] P (E H = h)p (H = h) Unnorm. Posterior No Normalize [0.66, 0.16, 0.16] 1 Z P (E H = h)p (H = h) Posterior Yes 25

44 Bayes Rule: Discrete, uniform prior [0.33, 0.33, 0.33] a b c Uniform distribution: Uninformative prior P (H = h) Prior Sum to 1? Yes Multiply [0.2, 0.05, 0.05] P (E H = h) Likelihood No Normalize [0.066, 0.016, 0.016] [0.66, 0.16, 0.16] 1 Z P (E H = h)p (H = h) Unnorm. Posterior Uniform prior implies that posterior is just renormalized likelihood P (E H = h)p (H = h) Posterior No Yes 25

45 Bayes Rule: Continuous, with prior. dprior Beta(2,2) P (H = h) Prior Sum to 1? Yes Multiply lik L(theta) 0 heads 5 tails P (E H = h) Likelihood No Normalize lik * dprior rior/sum(lik * dprior) P (E H = h)p (H = h) Unnorm. Posterior 1 P (E H = h)p (H = h) Z Posterior 26 No Yes

46 Bayes Rule: Continuous, uniform prior dprior Uniform distribution: Uninformative prior P (H = h) Prior Sum to 1? Yes Multiply lik L(theta) 0 heads 5 tails P (E H = h) Likelihood No Normalize lik * dprior rior/sum(lik * dprior) 1 Z 27 P (E H = h)p (H = h) Unnorm. Posterior Uniform prior implies that posterior is just renormalized likelihood P (E H = h)p (H = h) Posterior No Yes

47 Bayes Rule: More data swamps priors. 1 head, 1 tail 3 heads, 1 tail 8 heads, 1 tail Prior dprior dprior dprior p Beta (θ;2, 2) 1 θ(1 θ) Z Likelihood lik lik lik θ #H (1 θ) #T lik * dprior lik * dprior lik * dprior Posterior prior/sum(lik * dprior) prior/sum(lik * dprior) prior/sum(lik * dprior) 1 Z θ#h+1 #T +1 (1 θ)

48 Posterior prob Bayes Rule: odds-ratio form Likelihood Prior prob P (H E) =P (E H)P (H)/P (E) [0, 1] [0, 1] [0, 1] P (h 1 E) P (h 2 E) = P (E h 1) P (E h 2 ) P (h 1 ) P (h 2 ) Posterior Odds Likelihood Ratio Prior Odds [0, inf) [0, inf) [0, inf) 29

49 Bayes Rule: odds-ratio form Posterior prob Likelihood Prior prob P (H E) =P (E H)P (H)/P (E) [0, 1] [0, 1] [0, 1] Binary prob rescalings Range coin flip prob p [0,1] 1/2 P (h 1 E) P (h 2 E) = P (E h 1) P (E h 2 ) Posterior Odds Likelihood Ratio P (h 1 ) P (h 2 ) Prior Odds [0, inf) [0, inf) [0, inf) log-prob log(p) (-inf, 0] log(1/2) odds p / (1-p) [0, +inf) 1 log-odds logit log[p/(1-p)] (-inf, +inf) 0 29

50 i.i.d. likelihood Bernoulli random process: X_i comes out H at prob θ, else comes out T. Likelihood of θ : L(θ) = P (X 1,X 2,X 3,...,X n θ) =??? Prob of data under parameter θ = i = i P (X i θ) {θ if X i =H else 1 θ} Independent... and Identically Distributed! = i θ 1{X i=h} (1 θ) 1{X i=t } = θ #{X i=h} (1 θ) #{X i=t } 30

51 Maximum Likelihood If we want to believe a single point value of theta, what should we believe? How about the maximum likelihood value of theta: that makes the data most probable L(θ) =θ #{X i=h} (1 θ) #{X i=t } arg max θ arg max θ L(θ) log L(θ) Most important trick: Take the log! Same argmax since monotonic. log(x) #{X i = H} log θ +#{X i = T } log(1 θ) x 31

52 Maximum Posterior (MAP) p(θ D) =P (D θ)p(θ)/p (D) p(θ D) P (D θ)p(θ) arg max θ p(θ D) = arg max θ P (D θ)p(θ) dprior P (H = h) Prior lik P (E H = h) Likelihood Another trick: unnormalized is good enough for MAP lik * dprior P (E H = h)p (H = h) Unnorm. Posterior rior/sum(lik * dprior) 1 Z P (E H = h)p (H = h) Posterior

53 33

54 Entropy H(X) = x p(x) log p(x) H(X) =E[log p(x)] 34

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,

More information

Probability Basics. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Probability Basics. Robot Image Credit: Viktoriya Sukhanova 123RF.com Probability Basics These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these

More information

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Machine Learning 0-70/5 70/5-78, 78, Spring 00 Probability 0 Aarti Singh Lecture, January 3, 00 f(x) µ x Reading: Bishop: Chap, Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Announcements Homework

More information

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins.

Bayesian Reasoning. Adapted from slides by Tim Finin and Marie desjardins. Bayesian Reasoning Adapted from slides by Tim Finin and Marie desjardins. 1 Outline Probability theory Bayesian inference From the joint distribution Using independence/factoring From sources of evidence

More information

Bayesian Approaches Data Mining Selected Technique

Bayesian Approaches Data Mining Selected Technique Bayesian Approaches Data Mining Selected Technique Henry Xiao xiao@cs.queensu.ca School of Computing Queen s University Henry Xiao CISC 873 Data Mining p. 1/17 Probabilistic Bases Review the fundamentals

More information

Probabilistic and Bayesian Machine Learning

Probabilistic and Bayesian Machine Learning Probabilistic and Bayesian Machine Learning Lecture 1: Introduction to Probabilistic Modelling Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London Why a

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,

More information

Probability and Inference

Probability and Inference Deniz Yuret ECOE 554 Lecture 3 Outline 1 Probabilities and ensembles 2 3 Ensemble An ensemble X is a triple (x, A X, P X ), where the outcome x is the value of a random variable, which takes on one of

More information

CS 331: Artificial Intelligence Probability I. Dealing with Uncertainty

CS 331: Artificial Intelligence Probability I. Dealing with Uncertainty CS 331: Artificial Intelligence Probability I Thanks to Andrew Moore for some course material 1 Dealing with Uncertainty We want to get to the point where we can reason with uncertainty This will require

More information

Intro to Probability. Andrei Barbu

Intro to Probability. Andrei Barbu Intro to Probability Andrei Barbu Some problems Some problems A means to capture uncertainty Some problems A means to capture uncertainty You have data from two sources, are they different? Some problems

More information

Basic Probability and Statistics

Basic Probability and Statistics Basic Probability and Statistics Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Jerry Zhu, Mark Craven] slide 1 Reasoning with Uncertainty

More information

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept Computational Biology Lecture #3: Probability and Statistics Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept 26 2005 L2-1 Basic Probabilities L2-2 1 Random Variables L2-3 Examples

More information

Machine Learning

Machine Learning Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 13, 2011 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation Machine Learning CMPT 726 Simon Fraser University Binomial Parameter Estimation Outline Maximum Likelihood Estimation Smoothed Frequencies, Laplace Correction. Bayesian Approach. Conjugate Prior. Uniform

More information

Computational Perception. Bayesian Inference

Computational Perception. Bayesian Inference Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

CS 4649/7649 Robot Intelligence: Planning

CS 4649/7649 Robot Intelligence: Planning CS 4649/7649 Robot Intelligence: Planning Probability Primer Sungmoon Joo School of Interactive Computing College of Computing Georgia Institute of Technology S. Joo (sungmoon.joo@cc.gatech.edu) 1 *Slides

More information

Advanced Probabilistic Modeling in R Day 1

Advanced Probabilistic Modeling in R Day 1 Advanced Probabilistic Modeling in R Day 1 Roger Levy University of California, San Diego July 20, 2015 1/24 Today s content Quick review of probability: axioms, joint & conditional probabilities, Bayes

More information

Bayesian RL Seminar. Chris Mansley September 9, 2008

Bayesian RL Seminar. Chris Mansley September 9, 2008 Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in

More information

CSC321 Lecture 18: Learning Probabilistic Models

CSC321 Lecture 18: Learning Probabilistic Models CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling

More information

Uncertainty. Chapter 13

Uncertainty. Chapter 13 Uncertainty Chapter 13 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes Rule Uncertainty Let s say you want to get to the airport in time for a flight. Let action A

More information

CS 630 Basic Probability and Information Theory. Tim Campbell

CS 630 Basic Probability and Information Theory. Tim Campbell CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)

More information

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018 Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian

More information

Generative Techniques: Bayes Rule and the Axioms of Probability

Generative Techniques: Bayes Rule and the Axioms of Probability Intelligent Systems: Reasoning and Recognition James L. Crowley ENSIMAG 2 / MoSIG M1 Second Semester 2016/2017 Lesson 8 3 March 2017 Generative Techniques: Bayes Rule and the Axioms of Probability Generative

More information

Why Probability? It's the right way to look at the world.

Why Probability? It's the right way to look at the world. Probability Why Probability? It's the right way to look at the world. Discrete Random Variables We denote discrete random variables with capital letters. A boolean random variable may be either true or

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF September 16, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

Problem Set Fall 2012 Due: Friday Sept. 14, at 4 pm

Problem Set Fall 2012 Due: Friday Sept. 14, at 4 pm Problem Set 1 10-601 Fall 2012 Due: Friday Sept. 14, at 4 pm TA: Brendan O Connor (brenocon@cs.cmu.edu) Due Date This is due at Friday Sept. 14, at 4 pm. Hand in a hard copy to Sharon Cavlovich, GHC 8215.

More information

Origins of Probability Theory

Origins of Probability Theory 1 16.584: INTRODUCTION Theory and Tools of Probability required to analyze and design systems subject to uncertain outcomes/unpredictability/randomness. Such systems more generally referred to as Experiments.

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV

More information

Probability, Entropy, and Inference / More About Inference

Probability, Entropy, and Inference / More About Inference Probability, Entropy, and Inference / More About Inference Mário S. Alvim (msalvim@dcc.ufmg.br) Information Theory DCC-UFMG (2018/02) Mário S. Alvim (msalvim@dcc.ufmg.br) Probability, Entropy, and Inference

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist

More information

Basics of Probability

Basics of Probability Basics of Probability Lecture 1 Doug Downey, Northwestern EECS 474 Events Event space E.g. for dice, = {1, 2, 3, 4, 5, 6} Set of measurable events S 2 E.g., = event we roll an even number = {2, 4, 6} S

More information

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework

More information

Bayesian Inference: What, and Why?

Bayesian Inference: What, and Why? Winter School on Big Data Challenges to Modern Statistics Geilo Jan, 2014 (A Light Appetizer before Dinner) Bayesian Inference: What, and Why? Elja Arjas UH, THL, UiO Understanding the concepts of randomness

More information

Lecture 2: Probability and Language Models

Lecture 2: Probability and Language Models Lecture 2: Probability and Language Models Intro to NLP, CS585, Fall 2014 Brendan O Connor (http://brenocon.com) 1 Admin Waitlist Moodle access: Email me if you don t have it Did you get an announcement

More information

PHASES OF STATISTICAL ANALYSIS 1. Initial Data Manipulation Assembling data Checks of data quality - graphical and numeric

PHASES OF STATISTICAL ANALYSIS 1. Initial Data Manipulation Assembling data Checks of data quality - graphical and numeric PHASES OF STATISTICAL ANALYSIS 1. Initial Data Manipulation Assembling data Checks of data quality - graphical and numeric 2. Preliminary Analysis: Clarify Directions for Analysis Identifying Data Structure:

More information

Bayesian Learning. Instructor: Jesse Davis

Bayesian Learning. Instructor: Jesse Davis Bayesian Learning Instructor: Jesse Davis 1 Announcements Homework 1 is due today Homework 2 is out Slides for this lecture are online We ll review some of homework 1 next class Techniques for efficient

More information

STAT J535: Introduction

STAT J535: Introduction David B. Hitchcock E-Mail: hitchcock@stat.sc.edu Spring 2012 Chapter 1: Introduction to Bayesian Data Analysis Bayesian statistical inference uses Bayes Law (Bayes Theorem) to combine prior information

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler Review: Probability BM1: Advanced Natural Language Processing University of Potsdam Tatjana Scheffler tatjana.scheffler@uni-potsdam.de October 21, 2016 Today probability random variables Bayes rule expectation

More information

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric?

Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Should all Machine Learning be Bayesian? Should all Bayesian models be non-parametric? Zoubin Ghahramani Department of Engineering University of Cambridge, UK zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

Basic Probabilistic Reasoning SEG

Basic Probabilistic Reasoning SEG Basic Probabilistic Reasoning SEG 7450 1 Introduction Reasoning under uncertainty using probability theory Dealing with uncertainty is one of the main advantages of an expert system over a simple decision

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Bayesian Networks. Distinguished Prof. Dr. Panos M. Pardalos

Bayesian Networks. Distinguished Prof. Dr. Panos M. Pardalos Distinguished Prof. Dr. Panos M. Pardalos Center for Applied Optimization Department of Industrial & Systems Engineering Computer & Information Science & Engineering Department Biomedical Engineering Program,

More information

An AI-ish view of Probability, Conditional Probability & Bayes Theorem

An AI-ish view of Probability, Conditional Probability & Bayes Theorem An AI-ish view of Probability, Conditional Probability & Bayes Theorem Review: Uncertainty and Truth Values: a mismatch Let action A t = leave for airport t minutes before flight. Will A 15 get me there

More information

10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty.

10/18/2017. An AI-ish view of Probability, Conditional Probability & Bayes Theorem. Making decisions under uncertainty. An AI-ish view of Probability, Conditional Probability & Bayes Theorem Review: Uncertainty and Truth Values: a mismatch Let action A t = leave for airport t minutes before flight. Will A 15 get me there

More information

Lecture 2: Probability, Naive Bayes

Lecture 2: Probability, Naive Bayes Lecture 2: Probability, Naive Bayes CS 585, Fall 205 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp205/ Brendan O Connor Today Probability Review Naive Bayes classification

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 8: Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk Based on slides by Sharon Goldwater October 14, 2016 Frank Keller Computational

More information

Sources of Uncertainty

Sources of Uncertainty Probability Basics Sources of Uncertainty The world is a very uncertain place... Uncertain inputs Missing data Noisy data Uncertain knowledge Mul>ple causes lead to mul>ple effects Incomplete enumera>on

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 14, 2015 Today: The Big Picture Overfitting Review: probability Readings: Decision trees, overfiting

More information

CPSC340. Probability. Nando de Freitas September, 2012 University of British Columbia

CPSC340. Probability. Nando de Freitas September, 2012 University of British Columbia CPSC340 Probability Nando de Freitas September, 2012 University of British Columbia Outline of the lecture This lecture is intended at revising probabilistic concepts that play an important role in the

More information

Probabilistic and Bayesian Analytics

Probabilistic and Bayesian Analytics Probabilistic and Bayesian Analytics Note to other teachers and users of these slides. Andrew would be delighted if you found this source material useful in giving your own lectures. Feel free to use these

More information

Intro to Bayesian Methods

Intro to Bayesian Methods Intro to Bayesian Methods Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Lecture 1 1 Course Webpage Syllabus LaTeX reference manual R markdown reference manual Please come to office

More information

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides

Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides Probability Review Lecturer: Ji Liu Thank Jerry Zhu for sharing his slides slide 1 Inference with Bayes rule: Example In a bag there are two envelopes one has a red ball (worth $100) and a black ball one

More information

Introduction into Bayesian statistics

Introduction into Bayesian statistics Introduction into Bayesian statistics Maxim Kochurov EF MSU November 15, 2016 Maxim Kochurov Introduction into Bayesian statistics EF MSU 1 / 7 Content 1 Framework Notations 2 Difference Bayesians vs Frequentists

More information

Naïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Naïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824 Naïve Bayes Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative HW 1 out today. Please start early! Office hours Chen: Wed 4pm-5pm Shih-Yang: Fri 3pm-4pm Location: Whittemore 266

More information

Probability is related to uncertainty and not (only) to the results of repeated experiments

Probability is related to uncertainty and not (only) to the results of repeated experiments Uncertainty probability Probability is related to uncertainty and not (only) to the results of repeated experiments G. D Agostini, Probabilità e incertezze di misura - Parte 1 p. 40 Uncertainty probability

More information

Bayesian Statistics. State University of New York at Buffalo. From the SelectedWorks of Joseph Lucke. Joseph F. Lucke

Bayesian Statistics. State University of New York at Buffalo. From the SelectedWorks of Joseph Lucke. Joseph F. Lucke State University of New York at Buffalo From the SelectedWorks of Joseph Lucke 2009 Bayesian Statistics Joseph F. Lucke Available at: https://works.bepress.com/joseph_lucke/6/ Bayesian Statistics Joseph

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

CS 540: Machine Learning Lecture 2: Review of Probability & Statistics

CS 540: Machine Learning Lecture 2: Review of Probability & Statistics CS 540: Machine Learning Lecture 2: Review of Probability & Statistics AD January 2008 AD () January 2008 1 / 35 Outline Probability theory (PRML, Section 1.2) Statistics (PRML, Sections 2.1-2.4) AD ()

More information

Decision making and problem solving Lecture 1. Review of basic probability Monte Carlo simulation

Decision making and problem solving Lecture 1. Review of basic probability Monte Carlo simulation Decision making and problem solving Lecture 1 Review of basic probability Monte Carlo simulation Why probabilities? Most decisions involve uncertainties Probability theory provides a rigorous framework

More information

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4

ECE521 Tutorial 11. Topic Review. ECE521 Winter Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides. ECE521 Tutorial 11 / 4 ECE52 Tutorial Topic Review ECE52 Winter 206 Credits to Alireza Makhzani, Alex Schwing, Rich Zemel and TAs for slides ECE52 Tutorial ECE52 Winter 206 Credits to Alireza / 4 Outline K-means, PCA 2 Bayesian

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 9: Bayesian Estimation Chris Lucas (Slides adapted from Frank Keller s) School of Informatics University of Edinburgh clucas2@inf.ed.ac.uk 17 October, 2017 1 / 28

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning CS4375 --- Fall 2018 Bayesian a Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell 1 Uncertainty Most real-world problems deal with

More information

Lecture 1: Bayesian Framework Basics

Lecture 1: Bayesian Framework Basics Lecture 1: Bayesian Framework Basics Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de April 21, 2014 What is this course about? Building Bayesian machine learning models Performing the inference of

More information

Bayesian Analysis (Optional)

Bayesian Analysis (Optional) Bayesian Analysis (Optional) 1 2 Big Picture There are two ways to conduct statistical inference 1. Classical method (frequentist), which postulates (a) Probability refers to limiting relative frequencies

More information

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes.

Consider an experiment that may have different outcomes. We are interested to know what is the probability of a particular set of outcomes. CMSC 310 Artificial Intelligence Probabilistic Reasoning and Bayesian Belief Networks Probabilities, Random Variables, Probability Distribution, Conditional Probability, Joint Distributions, Bayes Theorem

More information

Random Variables Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University

Random Variables Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University Random Variables Chris Piech CS109, Stanford University Piech, CS106A, Stanford University Piech, CS106A, Stanford University Let s Play a Game Game set-up We have a fair coin (come up heads with p = 0.5)

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information

Introduction to Machine Learning

Introduction to Machine Learning Uncertainty Introduction to Machine Learning CS4375 --- Fall 2018 a Bayesian Learning Reading: Sections 13.1-13.6, 20.1-20.2, R&N Sections 6.1-6.3, 6.7, 6.9, Mitchell Most real-world problems deal with

More information

With Question/Answer Animations. Chapter 7

With Question/Answer Animations. Chapter 7 With Question/Answer Animations Chapter 7 Chapter Summary Introduction to Discrete Probability Probability Theory Bayes Theorem Section 7.1 Section Summary Finite Probability Probabilities of Complements

More information

Uncertainty and knowledge. Uncertainty and knowledge. Reasoning with uncertainty. Notes

Uncertainty and knowledge. Uncertainty and knowledge. Reasoning with uncertainty. Notes Approximate reasoning Uncertainty and knowledge Introduction All knowledge representation formalism and problem solving mechanisms that we have seen until now are based on the following assumptions: All

More information

UNCERTAINTY. In which we see what an agent should do when not all is crystal-clear.

UNCERTAINTY. In which we see what an agent should do when not all is crystal-clear. UNCERTAINTY In which we see what an agent should do when not all is crystal-clear. Outline Uncertainty Probabilistic Theory Axioms of Probability Probabilistic Reasoning Independency Bayes Rule Summary

More information

Exercise 1: Basics of probability calculus

Exercise 1: Basics of probability calculus : Basics of probability calculus Stig-Arne Grönroos Department of Signal Processing and Acoustics Aalto University, School of Electrical Engineering stig-arne.gronroos@aalto.fi [21.01.2016] Ex 1.1: Conditional

More information

Bayesian Methods: Naïve Bayes

Bayesian Methods: Naïve Bayes Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior

More information

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty

Lecture 10: Introduction to reasoning under uncertainty. Uncertainty Lecture 10: Introduction to reasoning under uncertainty Introduction to reasoning under uncertainty Review of probability Axioms and inference Conditional probability Probability distributions COMP-424,

More information

Chapter 2 Classical Probability Theories

Chapter 2 Classical Probability Theories Chapter 2 Classical Probability Theories In principle, those who are not interested in mathematical foundations of probability theory might jump directly to Part II. One should just know that, besides

More information

Probabilistic Reasoning. Doug Downey, Northwestern EECS 348 Spring 2013

Probabilistic Reasoning. Doug Downey, Northwestern EECS 348 Spring 2013 Probabilistic Reasoning Doug Downey, Northwestern EECS 348 Spring 2013 Limitations of logic-based agents Qualification Problem Action s preconditions can be complex Action(Grab, t) => Holding(t).unless

More information

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007 Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.

More information

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions

More information

Probability Review and Naïve Bayes

Probability Review and Naïve Bayes Probability Review and Naïve Bayes Instructor: Alan Ritter Some slides adapted from Dan Jurfasky and Brendan O connor What is Probability? The probability the coin will land heads is 0.5 Q: what does this

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Bayesian Networks BY: MOHAMAD ALSABBAGH

Bayesian Networks BY: MOHAMAD ALSABBAGH Bayesian Networks BY: MOHAMAD ALSABBAGH Outlines Introduction Bayes Rule Bayesian Networks (BN) Representation Size of a Bayesian Network Inference via BN BN Learning Dynamic BN Introduction Conditional

More information

Sociology 6Z03 Topic 10: Probability (Part I)

Sociology 6Z03 Topic 10: Probability (Part I) Sociology 6Z03 Topic 10: Probability (Part I) John Fox McMaster University Fall 2014 John Fox (McMaster University) Soc 6Z03: Probability I Fall 2014 1 / 29 Outline: Probability (Part I) Introduction Probability

More information

Basics of Bayesian analysis. Jorge Lillo-Box MCMC Coffee Season 1, Episode 5

Basics of Bayesian analysis. Jorge Lillo-Box MCMC Coffee Season 1, Episode 5 Basics of Bayesian analysis Jorge Lillo-Box MCMC Coffee Season 1, Episode 5 Disclaimer This presentation is based heavily on Statistics, Data Mining, and Machine Learning in Astronomy: A Practical Python

More information

Lecture 9: Naive Bayes, SVM, Kernels. Saravanan Thirumuruganathan

Lecture 9: Naive Bayes, SVM, Kernels. Saravanan Thirumuruganathan Lecture 9: Naive Bayes, SVM, Kernels Instructor: Outline 1 Probability basics 2 Probabilistic Interpretation of Classification 3 Bayesian Classifiers, Naive Bayes 4 Support Vector Machines Probability

More information

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail}

The enumeration of all possible outcomes of an experiment is called the sample space, denoted S. E.g.: S={head, tail} Random Experiment In random experiments, the result is unpredictable, unknown prior to its conduct, and can be one of several choices. Examples: The Experiment of tossing a coin (head, tail) The Experiment

More information

Lecture 23 Maximum Likelihood Estimation and Bayesian Inference

Lecture 23 Maximum Likelihood Estimation and Bayesian Inference Lecture 23 Maximum Likelihood Estimation and Bayesian Inference Thais Paiva STA 111 - Summer 2013 Term II August 7, 2013 1 / 31 Thais Paiva STA 111 - Summer 2013 Term II Lecture 23, 08/07/2013 Lecture

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Time Series and Dynamic Models

Time Series and Dynamic Models Time Series and Dynamic Models Section 1 Intro to Bayesian Inference Carlos M. Carvalho The University of Texas at Austin 1 Outline 1 1. Foundations of Bayesian Statistics 2. Bayesian Estimation 3. The

More information