Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Size: px
Start display at page:

Download "Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014"

Transcription

1 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions by Charles Sutton, Amos Storkey, David Barber.) 1 / 31

2 Outline What is probability? Random Variables (discrete and continuous) Expectation Joint Distributions Marginal Probability Conditional Probability Chain Rule Bayes Rule Independence Conditional Independence Some Probability Distributions (for reference) Reading: Murphy secs / 31

3 What is probability? Quantification of uncertainty Frequentist interpretation: long run frequenies of events Example: The probability of a particular coin landing heads up is 0.43 Bayesian interpretation: quantify our degrees of belief about something Example: the probability of it raining tomorrow is 0.3 Not possible to repeat tomorrow many times Basic rules of probability are the same, no matter which interpretation is adopted 3 / 31

4 Random Variables A random variable (RV) X denotes a quantity that is subject to variations due to chance May denote the result of an experiment (e.g. flipping a coin) or the measurement of a real-world fluctuating quantity (e.g. temperature) Use capital letters to denote random variables and lower case letters to denote values that they take, e.g. p(x = x) An RV may be discrete or continuous A discrete variable takes on values from a finite or countably infinite set Probability mass function p(x = x) for discrete random variables 4 / 31

5 Examples: Colour of a car blue, green, red Number of children in a family 0, 1, 2, 3, 4, 5, 6, > 6 Toss two coins, let X = (number of heads) 2. X can take on the values 0, 1 and 4. Example p(colour = red) = 0.3 x p(x) = 1 5 / 31

6 Continuous RVs Continuous RVs take on values that vary continuously within one or more real intervals Probability density function (pdf) p(x) for a continuous random variable X therefore p(a X b) = b a p(x)dx p(x X x + δx) p(x)δx p(x)dx = 1 (but values of p(x) can be greater than 1) Examples (coming soon): Gaussian, Gamma, Exponential, Beta 6 / 31

7 Expectation Consider a function f(x) mapping from x onto numerical values E[f(x)] = f(x)p(x) x = f(x)p(x)dx for discrete and continuous variables resp. f(x) = x, we obtain the mean, µ x f(x) = (x µ x ) 2 we obtain the variance 7 / 31

8 Joint distributions Properties of several random variables are important for modelling complex problems p(x 1 = x 1, X 2 = x 2,..., X D = x D ), is read as and Examples about Grade and Intelligence (from Koller and Friedman, 2009) Intelligence = low Intelligence = high Grade = A Grade = B Grade = C / 31

9 Marginal Probability The sum rule p(x) = y p(x, y) p(grade = A)?? Replace sum by an integral for continuous RVs 9 / 31

10 Conditional Probability Let X and Y be two disjoint groups of variables, such that p(y = y) > 0. Then the conditional probability distribution (CPD) of X given Y = y is given by p(x = x Y = y) = p(x y) = p(x, y) p(y) Product rule p(x, Y) = p(x)p(y X) = p(y)p(x Y) Example: In the grades example, what is p(intelligence = high Grade = A)? x p(x = x Y = y) = 1 for all y Can we say anything about y p(x = x Y = y)? 10 / 31

11 Chain Rule The chain rule is derived by repeated application of the product rule p(x 1,..., X D ) = p(x 1,..., X D 1 )p(x D X 1,..., X D 1 ) = p(x 1,..., X D 2 )p(x D 1 X 1,..., X D 2 ) p(x D X 1,..., X D 1 ) =... D = p(x 1 ) p(x i X 1,..., X i 1 ) i=2 Exercise: give six decompositions of p(x, y, z) using the chain rule 11 / 31

12 Bayes Rule From the product rule, p(x Y) = p(y X)p(X) p(y) From the sum rule the denominator is p(y) = X p(y X)p(X) 12 / 31

13 Probabilistic Inference using Bayes Rule Tuberculosis (TB) and a skin test (Test) p(t B = yes) = (for subjects who get tested) p(t est = yes T B = yes) = 0.95 p(t est = no T B = no) = 0.95 Person gets a positive test result. What is p(t B = yes T est = yes)? p(t est = yes T B = yes)p(t B = yes) p(t B = yes T est = yes) = p(t est = yes) = NB: These are fictitious numbers 13 / 31

14 Independence Let X and Y be two disjoint groups of variables. Then X is said to be independent of Y if and only if p(x Y) = p(x) for all possible values x and y of X and Y; otherwise X is said to be dependent on Y Using the definition of conditional probability, we get an equivalent expression for the independence condition p(x, Y) = p(x)p(y) X independent of Y Y independent of X Independence of a set of variables. X 1,...., X D are independent iff p(x 1,..., X D ) = D p(x i ) i=1 14 / 31

15 Conditional Independence Let X, Y and Z be three disjoint groups of variables. X is said to be conditionally independent of Y given Z iff p(x y, z) = p(x z) for all possible values of x, y and z. Equivalently p(x, y z) = p(x z)p(y z) [show this] Notation, I(X, Y Z) 15 / 31

16 Bernoulli Distribution X is a random variable that either takes the value 0 or the value 1. Let p(x = 1 p) = p and so p(x = 0 p) = 1 p. Then X has a Bernoulli distribution / 31

17 Categorical Distribution X is a random variable that takes one of the values 1, 2,..., D. Let p(x = i p) = p i, with D i=1 p i = 1. Then X has a catgorical (aka multinoulli) distribution (see Murphy 2012, p. 35)) / 31

18 Binomial Distribution The binomial distribution is obtained from the total number of 1 s in n independent Bernoulli trials. X is a random variable that takes one of the values 0, 1, 2,..., n. ( ) n Let p(x = r p) = p r (1 p) (n r). r Then X is binomially distributed / 31

19 Multinomial Distribution The multinomial distribution is obtained from the total count for each outcome in n independent multivariate trials with D possible outcomes. X is a random vector of length D taking values x with x i Z + (non-negative integers) and D i=1 x i = n. Let n! p(x = x p) = x 1!... x D! px px D m Then X is multinomially distributed. 19 / 31

20 Poisson Distribution The Poisson distribution is obtained from binomial distribution in the limit n with p/n = λ. X is a random variable taking non-negative integer values 0, 1, 2,.... Let p(x = x λ) = λx exp( λ) x! Then X is Poisson distributed / 31

21 Uniform Distribution X is a random variable taking values x [a, b]. Let p(x = x) = 1/[b a] Then X is uniformly distributed. Note Cannot have a uniform distribution on an unbounded region / 31

22 Gaussian Distribution X is a random variable taking values x R (real values). Let p(x = x µ, σ 2 ) = ( ) 1 exp (x µ)2 2πσ 2 2σ 2 Then X is Gaussian distributed with mean µ and variance σ / 31

23 Gamma Distribution The Gamma distribution has a rate parameter β > 0 (or a scale parameter 1/β) and a shape parameter α > 0. X is a random variable taking values x R + (non-negative real values). Let p(x = x α, β) = 1 Γ(α) xα 1 β α exp( βx) Then X is Gamma distributed. Note the Gamma function. 23 / 31

24 Exponential Distribution The exponential distribution is a Gamma distribution with α = 1. The exponential distribution is often used for arrival times. X is a random variable taking values x R +. Let p(x = x λ) = λ exp( λx) Then X is exponentially distributed / 31

25 Laplace Distribution The Laplace distribution is obtained from the difference between two independent identically exponentially distributed variables. X is a random variable taking values x R. Let p(x = x λ) = (λ/2) exp( λ x ) Then X is Laplace distributed / 31

26 Beta Distribution X is a random variable taking values x [0, 1]. Let p(x = x a, b) = Γ(a + b) Γ(a)Γ(b) xa 1 (1 x) b 1 Then X is β(a, b) distributed a = b = a = 2, b = 3 26 / 31

27 The Kronecker Delta Think of a discrete distribution with all its probability mass on one value. So p(x = i) = 1 iff (if and only if) i = j. We can write this using the Kronecker Delta: p(x = i) = δ ij δ ij = 1 iff i = j and is zero otherwise. 27 / 31

28 The Dirac Delta Think of a real valued distribution with all its probability density on one value. There is an infinite density peak at one point (lets call this point a). We can write this using the Dirac delta: p(x = x) = δ(x a) which has the properties δ(x a) = 0 if x a, δ(x a) = if x = a, dx δ(x a) = 1 and dx f(x)δ(x a) = f(a). You could think of it as a Gaussian distribution in the limit of zero variance. 28 / 31

29 Other Distributions Chi-squared distribution with k degrees of freedom is a Gamma distribution with β = 1/2 and k = 2/α. Dirichlet distribution: will be used on this course. Weibull distribution (a generalisation of the exponential) Geometric distribution Negative binomial distribution. Wishart distribution (a distribution over matrices). Use Wikipedia and Mathworld. Good summaries for distributions. 29 / 31

30 Things you must never (ever) forget Probabilities must be between 0 and 1 (though probability densities can be greater than 1). Distributions must sum (or integrate) to / 31

31 Summary Joint distributions Conditional Probability Sum and Product Rules Standard Probability distributions Reading: Murphy secs / 31

PMR Introduction. Probabilistic Modelling and Reasoning. Amos Storkey. School of Informatics, University of Edinburgh

PMR Introduction. Probabilistic Modelling and Reasoning. Amos Storkey. School of Informatics, University of Edinburgh PMR Introduction Probabilistic Modelling and Reasoning Amos Storkey School of Informatics, University of Edinburgh Amos Storkey PMR Introduction School of Informatics, University of Edinburgh 1/50 Outline

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

Introduction to Machine Learning

Introduction to Machine Learning What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Uncertainty & Probabilities & Bandits Daniel Hennes 16.11.2017 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Uncertainty Probability

More information

Review of Probability Mark Craven and David Page Computer Sciences 760.

Review of Probability Mark Craven and David Page Computer Sciences 760. Review of Probability Mark Craven and David Page Computer Sciences 760 www.biostat.wisc.edu/~craven/cs760/ Goals for the lecture you should understand the following concepts definition of probability random

More information

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams.

Course Introduction. Probabilistic Modelling and Reasoning. Relationships between courses. Dealing with Uncertainty. Chris Williams. Course Introduction Probabilistic Modelling and Reasoning Chris Williams School of Informatics, University of Edinburgh September 2008 Welcome Administration Handout Books Assignments Tutorials Course

More information

Review of probability

Review of probability Review of probability Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts definition of probability random variables

More information

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014 Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,

More information

Basics on Probability. Jingrui He 09/11/2007

Basics on Probability. Jingrui He 09/11/2007 Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

Introduction to Probability and Statistics (Continued)

Introduction to Probability and Statistics (Continued) Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

V7 Foundations of Probability Theory

V7 Foundations of Probability Theory V7 Foundations of Probability Theory Probability : degree of confidence that an event of an uncertain nature will occur. Events : we will assume that there is an agreed upon space of possible outcomes

More information

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015

Human-Oriented Robotics. Probability Refresher. Kai Arras Social Robotics Lab, University of Freiburg Winter term 2014/2015 Probability Refresher Kai Arras, University of Freiburg Winter term 2014/2015 Probability Refresher Introduction to Probability Random variables Joint distribution Marginalization Conditional probability

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

PROBABILITY THEORY REVIEW

PROBABILITY THEORY REVIEW PROBABILITY THEORY REVIEW CMPUT 466/551 Martha White Fall, 2017 REMINDERS Assignment 1 is due on September 28 Thought questions 1 are due on September 21 Chapters 1-4, about 40 pages If you are printing,

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

NPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic

NPFL108 Bayesian inference. Introduction. Filip Jurčíček. Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic NPFL108 Bayesian inference Introduction Filip Jurčíček Institute of Formal and Applied Linguistics Charles University in Prague Czech Republic Home page: http://ufal.mff.cuni.cz/~jurcicek Version: 21/02/2014

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Introduction to probability and statistics

Introduction to probability and statistics Introduction to probability and statistics Alireza Fotuhi Siahpirani & Brittany Baur sroy@biostat.wisc.edu Computational Network Biology Biostatistics & Medical Informatics 826 https://compnetbiocourse.discovery.wisc.edu

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Machine Learning using Bayesian Approaches

Machine Learning using Bayesian Approaches Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes

More information

Artificial Intelligence

Artificial Intelligence Artificial Intelligence Probabilities Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: AI systems need to reason about what they know, or not know. Uncertainty may have so many sources:

More information

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018

15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018 15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due

More information

Random Variables and Their Distributions

Random Variables and Their Distributions Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital

More information

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples

Machine Learning. Bayes Basics. Marc Toussaint U Stuttgart. Bayes, probabilities, Bayes theorem & examples Machine Learning Bayes Basics Bayes, probabilities, Bayes theorem & examples Marc Toussaint U Stuttgart So far: Basic regression & classification methods: Features + Loss + Regularization & CV All kinds

More information

An Introduction to Bayesian Machine Learning

An Introduction to Bayesian Machine Learning 1 An Introduction to Bayesian Machine Learning José Miguel Hernández-Lobato Department of Engineering, Cambridge University April 8, 2013 2 What is Machine Learning? The design of computational systems

More information

PMR Learning as Inference

PMR Learning as Inference Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models Fatih Cavdur fatihcavdur@uludag.edu.tr March 20, 2012 Introduction Introduction The world of the model-builder

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV

More information

MFM Practitioner Module: Quantitative Risk Management. John Dodson. September 23, 2015

MFM Practitioner Module: Quantitative Risk Management. John Dodson. September 23, 2015 MFM Practitioner Module: Quantitative Risk Management September 23, 2015 Mixtures Mixtures Mixtures Definitions For our purposes, A random variable is a quantity whose value is not known to us right now

More information

Learning Bayesian network : Given structure and completely observed data

Learning Bayesian network : Given structure and completely observed data Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution

More information

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 60 minutes.

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 60 minutes. Closed book and notes. 60 minutes. A summary table of some univariate continuous distributions is provided. Four Pages. In this version of the Key, I try to be more complete than necessary to receive full

More information

Lecture 17: The Exponential and Some Related Distributions

Lecture 17: The Exponential and Some Related Distributions Lecture 7: The Exponential and Some Related Distributions. Definition Definition: A continuous random variable X is said to have the exponential distribution with parameter if the density of X is e x if

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Lecture 2: Conjugate priors

Lecture 2: Conjugate priors (Spring ʼ) Lecture : Conjugate priors Julia Hockenmaier juliahmr@illinois.edu Siebel Center http://www.cs.uiuc.edu/class/sp/cs98jhm The binomial distribution If p is the probability of heads, the probability

More information

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section

More information

Some Probability and Statistics

Some Probability and Statistics Some Probability and Statistics David M. Blei COS424 Princeton University February 13, 2012 Card problem There are three cards Red/Red Red/Black Black/Black I go through the following process. Close my

More information

Things to remember when learning probability distributions:

Things to remember when learning probability distributions: SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell

Aarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Machine Learning 0-70/5 70/5-78, 78, Spring 00 Probability 0 Aarti Singh Lecture, January 3, 00 f(x) µ x Reading: Bishop: Chap, Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Announcements Homework

More information

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY

DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY DEEP LEARNING CHAPTER 3 PROBABILITY & INFORMATION THEORY OUTLINE 3.1 Why Probability? 3.2 Random Variables 3.3 Probability Distributions 3.4 Marginal Probability 3.5 Conditional Probability 3.6 The Chain

More information

Parameter Learning With Binary Variables

Parameter Learning With Binary Variables With Binary Variables University of Nebraska Lincoln CSCE 970 Pattern Recognition Outline Outline 1 Learning a Single Parameter 2 More on the Beta Density Function 3 Computing a Probability Interval Outline

More information

A Few Special Distributions and Their Properties

A Few Special Distributions and Their Properties A Few Special Distributions and Their Properties Econ 690 Purdue University Justin L. Tobias (Purdue) Distributional Catalog 1 / 20 Special Distributions and Their Associated Properties 1 Uniform Distribution

More information

Lecture 3. Probability - Part 2. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. October 19, 2016

Lecture 3. Probability - Part 2. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. October 19, 2016 Lecture 3 Probability - Part 2 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza October 19, 2016 Luigi Freda ( La Sapienza University) Lecture 3 October 19, 2016 1 / 46 Outline 1 Common Continuous

More information

Chapter 5 continued. Chapter 5 sections

Chapter 5 continued. Chapter 5 sections Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Conditional distributions

Conditional distributions Conditional distributions Will Monroe July 6, 017 with materials by Mehran Sahami and Chris Piech Independence of discrete random variables Two random variables are independent if knowing the value of

More information

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) 3 Probability Distributions (Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) Probability Distribution Functions Probability distribution function (pdf): Function for mapping random variables to real numbers. Discrete

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology 6.04/6.4: Probabilistic Systems Analysis Fall 00 Quiz Solutions: October, 00 Problem.. 0 points Let R i be the amount of time Stephen spends at the ith red light. R i is a Bernoulli random variable with

More information

Multivariate probability distributions and linear regression

Multivariate probability distributions and linear regression Multivariate probability distributions and linear regression Patrik Hoyer 1 Contents: Random variable, probability distribution Joint distribution Marginal distribution Conditional distribution Independence,

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

Brief Review of Probability

Brief Review of Probability Brief Review of Probability Nuno Vasconcelos (Ken Kreutz-Delgado) ECE Department, UCSD Probability Probability theory is a mathematical language to deal with processes or experiments that are non-deterministic

More information

RVs and their probability distributions

RVs and their probability distributions RVs and their probability distributions RVs and their probability distributions In these notes, I will use the following notation: The probability distribution (function) on a sample space will be denoted

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 9: Bayesian Estimation Chris Lucas (Slides adapted from Frank Keller s) School of Informatics University of Edinburgh clucas2@inf.ed.ac.uk 17 October, 2017 1 / 28

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

CS 630 Basic Probability and Information Theory. Tim Campbell

CS 630 Basic Probability and Information Theory. Tim Campbell CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)

More information

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) 3 Probability Distributions (Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3) Probability Distribution Functions Probability distribution function (pdf): Function for mapping random variables to real numbers. Discrete

More information

Topic 2: Review of Probability Theory

Topic 2: Review of Probability Theory CS 8850: Advanced Machine Learning Fall 2017 Topic 2: Review of Probability Theory Instructor: Daniel L. Pimentel-Alarcón c Copyright 2017 2.1 Why Probability? Many (if not all) applications of machine

More information

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler

Review: Probability. BM1: Advanced Natural Language Processing. University of Potsdam. Tatjana Scheffler Review: Probability BM1: Advanced Natural Language Processing University of Potsdam Tatjana Scheffler tatjana.scheffler@uni-potsdam.de October 21, 2016 Today probability random variables Bayes rule expectation

More information

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

STAT2201. Analysis of Engineering & Scientific Data. Unit 3 STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random

More information

Recitation 2: Probability

Recitation 2: Probability Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions

More information

Introduction to Probability and Statistics (Continued)

Introduction to Probability and Statistics (Continued) Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:

More information

Chapter 1. Sets and probability. 1.3 Probability space

Chapter 1. Sets and probability. 1.3 Probability space Random processes - Chapter 1. Sets and probability 1 Random processes Chapter 1. Sets and probability 1.3 Probability space 1.3 Probability space Random processes - Chapter 1. Sets and probability 2 Probability

More information

01 Probability Theory and Statistics Review

01 Probability Theory and Statistics Review NAVARCH/EECS 568, ROB 530 - Winter 2018 01 Probability Theory and Statistics Review Maani Ghaffari January 08, 2018 Last Time: Bayes Filters Given: Stream of observations z 1:t and action data u 1:t Sensor/measurement

More information

Single Maths B: Introduction to Probability

Single Maths B: Introduction to Probability Single Maths B: Introduction to Probability Overview Lecturer Email Office Homework Webpage Dr Jonathan Cumming j.a.cumming@durham.ac.uk CM233 None! http://maths.dur.ac.uk/stats/people/jac/singleb/ 1 Introduction

More information

5. Conditional Distributions

5. Conditional Distributions 1 of 12 7/16/2009 5:36 AM Virtual Laboratories > 3. Distributions > 1 2 3 4 5 6 7 8 5. Conditional Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent

More information

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping Random Variables Example: We roll a fair die 6 times. Suppose we are interested in the number of 5 s in the 6 rolls. Let X = number of 5 s. Then X could be 0, 1, 2, 3, 4, 5, 6. X = 0 corresponds to the

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

ECE 313 Probability with Engineering Applications Fall 2000

ECE 313 Probability with Engineering Applications Fall 2000 Exponential random variables Exponential random variables arise in studies of waiting times, service times, etc X is called an exponential random variable with parameter λ if its pdf is given by f(u) =

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com 1 School of Oriental and African Studies September 2015 Department of Economics Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com Gujarati D. Basic Econometrics, Appendix

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Bayesian RL Seminar. Chris Mansley September 9, 2008

Bayesian RL Seminar. Chris Mansley September 9, 2008 Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in

More information

Some Probability and Statistics

Some Probability and Statistics Some Probability and Statistics David M. Blei COS424 Princeton University February 12, 2007 D. Blei ProbStat 01 1 / 42 Who wants to scribe? D. Blei ProbStat 01 2 / 42 Random variable Probability is about

More information

1 Random Variable: Topics

1 Random Variable: Topics Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?

More information

Review of Probability. CS1538: Introduction to Simulations

Review of Probability. CS1538: Introduction to Simulations Review of Probability CS1538: Introduction to Simulations Probability and Statistics in Simulation Why do we need probability and statistics in simulation? Needed to validate the simulation model Needed

More information

Probability and Distributions

Probability and Distributions Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated

More information

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr.

Topic 2: Probability & Distributions. Road Map Probability & Distributions. ECO220Y5Y: Quantitative Methods in Economics. Dr. Topic 2: Probability & Distributions ECO220Y5Y: Quantitative Methods in Economics Dr. Nick Zammit University of Toronto Department of Economics Room KN3272 n.zammit utoronto.ca November 21, 2017 Dr. Nick

More information

Statistical Methods in Particle Physics

Statistical Methods in Particle Physics Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative

More information

1 Uniform Distribution. 2 Gamma Distribution. 3 Inverse Gamma Distribution. 4 Multivariate Normal Distribution. 5 Multivariate Student-t Distribution

1 Uniform Distribution. 2 Gamma Distribution. 3 Inverse Gamma Distribution. 4 Multivariate Normal Distribution. 5 Multivariate Student-t Distribution A Few Special Distributions Their Properties Econ 675 Iowa State University November 1 006 Justin L Tobias (ISU Distributional Catalog November 1 006 1 / 0 Special Distributions Their Associated Properties

More information

THE QUEEN S UNIVERSITY OF BELFAST

THE QUEEN S UNIVERSITY OF BELFAST THE QUEEN S UNIVERSITY OF BELFAST 0SOR20 Level 2 Examination Statistics and Operational Research 20 Probability and Distribution Theory Wednesday 4 August 2002 2.30 pm 5.30 pm Examiners { Professor R M

More information

Brief Review of Probability

Brief Review of Probability Maura Department of Economics and Finance Università Tor Vergata Outline 1 Distribution Functions Quantiles and Modes of a Distribution 2 Example 3 Example 4 Distributions Outline Distribution Functions

More information

Discrete Binary Distributions

Discrete Binary Distributions Discrete Binary Distributions Carl Edward Rasmussen November th, 26 Carl Edward Rasmussen Discrete Binary Distributions November th, 26 / 5 Key concepts Bernoulli: probabilities over binary variables Binomial:

More information

1 Probability and Random Variables

1 Probability and Random Variables 1 Probability and Random Variables The models that you have seen thus far are deterministic models. For any time t, there is a unique solution X(t). On the other hand, stochastic models will result in

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation Machine Learning CMPT 726 Simon Fraser University Binomial Parameter Estimation Outline Maximum Likelihood Estimation Smoothed Frequencies, Laplace Correction. Bayesian Approach. Conjugate Prior. Uniform

More information

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University Chapter 3, 4 Random Variables ENCS6161 - Probability and Stochastic Processes Concordia University ENCS6161 p.1/47 The Notion of a Random Variable A random variable X is a function that assigns a real

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information