CS145: Probability & Computing

Similar documents
CS145: Probability & Computing

General Random Variables

SDS 321: Introduction to Probability and Statistics

1 Joint and marginal distributions

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

CS145: Probability & Computing Lecture 11: Derived Distributions, Functions of Random Variables

Chapter 4 Multiple Random Variables

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

4.1 The Expectation of a Random Variable

1 Random Variable: Topics

ECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.

Lecture 3. Conditional distributions with applications

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

Homework 5 Solutions

conditional cdf, conditional pdf, total probability theorem?

Multivariate distributions

HW4 : Bivariate Distributions (1) Solutions

Chapter 4. Continuous Random Variables 4.1 PDF

Will Landau. Feb 21, 2013

1 Review of di erential calculus

Basics of Stochastic Modeling: Part II

STAT 430/510: Lecture 15

Joint Probability Distributions, Correlations

Chapter 5 Joint Probability Distributions

p. 6-1 Continuous Random Variables p. 6-2

2 Functions of random variables

Math 416 Lecture 2 DEFINITION. Here are the multivariate versions: X, Y, Z iff P(X = x, Y = y, Z =z) = p(x, y, z) of X, Y, Z iff for all sets A, B, C,

Jointly Distributed Random Variables

2 (Statistics) Random variables

ECE353: Probability and Random Processes. Lecture 7 -Continuous Random Variable

STA 256: Statistics and Probability I

1 Expectation of a continuously distributed random variable

Random Signals and Systems. Chapter 3. Jitendra K Tugnait. Department of Electrical & Computer Engineering. Auburn University.

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables

SDS 321: Introduction to Probability and Statistics

Continuous Random Variables

Review: mostly probability and some statistics

Introduction to Probability and Stocastic Processes - Part I

STAT/MA 416 Answers Homework 6 November 15, 2007 Solutions by Mark Daniel Ward PROBLEMS

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

1 Solution to Problem 2.1

MATH/STAT 395 Winter 2013 This homework is due at the beginning of class on Friday, March 1.

FACULTY OF ENGINEERING AND ARCHITECTURE. MATH 256 Probability and Random Processes. 04 Multiple Random Variables

Chapter 2 Random Variables

Bivariate distributions

STAT/MATH 395 PROBABILITY II

Joint Probability Distributions, Correlations

Chapter 4 Multiple Random Variables

Lecture Notes 3 Multiple Random Variables. Joint, Marginal, and Conditional pmfs. Bayes Rule and Independence for pmfs

Continuous r.v practice problems

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Lecture 11. Probability Theory: an Overveiw

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

4 Pairs of Random Variables

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 1

Lecture 4. Continuous Random Variables and Transformations of Random Variables

Chapter 2. Probability

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

The story of the film so far... Mathematics for Informatics 4a. Jointly distributed continuous random variables. José Figueroa-O Farrill

ENGG2430A-Homework 2

Review of Probability Theory

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Probability and Distributions

EE4601 Communication Systems

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Introduction to Stochastic Processes

ECE 4400:693 - Information Theory

Multiple Random Variables

MATH 180A - INTRODUCTION TO PROBABILITY PRACTICE MIDTERM #2 FALL 2018

Homework 10 (due December 2, 2009)


Solution to Assignment 3

Communication Theory II

This exam contains 6 questions. The questions are of equal weight. Print your name at the top of this page in the upper right hand corner.

When Are Two Random Variables Independent?

STAT 515 MIDTERM 2 EXAM November 14, 2018

Statistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Mathematics 426 Robert Gross Homework 9 Answers

Recitation 2: Probability

Basics on Probability. Jingrui He 09/11/2007

GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review

MATH Solutions to Probability Exercises

Bayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI

More on Distribution Function

Raquel Prado. Name: Department of Applied Mathematics and Statistics AMS-131. Spring 2010

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random

EXAM # 3 PLEASE SHOW ALL WORK!

CS8803: Statistical Techniques in Robotics Byron Boots. Hilbert Space Embeddings

CS145: Probability & Computing Lecture 9: Gaussian Distributions, Conditioning & Bayes Rule

1 Probability and Random Variables

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be

Lecture 8: Channel Capacity, Continuous Random Variables

Ch. 5 Joint Probability Distributions and Random Samples

Chapter 5,6 Multiple RandomVariables

HW7 Solutions. f(x) = 0 otherwise. 0 otherwise. The density function looks like this: = 20 if x [10, 90) if x [90, 100]

Transcription:

CS45: Probability & Computing Lecture 0: Continuous Bayes Rule, Joint and Marginal Probability Densities Instructor: Eli Upfal Brown University Computer Science Figure credits: Bertsekas & Tsitsiklis, Introduction to Probability, 2008 Pitman, Probability, 999

CS45: Lecture 0 Outline Ø Bayes Rule: Classification from Continuous Data Ø Joint and Marginal Probability Densities

Continuous Random Variables Ø For any discrete random variable, the CDF is discontinuous and piecewise constant Ø If the CDF is monotonically increasing and continuous, have a continuous random variable: 0 apple F X (x) apple F X (x 2 ) F X (x )ifx 2 >x. lim x! F X(x) =0 lim x!+ F X(x) = Ø The probability that continuous random variable X lies in the interval (x,x 2 ] is then P (x <Xapple x 2 )=F X (x 2 ) F X (x ) CDF 0 n = b a + F X (x) a b x

Probability Density Function (PDF) Ø If the CDF is differentiable, its first derivative is called the probability density function (PDF): f X (x) = df X(x) dx F X (x) = Z x = F 0 X(x) Ø By the fundamental theorem of calculus: Z x2 x f X (x) dx = F X (x 2 ) F X (x )=P (x <Xapple x 2 ) Ø For any valid PDF: f X (x) 0 Z + f X (t) dt f X (x) dx = CDF 0 f (x X ) b a 0 a b x a b x

Classification Problems Athitsos et al., CVPR 2004 & PAMI 2008 Ø Which of the 0 digits did a person write by hand? Ø Which of the basic ASL phonemes did a person sign? Ø Is an email spam or not spam (ham)? Ø Is this image taken in an indoor our outdoor environment? Ø Is a pedestrian visible from a self-driving car s camera? Ø What language is a webpage or document written in? Ø How many stars would a user rate a movie that they ve never seen?

Continuous & Discrete Variables X! discrete random variable with PMF p X (x) p X (x) 0, X p X (x) =. x Example: Probability of each category in a classification problem. Y! continuous random variable with conditional PDF f Y X (y x) f Y X (y x) 0, Z + f Y X (y x) dy = for all x. Example: Output of temperature sensor, motion sensor, camera, etc. For each possible X=x, the distribution of this sensor is different.

Marginal Distributions are Mixtures X! discrete random variable with PMF p X (x) p X (x) 0, X p X (x) =. Y! continuous random variable with conditional PDF f Y X (y x) f Y X (y x) 0, Z + x f Y X (y x) dy = for all x. Marginal Distribution: Summing over all possible values of X, the marginal CDF (PDF) is a mixture of the conditional CDFs (PDFs): F Y (y) =P (Y apple y) = X x P (X = x \ Y apple y) = X x p X (x)p (Y apple y X = x) = X x p X (x)f Y X (y x) Ø Then taking derivative of each term in sum: f Y (y) = X x p X (x)f Y X (y x)

Example: Hard Drive Lifetimes Ø Suppose 90% of hard drives in some laptop computer 0 model have exponentially distributed lifetime param f Y X (y 0) = 0 e 0y p X (0) = 0.9 Exponential Distributions: Ø However, 0% of hard drives have a manufacturing defect that gives them a shorter lifetime f Y X (y ) = e y Ø Recall mean of exponential distribution: > 0 p X () = 0. E[Y X = 0] = 0 > = E[Y X = ] F Y (y) = e y f Y (y) = e y

Example: Hard Drive Lifetimes Ø Suppose 90% of hard drives in some laptop computer 0 model have exponentially distributed lifetime param f Y X (y 0) = 0 e 0y p X (0) = 0.9 Exponential Distributions: Ø However, 0% of hard drives have a manufacturing defect that gives them a shorter lifetime f Y X (y ) = e y > 0 p X () = 0. Ø If your hard drive has operated for t seconds and has not yet failed, what is the probability it is defective? P (Y > t X = )P (X = ) P (X = Y > t)= P (Y > t) 0.e t = 0.e t +0.9e 0t F Y (y) = e y f Y (y) = e y

Example: Hard Drive Lifetimes Ø Suppose 90% of hard drives in some laptop computer 0 model have exponentially distributed lifetime param f Y X (y 0) = 0 e 0y p X (0) = 0.9 Ø However, 0% of hard drives have a manufacturing defect that gives them a shorter lifetime f Y X (y ) = e y p X () = 0. Ø If your hard drive fails after exactly t seconds of operation, what is the probability it is defective? P (X = Y = t) = > 0 P (Y = t X = )P (X = ) P (Y = t) Problem: For continuous variable Y, P (Y = t) = 0? Exponential Distributions: F Y (y) = e y f Y (y) = e y

Discrete Inference from Continuous Data X! discrete random variable with PMF p X (x) Y! continuous random variable with conditional PDF f Y X (y x) Ø Y has a different PDF for each possible discrete X=x Ø Given Y=y, we want to find the probability of each possible X=x Ø But how can we condition on an event of probability zero? P(X = x Y = y) P(X = x y Y y + δ) l'hopital's Rule = p X(x)P(y Y y + δ X = x) P(y Y y + δ) p X(x)f Y X (y x)δ f Y (y)δ = p X(x)f Y X (y x). f Y (y) nator can be evaluated using a version of the total prob f Y (y) = X x f Y (y) PDF f X (x ) ( ) δ yx x y + δ p X (x)f Y X (y x)

Example: Hard Drive Lifetimes Ø Suppose 90% of hard drives in some laptop computer 0 model have exponentially distributed lifetime param f Y X (y 0) = 0 e 0y p X (0) = 0.9 Exponential Distributions: Ø However, 0% of hard drives have a manufacturing defect that gives them a shorter lifetime f Y X (y ) = e y p X () = 0. Ø If your hard drive fails after exactly t seconds of operation, what is the probability it is defective? P (X = Y = t) = f Y X(y )p X () f Y (y) = > 0 0. e t 0. e t +0.9 0 e 0t F Y (y) = e y f Y (y) = e y

Classification from Continuous Data Suppose X is discrete and Y is continuous, and we observe Y=y. Prior probability mass function for X: p X (x) Conditional probability density function for Y: f Y X (y x) Posterior probability mass function of X given Y=y: px Y (x y) = p X (x)f (y x) Y X f Y (y) f Y (y) = X px (x)f Y X (y x) x weight 280 260 240 220 200 80 60 40 20 00 f Y X (y ) red = female, blue=male f Y X (y 0) 80 55 60 65 70 75 80 height f Y (y)

Moments of Continuous Data Suppose X is discrete and Y is continuous, and we observe Y=y. Prior probability mass function for X: p X (x) Conditional probability density function for Y: f Y X (y x) Marginal expected value of random variable Y: E[Y ]= Z + f Y (y) = X x yf Y (y) dy = X x = X x p X (x)f Y X (y x) p X (x) Z + yf Y X (y x) dy p X (x)e[y X = x] weight 280 260 240 220 200 80 60 40 20 00 f Y X (y ) red = female, blue=male f Y X (y 0) 80 55 60 65 70 75 80 height f Y (y)

Mixed Continuous/Discrete Distributions Representations of distribution of X: Ø CDF is (always) well defined Ø Formally, PDF does not exist in this case Ø Informally, can illustrate via a hybrid of a probability density function (PDF) and a probability mass function (PMF) Ø Related concepts in physics & engineering: Dirac delta function, impulse response Schematic drawing of a combination of a PDF and a PMF /2 0 /2 x 0 The corresponding CDF: F X (x) =P(X x) Distribution of random variable X: Ø With probability 0.5, X has a continuous uniform distribution on [0,] Ø With probability 0.5, X=0.5 CDF 3/4 /4 /2 x

CS45: Lecture 0 Outline Ø Bayes Rule: Classification from Continuous Data Ø Joint and Marginal Probability Densities

Joint Probability Distributions Definition The joint distribution function of X and Y is F (x, y) =Pr(X apple x, Y apple y). The variables X and Y have joint density function f if for all x, y, F (x, y) = Z y Z x f (u, v) du dv. Z Z when the derivative exists. f (x, y) = @2 F (x, y) @x@y P( x X x +,y Y y+ ) f ( ) 2 X,Y x, y

Example: Joint Distribution values of x f XY (x, y) = 5!x(y x)( y) 0 <x<y< Z Z y P (X > 0.25,Y > 0.5) = Z Z y 0.5 0.25 f XY (x, y) dxdy 0 0 x(y x)( y) dxdy = 5! Pitman s Probability, 999

2D Uniform Distributions Density function: f (x, y) = x, y 2 [0, ] 2 0 otherwise. Probability distribution: F (x, y) =Pr(X apple x, Y apple y) = Z min[,x] Z min[,y] 0 0 dxdy F (x, y) = 8 >< >: 0 if x < 0 or y < 0 xy if x, y 2 [0, ] 2 x 0 apple x apple, y > y 0 apple y apple, x > x >, y >.

Definition Marginal Distributions Given a joint distribution function F X,Y (x, y) the marginal distribution function of X is F X (x) =Pr(X apple x) = Z x Z f X,Y (x, y)dydx and the corresponding marginal density functions is f X (x) = Z Example: Uniform Distributions x, y 2 [0, ] 2 f X,Y (x, y) = 0 otherwise. f X,Y (x, y)dy 8 < R : R f X (x) = x 2 [0, ] 0 otherwise.

Independence Definition The random variables X and Y are independent if for all x, y, Pr((X apple x) \ (Y apple y)) = Pr(X apple x)pr(y apple y). Two random variables are independent if and only if F (x, y) =F X (x)f Y (y). f (x, y) =f X (x)f Y (y). If X and Y are independent then E[XY ]=E[X ]E[Y ].

Classification Problems Athitsos et al., CVPR 2004 & PAMI 2008 Ø Which of the 0 digits did a person write by hand? Ø Which of the basic ASL phonemes did a person sign? Ø Is this image taken in an indoor our outdoor environment? Ø Is a pedestrian visible from a self-driving car s camera? If raw data is a list of M continuous features (like pixel intensities), scale models to high-dimensional data via independence assumptions: f Y X (y,y 2,...,y M x) = MY f Yi X(y i x) i=

Buffon s Needle Bu on s needle Parallel lines at distance d Needle of length (assume <d) Find P(needle intersects one of the lines) x l q d X [0,d/2]: distance of needle midpoint to nearest line Model: X, uniform, independent 4 d fx, ( x, ) = 0 x d/ 2, 0 /2 Intersect if X sin 2 Z Z P X sin 2 = Z Z x 2 sin f X (x)f ( ) dx d 4 Z /2 Z ( /2) sin = dx d d 0 0 = Z 4 /2 2 sin d = d 0 2 d