Multivariate random variables

Similar documents
Multivariate random variables

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Expectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Probability (continued)

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Basics on Probability. Jingrui He 09/11/2007

Recitation 2: Probability

Review of Probability Theory

Multiple Random Variables

Introduction to Probability and Stocastic Processes - Part I

1 Random Variable: Topics

Lecture 2: Repetition of probability theory and statistics

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.

Quick Tour of Basic Probability Theory and Linear Algebra

Introduction to Machine Learning

Multivariate Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

DS-GA 1002 Lecture notes 2 Fall Random variables

Multiple Random Variables

Math 416 Lecture 3. The average or mean or expected value of x 1, x 2, x 3,..., x n is

Probability and Statistics for Data Science. Carlos Fernandez-Granda

Math 416 Lecture 2 DEFINITION. Here are the multivariate versions: X, Y, Z iff P(X = x, Y = y, Z =z) = p(x, y, z) of X, Y, Z iff for all sets A, B, C,

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

1 Random variables and distributions

Random Processes. DS GA 1002 Probability and Statistics for Data Science.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Review of probability

Lecture 14: Multivariate mgf s and chf s

ECE 302: Probabilistic Methods in Electrical Engineering

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Algorithms for Uncertainty Quantification

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Chapter 4 Multiple Random Variables

EE4601 Communication Systems

A Probability Review

Multivariate probability distributions and linear regression

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

1.1 Review of Probability Theory

BASICS OF PROBABILITY

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

3. Probability and Statistics

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

Chapter 2: Random Variables

We introduce methods that are useful in:

Review: mostly probability and some statistics

Joint Probability Distributions and Random Samples (Devore Chapter Five)

6 The normal distribution, the central limit theorem and random samples

Chapter 2 Random Variables

Statistics: Learning models from data

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Random Variables and Their Distributions

Probability, CLT, CLT counterexamples, Bayes. The PDF file of this lecture contains a full reference document on probability and random variables.

DISCRETE RANDOM VARIABLES: PMF s & CDF s [DEVORE 3.2]

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

ECON Fundamentals of Probability

UNIT Define joint distribution and joint probability density function for the two random variables X and Y.

2 Functions of random variables

Exercises with solutions (Set D)

Statistical Methods in Particle Physics

Probability Background

Lecture 1: Basics of Probability

Stat 366 A1 (Fall 2006) Midterm Solutions (October 23) page 1

Review (Probability & Linear Algebra)

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Lecture 1: August 28

Probability Review. Gonzalo Mateos

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

More on Distribution Function

ECE Lecture #10 Overview

Lecture 3: Random variables, distributions, and transformations

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Introduction to Machine Learning

Preliminary statistics

Lecture Notes 3 Multiple Random Variables. Joint, Marginal, and Conditional pmfs. Bayes Rule and Independence for pmfs

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

CS 361: Probability & Statistics

conditional cdf, conditional pdf, total probability theorem?

Chapter 4. Multivariate Distributions. Obviously, the marginal distributions may be obtained easily from the joint distribution:

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

Probability theory. References:

Statistics and Econometrics I

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

Introduction to Probability Theory

MAS223 Statistical Inference and Modelling Exercises

MATH 3510: PROBABILITY AND STATS July 1, 2011 FINAL EXAM

where r n = dn+1 x(t)

L2: Review of probability and statistics

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Review of probability. Nuno Vasconcelos UCSD

PROBABILITY THEORY REVIEW

Appendix A : Introduction to Probability and stochastic processes

Stochastic processes Lecture 1: Multiple Random Variables Ch. 5

MAT 271E Probability and Statistics

Introduction to Probability and Statistics (Continued)

1 Joint and marginal distributions

ECE 650 Lecture 4. Intro to Estimation Theory Random Vectors. ECE 650 D. Van Alphen 1

Transcription:

Multivariate random variables DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda

Joint distributions Tool to characterize several uncertain numerical quantities of interest within the same probabilistic model We can group the variables into random vectors X 1 X = X 2 X n

Discrete random variables Continuous random variables Joint distributions of discrete and continuous random variables

Joint probability mass function The joint pmf of X and Y is defined as p X,Y (x, y) := P (X = x, Y = y) It is the probability of X, Y being equal to x, y respectively By the definition of a probability measure p X,Y (x, y) 0 for any x R X, y R Y x R X y R Y p X,Y (x, y) = 1

Joint probability mass function The joint pmf of a discrete random vector X is p X ( x) := P (X 1 = x 1, X 2 = x 2,..., X n = x n ) It is the probability of X being equal to x By the definition of a probability measure p X ( x) 0 x 1 R 1 p X ( x) = 1 x 2 R 2 x n R n

Joint probability mass function By the Law of Total Probability, for any set S R X R y, P ((X, Y ) S) = P ( (x,y) S {X = x, Y = y} ) (union of disjoint events) = P (X = x, Y = y) (x,y) S = (x,y) S p X,Y (x, y) Similarly, for any discrete set S R n ( ) P X S = p X ( x) x S

Marginalization To compute the marginal pmf of X from the joint pmf p X,Y p X (x) = P (X = x) = P ( y RY {X = x, Y = y}) (union of disjoint events) = y R Y P (X = x, Y = y) = y R Y p X,Y (x, y) This is called marginalizing over Y

Marginalization Marginal pmf of a subvector X I, I {1, 2,..., n}, p XI ( x I ) = p X ( x) j 2 R j2 j n m R jn m j 1 R j1 {j 1, j 2,..., j n m } := {1, 2,..., n} /I

Conditional probability mass function The conditional pmf of Y given X is p Y X (y x) = P (Y = y X = x) = p X,Y (x, y), as long as p X (x) > 0 p X (x) Valid pmf parametrized by x Chain rule for discrete random variables p X,Y (x, y) = p X (x) p Y X (y x)

Conditional probability mass function The conditional pmf of a random subvector X I, I {1, 2,..., n}, given another subvector X J is p XI X J ( x I x J ) := p X ( x) p XJ ( x J ) {j 1, j 2,..., j n m } := {1, 2,..., n} /I Chain rule for discrete random vectors p X ( x) = p X1 (x 1 ) p X2 X 1 (x 2 x 1 )... p Xn X1,...,X n 1 (x n x 1,..., x n 1 ) n ( ) = p Xi X xi x {1,...,i 1} {1,...,i 1} i=1 Any order works!

Example: Flights and rain (continued) Probabilistic model for late arrivals at an airport P (late, no rain) = 2 14, P (on time, no rain) = 20 20, P (late, rain) = 3 20, P (on time, rain) = 1 20 { 1 if plane is late L = 0 otherwise { 1 it rains R = 0 otherwise

Example: Flights and rain (continued) R p L,R 0 1 p L p L R ( 0) p L R ( 1) L 0 14 20 1 2 20 1 20 3 20 15 20 5 20 7 8 1 8 1 4 3 4 p R 16 20 4 20 p R L ( 0) 14 15 1 15 p R L ( 1) 2 5 3 5

Independence of discrete random variables X and Y are independent if and only if p X,Y (x, y) = p X (x) p Y (y), for all x R X, y R Y, Equivalently p X Y (x y) = p X (x) p Y X (y x) = p Y (y) for all x R X, y R Y

Mutually independent random variables The n entries X 1, X 2,..., X n in a random vector X are mutually independent if and only if n p X ( x) = p Xi (x i ) i=1

Conditionally mutually independent random variables The components of a subvector X I, I {1, 2,..., n} are conditionally mutually independent given another subvector X J, J {1, 2,..., n}, if and only if p XI X J ( x I x J ) = i I p Xi X J (x i x J )

Pairwise independence X 1 and X 2 are outcomes of independent unbiased coin flips X 3 = { 1 if X 1 = X 2, 0 if X 1 X 2. Are X 1, X 2 and X 3 independent?

Pairwise independence X 1 and X 2 are independent by assumption

Pairwise independence X 1 and X 2 are independent by assumption The pmf of X 3 is p X3 (1) = p X3 (0) =

Pairwise independence X 1 and X 2 are independent by assumption The pmf of X 3 is p X3 (1) = p X1,X 2 (1, 1) + p X1,X 2 (0, 0) = 1 2, p X3 (0) = p X1,X 2 (0, 1) + p X1,X 2 (1, 0) = 1 2

Pairwise independence Are X 1 and X 3 independent?,

Pairwise independence Are X 1 and X 3 independent? p X1,X 3 (0, 0) =,

Pairwise independence Are X 1 and X 3 independent? p X1,X 3 (0, 0) = p X1,X 2 (0, 1) = 1 4,

Pairwise independence Are X 1 and X 3 independent? p X1,X 3 (0, 0) = p X1,X 2 (0, 1) = 1 4 = p X 1 (0) p X3 (0),

Pairwise independence Are X 1 and X 3 independent? p X1,X 3 (0, 0) = p X1,X 2 (0, 1) = 1 4 = p X 1 (0) p X3 (0), p X1,X 3 (1, 0) = p X1,X 2 (1, 0) = 1 4 = p X 1 (1) p X3 (0),

Pairwise independence Are X 1 and X 3 independent? p X1,X 3 (0, 0) = p X1,X 2 (0, 1) = 1 4 = p X 1 (0) p X3 (0), p X1,X 3 (1, 0) = p X1,X 2 (1, 0) = 1 4 = p X 1 (1) p X3 (0), p X1,X 3 (0, 1) = p X1,X 2 (0, 0) = 1 4 = p X 1 (0) p X3 (1),

Pairwise independence Are X 1 and X 3 independent? p X1,X 3 (0, 0) = p X1,X 2 (0, 1) = 1 4 = p X 1 (0) p X3 (0), p X1,X 3 (1, 0) = p X1,X 2 (1, 0) = 1 4 = p X 1 (1) p X3 (0), p X1,X 3 (0, 1) = p X1,X 2 (0, 0) = 1 4 = p X 1 (0) p X3 (1), p X1,X 3 (1, 1) = p X1,X 2 (1, 1) = 1 4 = p X 1 (1) p X3 (1)

Pairwise independence Are X 1 and X 3 independent? p X1,X 3 (0, 0) = p X1,X 2 (0, 1) = 1 4 = p X 1 (0) p X3 (0), p X1,X 3 (1, 0) = p X1,X 2 (1, 0) = 1 4 = p X 1 (1) p X3 (0), p X1,X 3 (0, 1) = p X1,X 2 (0, 0) = 1 4 = p X 1 (0) p X3 (1), p X1,X 3 (1, 1) = p X1,X 2 (1, 1) = 1 4 = p X 1 (1) p X3 (1) Yes

Pairwise independence X 1, X 2 and X 3 are pairwise independent Are X 1, X 2 and X 3 mutually independent?

Pairwise independence X 1, X 2 and X 3 are pairwise independent Are X 1, X 2 and X 3 mutually independent? p X1,X 2,X 3 (1, 1, 1) = p X1 (1) p X2 (1) p X3 (1) =

Pairwise independence X 1, X 2 and X 3 are pairwise independent Are X 1, X 2 and X 3 mutually independent? p X1,X 2,X 3 (1, 1, 1) = P (X 1 = 1, X 2 = 1) = 1 4 p X1 (1) p X2 (1) p X3 (1) = 1 8

Pairwise independence X 1, X 2 and X 3 are pairwise independent Are X 1, X 2 and X 3 mutually independent? p X1,X 2,X 3 (1, 1, 1) = P (X 1 = 1, X 2 = 1) = 1 4 p X1 (1) p X2 (1) p X3 (1) = 1 8 No!

Discrete random variables Continuous random variables Joint distributions of discrete and continuous random variables

Continuous random variables We consider events that are composed of unions of Cartesian products of intervals (Borel sets) The joint cumulative distribution function (cdf) of X and Y is F X,Y (x, y) := P (X x, Y y) In words, probability of X and Y being smaller than x and y respectively The cdf of X is F X ( x) := P (X 1 x 1, X 2 x 2,..., X n x n )

Joint cumulative distribution function Every joint cdf satisfies lim F X,Y (x, y) = 0, x lim F X,Y (x, y) = 0, y lim F X,Y (x, y) = 1 x,y F X,Y (x 1, y 1 ) F X,Y (x 2, y 2 ) if x 2 x 1, y 2 y 1 (nondecreasing)

Joint cumulative distribution function For any two-dimensional interval P (x 1 X x 2, y 1 Y y 2 ) = P ({X x 2, Y y 2 } {X > x 2 } {Y > y 2 }) = P (X x 2, Y y 2 ) P (X x 1, Y y 2 ) P (X x 2, Y y 1 ) + P (X x 1, Y y 1 ) = F X,Y (x 2, y 2 ) F X,Y (x 1, y 2 ) F X,Y (x 2, y 1 ) + F X,Y (x 1, y 1 ) Completely characterizes the distribution of the random variables / random vector

Joint probability density function If the joint cdf is differentiable f X,Y (x, y) := 2 F X,Y (x, y) x y n F X ( x) f X ( x) := x 1 x 2 x n

Joint probability density function Probability of (X, Y ) (x, x + x ) (y, y + y ) for x, y 0 is f X,Y (x, y) x y It is a density, not a probability measure! From the monotonicity of the joint cdf f X,Y (x, y) 0 f X ( x) 0

Joint probability density function For any Borel set S R 2 P ((X, Y ) S) = In particular, x= y= S f X,Y (x, y) dx dy f X,Y (x, y) dx dy = 1

Joint probability density function For any Borel set S R n In particular, ( ) P X S = f X ( x) d x S R n f X ( x) d x = 1

Example: Triangle lake 1.5 1 E F 0.5 B C D 0 A 0.5 0.5 0 0.5 1 1.5

Example: Triangle lake 0 if x 1 < 0 or x 2 < 0, 2x 1 x 2, if x 1 0, x 2 0, x 1 + x 2 1, 2x 1 + 2x 2 x2 2 F X ( x) = x 1 2 1, if x 1 1, x 2 1, x 1 + x 2 1, 2x 2 x2 2, if x 1 1, 0 x 2 1, 2x 1 x1 2, if 0 x 1 1, x 2 1, 1, if x 1 1, x 2 1

Marginalization We can compute the marginal cdf from the joint cdf or from the joint pdf F X (x) = P (X x) = lim y F X,Y (x, y) F X (x) = P (X x) = Differentiating we obtain f X (x) = x u= y= y= f X,Y (x, y) dy f X,Y (u, y) du dy

Marginalization Marginal pdf of a subvector X I, I := {i 1, i 2,..., i m }, f XI ( x I ) = x j1 f X ( x) dx j1 dx j2 dx jn m x j2 x jn m where {j 1, j 2,..., j n m } := {1, 2,..., n} /I

Example: Triangle lake (continued) Marginal cdf of x 1 F X1 (x 1 ) = Marginal pdf of x 1 lim F x 2 X ( x) = f X1 (x 1 ) = df X 1 (x 1 ) dx 1 = 0 if x 1 < 0, 2x 1 x1 2 if 0 x 1 1, 1 if x 1 1 { 2 (1 x 1 ) if 0 x 1 1 0 otherwise

Joint conditional cdf and pdf given an event If we know that (X, Y ) S for any Borel set in R 2 F X,Y (X,Y ) S (x, y) := P (X x, Y y (X, Y ) S) = = P (X x, Y y, (X, Y ) S) P ((X, Y ) S) u x,v y,(u,v) S f X,Y (u, v) du dv (u,v) S f X,Y (u, v) du dv f X,Y (X,Y ) S (x, y) := 2 F X,Y (X,Y ) S (x, y) x y

Conditional cdf and pdf Distribution of Y given X = x? The event has zero probability!

Conditional cdf and pdf Distribution of Y given X = x? The event has zero probability! Define f Y X (y x) := f X,Y (x, y), if f X (x) > 0 f X (x) F Y X (y x) := y u= f Y X (u x) du Chain rule for continuous random variables f X,Y (x, y) = f X (x) f Y X (y x)

Conditional cdf and pdf P (x X x + x ) f X (x) = lim x 0 x 1 P (x X x + x, Y y) f X,Y (x, y) = lim x 0 x y

Conditional cdf and pdf F Y X (y x) = y lim u= x 0, y 0 1 = lim x 0 P (x X x + x ) 1 P (x X x + x ) y u= P (x X x + x, Y y) = lim x 0 P (x X x + x ) = lim x 0 P (Y y x X x + x) P (x X x + x, Y u) y P (x X x + x, Y u) y du du

Conditional pdf of a random subvector Conditional pdf of a random subvector X I, I {1, 2,..., n}, given another subvector X {1,...,n}/I is f XI X {1,...,n}/I ( xi x {1,...,n}/I ) := f X ( x) f X{1,...,n}/I ( x{1,...,n}/i ) Chain rule for continuous random vectors f X ( x) = f X1 (x 1 ) f X2 X 1 (x 2 x 1 )... f Xn X1,...,X n 1 (x n x 1,..., x n 1 ) n ( ) = f Xi X xi x {1,...,i 1} {1,...,i 1} i=1 Any order works!

Example: Triangle lake (continued) Conditioned on {x 1 = 0.75} what is the pdf and cdf of x 2?

Example: Triangle lake (continued) f X2 X 1 (x 2 x 1 )

Example: Triangle lake (continued) f X2 X 1 (x 2 x 1 ) = f X ( x) f X1 (x 1 )

Example: Triangle lake (continued) f X2 X 1 (x 2 x 1 ) = f X ( x) f X1 (x 1 ) = 1 1 x 1, 0 x 2 1 x 1

Example: Triangle lake (continued) f X2 X 1 (x 2 x 1 ) = f X ( x) f X1 (x 1 ) = 1 1 x 1, 0 x 2 1 x 1 F X2 X 1 (x 2 x 1 ) = x2 = x 2 1 x 1 f X2 X 1 (u x 1 ) du

Example: Desert Car traveling through the desert Time until the car breaks down: T State of the motor: M State of the road: R Model: M uniform between 0 (no problem) and 1 (very bad) R uniform between 0 (no problem) and 1 (very bad) M and R independent T exponential with parameter M + R

Example: Desert Joint pdf?

Example: Desert Joint pdf? f M,R,T (m, r, t)

Example: Desert Joint pdf? f M,R,T (m, r, t) = f M (m) f R M (r m) f T M,R (t m, r)

Example: Desert Joint pdf? f M,R,T (m, r, t) = f M (m) f R M (r m) f T M,R (t m, r) = f M (m) f R (r) f T M,R (t m, r) by independence

Example: Desert Joint pdf? f M,R,T (m, r, t) = f M (m) f R M (r m) f T M,R (t m, r) = f M (m) f R (r) f T M,R (t m, r) by independence { (m + r) e (m+r)t for t 0, 0 m 1, 0 r 1, = 0 otherwise

Example: Desert Car breaks down after 15 min (0.25 h), T = 0.25 Road seems OK, R = 0.2 What was the state of the motor M?

Example: Desert Car breaks down after 15 min (0.25 h), T = 0.25 Road seems OK, R = 0.2 What was the state of the motor M? f M R,T (m r, t) = f M,R,T (m, r, t) f R,T (r, t)

Example: Desert f R,T (r, t) =

Example: Desert f R,T (r, t) = 1 m=0 f M,R,T (m, r, t) dm

Example: Desert f R,T (r, t) = 1 m=0 = e tr ( 1 f M,R,T (m, r, t) dm m=0 me tm dm + r 1 m=0 ) e tm dm

Example: Desert 1 f R,T (r, t) = f M,R,T (m, r, t) dm m=0 ( 1 1 ) = e tr me tm dm + r e tm dm m=0 m=0 ( 1 (1 + t) e = e tr t t 2 + r (1 ) e t ) t

Example: Desert 1 f R,T (r, t) = f M,R,T (m, r, t) dm m=0 ( 1 1 ) = e tr me tm dm + r e tm dm m=0 m=0 ( 1 (1 + t) e = e tr t t 2 + r (1 ) e t ) t = e tr t 2 ( 1 + tr e t (1 + t + tr) ) for t 0, 0 r 1

Example: Desert f M R,T (m r, t) = f M,R,T (m, r, t) f R,T (r, t) = = e tr t 2 (m + r) e (m+r)t (1 + tr e t (1 + t + tr)) (m + r) t 2 e tm 1 + tr e t (1 + t + tr)

Example: Desert f M R,T (m r, t) = f M,R,T (m, r, t) f R,T (r, t) = = e tr t 2 (m + r) e (m+r)t (1 + tr e t (1 + t + tr)) (m + r) t 2 e tm 1 + tr e t (1 + t + tr) f M R,T (m 0.2, 0.25) = (m + 0.2) 0.25 2 e 0.25m 1 + 0.25 0.2 e 0.25 (1 + 0.25 + 0.25 0.2) = 1.66 (m + 0.2) e 0.25m for 0 m 1

State of the car 1.5 f M R,T (m 0.2, 0.25) 1 0.5 0 0 0.2 0.4 0.6 0.8 1 m

Independent continuous random variables Two random variables X and Y are independent if and only if F X,Y (x, y) = F X (x) F Y (y), for all (x, y) R 2 Equivalently, F X Y (x y) = F X (x) F Y X (y x) = F Y (y) for all (x, y) R 2

Independent continuous random variables Two random variables X and Y with joint pdf f X,Y are independent if and only if f X,Y (x, y) = f X (x) f Y (y), for all (x, y) R 2 Equivalently, f X Y (x y) = f X (x) f Y X (y x) = f Y (y) for all (x, y) R 2

Mutually independent continuous random variables The components of a random vector X are mutually independent if and only if Equivalently, F X ( x) = f X ( x) = n F Xi (x i ) i=1 n f Xi (x i ) i=1

Mutually conditionally independent random variables The components of a subvector X I, I {1, 2,..., n} are mutually conditionally independent given another subvector X J, J {1, 2,..., n}, if and only if F XI X J ( x I x J ) = i I F Xi X J (x i x J ) Equivalently, f XI X J ( x I x J ) = i I f Xi X J (x i x J )

Functions of random variables U = g (X, Y ) and V = h (X, Y ) F U,V (u, v) = P (U u, V v) = P (g (X, Y ) u, h (X, Y ) v) = f X,Y (x, y) dx dy {(x,y) g(x,y) u,h(x,y) v}

Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y?

Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y? F Z (z)

Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y? F Z (z) = P (X + Y z)

Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y? F Z (z) = P (X + Y z) = = y= y= z y x= f X (x) f Y (y) dx dy F X (z y) f Y (y) dy

Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y? F Z (z) = P (X + Y z) = = y= y= z y x= f X (x) f Y (y) dx dy F X (z y) f Y (y) dy f Z (z) = d u dz lim F X (z y) f Y (y) dy u y= u

Sum of independent random variables X and Y are independent random variables, what is the pdf of Z = X + Y? F Z (z) = P (X + Y z) = = y= y= z y x= f X (x) f Y (y) dx dy F X (z y) f Y (y) dy f Z (z) = d dz lim = u y= Convolution of individual pdfs u y= u F X (z y) f Y (y) dy f X (z y) f Y (y) dy

Example: Coffee beans Company buys coffee beans from two local producers Beans from Colombia: C tons/year Beans from Vietnam: V tons/year Model: C uniform between 0 and 1 V uniform between 0 and 2 C and V independent What is the distribution of the total amount of beans B?

Example: Coffee beans f B (b) =

Example: Coffee beans f B (b) = u= f C (b u) f V (u) du

Example: Coffee beans f B (b) = = 1 2 u= 2 u=0 f C (b u) f V (u) du f C (b u) du

Example: Coffee beans f B (b) = = 1 2 = u= 2 u=0 1 2 1 2 1 2 2 u=b 1 f C (b u) f V (u) du f C (b u) du b u=0 du = b 2 if b 1 b u=b 1 du = 1 2 if 1 b 2 du = 3 b 2 if 2 b 3

Example: Coffee beans 1 f C f V 1 f B 0.5 0.5 0 0 0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3

Gaussian random vector A Gaussian random vector X has a joint pdf of the form f X ( x) = 1 ( (2π) n Σ exp 1 ) 2 ( x µ)t Σ 1 ( x µ) where the mean µ R n and the covariance matrix Σ is a symmetric positive definite matrix

Linear transformation of Gaussian random vectors X is a Gaussian r.v. of dimension n with mean µ and covariance matrix Σ For any matrix A R m n and b R m Y = AX + b is Gaussian with mean A µ + b and covariance matrix AΣA T

Marginal distributions are Gaussian Gaussian random vector, [ ] [ ] X µ Z :=, with mean µ := X Y µ Y and covariance matrix Σ Z = [ ] Σ X Σ X Y Σ T X Y Σ Y X is a Gaussian random vector with mean µ X and covariance matrix Σ X

Marginal distributions are Gaussian f Y (y) f X (x) fx,y (X, Y ) 0.2 0.1 0 3 2 1 2 x 0 1 2 3 2 0 y

Discrete random variables Continuous random variables Joint distributions of discrete and continuous random variables

Discrete and continuous random variables How do we model the relation between a continuous random variable C and a discrete random variable D? Conditional cdf and pdf of C given D By the Law of Total Probability F C D (c d) := P (C c D = d) f C D (c d) := df C D (c d) dc F C (c) = f C (c) = d R D p D (d) F C D (c d) d R D p D (d) f C D (c d)

Mixture models Data are drawn from continuous distribution whose parameters are chosen from a discrete set Important example: Gaussian mixture models

Grizzlies in Yellowstone Model for the weight of grizzly bears in Yellowstone: Males: Gaussian with µ := 240 kg and σ := 40kg Females: Gaussian with µ := 140 kg and σ := 20kg There are about the same number of females and males

Grizzlies in Yellowstone The distribution of the weight of all bears W can be modeled as a Gaussian mixture with two random variables: S (sex) and W (weight)

Grizzlies in Yellowstone The distribution of the weight of all bears W can be modeled as a Gaussian mixture with two random variables: S (sex) and W (weight) f W (w)

Grizzlies in Yellowstone The distribution of the weight of all bears W can be modeled as a Gaussian mixture with two random variables: S (sex) and W (weight) 1 f W (w) = p S (s) f W S (w s) s=0

Grizzlies in Yellowstone The distribution of the weight of all bears W can be modeled as a Gaussian mixture with two random variables: S (sex) and W (weight) f W (w) = 1 p S (s) f W S (w s) s=0 = 1 2 2π (w 240) 2 e 3200 40 + (w 140) 2 e 800 20

Grizzlies in Yellowstone 2 1 10 2 f W S ( 0) f W S ( 1) f W ( ) 0 0 100 200 300 400

Continuous and discrete random variables Conditional pmf of D given C? Event {C = c} has zero probability

Continuous and discrete random variables Conditional pmf of D given C? Event {C = c} has zero probability p D C (d c) := lim 0 P (D = d, c C c + ) P (c C c + )

Continuous and discrete random variables Conditional pmf of D given C? Event {C = c} has zero probability p D C (d c) := lim 0 P (D = d, c C c + ) P (c C c + ) By the Law of Total Probability and a limit argument p D (d) = c= f C (c) p D C (d c) dc

Bayesian coin flip Bayesian methods often endow parameters of discrete distributions with a continuous marginal distribution

Bayesian coin flip Bayesian methods often endow parameters of discrete distributions with a continuous marginal distribution You suspect a coin is biased You are uncertain about the bias so you model it as a random variable with pdf What is the probability of heads? f B (b) = 2b for b [0, 1]

Bayesian coin flip 2 f B ( ) 1.5 1 0.5 0 0 0.2 0.4 0.6 0.8 1

Bayesian coin flip p X (1)

Bayesian coin flip p X (1) = b= f B (b) p X B (1 b) db

Bayesian coin flip p X (1) = = b= 1 b=0 f B (b) p X B (1 b) db 2b 2 db

Bayesian coin flip p X (1) = = = 2 3 b= 1 b=0 f B (b) p X B (1 b) db 2b 2 db

Chain rule for continuous and discrete random variables P (c C c + D = d) p D (d) f C D (c d) = lim P (D = d) 0 = lim 0 P (D = d, c C c + ) P (c C c + ) P (D = d, c C c + ) = lim 0 P (c C c + ) = f C (c) p D C (d c)

Grizzlies in Yellowstone You spot a grizzly that is about 180 kg What is the probability that it is male?

Grizzlies in Yellowstone You spot a grizzly that is about 180 kg What is the probability that it is male? p S W (0 180)

Grizzlies in Yellowstone You spot a grizzly that is about 180 kg What is the probability that it is male? p S W (0 180) = p S (0) f W S (180 0) f W (180)

Grizzlies in Yellowstone You spot a grizzly that is about 180 kg What is the probability that it is male? p S W (0 180) = p S (0) f W S (180 0) f W (180) ( 1 40 exp = ( 1 40 exp 602 3200 602 3200 ) ) + 1 20 exp ( 402 800 ) = 0.545

Bayesian coin flip Coin flip is tails What is the distribution of the bias now?

Bayesian coin flip Coin flip is tails What is the distribution of the bias now? f B X (b 0)

Bayesian coin flip Coin flip is tails What is the distribution of the bias now? f B X (b 0) = f B (b) p X B (0 b) p X (0)

Bayesian coin flip Coin flip is tails What is the distribution of the bias now? f B X (b 0) = f B (b) p X B (0 b) p X (0) 2b (1 b) = 1/3 = 6b (1 b)

Bayesian coin flip 1.5 2 f B ( ) f B X ( 0) 1 0.5 0 0 0.2 0.4 0.6 0.8 1