Jointly Distributed Random Variables

Similar documents
Joint probability distributions: Discrete Variables. Two Discrete Random Variables. Example 1. Example 1

Joint Distribution of Two or More Random Variables

STAT/MATH 395 PROBABILITY II

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

Notes for Math 324, Part 19

Covariance and Correlation Class 7, Jeremy Orloff and Jonathan Bloom

Bivariate distributions

EXPECTED VALUE of a RV. corresponds to the average value one would get for the RV when repeating the experiment, =0.

The mean, variance and covariance. (Chs 3.4.1, 3.4.2)

ENGG2430A-Homework 2

Lecture 16 : Independence, Covariance and Correlation of Discrete Random Variables

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

Bivariate Distributions

Lecture 2: Review of Probability

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

ECON Fundamentals of Probability

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Ch. 5 Joint Probability Distributions and Random Samples

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Covariance and Correlation

Chapter 4 continued. Chapter 4 sections

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable

STA 256: Statistics and Probability I

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

Outline Properties of Covariance Quantifying Dependence Models for Joint Distributions Lab 4. Week 8 Jointly Distributed Random Variables Part II

CHAPTER 5. Jointly Probability Mass Function for Two Discrete Distributed Random Variables:

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.

18.440: Lecture 28 Lectures Review

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.

MULTIVARIATE PROBABILITY DISTRIBUTIONS

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

BASICS OF PROBABILITY

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Probability, Random Processes and Inference

Multivariate probability distributions and linear regression

Lecture 4: Proofs for Expectation, Variance, and Covariance Formula

11. Regression and Least Squares

Analysis of Engineering and Scientific Data. Semester

Appendix A : Introduction to Probability and stochastic processes

Gaussian random variables inr n

Distributions of linear combinations

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Chapter 4. Chapter 4 sections

3 Multiple Discrete Random Variables

Problem Set #5. Econ 103. Solution: By the complement rule p(0) = 1 p q. q, 1 x 0 < 0 1 p, 0 x 0 < 1. Solution: E[X] = 1 q + 0 (1 p q) + p 1 = p q

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

Algorithms for Uncertainty Quantification

18.440: Lecture 26 Conditional expectation

Review of Probability Theory

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v }

More than one variable

EECS 126 Probability and Random Processes University of California, Berkeley: Spring 2015 Abhay Parekh February 17, 2015.

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

More on Distribution Function

Notes 12 Autumn 2005

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

Review of Probability. CS1538: Introduction to Simulations

Multivariate Random Variable

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

Functions of two random variables. Conditional pairs

18.440: Lecture 28 Lectures Review

Chapter 4. Multivariate Distributions. Obviously, the marginal distributions may be obtained easily from the joint distribution:

Preliminary Statistics. Lecture 3: Probability Models and Distributions

(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise.

ECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.

Lecture 2: Repetition of probability theory and statistics

1 Probability theory. 2 Random variables and probability theory.

When Are Two Random Variables Independent?

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Multivariate Distributions (Hogg Chapter Two)

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

University of Illinois ECE 313: Final Exam Fall 2014

This does not cover everything on the final. Look at the posted practice problems for other topics.

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

Conditional distributions (discrete case)

6.041/6.431 Fall 2010 Quiz 2 Solutions

STOR Lecture 16. Properties of Expectation - I

STAT 430/510: Lecture 16

Recitation 2: Probability

ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 PROBABILITY. Prof. Steven Waslander

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

Stat 5101 Notes: Algorithms (thru 2nd midterm)

2 (Statistics) Random variables

Basics on Probability. Jingrui He 09/11/2007

Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51

HW Solution 12 Due: Dec 2, 9:19 AM

Class 8 Review Problems solutions, 18.05, Spring 2014

Continuous Random Variables

3. General Random Variables Part IV: Mul8ple Random Variables. ECE 302 Fall 2009 TR 3 4:15pm Purdue University, School of ECE Prof.

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

STAT 516 Midterm Exam 3 Friday, April 18, 2008

Homework 5 Solutions

3d scatterplots. You can also make 3d scatterplots, although these are less common than scatterplot matrices.

Transcription:

Jointly Distributed Random Variables CE 311S

What if there is more than one random variable we are interested in? How should you invest the extra money from your summer internship?

To simplify matters, imagine there are two mutual funds you are thinking about investing in: Fund A: Fund B: Invests in cryptocurrency Invests in municipal bonds Neither of these funds has a guaranteed rate of return, so we can use probability distributions to describe them.

Based on your investing experience, you believe that Fund A s annual rate of return will be 75% with probability 0.2, 20% with probability 0.5, and 50% with probability 0.3. Fund B has an annual rate of return of 10% with probability 0.6, and 5% with probability 0.4. However, investments are not independent of one another: when the economy is strong, most assets will increase in value; in a recession, most assets will decrease in value.

We can describe this information in a table showing the probability of seeing each combination of rates of return. B A +75% +20% 50% Sum +10% 0.10 0.45 0.05 0.6 5% 0.10 0.05 0.25 0.4 Sum 0.20 0.50 0.30 1 Each entry in the table shows the probability of a particular combination of rates of return. This is called the joint probability mass function or joint distribution of A and B.

B A +75% +20% 50% Sum +10% 0.10 0.45 0.05 0.6 5% 0.10 0.05 0.25 0.4 Sum 0.20 0.50 0.30 1 Some things to notice about the table: Each value is nonnegative, and all values in the table add up to 1. The sum of all values in the first row gives the probability that B = +10% when we aren t looking at A. The sum of the values in each column give the probability mass function for A when we aren t looking at B.

In general, if X and Y are any two discrete variables, the joint probability mass function P XY (x, y) is the probability of seeing both X = x and Y = y. To be a valid joint PMF, P XY (x, y) 0 for all x and y, and x y P XY (x, y) = 1. The marginal PMF of X gives us the distribution of X when we aren t concerned with Y : P X (x) = P XY (x, y) y Likewise, the marginal PMF of Y is P Y (y) = x P XY (x, y)

B A +75% +20% 50% Sum +10% 0.10 0.45 0.05 0.6 5% 0.10 0.05 0.25 0.4 p A 0.20 0.50 0.30 1 The marginal PMF of A in this table is just the sums of each column.

B A +75% +20% 50% p B +10% 0.10 0.45 0.05 0.6 5% 0.10 0.05 0.25 0.4 p A 0.20 0.50 0.30 1 The marginal PMF of B is just the sums of each row.

The joint CDF is written F XY (x, y) = P(X x Y y) and is the sum of the P XY values which are both less than or equal to x, and less than or equal to y. To use the joint CDF to solve problems, it is helpful to draw a picture to see what areas need to be added and subtracted.

(Figure from Pishro-Nik)

Two discrete random variables X and Y are independent if P XY (x, y) = P X (x)p Y (y) for every possible value of x and y. If this is not true (even for one value of x and y), they are dependent. B A +75% +20% 50% p B +10% 0.10 0.45 0.05 0.6 5% 0.10 0.05 0.25 0.4 p A 0.20 0.50 0.30 1 Are funds A and B independent?

Let R be the number of heads on two coin flips, and S be the number of heads on the next three coin flips. S R 0 1 2 3 p S 0 1/32 3/32 3/32 1/32 1/4 1 1/16 3/16 3/16 1/16 1/2 2 1/32 3/32 3/32 1/32 1/4 p R 1/8 3/8 3/8 1/8 1 Are R and S independent? How did I get the values for P RS (r, s)?

The expected value of any function h(x, Y ) is x y h(x, y)p XY (x, y) B A +75% +20% 50% p B +10% 0.10 0.45 0.05 0.6 5% 0.10 0.05 0.25 0.4 p A 0.20 0.50 0.30 1 Let s say I invest my money equally in the two funds. What is the expected average rate of return (A + B)/2?

To calculate the sum, you can create two tables, one with values of (A + B)/2, and the other with probabilities. (A + B)/2: Probabilities: B A +75% +20% 50% +10% +42.5% +15% 20% 5% +35% +7.5% 27.5% B A +75% +20% 50% +10% 0.10 0.45 0.05 5% 0.10 0.05 0.25 Multiply corresponding values and add: (42.5 0.10 + 15 0.45 + + ( 27.5) 0.25) = 7.

What are E[A] and E[B]? Probabilities: B A +75% +20% 50% +10% 0.10 0.45 0.05 5% 0.10 0.05 0.25

A: B A +75% +20% 50% +10% +75% +20% 50% 5% +75% +20% 50% Probabilities: B A +75% +20% 50% +10% 0.10 0.45 0.05 5% 0.10 0.05 0.25 Multiply corresponding values and add: (75 0.10 + 20 0.45 + + ( 50) 0.25) = 10. (You would get the same answer by calculating E[A] the usual way, from the marginal distributions.)

A: B A +75% +20% 50% +10% +10% +10% +10% 5% 5% 5% 5% Probabilities: B A +75% +20% 50% +10% 0.10 0.45 0.05 5% 0.10 0.05 0.25 Multiply corresponding values and add: (10 0.10 + 10 0.45 + + ( 5) 0.25) = 4. (You would get the same answer by calculating E[B] the usual way, from the marginal distributions.)

MULTIPLE CONTINUOUS RANDOM VARIABLES

All of the same concepts can be applied for joint continuous random variables as well. The joint density function f XY (x, y) is valid if f XY (x, y) 0 for all x and y, and if f XY (x, y) dy dx = 1. The marginal density functions are f X (x) = f XY (x, y) dy and f Y (y) = f XY (x, y) dx X and Y are independent if f XY (x, y) = f X (x)f Y (y) for all x and y E[h(X, Y )] = h(x, y)f XY (x, y) dy dx Multiple Continuous Random Variables

Continuous joint random variables are similar, but let s go through some examples. (Key ideas: replace table of masses with a density function of multiple variables; replace sums with integrals) Multiple Continuous Random Variables

You have a new laptop. Let X represent the time (in years) before the hard drive fails, and Y the time in years before your keyboard breaks. Let the joint PDF of X and Y be f XY (x, y) = ke 3x 2y for x > 0 and y > 0, and 0 otherwise. What is k? Multiple Continuous Random Variables

So f XY (x, y) = 6e 3x 2y for x > 0, y > 0. What is the marginal distribution of X? (This gives us the pdf for failure life of the hard drive, irrespective of the failure life of the keyboard.) Multiple Continuous Random Variables

f XY (x, y) = 6e 3x 2y for x > 0, y > 0. What is the marginal distribution of Y? (This gives us the pdf for failure life of the keyboard, irrespective of the failure life of the hard drive.) Multiple Continuous Random Variables

X and Y are independent if f XY (x, y) = f X (x)f Y (y) Are X and Y independent? Multiple Continuous Random Variables

What is the expected lifetime of the keyboard? Multiple Continuous Random Variables

Assume that I throw the laptop away once either the keyboard or hard drive break. What is the expected lifetime of the laptop? Multiple Continuous Random Variables

COVARIANCE AND CORRELATION

In the discrete example, we already saw that funds A and B are not independent. It would be useful to have a measure of how dependent they are, though. Define the covariance of two random variables X and Y to be Cov(X, Y ) = E[(X E[X ])(Y E[Y ])]. Why does this tell us how dependent X and Y are? There is a shortcut formula for covariance: Cov(X, Y ) = E[XY ] E[X ]E[Y ]. Covariance and Correlation

Recall that E[A] = 10 and E[B] = 4. (A E[A])(B E[B]): Probabilities: B A +75% +20% 50% +10% 390 60 360 5% 585 90 540 B A +75% +20% 50% +10% 0.10 0.45 0.05 5% 0.10 0.05 0.25 Multiply corresponding values and add: (390 0.10 + 60 0.45 + + 540 0.25) = 120. Positive covariance means that when A is high, B tends to be high; and when A is low, B tends to be low as well. Covariance and Correlation

We can also use the shortcut formula E[AB] E[A]E[B] to save some tedious computations: AB: B A +75% +20% 50% +10% 750 200 500 5% 375 100 250 Probabilities: B A +75% +20% 50% +10% 0.10 0.45 0.05 5% 0.10 0.05 0.25 E[AB] = 160, so Cov(A, B) = 160 4 10 = 120 Covariance and Correlation

The trouble with covariance by itself is that it is hard to interpret, apart from the sign: If covariance is positive, when X is above average, Y usually is too; and when X is below average, Y usually is too. If covariance is negative, when X is above average, Y is usually below average, and vice versa. If X and Y are independent, their covariance is zero. (The converse is not true). The magnitude of the covariance does not tell very much, though. (It depends on the units of X and Y ) Covariance and Correlation

To make the magnitude useful, we make covariance unitless by dividing by the standard deviations of X and Y. This gives the correlation coefficient: ρ XY = Cov(X, Y ) σ X σ Y The correlation coefficient is always between 1 and +1, and quantifies the strength of the linear relationship between X and Y. If ρ XY = 1, then Y = ax + b for some a > 0. If ρ XY = 1, then Y = ax + b for some a < 0. If ρ XY = 0, there is no linear relationship between X and Y. (This does not mean they are independent.) If ρ XY (0, 1), there is roughly a linear relationship with positive slope. If ρ XY ( 1, 0), there is roughly a linear relationship with negative slope. Covariance and Correlation

In the example with the mutual funds, σ A = 44.4 and σ B = 7.35, so ρ AB = 120/(44.4 7.35) = 0.367 Covariance and Correlation

Properties of covariance 1 Cov(X, X ) = Var[X ] (covariance of a random variable with itself is just its variance) 2 X, Y independent Cov(X, Y ) = 0 (but the reverse is not always true) 3 Cov(X, Y ) = Cov(Y, X ) (order doesn t matter for covariance) 4 Cov(aX, Y ) = acov(x, Y ) (constants can be factored out) 5 Cov(X + c, Y ) = Cov(X, Y ) (adding a constant does not change covariance) 6 Cov(X + Y, Z) = Cov(X, Z) + Cov(Y, Z) (distributive property) (You should be able to prove these formulas from the definitions given above.) Covariance and Correlation

By combining these properties, we can obtain the general formula m n Cov( a i X i, b j Y j ) = i=1 j=1 m n a i b j Cov(X i, Y j ) i=1 j=1 For instnace, Cov(X 1 + 2X 2, 3Y 1 + 4Y 2 ) = 3Cov(X 1, Y 1 ) + 4Cov(X 1, Y 2 ) + 6Cov(X 2, Y 1 ) + 8Cov(X 2, Y 2 ) Covariance and Correlation

Let X and Y be independent standard normal random variables. What is Cov(1 + X + XY 2, 1 + X )? 1 Added constants can be removed: Cov(X + XY 2, X ) 2 Distributive property: Cov(X, X ) + Cov(XY 2, X ) 3 Covariance with itself is variance: Var(X ) + Cov(XY 2, X ) 4 Standard normal has variance 1: 1 + Cov(XY 2, X ) 5 Shortcut formula: 1 + E[X 2 Y 2 ] E[XY 2 ]E[X ] 6 Independence: 1 + E[X 2 ]E[Y 2 ] E[XY 2 ]E[X ] 7 Variance shortcut formula: 1 + (Var(X ) + E[X ] 2 )(Var(Y ) + E[Y ] 2 ) E[XY 2 ]E[X ] 8 Standard normal: 1 + (1 + 0)(1 + 0) E[XY 2 ](0) = 2 Covariance and Correlation

An important special case is finding the variance of a sum: Var(aX + by ) = a 2 Var(X ) + b 2 Var(Y ) + 2abCov(X, Y ) What does this mean if X and Y are independent? Positively correlated? Negatively correlated? Covariance and Correlation