Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51

Similar documents
Chapter 4 continued. Chapter 4 sections

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable

MAS113 Introduction to Probability and Statistics. Proofs of theorems

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

Covariance and Correlation Class 7, Jeremy Orloff and Jonathan Bloom

ECON Fundamentals of Probability

ENGG2430A-Homework 2

11. Regression and Least Squares

Covariance and Correlation

5 Operations on Multiple Random Variables

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

BASICS OF PROBABILITY

Appendix A : Introduction to Probability and stochastic processes

Outline Properties of Covariance Quantifying Dependence Models for Joint Distributions Lab 4. Week 8 Jointly Distributed Random Variables Part II

Chp 4. Expectation and Variance

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Multivariate Random Variable

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2

Covariance. Lecture 20: Covariance / Correlation & General Bivariate Normal. Covariance, cont. Properties of Covariance

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.

ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Lecture 2: Repetition of probability theory and statistics

Continuous Random Variables

Lecture 22: Variance and Covariance

1 Presessional Probability

Preliminary Statistics. Lecture 3: Probability Models and Distributions

Notes for Math 324, Part 19

The Multivariate Normal Distribution. In this case according to our theorem

Joint Distribution of Two or More Random Variables

EXPECTED VALUE of a RV. corresponds to the average value one would get for the RV when repeating the experiment, =0.

Chapter 5 continued. Chapter 5 sections

Lecture 11. Multivariate Normal theory

The Multivariate Gaussian Distribution

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.

Jointly Distributed Random Variables

CS70: Jean Walrand: Lecture 22.

Lecture 2: Review of Probability

Joint Probability Distributions and Random Samples (Devore Chapter Five)

ECE 313: Conflict Final Exam Tuesday, May 13, 2014, 7:00 p.m. 10:00 p.m. Room 241 Everitt Lab

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

ECE Lecture #9 Part 2 Overview

Algorithms for Uncertainty Quantification

CHAPTER 5. Jointly Probability Mass Function for Two Discrete Distributed Random Variables:

ECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.

CS70: Lecture 33. Linear Regression. 1. Examples 2. History 3. Multiple Random variables 4. Linear Regression 5. Derivation 6.

More than one variable

Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com

3. General Random Variables Part IV: Mul8ple Random Variables. ECE 302 Fall 2009 TR 3 4:15pm Purdue University, School of ECE Prof.

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors

Probability and Statistics Notes

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Gaussian random variables inr n

III - MULTIVARIATE RANDOM VARIABLES

Random Variables and Expectations

The mean, variance and covariance. (Chs 3.4.1, 3.4.2)

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Bivariate Distributions

ACM 116: Lectures 3 4

Chapter 4. Chapter 4 sections

ORF 245 Fundamentals of Statistics Chapter 4 Great Expectations

Mathematics of Finance Problem Set 1 Solutions

18.440: Lecture 28 Lectures Review

STOR Lecture 16. Properties of Expectation - I

STAT/MATH 395 PROBABILITY II

LIST OF FORMULAS FOR STK1100 AND STK1110

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

Two hours. Statistical Tables to be provided THE UNIVERSITY OF MANCHESTER. 14 January :45 11:45

ECE 450 Homework #3. 1. Given the joint density function f XY (x,y) = 0.5 1<x<2, 2<y< <x<4, 2<y<3 0 else

Probability, Random Processes and Inference

Elements of Probability Theory

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei)

4. Distributions of Functions of Random Variables

Econ 2120: Section 2

FINAL EXAM: Monday 8-10am

Probability Theory and Statistics. Peter Jochumzen

Probability Background

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Mathematical Statistics

Lecture 3 - Expectation, inequalities and laws of large numbers

Joint probability distributions: Discrete Variables. Two Discrete Random Variables. Example 1. Example 1

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

HW5 Solutions. (a) (8 pts.) Show that if two random variables X and Y are independent, then E[XY ] = E[X]E[Y ] xy p X,Y (x, y)

matrix-free Elements of Probability Theory 1 Random Variables and Distributions Contents Elements of Probability Theory 2

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Stat 5101 Notes: Algorithms (thru 2nd midterm)

We introduce methods that are useful in:

Probability and Distributions

ECSE B Solutions to Assignment 8 Fall 2008

Stat 5101 Notes: Algorithms

STA 4322 Exam I Name: Introduction to Statistics Theory

Properties of Random Variables

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Lecture 16 : Independence, Covariance and Correlation of Discrete Random Variables

Transcription:

Yi Lu Correlation and Covariance Yi Lu ECE 313 2/51

Definition Let X and Y be random variables with finite second moments. the correlation: E[XY ] Yi Lu ECE 313 3/51 Definition Let X and Y be random variables with finite second moments. the correlation: E[XY ] the covariance: Cov(X, Y )=E[(X E[X])(Y E[Y ])] Yi Lu ECE 313 4/51

Definition Let X and Y be random variables with finite second moments. the correlation: E[XY ] the covariance: Cov(X, Y )=E[(X E[X])(Y E[Y ])] Cov(X, Y ) the correlation coefficient: ρ X,Y = = Cov(X, Y ). Var(X)Var(Y ) σ X σ Y Yi Lu ECE 313 5/51 Covariance generalizes variance Var(X) =Cov(X, X). Yi Lu ECE 313 6/51

Covariance generalizes variance Var(X) =Cov(X, X). Shortcut for computing variance: Var(X) =E[X(X E[X])] = E[X 2 ] E[X] 2. Yi Lu ECE 313 7/51 Covariance generalizes variance Var(X) =Cov(X, X). Shortcut for computing variance: Var(X) =E[X(X E[X])] = E[X 2 ] E[X] 2. Similar shortcuts exist for computing covariances: Cov(X, Y ) = E[X(Y E[Y ])] = E[(X E[X])Y ] = E[XY ] E[X]E[Y ]. In particular, if either X or Y has mean zero, then E[XY ]=Cov(X, Y ). Yi Lu ECE 313 8/51

Correlated If Cov(X, Y )=0,Xand Y are uncorrelated. If Cov(X, Y ) > 0, Xand Y are positively correlated. If Cov(X, Y ) < 0, Xand Y are negatively correlated. ρ X,Y is a scaled version of Cov(X, Y ). Yi Lu ECE 313 9/51 Uncorrelated vs. Independent If Cov(X, Y )=0,Xand Y are uncorrelated. Cov(X, Y )=0is equivalent to E[XY ]=E[X]E[Y ]. Yi Lu ECE 313 10/51

Uncorrelated vs. Independent If Cov(X, Y )=0,Xand Y are uncorrelated. Cov(X, Y )=0is equivalent to E[XY ]=E[X]E[Y ]. Does independence of X and Y imply E[XY ]=E[X]E[Y ]? Yi Lu ECE 313 11/51 Uncorrelated vs. Independent If Cov(X, Y )=0,Xand Y are uncorrelated. Cov(X, Y )=0is equivalent to E[XY ]=E[X]E[Y]. Does independence of X and Y imply E[XY ]=E[X]E[Y]? Yes, independence uncorrelated. Yi Lu ECE 313 12/51

Uncorrelated vs. Independent Does uncorrelated independence? Yi Lu ECE 313 13/51 Uncorrelated vs. Independent Does uncorrelated independence? No, independence is a much stronger condition. Independence requires a larger number of equations to hold, namely F XY (u, v) =F X (u)f Y (v) for every real value of u and v. Uncorrelated only requires a single equation to hold. Yi Lu ECE 313 14/51

Uncorrelated vs. Independent Correlation is a pairwise property. A set of random variables being uncorrelated is the same as being pairwise uncorrelated. Yi Lu ECE 313 15/51 Uncorrelated vs. Independent Correlation is a pairwise property. A set of random variables being uncorrelated is the same as being pairwise uncorrelated. In contrast, mutual independence of 3 or more random variables is a stronger property than pairwise independence. Yi Lu ECE 313 16/51

Play with Covariance Yi Lu ECE 313 17/51 Linearity Covariance is linear in each of its two arguments: Cov(X + Y,U + V ) = Cov(X, U)+ Cov(X, V )+ Cov(Y,U)+ Cov(Y,V Cov(aX + b, cy + d) = accov(x, Y ). for constants a, b, c, d. Recall that Var(aX + b) =a 2 Var(X). Yi Lu ECE 313 18/51

Variance of sum of r.v. The variance of the sum of uncorrelated random variables is equal to the sum of the variances of the random variables. For example, if X and Y are uncorrelated, Var(X + Y )=Var(X)+Var(Y ), Yi Lu ECE 313 19/51 Variance of sum of r.v. The variance of the sum of uncorrelated random variables is equal to the sum of the variances of the random variables. For example, if X and Y are uncorrelated, Var(X+Y )=Cov(X+Y,X+Y )=Cov(X, X)+Cov(Y,Y )+2Cov(X, Y ) = Var(X)+Var(Y ). Yi Lu ECE 313 20/51

Variance of sum of r.v. Consider the sum S n = X 1 + + X n, such that X 1,,X n are uncorrelated (so Cov(X i,x j )=0if i j) with E[X i ]=μ and Var(X i )=σ 2 for 1 i n. Find E(S n ) and Var(S n ). Yi Lu ECE 313 21/51 Variance of sum of r.v. E[S n ]=nμ (1) Yi Lu ECE 313 22/51

and n n Var(S n ) = Cov(S n,s n )=Cov X i, X j i=1 j=1 = n n Cov(X i,x j ) i=1 j=1 = n Cov(X i,x i )+ Cov(X i,x j ) i=1 i,j:i j = n Var(X i )+0=nσ 2. (2) i=1 Yi Lu ECE 313 23/51 Practice! Simplify the following expressions: (a) Cov(8X +3, 5Y 2), (b) Cov(10X 5, 3X +15), (c) Cov(X+2,10X- 3Y), (d) ρ 10X,Y +4. Yi Lu ECE 313 24/51

Correlation coefficient Yi Lu ECE 313 25/51 Correlation coefficient ρ X,Y = ρ X,Y is a scaled version of Cov(X, Y ). Cov(X, Y ) Var(X)Var(Y ) = Cov(X, Y ) σ X σ Y. The situation is similar to the standardized version of X and Y. Find ( X E[X] Cov, Y E[Y ] ). σ X σ Y Yi Lu ECE 313 26/51

Correlation coefficient ( X E[X] Cov, Y E[Y ] ) ( X = Cov, Y ) = Cov(X, Y ) = ρ X,Y. σ X σ Y σx σ Y σ X σ Y Yi Lu ECE 313 27/51 Correlation coefficient Find ρ ax+b,cy +d Yi Lu ECE 313 28/51

Correlation coefficient ρ ax+b,cy +d = ρ X,Y for a, c > 0. Yi Lu ECE 313 29/51 Correlation coefficient ρ X,Y 1, ρ X,Y =1if and only if Y = ax + b for some a, b with a>0, and ρ X,Y = 1 if and only if Y = ax + b for some a, b with a<0. Yi Lu ECE 313 30/51

Unbiased Estimator Suppose X 1,...,X n are independent and identically distributed random variables, with mean μ and variance σ 2. We estimate μ and σ 2 by the sample mean and sample variancedefined as follows: X = 1 n n k=1 X k σ2 = 1 n 1 n (X k X) 2. Note the perhaps unexpected appearance of n 1 in the sample variance. Of course, we should have n 2 to estimate the variance (assuming we don t know the mean) so it is not surprising that the formula is not defined if n =1. An estimator is called unbiased if the mean of the estimator is equal to the parameter that is being estimated. k=1 Yi Lu ECE 313 31/51 Q1: Why don t we use ML parameter estimation for μ and σ 2? Q2: Why is σ 2 undefined for n =1? (a) Is the sample mean an unbiased estimator of μ? (b) Find the mean square error, E[(μ X) 2 ], for estimation of the mean by the sample mean. (c) Is the sample variance an unbiased estimator of σ 2? Yi Lu ECE 313 32/51

Minimum mean square error estimation Yi Lu ECE 313 33/51 Constant estimator Let Y be a random variable with some known distribution. Suppose Y is not observed but that we wish to estimate Y. We use a constant δ to estimate Y. Yi Lu ECE 313 34/51

Constant estimator Let Y be a random variable with some known distribution. Suppose Y is not observed but that we wish to estimate Y. We use a constant δ to estimate Y. The mean square error (MSE) for estimating Y by δ is defined by E[(Y δ) 2 ]. Q. How do we find a δ that minimizes E[(Y δ) 2 ]? What is the resulting MSE? Yi Lu ECE 313 35/51 Unconstrained Estimator We want to estimate Y. We have an observation X. The joint distribution is f X,Y. We use the estimator g(x) for some function g. The resulting mean square error (MSE) is E[(Y g(x)) 2 ] Q. What is the function g that minimizes the MSE? What is the resulting MSE? Yi Lu ECE 313 36/51

Unconstrained Estimator Suppose you observe X =10. What do you know about Y? Yi Lu ECE 313 37/51 Unconstrained Estimator Suppose you observe X =10. What do you know about Y? You can derive the conditional pdf of Y given X = 10, denoted by f Y X (v 10). Which value of Y should you pick? Yi Lu ECE 313 38/51

Unconstrained Estimator Suppose you observe X =10. What do you know about Y? You can derive the conditional pdf of Y given X = 10, denoted by f Y X (v 10). Which value of Y should you pick? Based on the fact, discussed above, that the minimum MSE constant estimator for a random variable is its mean, it makes sense to estimate Y by the conditional mean: E[Y X = 10] = vf Y X (v 10)dv. Yi Lu ECE 313 39/51 Unconstrained Estimator In general, we can show that g (u) =E[Y X = u] = The minimum MSE is vf Y X (v u)dv. MSE = E[Y 2 ] E[(E[Y X]) 2 ], (3) = Var(Y ) Var(Y X) (4) Yi Lu ECE 313 40/51

Linear estimator In practice it is not always possible to compute g (u). The conditional density f Y X (v u) may not be available or might be difficult to compute. Worse, there might not even be a good way to decide what joint pdf f X,Y to use in the first place. A reasonable alternative is to consider linear estimators of Y given X. Yi Lu ECE 313 41/51 Linear estimator We use a linear estimator L(X) =ax + b. We only need to find a and b. The resulting mean square error (MSE) is E[(Y (ax + b)) 2 ]. Q. What are the a and b that minimize the MSE? What is the resulting MSE? Yi Lu ECE 313 42/51

Linear estimator The minimum MSE linear estimator is given by L (X) =Ê[Y X], where Ê[Y X] = μ Y + ( ) Cov(Y,X) Var(X) = μ Y + σ Y ρ X,Y ( X μx σ X (X μ X ) ). minimum MSE for linear estimation = σy 2 (Cov(X, Y ))2 Var(X) = σ 2 Y (1 ρ 2 X,Y ). Yi Lu ECE 313 43/51 If X and Y are standard, then Ê[Y X] =ρ X,Y X and the MSE is 1 ρ 2 X,Y. Yi Lu ECE 313 44/51

Three estimators Constant Unconstrained Linear Which is the best? Yi Lu ECE 313 45/51 Three estimators Constant (estimator) Linear Unconstrained E[(Y g (X)) 2 ] }{{} σy 2 (1 ρ 2 X,Y ) }{{} σy 2 }{{} MSE for g (X)=E[Y X] MSE for L (X)=Ê[Y X] MSE for δ =E[Y ]. (5) Yi Lu ECE 313 46/51

Three estimators All three estimators are linear as functions of the variable to be estimated: E[aY + bz + c] = ae[y ]+be[z]+c E[aY + bz + c X] = ae[y X]+bE[Z X]+c Ê[aY + bz + c X] = aê[y X]+bÊ[Z X]+c Yi Lu ECE 313 47/51 Noisy observation Let X = Y + N, where Y has the exponential distribution with parameter λ, andn is Gaussian with mean 0 and variance σn 2. Suppose the variables Y and N are independent, and the parameters λ and σn 2 are known and strictly positive. (Recall that E[Y ]= 1 λ and Var(Y )=σ2 Y = 1.) λ 2 (a) Find Ê[Y X], the MSE linear estimator of Y given X, and also find the resulting MSE. (b) Find an unconstrained estimator of Y yielding a strictly smaller MSE than Ê[Y X] does. Yi Lu ECE 313 48/51

Uniform distribution Suppose (X, Y ) is uniformly distributed over the triangular region with vertices at ( 1, 0), (0, 1), and(1, 1), shown in Figure 1. (a) Find and sketch v u Figure 1: Support of f X,Y. Yi Lu ECE 313 49/51 the minimum MSE estimator of Y given X = u, g (u) =E[Y X = u], for all u such that it is well defined, and find the resulting minimum MSE for using g (X) =E[Y X] to estimate Y. (b) Find and sketch the function Ê[Y X = u], used for minimum MSE linear estimation of Y from X, and find the resulting MSE for using Ê[Y X] to estimate Y. Yi Lu ECE 313 50/51

Questions?