Summary of EE226a: MMSE and LLSE

Size: px
Start display at page:

Download "Summary of EE226a: MMSE and LLSE"

Transcription

1 1 Summary of EE226a: MMSE and LLSE Jean Walrand Here are the key ideas and results: I. SUMMARY The MMSE of given Y is E[ Y]. The LLSE of given Y is L[ Y] = E() + K,Y K 1 Y (Y E(Y)) if K Y is nonsingular. Theorem 3 states the properties of the LLSE. The linear regression approximates the LLSE when the samples are realizations of i.i.d. random pairs ( m, Y m ). II. ESTIMATION: FORMULATION The estimation problem is a generalized version of the Bayesian decision problem where the set of values of can be R n. In applications, one is given a source model f and a channel model f Y that together define f,y. One may have to estimate f Y by using a training sequence, i.e., by selecting the values of and observing the channel outputs. Alternatively, one may be able to observe a sequence of values of (, Y) and use them to estimate f,y. Definition 1: Estimation Problems One is given the joint distribution of (, Y). The estimation problem is to calculate Z = g(y) to minimize E(c(, Z)) for a given function c(, ). The random variable Z = g(y) that minimizes E( Z 2 ) is the minimum mean squares estimator (MMSE) of given Y. The random variable Z = AY + b that minimizes E( Z 2 ) is called the linear least squares estimator (LLSE) of given Y; we designate it by Z = L[ Y]. III. MMSE Here is the central result about minimum mean squares estimation. You have seen this before, but we recall the proof of that important result. Theorem 1: MMSE The MMSE of given Y is E[ Y]. Proof: You should recall the definition of E[ Y], a random variable that has the property E[( E[ Y])h 1 (Y)] =, h 1 ( ), (1) or equivalently E[h 2 (Y)( E[ Y])] =, h 2 ( ). (2) By E() = E[E[ Y]], the interpretation is that E[ Y] h(y) for all h( ). By Pythagoras, one then expect E[ Y] to be the function of Y that is closest to, as illustrated in Figure 1. E[ Y] = g(y) h(y) {r(y), r(.) = function} Fig. 1. The conditional expectation as a projection. Formally, let h(y) be an arbitrary function. Then E( h(y) 2 ) = E( E[ Y] + E[ Y] h(y) 2 ) = E( E[ Y] 2 ) + 2E((E[ Y] h(y)) T ( E[ Y])) + E( E[ Y] h(y) 2 ) = E( E[ Y] 2 ) + E( E[ Y] h(y) 2 ) E( E[ Y] 2 ). In this derivation, the third identity follows from the fact that the cross-term vanishes in view of (2). Note also that this derivation corresponds to the fact that the triangle {, E[ Y], h(y)} is a right triangle with hypothenuse (, h(y)).

2 2 Example 1: You recall that if (, Y) are jointly Gaussian with K Y nonsingular, then E[ Y] = E() + K,Y K 1 Y (Y E(Y)). You also recall what happens when K Y is singular. The key is to find A so that K,Y = AK Y = AQΛQ T. By writing [ ] Λ1 Q = [Q 1 Q 2 ], Λ = where Λ 1 corresponds the part of nonzero eigenvalues of K Y, one finds [ ] [ K,Y = AQΛQ T Λ1 Q T = A[Q 1 Q 2 ] 1 so that K,Y = AK Y with A = K,Y Q 1 Λ 1 1 QT 1. IV. LLSE Q T 2 ] = AQ 1 Λ 1 Q T 1, Theorem 2: LLSE L[ Y] = E() + K,Y K 1 Y (Y E(Y)) if K Y is nonsingular. L[ Y] = E() + K,Y Q 1 Λ 1 1 QT 1 (Y E(Y)) if K Y is singular. Proof: Z = L[ Y] = AY + b satisfies Z BY + d for any B and d with E() = E[L[ Y]]. If we consider any Z = CY + c, then E((Z Z ) T ( Z)) = since Z Z = AY + b CY c = (A C)Y + (b c) = BY + d. It follows that E( Z 2 ) = E( Z + Z Z 2 ) = E( Z 2 ) + 2E((Z Z ) T ( Z)) + E( Z Z 2 ) = E( Z 2 ) + E( Z Z 2 ) E( Z 2 ). Figure 2 illustrates this calculation. It shows that L[ Y] is the projection of on the set of linear functions of Y. The projection Z is characterized by the property that Z is orthogonal to all the linear functions BY + d. This property holds if and only if (from (1)) E( Z) = and E(( Z)Y T ) =, which are equivalent to E( Z) = and cov( Z, Y) =. The comparison between L[ Y] and E[ Y] is shown in Figure 3. Z = L[ Y] Z {r(y) = CY + d} Fig. 2. The LLSE as a projection. L[ Y] E[ Y] {CY + d} {g(y)} Fig. 3. LLSE vs. MMSE. The following result indicates some properties of the LLSE. They are easy to prove. Theorem 3: Properties of LLSE (a) L[A 1 + B 2 Y] = AL[ 1 Y] + BL[ 2 Y]. (b) L[L[ Y, Z] Y] = L[ Y]. (c) If Y, then L[ Y] = E(). (d) Assume that and Z are conditionally independent given Y. Then, in general, L[ Y, Z] L[ Y ].

3 3 Proof: (b): It suffices to show that E(L[ Y, Z] L[ Y]) = and E((L[ Y] L[ Y, Z])Y T ) =. We already have E( L[ Y, Z]) = and E( L[ Y]) =, which implies E(L[ Y, Z] L[ Y]) =. Moreover, from E(( L[ Y, Z])Y T ) = and E(( L[ Y])Y T ) =, it is obtained E((L[ Y] L[ Y, Z])Y T ) =. (d): Here is a trivial example. Assume that = Z = Y 1/2 where is U[, 1]. Then L[ Y, Z] = Z. However, since Y = 2, Y 2 = 4, and Y = 3, we find We see that L[ Y, Z] L[ Y ] because E(Y ) E()E(Y ) L[ Y ] = E() + E(Y 2 ) E(Y ) 2 (Y E(Y )) = 1 ( ) ( ) 1 ( Y 1 ) 9 3 = 3 16 Y. Z 3 16 Y = 3 16 Z2. Figure 4 illustrates property (b) which is called the smoothing property of LLSE. L[ Y] L[ Y, Z] {CY + d} {AY + BZ + c} Fig. 4. Smoothing property of LLSE. V. EAMPLES We illustrate the previous ideas on a few examples. Example 2: Assume that U, V, W are independent and U[, 1]. Calculate L[(U + V ) 2 V 2 + W 2 ]. Solution: Let = (U + V ) 2 and Y = V 2 + W 2. Then Now, and Hence, We conclude that K,Y = cov(u 2 + 2UV + V 2, V 2 + W 2 ) = 2cov(UV, V 2 ) + cov(v 2, V 2 ). cov(uv, V 2 ) = E(UV 3 ) E(UV )E(V 2 ) = = 1 24 cov(v 2, V 2 ) = E(V 4 ) E(V 2 )E(V 2 ) = = K,Y = = L[(U + V ) 2 V 2 + W 2 ] = E((U + V ) 2 ) + K,Y K 1 Y (Y E(Y )) = Example 3: Let U, V, W be as in the previous example. Calculate L[cos(U + V ) V + W ]. Solution: Let = cos(u + V ) and Y = V + W. We find K,Y = E(Y ) E()E(Y ). ( Y 2 ). 3 Now, E(Y ) = E(V cos(u + V )) E().

4 4 Also, Moreover, In addition, and E(V cos(u + V )) = E() = = v cos(u + v)dudv = (v sin(v + 1) v sin(v))dv = = [v cos(v + 1)] 1 + [v sin(u + v)] 1 dv v d cos(v + 1) + cos(v + 1)dv + [v cos(v)] 1 = cos(2) + [sin(v + 1)] 1 + cos(1) [sin(v)] 1 = cos(2) + sin(2) sin(1) + cos(1) sin(1). cos(u + v)dudv = [sin(u + v)] 1 du = (sin(u + 1) sin(u))du cos(v)dv v d cos(v) = [cos(u + 1)] 1 + [cos(u)] 1 = cos(2) + cos(1) + cos(1) cos() = cos(2) + 2 cos(1) 1. E(Y ) = E(V ) + E(U) = 1. K Y = 1 6. VI. LINEAR REGRESSION VS. LLSE We discuss the connection between the familiar linear regression procedure and the LLSE. One is given a set of n pairs of numbers {(x m, y m ), m = 1,..., n}, as shown in Figure 5. One draws a line through the points. That line approximates the values y m by z m = αx m + β. The line is chosen to minimize (z m y m ) 2 = (αx m + β y m ) 2. (3) That is, the linear regression is the linear approximation that minimizes the sum of the squared errors. Note that there is no probabilistic framework in this procedure. To find the values of α and β that minimize the sum of the squared errors, one differentiates (3) with respect to α and β and sets the derivatives to zero. One finds (αx m + β y m ) = and x m (αx m + β y m ) =. Solving these equations, we find α = A(xy) A(x)A(y) A(x 2 ) A(x) 2 In these expressions, we used the following notation: A(y) := 1 y m, A(x) := 1 x m, A(xy) := 1 n n n Thus, the point y m is approximated by z m = A(y) + and β = A(y) αa(x). x m y m, and A(x 2 ) := 1 n A(xy) A(x)A(y) A(x 2 ) A(x) 2 (x m A(x)). x 2 m. Note that if the pairs (x m, y m ) are realizations of i.i.d. random variables ( m, Y m ) = D (, Y ) with finite variances, then, as n, A(x) E(), A(x 2 ) E( 2 ), A(y) E(Y ), and A(xy) E(Y ). Consequently, if n is large, we see that See [2] for examples. z m A(Y ) + cov(, Y ) (x m E()) = L[Y = x m ]. var() REFERENCES [1] R. G. Gallager: Stochastic Processes: A Conceptual Approach. August 2, 21. [2] D. Freedman: Statistical Models: Theory and Practice. Cambridge University Press, 25. [3] J Walrand: Lecture Notes on Probability and Random Processes, On-Line Book, August 24.

5 5 z m y m x m Fig. 5. Linear Regression.

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 26. Estimation: Regression and Least Squares

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 26. Estimation: Regression and Least Squares CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 26 Estimation: Regression and Least Squares This note explains how to use observations to estimate unobserved random variables.

More information

Problem Set 7 Due March, 22

Problem Set 7 Due March, 22 EE16: Probability and Random Processes SP 07 Problem Set 7 Due March, Lecturer: Jean C. Walrand GSI: Daniel Preda, Assane Gueye Problem 7.1. Let u and v be independent, standard normal random variables

More information

EE226a - Summary of Lecture 13 and 14 Kalman Filter: Convergence

EE226a - Summary of Lecture 13 and 14 Kalman Filter: Convergence 1 EE226a - Summary of Lecture 13 and 14 Kalman Filter: Convergence Jean Walrand I. SUMMARY Here are the key ideas and results of this important topic. Section II reviews Kalman Filter. A system is observable

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

Stat 366 A1 (Fall 2006) Midterm Solutions (October 23) page 1

Stat 366 A1 (Fall 2006) Midterm Solutions (October 23) page 1 Stat 366 A1 Fall 6) Midterm Solutions October 3) page 1 1. The opening prices per share Y 1 and Y measured in dollars) of two similar stocks are independent random variables, each with a density function

More information

Lecture Notes Special: Using Neural Networks for Mean-Squared Estimation. So: How can we evaluate E(X Y )? What is a neural network? How is it useful?

Lecture Notes Special: Using Neural Networks for Mean-Squared Estimation. So: How can we evaluate E(X Y )? What is a neural network? How is it useful? Lecture Notes Special: Using Neural Networks for Mean-Squared Estimation The Story So Far... So: How can we evaluate E(X Y )? What is a neural network? How is it useful? EE 278: Using Neural Networks for

More information

Kalman Filtering. Namrata Vaswani. March 29, Kalman Filter as a causal MMSE estimator

Kalman Filtering. Namrata Vaswani. March 29, Kalman Filter as a causal MMSE estimator Kalman Filtering Namrata Vaswani March 29, 2018 Notes are based on Vincent Poor s book. 1 Kalman Filter as a causal MMSE estimator Consider the following state space model (signal and observation model).

More information

Lecture Note 12: Kalman Filter

Lecture Note 12: Kalman Filter ECE 645: Estimation Theory Spring 2015 Instructor: Prof. Stanley H. Chan Lecture Note 12: Kalman Filter LaTeX prepared by Stylianos Chatzidakis) May 4, 2015 This lecture note is based on ECE 645Spring

More information

ECE531 Lecture 8: Non-Random Parameter Estimation

ECE531 Lecture 8: Non-Random Parameter Estimation ECE531 Lecture 8: Non-Random Parameter Estimation D. Richard Brown III Worcester Polytechnic Institute 19-March-2009 Worcester Polytechnic Institute D. Richard Brown III 19-March-2009 1 / 25 Introduction

More information

CS70: Jean Walrand: Lecture 22.

CS70: Jean Walrand: Lecture 22. CS70: Jean Walrand: Lecture 22. Confidence Intervals; Linear Regression 1. Review 2. Confidence Intervals 3. Motivation for LR 4. History of LR 5. Linear Regression 6. Derivation 7. More examples Review:

More information

Problem Set 3 Due Oct, 5

Problem Set 3 Due Oct, 5 EE6: Random Processes in Systems Lecturer: Jean C. Walrand Problem Set 3 Due Oct, 5 Fall 6 GSI: Assane Gueye This problem set essentially reviews detection theory. Not all eercises are to be turned in.

More information

RECURSIVE ESTIMATION AND KALMAN FILTERING

RECURSIVE ESTIMATION AND KALMAN FILTERING Chapter 3 RECURSIVE ESTIMATION AND KALMAN FILTERING 3. The Discrete Time Kalman Filter Consider the following estimation problem. Given the stochastic system with x k+ = Ax k + Gw k (3.) y k = Cx k + Hv

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

ECE Lecture #9 Part 2 Overview

ECE Lecture #9 Part 2 Overview ECE 450 - Lecture #9 Part Overview Bivariate Moments Mean or Expected Value of Z = g(x, Y) Correlation and Covariance of RV s Functions of RV s: Z = g(x, Y); finding f Z (z) Method : First find F(z), by

More information

Ch. 12 Linear Bayesian Estimators

Ch. 12 Linear Bayesian Estimators Ch. 1 Linear Bayesian Estimators 1 In chapter 11 we saw: the MMSE estimator takes a simple form when and are jointly Gaussian it is linear and used only the 1 st and nd order moments (means and covariances).

More information

Appendix A : Introduction to Probability and stochastic processes

Appendix A : Introduction to Probability and stochastic processes A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of

More information

ECE 650 Lecture 4. Intro to Estimation Theory Random Vectors. ECE 650 D. Van Alphen 1

ECE 650 Lecture 4. Intro to Estimation Theory Random Vectors. ECE 650 D. Van Alphen 1 EE 650 Lecture 4 Intro to Estimation Theory Random Vectors EE 650 D. Van Alphen 1 Lecture Overview: Random Variables & Estimation Theory Functions of RV s (5.9) Introduction to Estimation Theory MMSE Estimation

More information

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18 Variance reduction p. 1/18 Variance reduction Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Variance reduction p. 2/18 Example Use simulation to compute I = 1 0 e x dx We

More information

Elliptically Contoured Distributions

Elliptically Contoured Distributions Elliptically Contoured Distributions Recall: if X N p µ, Σ), then { 1 f X x) = exp 1 } det πσ x µ) Σ 1 x µ) So f X x) depends on x only through x µ) Σ 1 x µ), and is therefore constant on the ellipsoidal

More information

Chapter 5 Class Notes

Chapter 5 Class Notes Chapter 5 Class Notes Sections 5.1 and 5.2 It is quite common to measure several variables (some of which may be correlated) and to examine the corresponding joint probability distribution One example

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

MATH 101 Midterm Examination Spring 2009

MATH 101 Midterm Examination Spring 2009 MATH Midterm Eamination Spring 9 Date: May 5, 9 Time: 7 minutes Surname: (Please, print!) Given name(s): Signature: Instructions. This is a closed book eam: No books, no notes, no calculators are allowed!.

More information

DS-GA 1002 Lecture notes 12 Fall Linear regression

DS-GA 1002 Lecture notes 12 Fall Linear regression DS-GA Lecture notes 1 Fall 16 1 Linear models Linear regression In statistics, regression consists of learning a function relating a certain quantity of interest y, the response or dependent variable,

More information

4 Derivations of the Discrete-Time Kalman Filter

4 Derivations of the Discrete-Time Kalman Filter Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof N Shimkin 4 Derivations of the Discrete-Time

More information

1 The Differential Geometry of Surfaces

1 The Differential Geometry of Surfaces 1 The Differential Geometry of Surfaces Three-dimensional objects are bounded by surfaces. This section reviews some of the basic definitions and concepts relating to the geometry of smooth surfaces. 1.1

More information

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2}, ECE32 Spring 25 HW Solutions April 6, 25 Solutions to HW Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in italics where

More information

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type. Expectations of Sums of Random Variables STAT/MTHE 353: 4 - More on Expectations and Variances T. Linder Queen s University Winter 017 Recall that if X 1,...,X n are random variables with finite expectations,

More information

Announcements (repeat) Principal Components Analysis

Announcements (repeat) Principal Components Analysis 4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long

More information

P R O B A B I L I T Y I N E E & C S

P R O B A B I L I T Y I N E E & C S J E A N W A L R A N D P R O B A B I L I T Y I N E E & C S An Application-Driven Course A M A Z O N Preface This book is about applications of probability in Electrical Engineering and Computer Science.

More information

EE5585 Data Compression May 2, Lecture 27

EE5585 Data Compression May 2, Lecture 27 EE5585 Data Compression May 2, 2013 Lecture 27 Instructor: Arya Mazumdar Scribe: Fangying Zhang Distributed Data Compression/Source Coding In the previous class we used a H-W table as a simple example,

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

Estimation techniques

Estimation techniques Estimation techniques March 2, 2006 Contents 1 Problem Statement 2 2 Bayesian Estimation Techniques 2 2.1 Minimum Mean Squared Error (MMSE) estimation........................ 2 2.1.1 General formulation......................................

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes

Shannon meets Wiener II: On MMSE estimation in successive decoding schemes Shannon meets Wiener II: On MMSE estimation in successive decoding schemes G. David Forney, Jr. MIT Cambridge, MA 0239 USA forneyd@comcast.net Abstract We continue to discuss why MMSE estimation arises

More information

Assignment 4 (Solutions) NPTEL MOOC (Bayesian/ MMSE Estimation for MIMO/OFDM Wireless Communications)

Assignment 4 (Solutions) NPTEL MOOC (Bayesian/ MMSE Estimation for MIMO/OFDM Wireless Communications) Assignment 4 Solutions NPTEL MOOC Bayesian/ MMSE Estimation for MIMO/OFDM Wireless Communications The system model can be written as, y hx + The MSE of the MMSE estimate ĥ of the aboe mentioned system

More information

Math 111 lecture for Friday, Week 10

Math 111 lecture for Friday, Week 10 Math lecture for Friday, Week Finding antiderivatives mean reversing the operation of taking derivatives. Today we ll consider reversing the chain rule and the product rule. Substitution technique. Recall

More information

The Gauss-Markov Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 61

The Gauss-Markov Model. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 61 The Gauss-Markov Model Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 61 Recall that Cov(u, v) = E((u E(u))(v E(v))) = E(uv) E(u)E(v) Var(u) = Cov(u, u) = E(u E(u)) 2 = E(u 2

More information

In general, the formula is S f ds = D f(φ(u, v)) Φ u Φ v da. To compute surface area, we choose f = 1. We compute

In general, the formula is S f ds = D f(φ(u, v)) Φ u Φ v da. To compute surface area, we choose f = 1. We compute alculus III Test 3 ample Problem Answers/olutions 1. Express the area of the surface Φ(u, v) u cosv, u sinv, 2v, with domain u 1, v 2π, as a double integral in u and v. o not evaluate the integral. In

More information

ESTIMATION THEORY. Chapter Estimation of Random Variables

ESTIMATION THEORY. Chapter Estimation of Random Variables Chapter ESTIMATION THEORY. Estimation of Random Variables Suppose X,Y,Y 2,...,Y n are random variables defined on the same probability space (Ω, S,P). We consider Y,...,Y n to be the observed random variables

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Problem Set 8 Fall 2007

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Problem Set 8 Fall 2007 UC Berkeley Department of Electrical Engineering and Computer Science EE 6: Probablity and Random Processes Problem Set 8 Fall 007 Issued: Thursday, October 5, 007 Due: Friday, November, 007 Reading: Bertsekas

More information

University of Regina Department of Mathematics and Statistics

University of Regina Department of Mathematics and Statistics u z v 1. Consider the map Universit of Regina Department of Mathematics and Statistics MATH431/831 Differential Geometr Winter 2014 Homework Assignment No. 3 - Solutions ϕ(u, v) = (cosusinv, sin u sin

More information

f G It expf fx.in xn xn IF Hat expf EE T vectorse Today we'll speed Multiua Reading 9 I 9.5 Last week we organized finite collectors of Us into

f G It expf fx.in xn xn IF Hat expf EE T vectorse Today we'll speed Multiua Reading 9 I 9.5 Last week we organized finite collectors of Us into Reading 9 I 9.5 Multiua random variables Last week we organized finite collectors of Us into vectors called rando vectorse Today we'll speed discussing Gaussian random vectors so fine Recall the univariate

More information

Optimization and Simulation

Optimization and Simulation Optimization and Simulation Variance reduction Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale de Lausanne M.

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise.

(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise. 54 We are given the marginal pdfs of Y and Y You should note that Y gamma(4, Y exponential( E(Y = 4, V (Y = 4, E(Y =, and V (Y = 4 (a With U = Y Y, we have E(U = E(Y Y = E(Y E(Y = 4 = (b Because Y and

More information

Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51

Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51 Yi Lu Correlation and Covariance Yi Lu ECE 313 2/51 Definition Let X and Y be random variables with finite second moments. the correlation: E[XY ] Yi Lu ECE 313 3/51 Definition Let X and Y be random variables

More information

Chapter 4 continued. Chapter 4 sections

Chapter 4 continued. Chapter 4 sections Chapter 4 sections Chapter 4 continued 4.1 Expectation 4.2 Properties of Expectations 4.3 Variance 4.4 Moments 4.5 The Mean and the Median 4.6 Covariance and Correlation 4.7 Conditional Expectation SKIP:

More information

Chapter 6 - Orthogonality

Chapter 6 - Orthogonality Chapter 6 - Orthogonality Maggie Myers Robert A. van de Geijn The University of Texas at Austin Orthogonality Fall 2009 http://z.cs.utexas.edu/wiki/pla.wiki/ 1 Orthogonal Vectors and Subspaces http://z.cs.utexas.edu/wiki/pla.wiki/

More information

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix: Joint Distributions Joint Distributions A bivariate normal distribution generalizes the concept of normal distribution to bivariate random variables It requires a matrix formulation of quadratic forms,

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

Joint ] X 5) P[ 6) P[ (, ) = y 2. x 1. , y. , ( x, y ) 2, (

Joint ] X 5) P[ 6) P[ (, ) = y 2. x 1. , y. , ( x, y ) 2, ( Two-dimensional Random Vectors Joint Cumulative Distrib bution Functio n F, (, ) P[ and ] Properties: ) F, (, ) = ) F, 3) F, F 4), (, ) = F 5) P[ < 6) P[ < (, ) is a non-decreasing unction (, ) = F ( ),,,

More information

The Multivariate Normal Distribution 1

The Multivariate Normal Distribution 1 The Multivariate Normal Distribution 1 STA 302 Fall 2017 1 See last slide for copyright information. 1 / 40 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2

More information

Math 112 Section 10 Lecture notes, 1/7/04

Math 112 Section 10 Lecture notes, 1/7/04 Math 11 Section 10 Lecture notes, 1/7/04 Section 7. Integration by parts To integrate the product of two functions, integration by parts is used when simpler methods such as substitution or simplifying

More information

Final Examination Solutions (Total: 100 points)

Final Examination Solutions (Total: 100 points) Final Examination Solutions (Total: points) There are 4 problems, each problem with multiple parts, each worth 5 points. Make sure you answer all questions. Your answer should be as clear and readable

More information

Multiple Regression Model: I

Multiple Regression Model: I Multiple Regression Model: I Suppose the data are generated according to y i 1 x i1 2 x i2 K x ik u i i 1...n Define y 1 x 11 x 1K 1 u 1 y y n X x n1 x nk K u u n So y n, X nxk, K, u n Rks: In many applications,

More information

The Multivariate Normal Distribution 1

The Multivariate Normal Distribution 1 The Multivariate Normal Distribution 1 STA 302 Fall 2014 1 See last slide for copyright information. 1 / 37 Overview 1 Moment-generating Functions 2 Definition 3 Properties 4 χ 2 and t distributions 2

More information

Notes on Time Series Modeling

Notes on Time Series Modeling Notes on Time Series Modeling Garey Ramey University of California, San Diego January 17 1 Stationary processes De nition A stochastic process is any set of random variables y t indexed by t T : fy t g

More information

DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 1002 Lecture notes 10 November 23, Linear models DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

More information

Section 5.5 More Integration Formula (The Substitution Method) 2 Lectures. Dr. Abdulla Eid. College of Science. MATHS 101: Calculus I

Section 5.5 More Integration Formula (The Substitution Method) 2 Lectures. Dr. Abdulla Eid. College of Science. MATHS 101: Calculus I Section 5.5 More Integration Formula (The Substitution Method) 2 Lectures College of Science MATHS : Calculus I (University of Bahrain) Integrals / 7 The Substitution Method Idea: To replace a relatively

More information

[Chapter 6. Functions of Random Variables]

[Chapter 6. Functions of Random Variables] [Chapter 6. Functions of Random Variables] 6.1 Introduction 6.2 Finding the probability distribution of a function of random variables 6.3 The method of distribution functions 6.5 The method of Moment-generating

More information

STA 256: Statistics and Probability I

STA 256: Statistics and Probability I Al Nosedal. University of Toronto. Fall 2017 My momma always said: Life was like a box of chocolates. You never know what you re gonna get. Forrest Gump. There are situations where one might be interested

More information

Statistics & Data Sciences: First Year Prelim Exam May 2018

Statistics & Data Sciences: First Year Prelim Exam May 2018 Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book

More information

c Igor Zelenko, Fall

c Igor Zelenko, Fall c Igor Zelenko, Fall 2017 1 18: Repeated Eigenvalues: algebraic and geometric multiplicities of eigenvalues, generalized eigenvectors, and solution for systems of differential equation with repeated eigenvalues

More information

Concentration Ellipsoids

Concentration Ellipsoids Concentration Ellipsoids ECE275A Lecture Supplement Fall 2008 Kenneth Kreutz Delgado Electrical and Computer Engineering Jacobs School of Engineering University of California, San Diego VERSION LSECE275CE

More information

Conditional Distributions

Conditional Distributions Conditional Distributions The goal is to provide a general definition of the conditional distribution of Y given X, when (X, Y ) are jointly distributed. Let F be a distribution function on R. Let G(,

More information

Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques

Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques Institut für Numerische Mathematik und Optimierung Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques Oliver Ernst Computational Methods with Applications Harrachov, CR,

More information

Functional Analysis (2006) Homework assignment 2

Functional Analysis (2006) Homework assignment 2 Functional Analysis (26) Homework assignment 2 All students should solve the following problems: 1. Define T : C[, 1] C[, 1] by (T x)(t) = t x(s) ds. Prove that this is a bounded linear operator, and compute

More information

UCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, Homework Set #6 Due: Thursday, May 22, 2011

UCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, Homework Set #6 Due: Thursday, May 22, 2011 UCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, 2014 Homework Set #6 Due: Thursday, May 22, 2011 1. Linear estimator. Consider a channel with the observation Y = XZ, where the signal X and

More information

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012 Problem Set #6: OLS Economics 835: Econometrics Fall 202 A preliminary result Suppose we have a random sample of size n on the scalar random variables (x, y) with finite means, variances, and covariance.

More information

5 Operations on Multiple Random Variables

5 Operations on Multiple Random Variables EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Vector Calculus handout

Vector Calculus handout Vector Calculus handout The Fundamental Theorem of Line Integrals Theorem 1 (The Fundamental Theorem of Line Integrals). Let C be a smooth curve given by a vector function r(t), where a t b, and let f

More information

22 Approximations - the method of least squares (1)

22 Approximations - the method of least squares (1) 22 Approximations - the method of least squares () Suppose that for some y, the equation Ax = y has no solutions It may happpen that this is an important problem and we can t just forget about it If we

More information

LINEAR MMSE ESTIMATION

LINEAR MMSE ESTIMATION LINEAR MMSE ESTIMATION TERM PAPER FOR EE 602 STATISTICAL SIGNAL PROCESSING By, DHEERAJ KUMAR VARUN KHAITAN 1 Introduction Linear MMSE estimators are chosen in practice because they are simpler than the

More information

Siegel s formula via Stein s identities

Siegel s formula via Stein s identities Siegel s formula via Stein s identities Jun S. Liu Department of Statistics Harvard University Abstract Inspired by a surprising formula in Siegel (1993), we find it convenient to compute covariances,

More information

Convergence of Fourier Series

Convergence of Fourier Series MATH 454: Analysis Two James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University April, 8 MATH 454: Analysis Two Outline The Cos Family MATH 454: Analysis

More information

APPM 3310 Problem Set 4 Solutions

APPM 3310 Problem Set 4 Solutions APPM 33 Problem Set 4 Solutions. Problem.. Note: Since these are nonstandard definitions of addition and scalar multiplication, be sure to show that they satisfy all of the vector space axioms. Solution:

More information

Gaussian Quiz. Preamble to The Humble Gaussian Distribution. David MacKay 1

Gaussian Quiz. Preamble to The Humble Gaussian Distribution. David MacKay 1 Preamble to The Humble Gaussian Distribution. David MacKay Gaussian Quiz H y y y 3. Assuming that the variables y, y, y 3 in this belief network have a joint Gaussian distribution, which of the following

More information

STA 4322 Exam I Name: Introduction to Statistics Theory

STA 4322 Exam I Name: Introduction to Statistics Theory STA 4322 Exam I Name: Introduction to Statistics Theory Fall 2013 UF-ID: Instructions: There are 100 total points. You must show your work to receive credit. Read each part of each question carefully.

More information

ECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation

ECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation ECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation Prof. om Overbye Dept. of Electrical and Computer Engineering exas A&M University overbye@tamu.edu Announcements

More information

Lecture 11: Continuous-valued signals and differential entropy

Lecture 11: Continuous-valued signals and differential entropy Lecture 11: Continuous-valued signals and differential entropy Biology 429 Carl Bergstrom September 20, 2008 Sources: Parts of today s lecture follow Chapter 8 from Cover and Thomas (2007). Some components

More information

Numerical Optimization

Numerical Optimization Numerical Optimization Unit 2: Multivariable optimization problems Che-Rung Lee Scribe: February 28, 2011 (UNIT 2) Numerical Optimization February 28, 2011 1 / 17 Partial derivative of a two variable function

More information

Problem Set 1 Sept, 14

Problem Set 1 Sept, 14 EE6: Random Processes in Systems Lecturer: Jean C. Walrand Problem Set Sept, 4 Fall 06 GSI: Assane Gueye This problem set essentially reviews notions of conditional expectation, conditional distribution,

More information

Lecture Note 2: Estimation and Information Theory

Lecture Note 2: Estimation and Information Theory Univ. of Michigan - NAME 568/EECS 568/ROB 530 Winter 2018 Lecture Note 2: Estimation and Information Theory Lecturer: Maani Ghaffari Jadidi Date: April 6, 2018 2.1 Estimation A static estimation problem

More information

Lecture 2 INF-MAT : A boundary value problem and an eigenvalue problem; Block Multiplication; Tridiagonal Systems

Lecture 2 INF-MAT : A boundary value problem and an eigenvalue problem; Block Multiplication; Tridiagonal Systems Lecture 2 INF-MAT 4350 2008: A boundary value problem and an eigenvalue problem; Block Multiplication; Tridiagonal Systems Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University

More information

STATS 306B: Unsupervised Learning Spring Lecture 12 May 7

STATS 306B: Unsupervised Learning Spring Lecture 12 May 7 STATS 306B: Unsupervised Learning Spring 2014 Lecture 12 May 7 Lecturer: Lester Mackey Scribe: Lan Huong, Snigdha Panigrahi 12.1 Beyond Linear State Space Modeling Last lecture we completed our discussion

More information

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017)

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017) UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, 208 Practice Final Examination (Winter 207) There are 6 problems, each problem with multiple parts. Your answer should be as clear and readable

More information

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei)

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei) UCSD ECE53 Handout #34 Prof Young-Han Kim Tuesday, May 7, 04 Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei) Linear estimator Consider a channel with the observation Y XZ, where the

More information

Solutions to Homework Set #5 (Prepared by Lele Wang) MSE = E [ (sgn(x) g(y)) 2],, where f X (x) = 1 2 2π e. e (x y)2 2 dx 2π

Solutions to Homework Set #5 (Prepared by Lele Wang) MSE = E [ (sgn(x) g(y)) 2],, where f X (x) = 1 2 2π e. e (x y)2 2 dx 2π Solutions to Homework Set #5 (Prepared by Lele Wang). Neural net. Let Y X + Z, where the signal X U[,] and noise Z N(,) are independent. (a) Find the function g(y) that minimizes MSE E [ (sgn(x) g(y))

More information

3.4 Conic sections. Such type of curves are called conics, because they arise from different slices through a cone

3.4 Conic sections. Such type of curves are called conics, because they arise from different slices through a cone 3.4 Conic sections Next we consider the objects resulting from ax 2 + bxy + cy 2 + + ey + f = 0. Such type of curves are called conics, because they arise from different slices through a cone Circles belong

More information

Change Of Variable Theorem: Multiple Dimensions

Change Of Variable Theorem: Multiple Dimensions Change Of Variable Theorem: Multiple Dimensions Moulinath Banerjee University of Michigan August 30, 01 Let (X, Y ) be a two-dimensional continuous random vector. Thus P (X = x, Y = y) = 0 for all (x,

More information

Week 2: First Order Semi-Linear PDEs

Week 2: First Order Semi-Linear PDEs Week 2: First Order Semi-Linear PDEs Introction We want to find a formal solution to the first order semilinear PDEs of the form a(, y)u + b(, y)u y = c(, y, u). Using a change of variables corresponding

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

10. Joint Moments and Joint Characteristic Functions

10. Joint Moments and Joint Characteristic Functions 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent the inormation contained in the joint p.d. o two r.vs.

More information

6. Vector Random Variables

6. Vector Random Variables 6. Vector Random Variables In the previous chapter we presented methods for dealing with two random variables. In this chapter we etend these methods to the case of n random variables in the following

More information

Ch 10.1: Two Point Boundary Value Problems

Ch 10.1: Two Point Boundary Value Problems Ch 10.1: Two Point Boundary Value Problems In many important physical problems there are two or more independent variables, so the corresponding mathematical models involve partial differential equations.

More information

Review Sheet for the Final

Review Sheet for the Final Review Sheet for the Final Math 6-4 4 These problems are provided to help you study. The presence of a problem on this handout does not imply that there will be a similar problem on the test. And the absence

More information

The Multivariate Gaussian Distribution [DRAFT]

The Multivariate Gaussian Distribution [DRAFT] The Multivariate Gaussian Distribution DRAFT David S. Rosenberg Abstract This is a collection of a few key and standard results about multivariate Gaussian distributions. I have not included many proofs,

More information