ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions

Similar documents
Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Problem Solving. Correlation and Covariance. Yi Lu. Problem Solving. Yi Lu ECE 313 2/51

Statistics, Data Analysis, and Simulation SS 2015

Solutions to Homework Set #5 (Prepared by Lele Wang) MSE = E [ (sgn(x) g(y)) 2],, where f X (x) = 1 2 2π e. e (x y)2 2 dx 2π

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei)

3. Probability and Statistics

A Probability Review

Problem Y is an exponential random variable with parameter λ = 0.2. Given the event A = {Y < 2},

Lecture 4: Least Squares (LS) Estimation

Multivariate probability distributions and linear regression

Section 8.1. Vector Notation

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

UCSD ECE153 Handout #27 Prof. Young-Han Kim Tuesday, May 6, Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei)

Multiple Random Variables

Statistics STAT:5100 (22S:193), Fall Sample Final Exam B

4. Distributions of Functions of Random Variables

Chapter 2 Random Processes

Section 9.1. Expected Values of Sums

ECE 541 Stochastic Signals and Systems Problem Set 11 Solution

Multivariate Random Variable

Probability Background

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review

Recall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.

Joint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Chapter 5 continued. Chapter 5 sections

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

ECE 673-Random signal analysis I Final

STA 256: Statistics and Probability I

5 Operations on Multiple Random Variables

Reminders. Thought questions should be submitted on eclass. Please list the section related to the thought question

FINAL EXAM: Monday 8-10am

Chapter 4. Chapter 4 sections

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors

Continuous Random Variables

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

1.1 Review of Probability Theory

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

Problem Set 1. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 20

(y 1, y 2 ) = 12 y3 1e y 1 y 2 /2, y 1 > 0, y 2 > 0 0, otherwise.

18 Bivariate normal distribution I

Final Examination Solutions (Total: 100 points)

Probability and Stochastic Processes

The Multivariate Normal Distribution. In this case according to our theorem

Chapter 4 : Expectation and Moments

Name of the Student: Problems on Discrete & Continuous R.Vs

11 - The linear model

Statistics 3657 : Moment Approximations

ECE302 Spring 2015 HW10 Solutions May 3,

Estimation techniques

FINAL EXAM: 3:30-5:30pm

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49

conditional cdf, conditional pdf, total probability theorem?

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

ECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.

Lecture 2: Repetition of probability theory and statistics

Jointly Distributed Random Variables

Bivariate distributions

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

STAT/MA 416 Answers Homework 6 November 15, 2007 Solutions by Mark Daniel Ward PROBLEMS

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Problem Set 8 Fall 2007

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

UCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, Homework Set #6 Due: Thursday, May 22, 2011

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017)

Lecture Notes 4 Vector Detection and Estimation. Vector Detection Reconstruction Problem Detection for Vector AGN Channel

Properties of Random Variables

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

ENGG2430A-Homework 2

We introduce methods that are useful in:

BASICS OF PROBABILITY

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

MAS223 Statistical Inference and Modelling Exercises

18.440: Lecture 28 Lectures Review

EE482: Digital Signal Processing Applications

Parameter Estimation

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2

Stat 206: Sampling theory, sample moments, mahalanobis

ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 PROBABILITY. Prof. Steven Waslander

ECE Homework Set 3

STAT 512 sp 2018 Summary Sheet

ACM 116: Lectures 3 4

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Chapter 2. Discrete Distributions

Chapter 5,6 Multiple RandomVariables

Notes on Random Vectors and Multivariate Normal

Statistics for scientists and engineers

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v }

CS281A/Stat241A Lecture 17

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

ECE-340, Spring 2015 Review Questions

Two hours. Statistical Tables to be provided THE UNIVERSITY OF MANCHESTER. 14 January :45 11:45

Chapter 4 Multiple Random Variables

Introduction to Simple Linear Regression

Joint Probability Distributions, Correlations

Transcription:

ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions Problem Solutions : Yates and Goodman, 9.5.3 9.1.4 9.2.2 9.2.6 9.3.2 9.4.2 9.4.6 9.4.7 and Problem 9.1.4 Solution The joint PDF of X and Y is f X,Y (x, y) { 6(y x) x y 1 otherwise (1) (a) The conditional PDF of X given Y is found by dividing the joint PDF by the marginal with respect to Y. For y < or y > 1, f Y (y). For y 1, f Y (y) y The complete expression for the marginal PDF of Y is 6(y x) dx 6xy 3x 2 y 3y2 (2) f Y (y) { 3y 2 y 1 otherwise (3) Thus for < y 1, f X Y (x y) f X,Y (x, y) f Y (y) { 6(y x) 3y 2 x y otherwise (4) (b) The minimum mean square estimator of X given Y y is ˆX M (y) E [X Y y] y (c) First we must find the marginal PDF for X. For x 1, f X (x) f X,Y (x, y) dy The conditional PDF of Y given X is xf X Y (x y) dx (5) 6x(y x) 3y 2 dx 3x2 y 2x 3 xy 3y 2 y/3 (6) x x 6(y x) dy 3y 2 6xy y1 yx (7) 3(1 2x + x 2 ) 3(1 x) 2 (8) f Y X (y x) f X,Y (x, y) f X (x) { 2(y x) 1 2x+x 2 x y 1 otherwise (9) 1

(d) The minimum mean square estimator of Y given X is Ŷ M (x) E [Y X x] Perhaps surprisingly, this result can be simplified to yf Y X (y x) dy (1) 2y(y x) dy (11) x 1 2x + x2 (2/3)y3 y 2 y1 x 2 3x + x3 1 2x + x 2 3(1 x) 2. (12) yx Ŷ M (x) x 3 + 2 3. (13) Problem 9.2.2 Solution The problem statement tells us that f V (v) { 1/12 6 v 6, otherwise. Furthermore, we are also told that R V + X where X is a Gaussian (, 3) random variable. (a) The expected value of R is the expected value V plus the expected value of X. We already know that X has zero expected value, and that V is uniformly distributed between -6 and 6 volts and therefore also has zero expected value. So E [R] E [V + X] E [V ] + E [X]. (2) (b) Because X and V are independent random variables, the variance of R is the sum of the variance of V and the variance of X. (c) Since E[R] E[V ], Var[R] Var[V ] + Var[X] 12 + 3 15. (3) Cov [V, R] E [V R] E [V (V + X)] E [ V 2] Var[V ]. (4) (1) (d) The correlation coefficient of V and R is ρ V,R Cov [V, R] Var[V ] Var[R] The LMSE estimate of V given R is Var[V ] Var[V ] Var[R] σ V σ R. (5) σ V ˆV (R) ρ V,R (R E [R]) + E [V ] σ2 V σ R σr 2 R 12 R. (6) 15 Therefore a 12/15 4/5 and b. (e) The minimum mean square error in the estimate is e Var[V ](1 ρ 2 V,R) 12(1 12/15) 12/5 (7) 2

Problem 9.2.6 Solution The linear mean square estimator of X given Y is ˆX L (Y ) ( E [XY ] µx µ Y Var[Y ] To find the parameters of this estimator, we calculate f Y (y) f X (x) y The moments of X and Y are E [Y ] E [ Y 2] x ) (Y µ Y ) + µ X. (1) 6(y x) dx 6xy 3x 2 y 3y2 ( y 1) (2) 6(y x) dy 3y 3 dy 3/4 E [X] 3y 4 dy 3/5 E [ X 2] The correlation between X and Y is E [XY ] 6 { 3(1 + 2x + x 2 ) x 1, otherwise. x (3) 3x(1 2x + x 2 ) dx 1/4 (4) 3x 2 (1 + 2x + x 2 ) dx 1/1 (5) xy(y x) dy dx 1/5 (6) Putting these pieces together, the optimal linear estimate of X given Y is ( ) ( 1/5 3/16 ˆX L (Y ) 3/5 (3/4) 2 Y 3 ) + 1 4 4 Y 3 (7) Problem 9.3.2 Solution From the problem statement we know that R is an exponential random variable with expected value 1/µ. Therefore it has the following probability density function. f R (r) { µe µr r otherwise (1) It is also known that, given R r, the number of phone calls arriving at a telephone switch, N, is a Poisson (α rt ) random variable. So we can write the following conditional probability mass function of N given R. P N R (n r) { (rt ) n e rt n! n, 1,... otherwise (2) (a) The minimum mean square error estimate of N given R is the conditional expected value of N given R r. This is given directly in the problem statement as r. ˆN M (r) E [N R r] rt (3) 3

(b) The maximum a posteriori estimate of N given R is simply the value of n that will maximize P N R (n r). That is, ˆn MAP (r) arg max n P N R (n r) arg max n (rt )n e rt /n! (4) Usually, we set a derivative to zero to solve for the maximizing value. In this case, that technique doesn t work because n is discrete. Since e rt is a common factor in the maximization, we can define g(n) (rt ) n /n! so that ˆn MAP arg max n g(n). We observe that g(n) rt g(n 1) (5) n this implies that for n rt, g(n) g(n 1). Hence the maximizing value of n is the largest n such that n rt. That is, ˆn MAP rt. (c) The maximum likelihood estimate of N given R selects the value of n that maximizes f R Nn (r), the conditional PDF of R given N. When dealing with situations in which we mix continuous and discrete random variables, its often helpful to start from first principles. In this case, f R N (r n) dr P [r < R r + dr N n] (6) In terms of PDFs and PMFs, we have P [r < R r + dr, N n] P [N n] P [N n R r] P [r < R r + dr] P [N n] f R N (r n) P N R (n r) f R (r) P N (n) To find the value of n that maximizes f R N (r n), we need to find the denominator P N (n). P N (n) (7) (8) (9) P N R (n r) f R (r) dr (1) (rt ) n e rt µe µr dr (11) n! µt n r n (µ + T )e (µ+t )r dr (12) n!(µ + T ) µt n n!(µ + T ) E [Xn ] (13) where X is an exponential random variable with expected value 1/(µ + T ). There are several ways to derive the nth moment of an exponential random variable including integration by parts. In Example 6.5, the MGF is used to show that E[X n ] n!/(µ + T ) n. Hence, for n, µt n P N (n) (µ + T ) n+1 (14) 4

Finally, the conditional PDF of R given N is f R N (r n) P N R (n r) f R (r) P N (n) The ML estimate of N given R is (rt ) n e rt n! µe µr µt n (µ+t ) n+1 (15) (µ + T ) [(µ + T )r]n e (µ+t )r n! (16) ˆn ML (r) arg max n f R N (r n) arg max n (µ + T )[(µ + T )r]n e (µ+t )r n! (17) This maximization is exactly the same as in the previous part except rt is replaced by (µ + T )r. The maximizing value of n is ˆn ML (µ + T )r. Problem 9.4.2 Solution From the problem statement, we learn for vectors X [ X 1 X 2 ] [ X 3 and W W1 ] W 2 that 1 3/4 1/2 E [X], R X 3/4 1 3/4, E [W], R W 1/2 3/4 1 [ ].1.1 (1) In addition, Y [ Y1 Y 2 ] AX + W [ ] 1 1 X + W. (2) 1 1 (a) Since E[Y] AE[X], we can apply Theorem 9.7(a) which states that the minimum mean square error estimate of X 1 is ˆX 1 (Y) â Y where â R 1 Y R YX 1. First we find R Y. R Y E [ YY ] E [ (AX + W)(AX + W) ] (3) E [ (AX + W)(X A + W ) ] (4) E [ AXX A ] + E [ WX A ] + E [ AXW ] + E [ WW ] (5) Since X and W are independent, E[WX ] and E[XW ]. This implies R Y AE [ XX ] A + E [ WW ] (6) AR X A + R W (7) [ ] 1 3/4 1/2 1 [ ] [ ] 1 1 3/4 1 3/4 1 1.1 3.6 3 +. (8) 1 1.1 3 3.6 1/2 3/4 1 1 Once again, independence of W and X 1 yields R YX1 E [YX 1 ] E [(AX + W)X 1 ] AE [XX 1 ]. (9) 5

This implies R YX1 A E [ X1 2 ] E [X 2 X 1 ] E [X 3 X 1 ] R X (1, 1) A R X (2, 1) R X (3, 1) Putting these facts together, we find that â R 1 Y R YX 1 [ 1/11 25/33 25/33 1/11 Thus the linear MMSE estimator of X 1 given Y is [ ] 1 1 1 1 ] [ 7/4 5/4 ] 1 3/4 1/2 [ ] 7/4. (1) 5/4 1 [ ] 85. (11) 132 25 ˆX 1 (Y) â Y 85 132 Y 1 25 132 Y 2.6439Y 1.1894Y 2. (12) (b) By Theorem 9.7(c), the mean squared error of the optimal estimator is e L Var[X 1 ] â R YX1 (13) R X (1, 1) R YX 1 R 1 Y R YX 1 (14) 1 [ 7/4 5/4 ] [ ] [ ] 1/11 25/33 7/4 29.198. (15) 25/33 1/11 5/4 264 In Problem 9.4.1, we solved essentialy the same problem but the observations Y were not subjected to the additive noise W. In comparing the estimators, we see that the additive noise perturbs the estimator somewhat but not dramatically because the correaltion structure of X and the mapping A from X to Y remains unchanged. On the other hand, in the noiseless case, the resulting mean square error was about half as much, 3/52.577 versus.198. (c) We can estimate random variable X 1 based on the observation of random variable Y 1 using Theorem 9.4. Note that Theorem 9.4 is a special case of Theorem 9.8 in which the observation is a random vector. In any case, from Theorem 9.4, the optimum linear estimate is ˆX 1 (Y 1 ) a Y 1 + b where a Cov [X 1, Y 1 ], b µ X1 a µ Y1. (16) Var[Y 1 ] Since E[X i ] µ Xi and Y 1 X 1 + X 2 + W 1, we see that µ Y1 E [Y 1 ] E [X 1 ] + E [X 2 ] + E [W 1 ]. (17) These facts, along with independence of X 1 and W 1, imply Cov [X 1, Y 1 ] E [X 1 Y 1 ] E [X 1 (X 1 + X 2 + W 1 )] (18) R X (1, 1) + R X (1, 2) 7/4 (19) In addition, using R Y from part (a), we see that Var[Y 1 ] E [ Y1 2 ] RY (1, 1) 3.6. (2) 6

Thus a Cov [X 1, Y 1 ] 7/4 Var[Y 1 ] 3.6 35 72, b µ X1 a µ Y1. (21) Thus the optimum linear estimate of X 1 given Y 1 is ˆX 1 (Y 1 ) 35 72 Y 1. (22) From Theorem 9.4(b), the mean square error of this estimator is e L σ2 X 1 (1 ρ 2 X 1,Y 1 ) (23) Since X 1 and Y 1 have zero expected value, σx 2 1 R X (1, 1) 1 and σy 2 1 R Y (1, 1) 3.6. Also, since Cov[X 1, Y 1 ] 7/4, we see that ρ X1,Y 1 Cov [X 1, Y 1 ] 7/4 49 σ X1 σ Y1 3.6 24. (24) Thus e L 1 (49/242 ).1493. As we would expect, the estimate of X 1 based on just Y 1 has larger mean square error than the estimate based on both Y 1 and Y 2. Problem 9.4.6 Solution For this problem, let Y [ X 1 X 2 n 1] and let X X n. Since E[Y] and E[X], Theorem 9.7(a) tells us that the minimum mean square linear estimate of X given Y is ˆX n (Y) â Y, where â R 1 Y R YX. This implies that â is the solution to Note that 1 c c n 2 R Y E [ YY ] c c 2........... c, c n 2 c 1 R Y â R YX. (1) R YX E X 1 X 2. X n 1 X n c n 1 c n 2. c. (2) We see that the last column of cr Y equals R YX. Equivalently, if â [ c ], then R Y â R YX. It follows that the optimal linear estimator of X n given Y is which completes the proof of the claim. The mean square error of this estimate is ˆX n (Y) â Y cx n 1, (3) e L E [ (X n cx n 1 ) 2] (4) R X (n, n) cr X (n, n 1) cr X (n 1, n) + c 2 R X (n 1, n 1) (5) 1 2c 2 + c 2 1 c 2 (6) When c is close to 1, X n 1 and X n are highly correlated and the estimation error will be small. Comment: We will see in Chapter 11 that correlation matrices with this structure arise frequently in the study of wide sense stationary random sequences. In fact, if you read ahead, you will find that the claim we just proved is the essentially the same as that made in Theorem 11.1. 7

Problem 9.4.7 Solution (a) In this case, we use the observation Y to estimate each X i. Since E[X i ], E [Y] E [X j ] p j S j + E [N]. (1) j1 Thus, Theorem 9.7(a) tells us that the MMSE linear estimate of X i is ˆX i (Y) â Y where â R 1 Y R YX i. First we note that R YXi E [YX i ] E X j pj S j + N X i (2) Since N and X i are independent, E[NX i ] E[N]E[X i ]. Because X i and X j are independent for i j, E[X i X j ] E[X i ]E[X j ] for i j. In addition, E[X 2 i ] 1, and it follows that R YXi j1 E [X j X i ] p j S j + E [NX i ] p i S i. (3) j1 For the same reasons, R Y E [ ( YY ] ) E pj X j S j + N pl X l S l + N (4) j1 l1 j1 pj p l E [X j X l ] S j S l pj E [X j N] S j + }{{} l1 + pl E [ X l N ] S l + E [ NN ] (5) }{{} j1 l1 p j S j S j + σ 2 I (6) j1 Now we use a linear algebra identity. For a matrix S with columns S 1, S 2,..., S k, and a diagonal matrix P diag[p 1, p 2,..., p k ], p j S j S j SPS. (7) j1 Although this identity may be unfamiliar, it is handy in manipulating correlation matrices. (Also, if this is unfamiliar, you may wish to work out an example with k 2 vectors of length 2 or 3.) Thus, R Y SPS + σ 2 I, (8) 8

and â R 1 Y R YX i ( SPS + σ 2 I ) 1 pi S i. (9) Recall that if C is symmetric, then C 1 is also symmetric. This implies the MMSE estimate of X i given Y is ˆX i (Y) â Y p i S i ( SPS + σ 2 I ) 1 Y (1) (b) We observe that V (SPS + σ 2 I) 1 Y is a vector that does not depend on which bit X i that we want to estimate. Since ˆX i p i S iv, we can form the vector of estimates ˆX ˆX 1. ˆX k p1 S 1V. pk S kv p1... pk S 1. S k V (11) P 1/2 S V (12) P 1/2 S ( SPS + σ 2 I ) 1 Y (13) Problem 9.5.3 Solution The solution to this problem is almost the same as the solution to Example 9.1, except perhaps the Matlab code is somewhat simpler. As in the example, let W (n), X (n), and Y (n) denote the vectors, consisting of the first n components of W, X, and Y. Just as in Examples 9.8 and 9.1, independence of X (n) and W (n) implies that the correlation matrix of Y (n) is [ R Y (n) E (X (n) + W (n) )(X (n) + W (n) ) ] R X (n) + R W (n) (1) Note that R X (n) and R W (n) are the n n upper-left submatrices of R X and R W. In addition, X 1 + W 1 r R Y (n) X E. X 1.. (2) X n + W n Compared to the solution of Example 9.1, the only difference in the solution is in the reversal of the vector R Y (n) X. The optimal filter based on the first n observations is â (n) R 1 R Y (n) Y (n) X, and the mean square error is r n 1 e L Var[X 1] (â (n) ) R Y (n) X. (3) 9

function emse953(r) Nlength(r); e[]; for n1:n, RYXr(1:n) ; RYtoeplitz(r(1:n))+.1*eye(n); ary\ryx; enr(1)-(a )*RYX; e[e;en]; end plot(1:n,e); The program mse953.m simply calculates the mean square error e L. The input is the vector r corresponding to the vector [ ] r r 2, which holds the first row of the Toeplitz correlation matrix R X. Note that R X (n) is the Toeplitz matrix whose first row is the first n elements of r. To plot the mean square error as a function of the number of observations, n, we generate the vector r and then run mse953(r). For the requested cases (a) and (b), the necessary Matlab commands and corresponding mean square estimation error output as a function of n are shown here:.1.1 MSE.5 MSE.5 5 1 15 2 25 n rasinc(.1*pi*(:2)); mse953(ra) (a) 5 1 15 2 25 n rbcos(.5*pi*(:2)); mse953(rb) (b) In comparing the results of cases (a) and (b), we see that the mean square estimation error depends strongly on the correlation structure given by r i j. For case (a), Y 1 is a noisy observation of X 1 and is highly correlated with X 1. The MSE at n 1 is just the variance of W 1. Additional samples of Y n mostly help to average the additive noise. Also, samples X n for n 1 have very little correlation with X 1. Thus for n 1, the samples of Y n result in almost no improvement in the estimate of X 1. In case (b), Y 1 X 1 + W 1, just as in case (a), is simply a noisy copy of X 1 and the estimation error is due to the variance of W 1. On the other hand, for case (b), X 5, X 9, X 13 and X 17 and X 21 are completely correlated with X 1. Other samples also have significant correlation with X 1. As a result, the MSE continues to go down with increasing n. 1