EE385 Class Notes 11/13/01 John Stensb Chapter 5 Moments and Conditional Statistics Let denote a random variable, and z = h(x) a function of x. Consider the transformation Z = h(). We saw that we could express z EZ [ ] = E[ h()] = hx ( )f x( x) dx, (5-1) a method of calculating E[Z] that does not require knowledge of f Z (z). It is possible to extend this method to transformations of two random variables. Given random variables, and function z = g(x,), form the new random variable Z = g(,). (5-) z f Z (z) denotes the densit of Z. he expected value of Z is EZ [ ] = zfz( z) dz; however, this formula requires knowledge of f Z, a densit which ma not be available. Instead, we can use z z E[ Z] = E[ g(, )] = g( x, ) fx( x, ) dxd (5-3) to calculate E[Z] without having to obtain f Z. his is a ver useful result. Covariance he covariance C of random variables and is defined as z z C = E[( - ηx)( - η)]= (x - η )( - η )f (x,)dxd - - x, (5-4) where η x = E[] and η = E[]. Note that C can be expressed as C = E[( - η )( - η )]=E[-η η + η η ] = E[ ] η η. (5-5) x x x x Updates at http://www.ece.uah.edu/courses/ee385/ 5-1
EE385 Class Notes 11/13/01 John Stensb Correlation Coefficient he correlation coefficient for random variables and is defined as r x C =. (5-6) σ σ x r x is a measure of the statistical similarit between and. heorem 5-1: he correlation coefficient must lie in the range 1 r x +1. Proof: Let α denote an real number. Consider the parabolic equation m r g( α) E[ α( ηx) + ( η) ] = ασx + αcx + σ 0 (5-7) Note that g(α) 0 for all α; g is a parabola that opens g( α) = α σx + αc x + σ upward. As a first case, suppose that there exists a value α 0 for which g(α 0 ) = 0 (see Fig. 5-1). hen α 0 is a repeated root of g(α) = 0. In the quadratic formula used to determine the roots of (5-7), the discriminant must be zero. hat is, (C x ) -4σ x σ = 0, so that α 0 α -axis Figure 5-1: Case for which the discriminant is zero. g( α) = α σ + αc + σ x x r x = C x / σ x σ = 1. Now, consider the case g(α) > 0 for all α; g has no real roots (see Fig. 5-). his means that the discriminant must be negative (so the roots are complex valued). Hence, (C x ) -4σ x σ < 0 so that α -axis Figure 5-: Case for which the discriminant is negative. Updates at http://www.ece.uah.edu/courses/ee385/ 5-
EE385 Class Notes 11/13/01 John Stensb C x r x = < 1. (5-8) σ σ x Hence, in either case, 1 r x +1 as claimed. Suppose an experiment ields values for and. Consider that we perform the experiment man times, and plot the outcomes and on a two dimensional plane. Some hpothetical results follow. -axis -axis x-axis x-axis Correlation Coefficient r x near -1 Correlation Coefficient r x Ver Small -axis x-axis Correlation Coefficient r x near +1 Figure 5-3: Samples of and with varing degrees of correlation. Updates at http://www.ece.uah.edu/courses/ee385/ 5-3
EE385 Class Notes 11/13/01 John Stensb Notes: 1. If r x = 1, then there exists constants a and b such that = a + b in the mean-square sense (i.e., E[{ - (a + b)} ] = 0).. he addition of a constant to a random variable does not change the variance of the random variable. hat is, σ = VAR[] = VAR[ + α] for an α. 3. Multiplication b a constant increases the variance of a random variable. If VAR[] = σ, then VAR[α] = α σ. 4. Adding constants to random variables and does not change the covarance or correlation of these random variables. hat is, + α and + β have the same covariance and correlation coefficient as and. Correlation Coefficient for Gaussian Random Variables Let zero mean and be joint Gaussian with joint densit R S f ( x, ) 1 exp 1 = x x ( ) r + 1 r M x x x πσ σ 1 r σ σ σ σ L M N O U QP V W. (5-9) We are interested in the correlation coefficient r ; we claim that r = r, where r is just a parameter in the joint densit (from statements given above, r is the correlation coefficient for the nonzero mean case as well). First, note that C = E[], since the means are zero. Now, show r = r b establishing E[] = rσ σ, so that r = C /σ σ = E[]/σ σ = r. In the square brackets of f is an expression that is quadratic in x/σ. Complete the square for this quadratic form to obtain x 1 σ r r x ( 1 r ) = x r ( 1 r ) σ σ σ σ. (5-10) x x σσ + = + + x x σ x σ σ x σ Use this new quadratic form to obtain Updates at http://www.ece.uah.edu/courses/ee385/ 5-4
EE385 Class Notes 11/13/01 John Stensb E[] = x f (x, ) dxd σx (x r ) 1 / x σ σ σ π x (1 r ) σx(1 r ) = e exp dxd. σ π σ x {normal densit with mean r x } σ (5-11) Note that the inner integral is an expected value calculation; the inner integral evaluates to σx r. Hence, σ 1 /σ E[] e σ = r x dxd σ π σ σ x 1 /σ σ x = r e d r σ = σ σ π (5-1) σ = r σxσ, as desired. From this, we conclude that r = r. Uncorrelatedness and Orthogonalit wo random variables are uncorrelated if their covariance is zero. hat is, the are uncorrelated if C = r = 0. (5-13) Since C = E[] E[]E[], Equation (5-13) is equivalent to the requirement that E[] = E[]E[]. wo random variables are called orthogonal if E[] = 0. (5-14) Updates at http://www.ece.uah.edu/courses/ee385/ 5-5
EE385 Class Notes 11/13/01 John Stensb heorem 5-: If random variables and are independent, then the are uncorrelated (independence uncorrelated). Proof: Let and be independent. hen. (5-15) E[] = x f (x, ) dxd = x f (x)f () dxd = E[] E[] herefore, and are uncorrelated. Note: he converse is not true in general. If and are uncorrelated, then the are not necessaril independent. his general rule has an exception for Gaussian random variable, a special case. heorem 5-3: For Gaussian random variables, uncorrelatedness is equivalent to independence ( Independence Uncorrelatedness for Gaussian random variables). Proof: We have onl to show that uncorrelatedness independence. But this is eas. Let the correlations coefficient r = 0 (so that the two random variables are uncorrelated) in the joint Gaussian densit. Note that the joint densit factors into a product of marginal densities. Joint Moments Joint moments of and can be computed. hese are defined as z z k r k r mkr = E [ ] = x f x dxd (, ). (5-16) Joint central moments are defined as z z μkr = E [( ηx ) k ( η ) r ] = x ηx k η r f x dxd ( ) ( ) (, ). (5-17) Conditional Distributions/Densities Let M denote an event with P(M) 0, and let and be random variables. Recall that Updates at http://www.ece.uah.edu/courses/ee385/ 5-6
EE385 Class Notes 11/13/01 John Stensb [, M] F( M) = P[ M] = P P[M]. (5-18) Now, event M can be defined in terms of the random variable. Example (5-1): Define M = [ x] and write P[ x, ] F (x, ) F( x) = = P[ x] F (x) (5-19) F( x, )/ f( x ) =. (5-0) F ( x) Example (5-): Define M = [x 1 < x ] and write P[x1< x, ] F (x,) F (x 1,) F( x1< x ) = = P[x < x ] F (x ) F (x ) 1 1. (5-1) Example (5-3): Define M = [ = x], where f (x) 0. he quantit P[, M]/ P [M] can be indeterminant (i.e., 0/0) in this case (certainl, this is true for continuous ) so that we must use F( = x) = limit F(x- Δ x< x). (5-) + Δx 0 From the previous example, this result can be written as F (x,) F (x Δx,) [F (x,) F (x Δx,)]/ Δx F( = x) = limit = limit F(x) F(x Δx) [F(x) F(x Δx)]/ Δx + + Δx 0 Δx 0 = F(x)/ x F (x,)/ x. (5-3) Updates at http://www.ece.uah.edu/courses/ee385/ 5-7
EE385 Class Notes 11/13/01 John Stensb From this last result, we conclude that the conditional densit can be expressed as f( = x) = F( = x) = F(x)/ x F (x,)/ x, (5-4) which ields f( x, ) f( = x ) =. (5-5) f ( x) Use the abbreviated notation f ( x) = f ( = x), Equation (5-5) and smmetr to write f (x,) = f ( x) f (x) = f (x ) f (). (5-6) Use this form of the joint densit with the formula before last to write f(x )f () f( x ) =, (5-7) f ( x) a result that is called Baes heorem for densities. Conditional Expectations Let M denote an event, g(x) a function of x, and a random variable. hen, the conditional expectation of g() given M is defined as z Eg [( ) Μ] = gx ( ) f( xμ) dx. (5-8) Updates at http://www.ece.uah.edu/courses/ee385/ 5-8
EE385 Class Notes 11/13/01 John Stensb For example, let and denote random variables, and write the conditional mean of given = as ηx E[ = ] E[ ] x f (x dx ). (5-9) Higher-order conditional moments can be defined in a similar manner. For example, the conditional variance is written as σx E[( η x) = ] E[( x ] (x x) f(x ) dx η ) η. (5-30) Remember that η x and σ x are functions of algebraic variable, in general. Example (5-4): Let and be zero-mean, jointl Gaussian random variables with R S f ( x, ) 1 exp 1 = x x ( ) r + 1 r x x πσ σ 1 r σ σ σ σ x L NM O U QP V W. (5-31) Find f(x ), η and σ x. We will accomplish this b factoring f into the product f(x )f (). B completing the square on the quadratic, we can write x x x r + = r + ( 1 r ) σ σσ x x σ σ x σ σ 1 x r + 1 r σ x = σ x σ ( ) σ, (5-3) so that Updates at http://www.ece.uah.edu/courses/ee385/ 5-9
EE385 Class Notes 11/13/01 John Stensb f(x ) σ x (x r ) σ 1 1 f (x,) = exp exp. (5-33) π(1 r ) σ σ x x (1 r ) πσ σ f () From this factorization, we observe that σx (x r ) 1 σ f(x ) = exp π(1 r ) σ σ x x (1 r ). (5-34) Note that this conditional densit is Gaussian! his unexpected conclusion leads to η = r x σx σ x x σ =σ (1 r ) (5-35) as the conditional mean and variance, respectivel. he variance σ x of a random variable is a measure of uncertaint in the value of. If σ x is small, it is highl likel to find near its mean. he conditional variance σ x is a measure of uncertaint in the value of given that =. From (5-35), note that σ x 0 as r 1. As perfect correlation is approached, it becomes more likel to find near its conditional mean η x. Example (5-5): Generalize the previous example to the non-zero mean case. Consider and same as above except for E[] = η and E[] = η. Now, define zero mean Gaussian variables d and d so that = d + η, = d + η and Updates at http://www.ece.uah.edu/courses/ee385/ 5-10
EE385 Class Notes 11/13/01 John Stensb f d d (x ηx, η) f (x, ) = = f d d (x ηx, η) (x,) (x, ) d d σx (x ηx r ( η)) σ ( η) exp exp σ x x (1 r ) πσ σ 1 1 = π(1 r ) σ. (5-36) B Baes rule for densit functions, it is easil seen that σx (x ηx r ( η)) 1 σ f(x ) = exp π(1 r ) σ σ x x (1 r ). (5-37) Hence, the conditional mean and variance are σx x x σ η =η + r ( η ) x x σ =σ (1 r ) (5-38) respectivel, for the case where and are themselves nonzero mean. Note that (5-38) follows directl from (5-35) since ηx E = = E d+η x d+η = = E d d = η + E η x d = η (5-39) σx σ = r ( η ) +η. x Updates at http://www.ece.uah.edu/courses/ee385/ 5-11
EE385 Class Notes 11/13/01 John Stensb Conditional Expected Value as a ransformation for a Random Variable Let and denote random variables. he conditional mean of random variable given that = x is an "ordinar" function ϕ(x) of x. hat is, ϕ (x) = E[ = x] = E[ x] = f( x) d. (5-40) In general, function ϕ(x) can be plotted, integrated, differentiated, etc.; it is an "ordinar" function of x. For example, as we have just seen, if and are jointl Gaussian, we know that σ ϕ (x) = E[ = x] =η + r (x ηx), (5-41) σ x a simple linear function of x. Use ϕ(x) to transform random variable. Now, ϕ() = E[ ] is a random variable. Be ver careful with the notation: random variable E[ ] is different from function E[ = x] E[ x] (note that E[ = x] and E[ x] are used interchangeabl). Find the expected value E[ϕ()] = E[E[ ]] of random variable ϕ(). In the usual wa, we start this task b writing z z z E[ E[ ]] E[ x] f ( x) dx = f( x) d f ( x) dx. (5-4) = = L N M O QP Now, since f (x,) = f ( x) f (x) we have z z z z E[ E[ ]] = f( x ) f( x) dxd= f( x, ) dxd= f( ) d. (5-43) z From this, we conclude that Updates at http://www.ece.uah.edu/courses/ee385/ 5-1
EE385 Class Notes 11/13/01 John Stensb E [ ] = E[ E [ ]]. (5-44) he inner conditional expectation is conditioned on ; the outer expectation is over. o emphasis this fact, the notation E [E[ ]] E[E[ ]] is used sometimes in the literature. Example (5-6): Example: wo fair dice are tossed until the combination 1 and 1 ( snake ees ) appear. Determine the average (i.e., expected) number of tosses required to hit snake ees. o solve this problem, define random variables 1) N = {number of tosses to hit snakes ees for the first time ) H =1 if snake ees hit on first roll = 0 if snake ees not hit first roll Note that H takes on onl two values with P[H= 1] = 1/36 and P[H=0] = 35/36. Now, we can = EE[NH], where the inner expectation is conditioned on H, and the outer expectation is an average over H. We write compute the average EN [ ] [ ] = = = P[ = ] + = P [ = ] EN E E[NH] E[NH 1] H 1 E[NH 0] H 0 Now, if H = 0, then snake ees was not hit on the first toss, and the game starts over (at the second toss) with an average of E[N] additional tosses still required to hit snake ees. Hence, E[N H = 0] = 1 + E[N]. On the other hand, if H = 1, snake ees was hit on the first roll, so E[N H = 1] = 1. hese two observations produce [ ] = = P[ = ] + = P[ = ] EN E[NH 1] H 1 E[NH 0] H 0 1 35 = 1 1 E[ N] + + 36 36 35 = EN [ ] + 1, 36 Updates at http://www.ece.uah.edu/courses/ee385/ 5-13
EE385 Class Notes 11/13/01 John Stensb and the conclusion E[N] = 36. Generalizations his basic concept can be generalized. Again, and denote random variables. And, g(x,) denotes a function of algebraic variables x and. he conditional mean ϕ(x) = E[g(,) = x] = E[g(x,) z = x] = g(x, ) f(x ) d (5-45) - is an "ordinar" function of real value x. Now, ϕ() = E[g(,) ] is a transformation of random variable (again, be careful: E[g(,) ] is a random variable and E[g(,) = x] = E[g(x,) x] = ϕ(x) is a function of x). We are interested in the expected value E[ϕ()] = E[E[g(,) ]] so we write - - E[ ϕ()] = E[ E[g(,) ] ] = f (x) g(x,)f ( x)d dx = g(x,)f ( x)f (x) d dx g(x,)f - - - - x(x, ) d dx E[g(,)], = = (5-46) where we have used f (x,) = f( x)f (x), Baes law of densities. Hence, we conclude that E[g(,)] = E[E[g(,) ]] = E [E[g(,) ]]. (5-47) In this last equalit, the inner conditional expectation is used to transform ; the outer expectation is over. Example (5-7): Let and be jointl Gaussian with E[] = E[] = 0, Var[] = σ, Var[] = σ and correlation coefficient r. Find the conditional second moment E[ = ] = E[ ]. First, note that Updates at http://www.ece.uah.edu/courses/ee385/ 5-14
EE385 Class Notes 11/13/01 John Stensb Var[] = E[ ] ee[ ] j. (5-48) Using the conditional mean and variance given b (5-35), we write = + F = + H G I e j K J σ E [ ] Var[ ] E [ ] σx( 1 r ) r x σ. (5-49) Example (5-8): Let and be jointl Gaussian with E[] = E[] = 0, Var[] = σ, Var[] = σ and correlation coefficient r. Find E[] = E [ ϕ ()], (5-50) where σ r x ϕ () = E[ = ] = E[ = ] = σ. (5-51) o accomplish this, substitute (5-51) into (5-50) to obtain σx σ E[] = E [ ϕ ()] = r E [ ] = r σ = rσ σ σ x x σ. (5-5) Application of Conditional Expectation: Baesian Estimation Let θ denote an unknown DC voltage (for example, the output a thermocouple, strain gauage, etc.). We are tring to measure θ. Unfortunatel, the measurement is obscured b additive noise n(t). At time t =, we take a single sample of θ and noise; this sample is called z = θ + n(). We model the noise sample n() as a random variable with known densit f n (n) (we Updates at http://www.ece.uah.edu/courses/ee385/ 5-15
EE385 Class Notes 11/13/01 John Stensb + n(t) θ + + Σ at t = z = θ + n() Figure 5-4: Nois measurement of a DC voltage. have abused the smbol n b using it simultaneousl to denote a random quantit and an algebraic variable. Such abuses are common in the literature). We model unknown θ as a random variable with densit f θ (θ). Densit f θ (θ) is called the a-priori densit of θ, and it is known. In most cases, random variables θ and n() are independent, but this is not an absolute requirement (the independence assumption simplifies the analsis). Figure 5-4 depicts a block diagram that illustrates the generation of voltage-sample z. From context in the discussion given below (and in the literature), the reader should be able to discern the current usage of the smbol z. He/she should be able to tell whether z denotes a random variable or a realization of a random variable (a particular sample outcome). Here, (as is often the case in the literature) there is no need to use Z to denote the random variable and z to denote a particular value (sample outcome or realization) of the random variable. We desire to use the measurement z to estimate voltage θ. We need to develop an estimator that will take our measurement sample value z and give us an estimate ˆθ (z) of the actual value of θ. Of course, there is some difference between the estimate ˆθ and the true value of θ; that is, there is an error voltage θ (z) ˆθ (z) - θ. Finall, making errors cost us. C( θ (z)) denotes the cost incurred b using measurement z to estimate voltage θ; C is a known cost function. he values of z and C( θ (z)) change from one sample to the next; the can be interpreted as random variables as described above. Hence, it makes no sense to develop estimator ˆθ that minimizes C( θ (z)). But, it does make sense to choose/design/develop ˆθ with the goal of Updates at http://www.ece.uah.edu/courses/ee385/ 5-16
EE385 Class Notes 11/13/01 John Stensb minimizing E[C( θ (z))] = E[C( ˆθ (z) - θ)], the expected or average cost associated with the estimation process. It is important to note that we are performing an ensemble average over all possible z and θ (random variables that we average over when computing E[C( ˆθ (z) - θ)]). he estimator, denoted here as ˆθ b, that minimizes this average cost is called the Baesian estimator. hat is, Baesian estimator ˆθ b satisfies E[ C( θˆ ˆ b(z) - θ)] E[ C ( θ(z) - θ)] (5-53) θ θ ˆ ˆ b. ( ˆθ b is the "best" estimator. On the average, ou "pa more" if ou use an other estimator). Important Special Case : Mean Square Cost Function C( θ ) = θ Let's use the squared error cost function C( θ ) = the average cost per decision is θ. hen, when estimator ˆθ is used, E[ ] (z) f (,z)d dz (z) f( z)d f (z)dz ( ˆ θ = ) z ( ˆ θ ) Z θ θ θ θ = θ θ θ θ (5-54) For the outer integral of the last double integral, the integrand is a non-negative function of z. Hence, average cost E[ θ ] will be minimized if, for ever value of z, we pick θ ˆ(z) to minimize the non-negative inner integral ( ) θ θˆ(z) f ( θ z) d θ. (5-55) With respect to ˆθ, differentiate this last integral, set our result to zero and get ( ˆ ) θ θ(z) f( θ z)dθ= 0. (5-56) Updates at http://www.ece.uah.edu/courses/ee385/ 5-17
EE385 Class Notes 11/13/01 John Stensb Finall, solve this last result for the Baesian estimator θ ˆ b (z) = θ f( θ z) d θ= E[ θ z]. (5-57) hat is, for the mean square cost function, the Baesian estimator is the mean of θ conditioned on the data z. Sometimes, we call (5-57) the conditional mean estimator. As outlined above, we make a measurement and get a specific numerical value for z (i.e., we ma interpret numerical z as a specific realization of a random variable). his measured value can be used in (5-57) to obtain a numerical estimate of θ. On the other hand, suppose that we are interested in the average performance of our estimator (averaged over all possible measurements and all possible values of θ). hen, as discussed below, we treat z as a random variable and average θ (z) = { θˆ (z) θ} over all possible measurements (values of z) and all b possible values of θ; that is, we compute the variance of the estimation error. In doing this, we treat z as a random variable. However, we use the same smbol z regardless of the interpretation and use of (5-57). From context, we must determine if z is being used to denote a random variable or a specific measurement (that is, a realization of a random variable). Alternative Expression for θ ˆb he conditional mean estimator can be expressed in a more convenient fashion. First, use Baes rule for densities (here, we interpret z as a random variable) f(z θ)f ( ) f( z) θ θ θ = (5-58) f z(z) in the estimator formula (5-57) to obtain f(z )f ( )d f(z )f ( )d ˆ f(z θ)f ( ) b (z) θ θ θ θ θ θ θ θ θ θ θ θ θ = d θ θ= =, f z(z) f z(z) f(z θ)f ( θ)dθ θ (5-59) Updates at http://www.ece.uah.edu/courses/ee385/ 5-18
EE385 Class Notes 11/13/01 John Stensb a formulation that is used in application. Mean and Variance of the Estimation Error For the conditional mean estimator, the estimation error is. (5-60) θ=θ θ ˆ b =θ E[ θ z] he mean value of θ is (averaged over all θ and all possible measurements z) E[ θ ] = E[ θ θ ˆ b] = E E[ z] θ θ = E[ θ] E E[ z] θ = E[ θ] E[ θ]. (5-61) =0 Equivalentl, E[ ˆ b] E[ ] θ = θ ; because of this, we sa that ˆθ b is an unbiased estimator. Since E[ θ ] = 0, the variance of the estimation error is VAR[ θ ] = E[ θ ] = θ E[ θ z] f ( θ, z)dθdz, (5-6) where f(θ,z) is the joint densit that describes θ and z. We want VAR[θ ] < VAR[θ]; otherwise, our estimator is of little value since we could use E[θ] to estimate θ. In general, VAR[ θ ] is a measure of estimator performance. Example (5-9): Baesian Estimator for Single-Sample Gaussian Case Suppose that θ is N(θ 0, σ 0 ) and n() is N(0,σ). Also, assume that θ and n are independent. Find the conditional mean (Baesian) estimator θ b. First, when interpreted as a random variable, z = θ + n() is Gaussian with mean θ 0 and variance σ 0 + σ. Hence, from the conditional mean formula (5-38) for the Gaussian case, we have Updates at http://www.ece.uah.edu/courses/ee385/ 5-19
EE385 Class Notes 11/13/01 John Stensb ˆ σ θ (z) = E[ θ z] = θ + r (z θ ), (5-63) 0 b 0 θz 0 σ 0 +σ where r θz is the correlation coefficient between θ and z. Now, we must find r θz. Observe that r θ z 0 0 0 0 n 0 0 0 0 0 0 0 0 E[( θ θ )(z θ )] E[( θ θ )([ θ θ ] + ())] E[( θ θ ) + ( θ θ ) n()] = = = σ σ +σ σ σ +σ σ σ +σ θ θ0 σ0 0 0 0 E[( = ) ] = σ σ +σ σ +σ, (5-64) since θ and η() are independent. Hence, the Baesian estimator is σ0 b 0 0 σ 0 +σ θ ˆ (z) =θ + (z θ ). (5-65) he error is θ = θ - ˆθ b, and E[ θ ] = 0 as shown b (5-61). hat is, ˆθ b is an unbiased estimator since its expected value is the mean of the quantit being estimated. he variance of θ is ˆ σ VAR[ θ ] = E[( θ θ 0 b) ] = E ( θ θ0) (z θ 0) σ 0 +σ σ 0 σ 0 0 0 0 0 σ 0 +σ σ 0 +σ = E[( θ θ ) ] E[( θ θ )(z θ )] + E[(z θ ) ]. (5-66) Due to independence, we have Updates at http://www.ece.uah.edu/courses/ee385/ 5-0
EE385 Class Notes 11/13/01 John Stensb 0 0 0 0 0 0 0 E[( θ θ )(z θ )] = E[( θ θ )( θ θ + n ())] = E[( θ θ )( θ θ )] =σ (5-67) 0 0 0 E[(z θ ) ] = E[( θ θ + n ()) ] =σ +σ (5-68) Now, use (5-67) and (5-68) in (5-66) to obtain σ 0 σ 0 0 0 0 σ 0 +σ σ 0 +σ VAR[ θ ] =σ σ + [ σ +σ ]. (5-69) σ =σ0 σ 0 +σ As expected, the variance of error θ approaches zero as the noise average power (i.e., the variance) σ 0. On the other hand, as σ, we have VAR[ θ ] σ 0 (this is the noise dominated case). As can be seen from (5-69), for all values of σ, we have VAR[ θ ] < VAR[θ] = σ 0, which means that ˆθ b will alwas out perform the simple approach of selecting mean E[θ] = θ 0 as the estimate of θ. Example (5-10): Baesian Estimator for Multiple Sample Gaussian Case As given b (5-69), the variance (i.e., the uncertaint) of ˆθ b ma be too large for some applications. We can use a sample mean (involving multiple samples) in the Baesian estimator to lower its variance. ake multiple samples of z(t k ) = θ + n(t k ), 1 k N (t k, 1 k N, denote the times at which samples are taken). Assume that the t k are far enough apart in time that n(t k ) and n(t j ) are independent for t k t j (for example, this would be the case if the time intervals between samples are large compared to the reciprocal of the bandwidth of noise n(t)). Define the sample mean of the collected data as Updates at http://www.ece.uah.edu/courses/ee385/ 5-1
EE385 Class Notes 11/13/01 John Stensb N 1 N k =θ+ n (5-70) k = 1 z z(t ) where N 1 n n (t ) (5-71) N k = 1 k is the sample mean of the noise. he quantit n is Gaussian with mean E[ n ] = 0; due to independence, the variance is N 1 σ k N k= 1 N VAR[ n] VAR[ n (t )] =. (5-7) Note that z θ+n has the same form regardless of the number of samples N. Hence, based on the data z, the Baesian estimator for θ has the same form regardless of the number of samples. We can adopt (5-65) and write σ0 b 0 0 σ 0 +σ /N θ ˆ (z) =θ + (z θ ). (5-73) hat is, in the Baesian estimator formula, use sample mean z instead of the single sample z. Adapt (5-69) to the multiple sample case and write the variance of error θ = θ - ˆθ b as σ /N VAR[ θ ] =σ0. (5-74) σ 0 +σ /N B making the number N of averaged samples large enough, we can average out the noise and Updates at http://www.ece.uah.edu/courses/ee385/ 5-
EE385 Class Notes 11/13/01 John Stensb make (5-74) arbitraril small. Conditional Multidimensional Gaussian Densit Let be an n 1 Gaussian vector with E[ ] = 0 and a positive definite n n covariance matrix Λ. Likewise, define as a zero-mean, m 1 Gaussian random vector with m m positive definite covariance matrix Λ. Also, define n m matrix Λ = E[ ]; note that Λ Z = Λ = E[ ], an m n matrix. Find the conditional densit f( ). First, define the (n+m) 1 super vector = L N M O Q P, (5-75) which is obtained b stacking on top of. he (n+m) (n+m) covariance matrix for Z is Λ Λ Λ Z = E[Z Z ] = E =. (5-76) Λ Λ he inverse of this matrix can be expressed as (observe that Λ Z Λ Z -1 = I) A B 1 Λ Z B C = L N M O Q P, (5-77) where A is n n, B is n m and C is m m. hese intermediate block matrices are given b Updates at http://www.ece.uah.edu/courses/ee385/ 5-3
EE385 Class Notes 11/13/01 John Stensb 1 1 1 1 A = ( Λ Λ Λ Λ ) = Λ [ I+ Λ CΛ Λ ] 1 1 B= AΛ Λ = Λ Λ C (5-78) 1 1 1 1 C= ( Λ Λ Λ Λ ) = Λ [ I+ Λ AΛ Λ ] Now, the joint densit is 1 A B f(, ) = exp 1 M L n m ( ) N M P + (5-79) π Λ B C Z L NM O QP L N M O Q P O QP he marginal densit is f( 1 ) = exp 1 1 m Λ (5-80) ( π) Λ From Baes heorem for densities f(, ) 1 f( ) = = exp f( ) n Λ Z ( π) NM Λ L L NM A B 1 1 B C Λ O QP L N M O Q P O QP (5-81) However, straightforward but tedious matrix algebra ields L NM A B B C Λ O QP L N M O Q P = A + B B+ ( C Λ ) 1 1 L N M O Q P 1 = [ A+ B] + [ B + ( C Λ ) ] (5-8) 1 = A+ B+ [ C Λ ] Updates at http://www.ece.uah.edu/courses/ee385/ 5-4
EE385 Class Notes 11/13/01 John Stensb (Note that the scalar identit B = B was used in obtaining this result). From the previous page, use the results B= AΛ Λ 1 1 1 and C Λ = ΛΛAΛΛ 1 to write L NM A B B C Λ O 1 1 1 A AΛ Λ Λ Λ AΛ Λ QP L N M O 1 Q P = + 1 1 = Λ Λ A Λ Λ (5-83) o simplif the notation, define 1 M Λ Λ (an m 1 vector) -1 1 Q A = Λ Λ Λ Λ (an n n matrix) (5-84) so that the quadratic form becomes L NM A B B 1 M Q M C Λ 1 ( ) ( ) (5-85) O QP L N M O Q P = Now, we must find the quotient Λ Z Λ. Write 1 Λ Λ Λ ΛΛ Λ I Λ n 0 Λ Z = = 1 Λ Λ 0 Λ Λ Λ Im (5-86) I m is the m m identit matrix and I n is the n n identit matrix. Hence, 1 Z Λ = Λ Λ Λ Λ Λ (5-87) Updates at http://www.ece.uah.edu/courses/ee385/ 5-5
EE385 Class Notes 11/13/01 John Stensb Λ Λ Z 1 Q = Λ Λ Λ Λ = (5-88) Use Equation (5-85) and (5-88) in f (x ) to obtain 1 1 1 f( ) = exp ( M) Q ( M) n ( π) Q, (5-89) where M Λ Λ (an m 1 vector) -1 1 1 Q A = Λ Λ Λ Λ (an n n matrix) (5-90) Vector M = E[ ] is the conditional expectation vector. Matrix Q = E[( M)( M) ] is the conditional covariance matrix. Generalizations to Nonzero Mean Case Suppose E[ ] = M and E[ ] = M, then f( 1 ) = exp 1 ( M 1 ) Q ( M ) n ( π) Q, (5-91) where M E[ ] = M +Λ Λ M ) (an n 1 vector) 1 ( Q E[( M)( M) ] =Λ Λ Λ Λ (an n n matrix). 1 (5-9) Updates at http://www.ece.uah.edu/courses/ee385/ 5-6