Probability and Random Processes

Size: px
Start display at page:

Download "Probability and Random Processes"

Transcription

1 Probability and Random Processes Probability and Random Processes Christian Schlegel, co-author: Ali M. Bassam Ultra Maritime Digital Communications Center website:umdcc.ca So far as mathematics do not tend to make men more sober and rational thinkers, wiser and better men, they are only to be considered an amusement, which ought not to take us off from serious business Bayes, 760

2 The Normal Gaussian Distribution Let us revisit the binomial distribution from Page?? for p = q = 0.5. Also, let n = ν. Then we have for the binomial term ν k ν a k = =, ν k ν ν ν + k ν+k ν ν + k a k is the k-th term from the center of the symmetric binomial distribution and a 0 is the central term. Expanding we obtain ν! ν a k = = ν! ν νν ν k + ν + k!ν k! ν!ν! ν + ν + ν + k } {{ } a 0 For large ν we can simplify as follows: k ν a k = a 0 + = a 0 exp ν k ν a 0 + k ν ν k j= j k ν = a 0 exp j= e j/ν k j= ej/ν k ν where we were using the limiting expressions for the exponential function, i.e., [ + j ν ν ] ν e j/ν We now also need a limit expression for the term a 0, which is afforded us by Sterling s formula: n! πn n+/ e n Applying Sterling to the central term we obtain a 0 πν, and a k πν exp k ν

3 The Normal Gaussian Distribution Normalizing as follows we can relate the binomial coefficients to a real-valued function in x by letting x = k/ ν/, and we obtain a k ν N x, where N x = π e x / The Probability Density Function PDF N x is called the Gaussian or Normal probability distribution function. We have our first limit theorem, which is a version of the Central Limit Theorem: a k ν ν N x 0.0 Example for n = 0 illustrating the close fit: ν N x

4 Random Walks Let the random variable S n be defined as the sum n { +, pxi x i = = 0.5 S n = X i, X i =, p Xi x i = = 0.5 i= The partial sum S n is a discrete, non-stationary, random process: 40 Examples of Random Walks 0 Sample Process # Sample Process # Sample Process # Applications: A random process can model the behavior of a phase-locked loop outside the pull-in range. This model describes the random motion of particles observed under thermal motion, called Brownian motion. The output of an integrator driven by zero-mean random noise is described a random walk. 3

5 The Laws of Large Numbers Weak Law of Large Numbers: Let S n = n i= X i, where the X i are identically and independently distributed random variables and EX i = m X and varx i = σ X are the mean and variance of X i. Then Pr S n n m X < ɛ where ɛ > 0 is arbitrarily small. Proof: We need Chebycheff s Inequality: Given X with zero mean and finite variance σx, we find that σ X = varx i = ɛ Hence, applying this to S n we find E var Sn n Sn n = n = E = E n i= n x m X f X xdx x m X ɛ EX i = m X n f X xdx = ɛ Pr X m X ɛ n X i m X i= n i= n j= X i X j m X n n i= EX i + m X = EX + n n m X m X = n EX m X = σ x n Applying these terms in the Chebycheff inequality we obtain Pr S n n m X ɛ σ X nɛ 4 n 0 Q.E.D.

6 Strong Law of Large Numbers Define the event A k as S k /k m X < ɛ, that is, the k-th partial sum S k /k is within ɛ of the mean m X m X + ɛ m X m X ɛ r n We are looking at the simultaneous occurrences of A r+,, A r+ν, for ν. The the strong law of large numbers says the following: Strong Law of Large Numbers: Given the conditions as for the weak law of large numbers, there exists an r <, such that Pr i= A r+i > ρ where ρ > 0 can be made as small as desired. The average S n /n tends to the mean m X with probability : S n n with probability m X 5

7 Strong Law of Large Numbers: Proof The proof of this statement has several steps and involves the union bound and Chernoff s inequality. First we apply the union bound: Pr A r+i = Pr A r+i i= i= i= Pr A r+i In order to proceed we need a bound for Pr X m X ɛ = f X x dx = x m X ɛ Ixf X xdx where Ix is the important indicator function. exp λx m X ɛ expλx m X ɛ Ix m x ɛ m x + ɛ x The indicator function itself is not the big invention, however, we are free to apply any suitable upper bound on the indicator function such that gx Ix and write Pr X m X ɛ gxf X xdx The key is that integrating this integral may be much simpler then exact evaluation. 6

8 Strong Law of Large Numbers: Proof We can over use the Chernoff bound, which is using the exponential function as overbound of Ix: Ix exp λx m X ɛ + exp λx m X ɛ where λ is the Chernoff factor, which can be optimized to tighten the bound. See previous page. Concentrating on the upper half of the indicator function, we write P X m X ɛ expλx m X + ɛf X xdx = E exp λx m X ɛ The parameter λ is arbitrary and can be chosen to minimize the right hand side and tighten the bound accordingly: 0 = d dλ E exp λx m X ɛ d = E dλ exp λx m X ɛ = E X m X ɛ exp λx m X ɛ which lead the implicit equation for the minimizing λ 0, given by E X m X exp λ 0 X m X E exp λ 0 X m X = ɛ From this we can now find λ 0 and obtain P X m X ɛ E exp λ 0 X m X ɛ = β ɛ 7

9 Strong Law of Large Numbers: Proof In the next step we now apply the Chernoff bound to S n /n, whose mean is also m X : n λ Sn Pr n m X ɛ = E n i= exp n i= X i m X ɛ E exp λ 0 X i m X ɛ = β n ɛ We know go back to the union bound term we wish to compute ν Pr A r+i i= = ν i= β r+i ɛ = β r ɛ ν i= β i ɛ ν β r ɛ β ɛ β ɛ r 0, iff β ɛ < This will prove the strong law of large numbers, as long as we can show that there exists a λ 0 such that β ɛ < strictly less. To show this fact, consider d dλ fλ E X m X ɛ exp λx m X exp λɛ λ=0 = E X m X ɛ = ɛ Hence, fλ has negative slope at λ = 0, and furthermore, fλ = 0 =. Therefore, the must exist a λ 0 > 0, such that fλ 0 <, since fλ is continuous in λ. f0 = β ɛ λ d dλ fλ λ=0 < 0 fλ λ 0 λ 8

10 Transformation of Random Variables Consider a random variable X that is subject to a transformation operation, for example y = gx = x. Clearly, Y is also a random variable, but has a different CDF and PDF. We proceed to compute its CDF as follows: Pr y < Y y + y = F Y y + y F Y y f Y y Y = Pr x : y < x y + y There exist two cases as shown here: x y = x y = x x x Pr Y < y = Pr X < x Pr X < x F Y y = F X x F X x = F X y F X y f Y y = df Y y dy = f X y y = df Xx y dx dx dy F X x y dx dx dy + f X y y 9

11 Transformation of Random Variables General Form If the function y = gx can be broken up into k invertible sections, that is, the inverse function g y = x has k solutions x,..., x k, then f Y y = k i= f x x i = g y g x i Example : Let f X x = π exp x, and Y = X. We apply the formula from above and obtain f Y y = f X y y = = + f X y y exp y/ π + exp y/ y y πy exp y/ This PDF is known as χ with one degree of freedom. Example : Let f X x = { / if x < 0 elsewhere and, again Y = X. We compute: f Y y = f X y y + f X y y = { y 0 y 0 elsewhere 0

12 Sum of Random Variables y y = z x Let X and Y be two independent RVs, and consider Z = X + Y : y z x x We proceed via the cumulative density function of Z as follows: Pr Z z = Pr X + Y z = lim x 0 = i x i Pr x i X x i + x Pr Y z x f X xf Y z xdx Taking the derivative on both sides w.r.t. z we obtain the probability density function: f Z z = f X xf Y z xdx Convolution Theorem: The addition of two independent random variables X + Y = Z has PDF f Z z = f X x f Y y = f X xf Y z xdx = Φ x ωφ y ω

13 Transformation of Multiple Random Variables In general, the transformation we may be faced with is between multiple RVs, for example, let y = g x, x and y = g x, x, where X, X are two original random variable, and Y, Y are two dependent random variable: y x + x x + x x x x w z x x x + x x x + x y We proceed by matching areas and therefore probability mass, that is, the events A x and A y have the same probability. A x has area probability mass given by x x. A y has area probability mass given by A y = z w = z w sinα = z w z w = z w z w The two tangential vectors bounding the parallelogram in the y, y -space are given as z = [ g, g ] T [ g w =, g ] T x x x x The transformed area A y is related to the original area A x by the Jacobian: A y = g g x g x g x x = Jx, x

14 Transformation of Multiple Random Variables The amount of probability that comes to lie in the infinitesimal areas in both the x, x -space as well as the y, y -space can be computed from the relative twodimensional density functions as: and Pr x < X x + x, x < X x + x = f X,X x, x x x Pr y < Y y + y, y < Y y + y = f Y,Y y, y y y y x + x x + x x x x y y x x x + x x x + x y However, the amount of probability in area A x transforms into all of area A y, and only an amount f X,X x, x x x Jx, x = f Y,Y y, y y y transforms into the infinitesimal area A y. As a consequence, the probability densities are related as f Y,Y y, y = k i= f X,X x i i Jx, xi, xi where we assumed that there are k solutions to y = g x, x, y = g x, x. 3

15 Polar Transformation of Random Variables As example let us study the ubiquitous polar coordinate transformation, where y and y are given by y x, x = x + x = g x, x = r y x, x = tan x x = g x, x = φ We calculate the Jacobian of this transformation as follows: Jx, x = g g x g x g x x = x x +x x x +x /x +x /x x /x +x /x = x /x x + = x + x /x r Consequently f Y,Y y, y = f X,X x, x x + x and we find the distribution in terms of polar coordinates as f R,Φ r, φ = f X,X r cosφ, r sinφ r Example 3: Let f X,X x, x be a product of Gaussian RVs: f X,X x, x = f R,Φ r, φ = π πσ x σ x exp x x σx σx r σ exp r = f σ R rf Φ φ and phase and amplitude are independently distributed with respective PDFs: f R r = { r σ exp r σ r 0 0 otherwise f Φ φ = { π 0 φ π 0 otherwise 4

16 Random Processes: Definition A Random Process is simply a random variable that is indexed by a time variable, either discrete or continuous. Statistics of N=500 Wiener process traces sliced at t= -3 fx[t=] x N 0, A complete characterization of a random process would involve giving the joint pdf at all possible sample points:. fx[t ],,X[tk ] x,, xk for all sample points k. 5

17 A Simple Example: Random Sine Wave Consider the following random process Xt = A cos πft + Θ where A is an unknown random Amplitude and Θ is an unknown random phase with uniform distribution in [0, π]. After sampling the random process Xt we obtain as discrete random process X[i] = A cos Ωi + Θ = X X where Ω = πft 0 and P X x = P A x and P X x is the density of cosθ, which has to be found via transformation as f X = = f Θ θ cosθ = /π sinθ /π cos cos x = /π x We find the pdf of X[i] via the transformation y x, x = x x, and y x, x = x : f Y,Y y, y = f X x f X x x = f X x f X x y f Y y = = f X y = f πy y /y X y π y y y f Y Y y y dy = f Xy dy y y y Special Case: if f A a is a constant, then f Y y = /π y 6

18 Statistical Descriptions of Random Processes Since it is in general exceedingly difficult to completely characterize a random process by characterizing the complete joint PDF of all possible sample points, one often resorts to statistical descriptions of the process. The Mean of a Random Process is defined as µ X t = EXt, or µ X [k] = EX[k] The Autocorrelation of a Random Process is defined as R XX t, t = EXt Xt, or R XX [l, m] = EX[l]X[m] The Autocovariance of a Random Process can be defined in terms of the mean and autocorrelation as C XX t, t = EXt µ X t Xt µ X t = R XX t, t µ X t µ X t As example, consider the random sine wave from before, where we obtain µ X t = EAEcos πft + Θ = 0 The autocorrelation is given as R XX t, t = E A cos πft + Θ A cos πft + Θ = E A E cos πft + t + Θ cos πft t = E A cos πfτ, τ = t t 7

19 Stationarity of Random Processes In the previous example, we note that neither µ x t, nor R XX t, t depend on the specific time samples t, t, t. This property is called Wide-Sense Stationarity A Random Process is Wide-Sense Stationary WSS if and only if µ X t = µ X independent of t, and R XX t, t = R X t t dependent only on difference τ = t t A Random Process is Strict-Sense Stationary SSS if and only if the complete joint PDF of the process is invariant under a time translation t 0 : f X[t +t 0 ],,X[t k +t 0 ]x,, x k = f X[t ],,X[t k ]x,, x k Basic Properties:. Strict-sense stationarity implies wide-sense stationarity.. The autocorrelation function is event, that is, R X τ = R X τ. 3. The autocorrelation achieves its maximum at τ = 0, i.e., R X τ R X Finding µ X t and R XX t, t is often the only hope one has in characterizing random processes. 8

20 Power Spectral Density of a Stationary Random Process The power spectral density PSD of a wide-sense stationary process is an extremely important and powerful characterization and analysis tool. It is simply defined as the Fourier transform of the autocorrelation function: The PSD of a Wide-Sense Stationary WSS Random Process is given as S X f = S X Ω = k= R X τe jπfτ dτ R X [k]e jkω for a continuous process, and for a discrete process Basic Properties:. S X f is real, non-negative, and even.. The Fourier transform property allows the autocorrelation to be recaptured from the PSD: R X t = S X fe jπfτ df for a continuous process, and R X [k] = π π S X Ωe jkω for a discrete process 3. R X 0 = EX t = Xt. S Xfdf is the average power of the process f 4. The integral S f X fdf gives the average power of the process in the frequency band [f, f ]. 9

21 Stationary Random Process Example: White Noise Thermal noise can be modeled as a Gaussian random process. We approximate the noise waveform by a sequence of random pulses wt of duration T s, nt = i= n i wt it s δ, where the weighing coefficients n i are independent Gaussian random variables and T s is the discrete time increment. δ is a random delay, uniformly distributed in [0, T s [. nt t We compute its autocorrelation function as Rτ = E [ntnt + τ] = E [ i= j= = σ E δ [ i= = σ Ts τ, n i n j wt it s δwt + τ jt s δ wt it s δwt + τ jt s δ ] ] 0

22 White Noise continued Rτ is a triangular function as shown below. Rτ Nf σ /T s T s fourier transform T s T s τ T s T s T s T s T s T s f As T s 0 in the limit, Nf σ. In the limit therefore white noise has an even distribution of power over all frequencies. As nt n w t, the correlation function Rτ will degenerate into an impulse of width zero and infinite height as T s 0, that is Rτ σ δτ, where δτ is known as Dirac s impulse function. We typically only need the sifting property of δt: δt αftdt = fα, where ft is an arbitrary function, which is continuous at t = α. White Noise n w t. is a stationary process with En w t = 0, and autocorrelation Rτ = σ δτ.. has infinite power En wt = and can thus only be used in system when filtered. 3. has a Gaussian distribution, and there is not only WSS but also strict-sense stationary. 4. is an excellent description of thermal noise and other wide-spread noise sources.

23 Filtering of Random Processes Let ht be the impulse response of a linear time-invariant system see lecture notes ECED 35, that is, an input signal xt undergoes the convolution yt = xαht αdα. Now if Xt is a random process, the output Y t = is also a random process. It s mean is m Y t = E = Likewise, its autocorrelation function is R Y Y t, t = E = = Xαht αdα Xαht αdα m X tht αdα Xαht αdα Xβht βdβ E XαXβ ht αht βdα dβ R XX α, βht αht βdα dβ If the input process Xt to a filter ht is stationary, the output process Y t = Xαht αdα is also stationary m Y = m X hαdα R Y τ = R X τ + α βhαhβdα dβ

24 The Power Spectrum of a Stationary Random Process For a wide-sense stationary random process, we can define the power spectrum. For a filtered stationary process Y t we therefore obtain R Y τ = = = = R X τ + α βhαhβdαdβ S X fe jπfτ+α β dα dβdf S X fe jπfτ df hα e jπfα dα hβ e jπfβ dβ S X f Hf e jπfτ df } {{ } H f } {{ } Hf Consequently, the output power spectral density of a filtered stationary process with power spectral density S X f is given as S Y f = S X f Hf This illustrates the meaning of the term Power Spectral Density, since the power of an output signal yt is given by R Y 0 = S X f Hf df = EY t and so S X f describes the mean power as a function of the frequency of the signal Xt. 3

25 White Gaussian Noise Through Brickwall Filter White noise n w t is theoretical concept with no real-world analog. In order to obtain useful models, we pass white noise with power spectral density N 0 / through a filter with characteristics { f W Hf = 0 otherwise and we then generate filtered noise nt = n t ht. We compute the autocorrelation function as R N τ = N 0 = N 0 = N 0 W W sin πfτ πfτ S n f Hf e jπfτ dτ cos πfτ dτ W W = W N 0 sin πw τ πw τ x We compute the power of th process as: R N 0 = W N 0. Now let us sample the noise at times which are multiples of /W, that is n i = nt i, t i = i W The sequence n i of samples is a discrete random process consisting of the sequence of random variables n i. The correlation of these random variable can be computed as follows E n i n j = R N t j t i = R N j i W sinπj i = W N 0 πj i { W N0 i = j = 0 i j 4

26 Ergodicity of Random Processes Ergodicity is a very important concept which connects experimental observation with the stationary properties of a random process. First, for a process to be ergodic in a parameter, it has to be stationary in that parameter. Consider for example the average o f an ergodic process Xt: µ X t = EXt = µ X, if process is stationary in the mean If we observe a given sample process xt, we can alternately compute the sample mean xt T = T T/ T/ xtdt A random process Xt is ergodic in the mean if and only off xt T ore formally, if lim T xt T = µ X lim T T T/ T/ xt µ X dt = 0 = l.i.m. T T T/ T/ µ X, or, xtdt = µ X where l.i.m. stands for limit in the mean square. Let us consider the random variable xt T E xt T = T T/ T/ = T T/ T/ T/ T/ T/ E xtxt dtdt C XX t, t dtdt T/ t t = ν var xt T = T = T T/ T/ T/ C XX νdν = T R XX νdν T/ R XX νdν xt T 5

27 Ergodicity of Random Processes continued From above we note that a stationary random process can be ergodic in the mean only if its autocorrelation function averages out to zero. This mean that the process has limited memory, or that the memory disappears as T becomes large. Ergodicity is an important concept. It means that we can replace statistical ensemble averages with averages over a single observed process. Ergodicity has to be established for each parameter separately. Ergodicity of the Autocorrelation: Let the sample autocorrelation function be given by R XX α T = T T/ T/ xtxt + αdt = Y α R XX α T is a random variable, whose expectation is given by E R XX α T = T = T T/ T/ T/ T/ E xtxt + α dt R XX αdt = R XX α To show convergence in the mean square sense we need the variance of Y α to vanish var Y α = T T/ = T T/ = T T/ T/ T/ T/ T/ T/ T/ T/ E xtxt + α xt xt + α dt dt Y α } {{ } zt R ZZ t, t dt dt R ZZ τ dτ 0 } {{ } zt In order for a random process to be ergodic in the autocorrelation function its 4-th statistics needs to vanish. 6

28 Brownian Motion Let s revisit the discrete random process S n = n i= X n, and compute µ S = 0 R SS [l, m] = minl, m This is shown as follows assuming l m: R SS [l, m] = E S l S m = E = E S l + S l S l l i= X i + m X i i=l+ m X i = E Sl = l i=l+ From this we can define the classical Wiener Process as Xt = { 0, t = 0 ds n, n T < t nt E Xt = 0 t E X t = T d E X t = nt = td T = αt If we let T 0, d 0, we obtain the Wiener Process wt : f W wt = exp w t παt αt The Wiener process is a non-stationary random process with. Ewt = 0, Ew t = αt. R W W t, t = α mint, t 7

29 Bayesian Estimation Contrary to Classical Estimation, which uses no prior information about the parameters to be estimated, Bayesian estimation makes use of probabilistic prior information. This enables Bayesian estimation to significantly improve estimation accuracy on average. Classical Parameter θ x x is a statistic Estimator gx ˆθ PDF θ Bayesian In the Bayesian approach, the parameter θ is considered to be a random variable whose PDF is known, or approximately known. We hence work with the joint PDF f X,Θ x, θ = f Θ X θ xf X x Minimum Mean-Square Error MMSE estimation: What was not possible in classical estimation theory is now feasible: Minimize E gx θ = = f X xdx gx θ f X,Θ x, θdxdθ gx θ f Θ X θ xdθ We can now minimize the second integral above w.r.t. g, and thus minimize the entire expression and therefore the squared error. 8

30 MMSE Estimator We proceed with the minimization as follows: E gx θ = E gx E gxθ + E θ = = f X xdx gx gx θf Θ X θ xdθ }{{} + conditional estimator: Eθ x f X xdx gx Eθ x + varθ x θ f Θ X θ xdθ The conditional variance varθ x represents an irreducible error, but the other term is minimized by setting Conditional Expectation: g MMSE x = Eθ x that is, the Conditional Expectation minimizes the mean-squared error of a Bayesian estimator. However, the conditional expecation is often difficult to find, and in practice another estimator, the Maximum-Likelihood Estimator is often easier to find. f Θ X Maximum-Likelihood Estimator: g ML x = arg max f Θ X θ x x Conditional Expectation Estimator Maximum Likelihood Estimator For symmetrical conditional density functions the two estimators coincide, but in general g ML x g MMSE x. 9

31 MMSE Estimator: Example Example: Consider the familiar example of observing an unknown amplitude a in Gaussian noise, i.e., x i = a + n i ; i n In this case the PDF of the observation vector x = x,..., x n is now interpreted as a conditional probability density function, given by f X A x a = exp n x πσ n/ σ i a The critical assumption in Bayesian estimation is now the PDF of a. Let us assume here that a is distributed according to an apriori Gaussian PDF following f A a = i= πσ exp A / σa a µ A where µ A is, of course, the mean of the distribution and σ A is its variance. We now need to compute the conditional expectation  = EA x We deviate briefly and discuss the form of Jointly Distributed Gaussian Random Variables GRV If two GRV s X and Y are uncorrelated, then their jiont PDF is simply the product f X,Y x, y = πσ x σ y exp x µ x σ x y µ y σ y However, if they are corrleated, the product is no longer separable, and a correlation coefficient connects the two, and f X,Y x, y = πσ x σ y exp ρ = E X µ xy µ y σ x σ y x µ x ρ σx ρx µ xy µ y σ x σ y + y µ y σy 30

32 Multivariate Gaussian Random Variables A Multivariate Gaussian Random Variable distribution is one where n RV s are all jointly Gaussian distributed, as f X x = C xx / π exp x µ x T C xx x µ x n/ In this equation, x = x,, x n is the vector of n GRV s, and C xx is the pairwise correlation matrix C xx = C C C n C C C n... C n C n C nn and C ij = E X i µ xi X j µ xj is the co-variance between x i and x j. The following extremely important formula applies to vectors of jointly Gaussian random variables. Separate the vector x = [y, z] into to partial vectors, with rearrangements if required. Now the conditional expectation of one part, given knowledge of the other is given by Conditional Expectation of Gaussian Random Variables: EY z = EY + C yz C zz z Ez C y z = C yy C yz C zz C zy where C y z is the Conditional Covariance Matrix C y z = E Y µ y z Y µ y z T X = z = E Y X X = z E µy z µ T y z Note: The conditional expectation equation in the case of jointly Gaussian random vectors is a linear transformation of the statistic. 3

33 MMSE Estimator: Example Continued We now let our vector of joint GRV s be x = [A, x,..., x n ] and apply the conditional expectation formula from above. With this we obtain C zz = σ A T + σ I C yy = σa C zyz = σa T + σ I keeping in mind that the first part y is only the scalar RV A. A straightforward application of the conditional expectation formula for GRV s now gives the optimal estimator as  = EA x = µ A + σ A T σ A T + σ I x µa Using the Woodbury Identity, a version of the Matrix Inversion Lemma, we can avoid the matrix inverse above. Woodbury s Identity: I + c T = I c T + nc We obtain  = EA x = µ A + σ A σ I σ A σ T + n σ A σ x µ A which can be manipulated into  = αµ }{{} x + αµ }{{ A } ; α = Data Part Prior Knowledge σ A σ A + σ /n The estimator thus naturally weighs the information coming from the observation, and what is known about the parameters to be estimated according to the reliabilities of both. 3

34 MMSE Estimator: Example Continued We can also quite easily compute the mean squared error of our estimator as E Â A = E α n n A + n i + αµ A i= = E α A µ A + E = α A µ A + α σ n α Therefore, the Bayesian estimator can give a substantially lower error then the unbiased classical estimator, whose minimum mean-square error equals σ /n..0 n n i= n i Mean Square Error MSE Classical: σ /n Bayesian: α σ /n α = σ A < σ A +σ σa 0; α 0. If A is known well. n ; α. If we trust the measurements A The picture illustrates the error as a function of the actual parameter A; if A happens to be too far away from its expectation µ A, the Bayesian estimator can actually generate an error which is worse then that of the classical estimator, but the average and the minimum error both can be substantially smaller. 33

35 Linear Estimator A linear estimator for a single parameter θ is one that is a linear combination of the observed data, i.e., ˆθ = n i= a i x i + b where the coefficients a i and b are constants to be optimized for minimum error in the estimate. We first find b by derivation b E θ ˆθ = E θ b = E θ n i= n i= a i x i b a i µ xi = 0 This takes care of the bias, that is Eˆθ = µθ. The coefficients a i are now also found via partial differentiation as a i E θ ˆθ = E θ n j= a j x j b }{{} error x i }{{} data = 0 This leads to the famous Orthogonality Principle: In an optimal linear estimator the data is orthogonal to the estimation error: E θ n j= a j x j b xi = 0 34

36 Linear Prediction Note that the parameter θ is arbitrary, it does not necessarily have to be a causal parameter of the random process x, it simply has to be probabilistically correlated with x. Linear Prediction: we choose θ = x n+ as the next RV in random process x = x, x, x 3,, that is: where N is the predictor order. ˆx n+ = N j= a j x n j A direct application of the orthogonality principle, we find the optimal coefficients as a = R r where r = [R n+,n,, R n+,n N+ ] and R is an N N matrix of similar values R i,j, where R i,j = EX i X j is the correlation between the i-th and the j-th element in the sequence. These are the Wiener-Hopf Equations. For a stationary process x, we have R i,j = R j i = R i j, and these equation become Wiener-Hopf Equation for Stationary Processes: R R. R N = R 0 R R N R R 0 R N. R N R N R 0. a a. a N r = Ra These equations can be solved very efficiently due to the structure of R by the Levinsion-Durbin Algorithm. 35

37 Linear Prediction: Mean-Square Error To compute the mean-square error MSE of the predictor we calculate E x n+ ˆx n+ = E x n+ E xn+ˆx n+ + E ˆx n+ = σ x E x n+ˆx n+ = σ x N j=0 = σ x N j=0 a j E x n+ x n j a j R j+ = σx at r = σx rt R r Since 0 r T R r σx the power of the predictor depends on the correlation between successive samples in x. The example predictor below tracks a fading process with doppler frequency f d Matlab Code: LinearPredictor.m. = 5Hz.8.6 Process Amplitude N = 5 N = Time in Seconds 36

38 Linear Estimators: The Linear MMSE If we need to estimate m parameters θ j from the same data, we proceed as ˆθ j = n i= a ji x i + b j ; j m which we can succinctly collect into a single matrix equation ˆθ = Ax + b where A is now an m n matrix of coefficients. Proceeding analogously to the sinlge parameter case we find and from there we find a j E b = Eθ A T µ x θj µ θj a T j x µ x = = E θ j µ θj a T j x µ x T x µ x T = 0 E θ j µ θj x µx = E a T j x µ xx µ x T from which we compute the optimal coefficients to estimate θ j as a T j = C θ j xc xx and the correlation matrices are defined implicitly above. Putting all the pieces together, we obtain for the optimal linear estimator: Linear Minimum Mean-Square Error Estimator: ˆθ = Ax + b = Eθ + C θx C xx x Eµ x Note: The linear LMMSE estimator has exactly the same mathematical form as the optimal estimator for jointly Gaussian RVs. 37

39 Linear Estimators: The Mean-Square Error The Mean-Square Error MSE of the linear estimator can be computed quite easily as follows: ˆθ = Ax + b MSE Θ = E Θ ˆΘ Θ ˆΘ + = E Θ µ Θ A X µ X Θ µ Θ A X µ X + = C θθ + AC xx A + AC + xx C θθ + A + Noting that C xx = E X µ X X µ X + and C xθ and C θx are defined analogously, we use to obtain A = C θx C xx MSE Θ = C θθ C θx C xx C xθ Note that MSE Θ measures not only the expected squared error of all ˆθj, but also the correlation of errors Eθ j ˆθj θ j ˆθj. Putting all the pieces together, we obtain for the optimal linear estimator: Linear Minimum Mean-Square Error Estimator: ˆθ = Ax + b = Eθ + C θx C xx x Eµ x with error MSE Θ = C θθ C θx C xx C xθ 38

40 Example: Channel Estimation Consider the case of an unknown discrete channel as shown here: u i..... h 0 h h h P w i..... x i The unknown parameters are the tap values h p of the channel, while the observed output data is given by x = u 0 u u 0 u. u. u 0.. u n u n... u n P + h 0 h h. h P + w = Uh + w The data covariance matrix required in the Bayesian approach is given by C xx = E X µ x X µ x + = E Uh µ h Uh µ h + = UE h µ h h µ h + U + + E ww + = UC hh U + + C ww The optimal Bayesian estimator for this problem is given by: ĥ = C hh U + UC hh U + + C ww x Uµh + µ h If the channel gains taps are uncorrelated with zero mean, C hh = I, µ h = 0, then ĥ = U + UU + + C ww x 39

41 Linear Models The previous example for an instance of a linear model not to be confused with a linear estimator. In a linear model, the vector of observed values, the statistic, is given by the linear relation from the previous page. If no statistical information about the parameters θ is known, we apply classical estimation and compute the CRLB as follows ln px, θ = θ θ σ x Uθ+ x Uθ = θ σ x + x + x + Uθ + θu + Uθ = σ U + x U + Uθ = U+ U σ U + U U + x θ The last equation is in the canonical CRLB form which allows us to immediately extract both the minimum variance unbiased estimator, as well as its error as: Minimum Variance Estimator: ˆθ = U + U U + x and its minimum variance error matrix E ˆθ θ ˆθ θ + = σ U + U Note that in this derivation we have assumed that the noise w is white with variance σ per coordinate. The Bayesian estimator for the equivalent problem, also assuming uncorrelated parameters C θθ = I is ˆθ = U + UU + + σ I x which, with the help of the matrix inversion lemma can be moved into the form ˆθ = U + U + σ I U + x 40

42 An Alternate Approach & Its Geometric View The method of Least Square LS computes the solution θ with the smalles squared distance to the received measurements, i.e., ˆθ = arg min x Uθ θ has exactly the same solution as the MVUE from a classical consideration, namely ˆθ = U + U U + x = U + Uˆθ = U + x = U + Uˆθ x = 0 The first product on the left-hand side, namely Uˆθ, generates a vector which is a linear combination of the columns {u k } of U. this vector lives in the span of U, as shown below: x P u Uˆθ u Span{u k } It s easy to see that Uˆθ is orthogonal to the error Uˆθ x second expansion above, and therefore Uˆθ = U U + U U + x is the operation of orthogonally projecting x onto the span of U. The matrix P = U U + U U + = P is the orthogonal projection matrix onto U. 4

43 Linear Channel Model: Revisited Revisiting the linear channel model in the classical approach, we know that the CRLB error bound can be achieved in this case, and the error is given by Cĥĥ = σ U + U If we use the unit coordinate vector e i = 0,, 0,, 0,, 0 with a single entry at the i-th position, then the variance of the estimate of the i-th channel gain is given as varĥ i = σ e i + Cĥĥ e i = σ e i + D + De i where we have used a symmetric decomposition, such as the Cholesky decomposition to obtain Cĥĥ = D+ D. This can always be done with a semi-definite matrix, such as Cĥĥ. We now use a form of the famous Chauchy-Schwartz Inequality: and from that i x + y x + xy + y x i y i i x i e i + e i = = e i + D + D + e i varĥ i i y i e + i D + De + i ei D D + e i = e + i Cĥĥ i e + ei C ĥĥe i σ e i+ C ĥĥe i = σ [U + U] ii σ This effect is known as noise enhancement of the pseudo-inverse. to obtain 4

44 Wiener Filtering Contrary to the cases considered with the exception of the predictor, the parameters to be estimated were fixed and affected the random process from outside. However, any random quantity which is correlated with the observation can serve as parameter, and therefore let us consider the case where these parameters evolve continuously: x = θ + w = s + w We have now as many parameters as measurements, and successful estimation is only possible if there is a strong correlation among the components of s. Using the optimal linear estimator we obtain the Wiener Estimator ŝ = C ss C ss + C ww x Filter Interpretation The equation above is basically a matrix-vector multiplication, i.e., As the process evolves, we compute ŝ = C ss C ss + C ww x = Fx ŝ n = where r T ss is the last row of C ss. n k=0 f n n k x k = r T ss C ss + C ww x This equation can be brought into the form of the Wiener-Hopf equations: C ss + C ww f n = r ss For stationary process the correlation values must decay as n, which leads to stationary coefficients f that no longer evolve with n. This equation above for ŝ n then describes a time-invariant discrete filter the Wiener Filter. 43

45 Sequential Linear Minimum Mean-Square Error Estimator The LMMSE is an important step towards the modern recursive estimation and tracking algorithms such as the Kalman, and the recursive least-squares estimators. Stating point is the Minimum Mean-Square Error Estimator for the linear system x = Uθ + w We are interested in updating the estimate after one new observation, i.e., [ x ] [ U ] x[n] = u T [n] θ + w The orthogonality principle states that the data is orthogonal to the estimation error: E [ θ i ˆθi x ] = 0 We now propose the following form for the updated estimator at time n ˆθ i [n] = ˆθi [n ] + c[n] The orthogonality principle leads to the following conditions: a E[c[n]x[i]] = 0, i [0, n ] b E [ θ i ˆθi c[n]x[n] ] = 0 One-Step Prediction: We start with a one-step predictor for x[n], denoted by ˆx[n]. From before, we know that the conditional expectation is the best estimator: ˆx[n] = E [ x[n] x ] c[n] = ki [n]x[n] ˆx[n] Furthermore, since ˆx[n] and x are jointly Gaussian: ˆx[n] = E[x[n]] + C x[n]s C xx x = u T [n]µ θ + u T [n]c θθ U T C xxx = u T [n]µ θ + u T [n]ˆθ[n ] 44

46 Sequential Linear Minimum Mean-Square Error Estimator Condition b is now used to determine the correction term c[n], which is a multiple of the prediction error. The multiplication factor k i [n] is the Kalman gain, and needs to be evaluated. From condition b we proceed as E [ θ i ˆθi c[n]x[n] ] = E [ θ i ˆθi k i [n]x[n] ˆx[n]x[n] ˆx[n] ] [ ] E θ i ˆθi u T [n]θ ˆθ[n ] = [ u T [n]θ ˆθ[n ] ] T + w[n] θ ˆθ[n ] + w[n] u[n] k i [n]e k i [n] = E [ θ i ˆθi u T [n]θ ˆθ[n ] ] u T [n]m[n ]u[n] + σ [n] If we combine all the individual Kalman gains into a gain vector k[n] we obtain k[n] = k 0 [n] k [n]. k P [n] = M[n ]u[n] u T [n]m[n ]u[n] + σ [n] The matrix M[n] is the error covariance matrix at time n, and is needed in the algorithm. Both M[n] and k[n] are computed recursively in the algorithm, which is summarized here: The sequential linear minimum mean-square error estimator update equations are given by Gain Update: k[n] = M[n ]u[n] u T [n]m[n ]u[n] + σ [n] Error Variance Update: M[n] = I k[n]u T [n] M[n ] Estimator Update: ˆθ[n] = ˆθ[n ] + k[n] x[n] u T [n]ˆθ[n ] The complexity of this one-step update is dominated by the matrix multiplications in the gain and error variance update steps. Both of these are P P matrix multiplies with a complexity of OP On 3. 45

47 The Kalman Filter The State-Space Model The primary difference between the Kalman Filter and the sequential MMSE estimator is that the linear system model is no longer driven by a fixed parameter vector, but by the update equation x[n] = u T [n]s[n] + w[n] that is, the formerly fixed parameter θ is now also evolving with time, and is denoted by s[n]. The key of the Kalman filter is the model for s[n], the so-called State-Space Model. Its accuracy determines how well the filter works, and its general form is given by The Kalman State-Space Model is defined as s[n] = As[n ] + Bu[n] where A is a P P model evolution description, u[n] is white innovation noise of unit variance, and B is a noise impact matrix. Example: Vehicle Tracking: We wish to track a vehicle in the x y plane, where r x [n], r y [n] are the x y coordinates at time n, and v x [n], v y [n] are the speeds in the x y direction. Our system state is defined as the four parameters s T [n] = [r x [n], r y [n], v x [n], v y [n]], and the state-space is given by s[n] = r x [n] r y [n] v x [n] v y [n] = 0 δt δt r x [n ] r y [n ] v x [n ] v y [n ] u x [n] u y [n] Here, the state-space model captures the motion of the vehicle via position and speed, and uncertainties, i.e., accelerations, are added to the model as uncertainties via the innovations in the speed components. Note that uncertainties in the measurements of the positions, for example, are modeled in the linear system equation at the top of the page. 46

48 The Kalman Filter The Algorithm The actual algorithm development for the Kalman Filter is identical to that of the sequential LMMSE. We need the first-order measurement prediction ˆx[n] = C x[n]x C xx x = u T [n]ac s[n ]x C xx x = u T [n]ŝ[n]c xx x which requires the first-order prediction ŝ[n] of the system state. Similar to the parameter update of the sequential LMMSE, the system state update is given as a correction of the state prediction at time n ŝ[n n] = ŝ[n n ] + k[n] x[n] u T [n]ŝ[n n ] The remainder of the development is identical to the one discussed before, and we obtain the Complete Kalman Filter Equations: State Prediction: ŝ[n n ] = Aŝ[n n ] MSE Prediction: ˆM[n n ] = AM[n n ]A T + BB T Gain Update: k[n] = Error Variance Update: M[n Estimator Update: ŝ[n M[n n ]u[n] u T [n]m[n n ]u[n] + σ [n] n] = I k[n]u T [n] M[n n ] n] = ŝ[n n ] + k[n] x[n] u T [n]ŝ[n n ] The only function difference between the Kalman filter and the sequential LMMSE is that in the former, both the new state, and the estimated output are predicted, and therefore there are also an update, and a prediction equation for the error covariance. 47

Stochastic Processes

Stochastic Processes Elements of Lecture II Hamid R. Rabiee with thanks to Ali Jalali Overview Reading Assignment Chapter 9 of textbook Further Resources MIT Open Course Ware S. Karlin and H. M. Taylor, A First Course in Stochastic

More information

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno Stochastic Processes M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno 1 Outline Stochastic (random) processes. Autocorrelation. Crosscorrelation. Spectral density function.

More information

ECE 636: Systems identification

ECE 636: Systems identification ECE 636: Systems identification Lectures 3 4 Random variables/signals (continued) Random/stochastic vectors Random signals and linear systems Random signals in the frequency domain υ ε x S z + y Experimental

More information

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017)

UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017) UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, 208 Practice Final Examination (Winter 207) There are 6 problems, each problem with multiple parts. Your answer should be as clear and readable

More information

UCSD ECE 153 Handout #46 Prof. Young-Han Kim Thursday, June 5, Solutions to Homework Set #8 (Prepared by TA Fatemeh Arbabjolfaei)

UCSD ECE 153 Handout #46 Prof. Young-Han Kim Thursday, June 5, Solutions to Homework Set #8 (Prepared by TA Fatemeh Arbabjolfaei) UCSD ECE 53 Handout #46 Prof. Young-Han Kim Thursday, June 5, 04 Solutions to Homework Set #8 (Prepared by TA Fatemeh Arbabjolfaei). Discrete-time Wiener process. Let Z n, n 0 be a discrete time white

More information

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process

ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 6 Stochastic Process Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Definition of stochastic process (random

More information

Fundamentals of Digital Commun. Ch. 4: Random Variables and Random Processes

Fundamentals of Digital Commun. Ch. 4: Random Variables and Random Processes Fundamentals of Digital Commun. Ch. 4: Random Variables and Random Processes Klaus Witrisal witrisal@tugraz.at Signal Processing and Speech Communication Laboratory www.spsc.tugraz.at Graz University of

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

ECE534, Spring 2018: Solutions for Problem Set #5

ECE534, Spring 2018: Solutions for Problem Set #5 ECE534, Spring 08: s for Problem Set #5 Mean Value and Autocorrelation Functions Consider a random process X(t) such that (i) X(t) ± (ii) The number of zero crossings, N(t), in the interval (0, t) is described

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

INTRODUCTION Noise is present in many situations of daily life for ex: Microphones will record noise and speech. Goal: Reconstruct original signal Wie

INTRODUCTION Noise is present in many situations of daily life for ex: Microphones will record noise and speech. Goal: Reconstruct original signal Wie WIENER FILTERING Presented by N.Srikanth(Y8104060), M.Manikanta PhaniKumar(Y8104031). INDIAN INSTITUTE OF TECHNOLOGY KANPUR Electrical Engineering dept. INTRODUCTION Noise is present in many situations

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

EAS 305 Random Processes Viewgraph 1 of 10. Random Processes

EAS 305 Random Processes Viewgraph 1 of 10. Random Processes EAS 305 Random Processes Viewgraph 1 of 10 Definitions: Random Processes A random process is a family of random variables indexed by a parameter t T, where T is called the index set λ i Experiment outcome

More information

Lecture Notes 7 Stationary Random Processes. Strict-Sense and Wide-Sense Stationarity. Autocorrelation Function of a Stationary Process

Lecture Notes 7 Stationary Random Processes. Strict-Sense and Wide-Sense Stationarity. Autocorrelation Function of a Stationary Process Lecture Notes 7 Stationary Random Processes Strict-Sense and Wide-Sense Stationarity Autocorrelation Function of a Stationary Process Power Spectral Density Continuity and Integration of Random Processes

More information

Gaussian, Markov and stationary processes

Gaussian, Markov and stationary processes Gaussian, Markov and stationary processes Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ November

More information

3F1 Random Processes Examples Paper (for all 6 lectures)

3F1 Random Processes Examples Paper (for all 6 lectures) 3F Random Processes Examples Paper (for all 6 lectures). Three factories make the same electrical component. Factory A supplies half of the total number of components to the central depot, while factories

More information

Name of the Student: Problems on Discrete & Continuous R.Vs

Name of the Student: Problems on Discrete & Continuous R.Vs Engineering Mathematics 05 SUBJECT NAME : Probability & Random Process SUBJECT CODE : MA6 MATERIAL NAME : University Questions MATERIAL CODE : JM08AM004 REGULATION : R008 UPDATED ON : Nov-Dec 04 (Scan

More information

Communication Systems Lecture 21, 22. Dong In Kim School of Information & Comm. Eng. Sungkyunkwan University

Communication Systems Lecture 21, 22. Dong In Kim School of Information & Comm. Eng. Sungkyunkwan University Communication Systems Lecture 1, Dong In Kim School of Information & Comm. Eng. Sungkyunkwan University 1 Outline Linear Systems with WSS Inputs Noise White noise, Gaussian noise, White Gaussian noise

More information

Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes

Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes Electrical & Computer Engineering North Carolina State University Acknowledgment: ECE792-41 slides were adapted

More information

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction Estimation Tasks Short Course on Image Quality Matthew A. Kupinski Introduction Section 13.3 in B&M Keep in mind the similarities between estimation and classification Image-quality is a statistical concept

More information

UCSD ECE153 Handout #40 Prof. Young-Han Kim Thursday, May 29, Homework Set #8 Due: Thursday, June 5, 2011

UCSD ECE153 Handout #40 Prof. Young-Han Kim Thursday, May 29, Homework Set #8 Due: Thursday, June 5, 2011 UCSD ECE53 Handout #40 Prof. Young-Han Kim Thursday, May 9, 04 Homework Set #8 Due: Thursday, June 5, 0. Discrete-time Wiener process. Let Z n, n 0 be a discrete time white Gaussian noise (WGN) process,

More information

Chapter 6: Random Processes 1

Chapter 6: Random Processes 1 Chapter 6: Random Processes 1 Yunghsiang S. Han Graduate Institute of Communication Engineering, National Taipei University Taiwan E-mail: yshan@mail.ntpu.edu.tw 1 Modified from the lecture notes by Prof.

More information

Probability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver

Probability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver Stochastic Signals Overview Definitions Second order statistics Stationarity and ergodicity Random signal variability Power spectral density Linear systems with stationary inputs Random signal memory Correlation

More information

16.584: Random (Stochastic) Processes

16.584: Random (Stochastic) Processes 1 16.584: Random (Stochastic) Processes X(t): X : RV : Continuous function of the independent variable t (time, space etc.) Random process : Collection of X(t, ζ) : Indexed on another independent variable

More information

UCSD ECE250 Handout #24 Prof. Young-Han Kim Wednesday, June 6, Solutions to Exercise Set #7

UCSD ECE250 Handout #24 Prof. Young-Han Kim Wednesday, June 6, Solutions to Exercise Set #7 UCSD ECE50 Handout #4 Prof Young-Han Kim Wednesday, June 6, 08 Solutions to Exercise Set #7 Polya s urn An urn initially has one red ball and one white ball Let X denote the name of the first ball drawn

More information

ECE6604 PERSONAL & MOBILE COMMUNICATIONS. Week 3. Flat Fading Channels Envelope Distribution Autocorrelation of a Random Process

ECE6604 PERSONAL & MOBILE COMMUNICATIONS. Week 3. Flat Fading Channels Envelope Distribution Autocorrelation of a Random Process 1 ECE6604 PERSONAL & MOBILE COMMUNICATIONS Week 3 Flat Fading Channels Envelope Distribution Autocorrelation of a Random Process 2 Multipath-Fading Mechanism local scatterers mobile subscriber base station

More information

7 The Waveform Channel

7 The Waveform Channel 7 The Waveform Channel The waveform transmitted by the digital demodulator will be corrupted by the channel before it reaches the digital demodulator in the receiver. One important part of the channel

More information

where r n = dn+1 x(t)

where r n = dn+1 x(t) Random Variables Overview Probability Random variables Transforms of pdfs Moments and cumulants Useful distributions Random vectors Linear transformations of random vectors The multivariate normal distribution

More information

A Probability Review

A Probability Review A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in

More information

Lecture 19: Bayesian Linear Estimators

Lecture 19: Bayesian Linear Estimators ECE 830 Fall 2010 Statistical Signal Processing instructor: R Nowa, scribe: I Rosado-Mendez Lecture 19: Bayesian Linear Estimators 1 Linear Minimum Mean-Square Estimator Suppose our data is set X R n,

More information

5 Operations on Multiple Random Variables

5 Operations on Multiple Random Variables EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y

More information

UCSD ECE250 Handout #20 Prof. Young-Han Kim Monday, February 26, Solutions to Exercise Set #7

UCSD ECE250 Handout #20 Prof. Young-Han Kim Monday, February 26, Solutions to Exercise Set #7 UCSD ECE50 Handout #0 Prof. Young-Han Kim Monday, February 6, 07 Solutions to Exercise Set #7. Minimum waiting time. Let X,X,... be i.i.d. exponentially distributed random variables with parameter λ, i.e.,

More information

Properties of the Autocorrelation Function

Properties of the Autocorrelation Function Properties of the Autocorrelation Function I The autocorrelation function of a (real-valued) random process satisfies the following properties: 1. R X (t, t) 0 2. R X (t, u) =R X (u, t) (symmetry) 3. R

More information

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl. E X A M Course code: Course name: Number of pages incl. front page: 6 MA430-G Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours Resources allowed: Notes: Pocket calculator,

More information

ELEMENTS OF PROBABILITY THEORY

ELEMENTS OF PROBABILITY THEORY ELEMENTS OF PROBABILITY THEORY Elements of Probability Theory A collection of subsets of a set Ω is called a σ algebra if it contains Ω and is closed under the operations of taking complements and countable

More information

Introduction to Probability and Stochastic Processes I

Introduction to Probability and Stochastic Processes I Introduction to Probability and Stochastic Processes I Lecture 3 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark Slides

More information

Massachusetts Institute of Technology

Massachusetts Institute of Technology Massachusetts Institute of Technology Department of Electrical Engineering and Computer Science 6.011: Introduction to Communication, Control and Signal Processing QUIZ, April 1, 010 QUESTION BOOKLET Your

More information

Name of the Student: Problems on Discrete & Continuous R.Vs

Name of the Student: Problems on Discrete & Continuous R.Vs Engineering Mathematics 08 SUBJECT NAME : Probability & Random Processes SUBJECT CODE : MA645 MATERIAL NAME : University Questions REGULATION : R03 UPDATED ON : November 07 (Upto N/D 07 Q.P) (Scan the

More information

Estimation techniques

Estimation techniques Estimation techniques March 2, 2006 Contents 1 Problem Statement 2 2 Bayesian Estimation Techniques 2 2.1 Minimum Mean Squared Error (MMSE) estimation........................ 2 2.1.1 General formulation......................................

More information

ENGR352 Problem Set 02

ENGR352 Problem Set 02 engr352/engr352p02 September 13, 2018) ENGR352 Problem Set 02 Transfer function of an estimator 1. Using Eq. (1.1.4-27) from the text, find the correct value of r ss (the result given in the text is incorrect).

More information

This examination consists of 10 pages. Please check that you have a complete copy. Time: 2.5 hrs INSTRUCTIONS

This examination consists of 10 pages. Please check that you have a complete copy. Time: 2.5 hrs INSTRUCTIONS THE UNIVERSITY OF BRITISH COLUMBIA Department of Electrical and Computer Engineering EECE 564 Detection and Estimation of Signals in Noise Final Examination 08 December 2009 This examination consists of

More information

EE4601 Communication Systems

EE4601 Communication Systems EE4601 Communication Systems Week 4 Ergodic Random Processes, Power Spectrum Linear Systems 0 c 2011, Georgia Institute of Technology (lect4 1) Ergodic Random Processes An ergodic random process is one

More information

Chapter 4 Random process. 4.1 Random process

Chapter 4 Random process. 4.1 Random process Random processes - Chapter 4 Random process 1 Random processes Chapter 4 Random process 4.1 Random process 4.1 Random process Random processes - Chapter 4 Random process 2 Random process Random process,

More information

for valid PSD. PART B (Answer all five units, 5 X 10 = 50 Marks) UNIT I

for valid PSD. PART B (Answer all five units, 5 X 10 = 50 Marks) UNIT I Code: 15A04304 R15 B.Tech II Year I Semester (R15) Regular Examinations November/December 016 PROBABILITY THEY & STOCHASTIC PROCESSES (Electronics and Communication Engineering) Time: 3 hours Max. Marks:

More information

This examination consists of 11 pages. Please check that you have a complete copy. Time: 2.5 hrs INSTRUCTIONS

This examination consists of 11 pages. Please check that you have a complete copy. Time: 2.5 hrs INSTRUCTIONS THE UNIVERSITY OF BRITISH COLUMBIA Department of Electrical and Computer Engineering EECE 564 Detection and Estimation of Signals in Noise Final Examination 6 December 2006 This examination consists of

More information

Multiple Random Variables

Multiple Random Variables Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

Gaussian Basics Random Processes Filtering of Random Processes Signal Space Concepts

Gaussian Basics Random Processes Filtering of Random Processes Signal Space Concepts White Gaussian Noise I Definition: A (real-valued) random process X t is called white Gaussian Noise if I X t is Gaussian for each time instance t I Mean: m X (t) =0 for all t I Autocorrelation function:

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

ECE534, Spring 2018: Solutions for Problem Set #3

ECE534, Spring 2018: Solutions for Problem Set #3 ECE534, Spring 08: Solutions for Problem Set #3 Jointly Gaussian Random Variables and MMSE Estimation Suppose that X, Y are jointly Gaussian random variables with µ X = µ Y = 0 and σ X = σ Y = Let their

More information

Stochastic Processes. Chapter Definitions

Stochastic Processes. Chapter Definitions Chapter 4 Stochastic Processes Clearly data assimilation schemes such as Optimal Interpolation are crucially dependent on the estimates of background and observation error statistics. Yet, we don t know

More information

3. Probability and Statistics

3. Probability and Statistics FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important

More information

Chapter 4 : Expectation and Moments

Chapter 4 : Expectation and Moments ECE5: Analysis of Random Signals Fall 06 Chapter 4 : Expectation and Moments Dr. Salim El Rouayheb Scribe: Serge Kas Hanna, Lu Liu Expected Value of a Random Variable Definition. The expected or average

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

SRI VIDYA COLLEGE OF ENGINEERING AND TECHNOLOGY UNIT 3 RANDOM PROCESS TWO MARK QUESTIONS

SRI VIDYA COLLEGE OF ENGINEERING AND TECHNOLOGY UNIT 3 RANDOM PROCESS TWO MARK QUESTIONS UNIT 3 RANDOM PROCESS TWO MARK QUESTIONS 1. Define random process? The sample space composed of functions of time is called a random process. 2. Define Stationary process? If a random process is divided

More information

System Identification & Parameter Estimation

System Identification & Parameter Estimation System Identification & Parameter Estimation Wb3: SIPE lecture Correlation functions in time & frequency domain Alfred C. Schouten, Dept. of Biomechanical Engineering (BMechE), Fac. 3mE // Delft University

More information

Formulas for probability theory and linear models SF2941

Formulas for probability theory and linear models SF2941 Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms

More information

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed

More information

P 1.5 X 4.5 / X 2 and (iii) The smallest value of n for

P 1.5 X 4.5 / X 2 and (iii) The smallest value of n for DHANALAKSHMI COLLEGE OF ENEINEERING, CHENNAI DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING MA645 PROBABILITY AND RANDOM PROCESS UNIT I : RANDOM VARIABLES PART B (6 MARKS). A random variable X

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering. Stochastic Processes and Linear Algebra Recap Slides

Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering. Stochastic Processes and Linear Algebra Recap Slides Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering Stochastic Processes and Linear Algebra Recap Slides Stochastic processes and variables XX tt 0 = XX xx nn (tt) xx 2 (tt) XX tt XX

More information

TSKS01 Digital Communication Lecture 1

TSKS01 Digital Communication Lecture 1 TSKS01 Digital Communication Lecture 1 Introduction, Repetition, and Noise Modeling Emil Björnson Department of Electrical Engineering (ISY) Division of Communication Systems Emil Björnson Course Director

More information

5 Kalman filters. 5.1 Scalar Kalman filter. Unit delay Signal model. System model

5 Kalman filters. 5.1 Scalar Kalman filter. Unit delay Signal model. System model 5 Kalman filters 5.1 Scalar Kalman filter 5.1.1 Signal model System model {Y (n)} is an unobservable sequence which is described by the following state or system equation: Y (n) = h(n)y (n 1) + Z(n), n

More information

Chp 4. Expectation and Variance

Chp 4. Expectation and Variance Chp 4. Expectation and Variance 1 Expectation In this chapter, we will introduce two objectives to directly reflect the properties of a random variable or vector, which are the Expectation and Variance.

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) 1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Minimum Variance Unbiased Estimators (MVUE)

More information

ECE 650 Lecture #10 (was Part 1 & 2) D. van Alphen. D. van Alphen 1

ECE 650 Lecture #10 (was Part 1 & 2) D. van Alphen. D. van Alphen 1 ECE 650 Lecture #10 (was Part 1 & 2) D. van Alphen D. van Alphen 1 Lecture 10 Overview Part 1 Review of Lecture 9 Continuing: Systems with Random Inputs More about Poisson RV s Intro. to Poisson Processes

More information

Probability and Statistics for Final Year Engineering Students

Probability and Statistics for Final Year Engineering Students Probability and Statistics for Final Year Engineering Students By Yoni Nazarathy, Last Updated: May 24, 2011. Lecture 6p: Spectral Density, Passing Random Processes through LTI Systems, Filtering Terms

More information

Lecture Notes 4 Vector Detection and Estimation. Vector Detection Reconstruction Problem Detection for Vector AGN Channel

Lecture Notes 4 Vector Detection and Estimation. Vector Detection Reconstruction Problem Detection for Vector AGN Channel Lecture Notes 4 Vector Detection and Estimation Vector Detection Reconstruction Problem Detection for Vector AGN Channel Vector Linear Estimation Linear Innovation Sequence Kalman Filter EE 278B: Random

More information

26. Filtering. ECE 830, Spring 2014

26. Filtering. ECE 830, Spring 2014 26. Filtering ECE 830, Spring 2014 1 / 26 Wiener Filtering Wiener filtering is the application of LMMSE estimation to recovery of a signal in additive noise under wide sense sationarity assumptions. Problem

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

ECE353: Probability and Random Processes. Lecture 18 - Stochastic Processes

ECE353: Probability and Random Processes. Lecture 18 - Stochastic Processes ECE353: Probability and Random Processes Lecture 18 - Stochastic Processes Xiao Fu School of Electrical Engineering and Computer Science Oregon State University E-mail: xiao.fu@oregonstate.edu From RV

More information

Stochastic Processes. A stochastic process is a function of two variables:

Stochastic Processes. A stochastic process is a function of two variables: Stochastic Processes Stochastic: from Greek stochastikos, proceeding by guesswork, literally, skillful in aiming. A stochastic process is simply a collection of random variables labelled by some parameter:

More information

ECE 541 Stochastic Signals and Systems Problem Set 11 Solution

ECE 541 Stochastic Signals and Systems Problem Set 11 Solution ECE 54 Stochastic Signals and Systems Problem Set Solution Problem Solutions : Yates and Goodman,..4..7.3.3.4.3.8.3 and.8.0 Problem..4 Solution Since E[Y (t] R Y (0, we use Theorem.(a to evaluate R Y (τ

More information

Fourier Analysis Linear transformations and lters. 3. Fourier Analysis. Alex Sheremet. April 11, 2007

Fourier Analysis Linear transformations and lters. 3. Fourier Analysis. Alex Sheremet. April 11, 2007 Stochastic processes review 3. Data Analysis Techniques in Oceanography OCP668 April, 27 Stochastic processes review Denition Fixed ζ = ζ : Function X (t) = X (t, ζ). Fixed t = t: Random Variable X (ζ)

More information

Random Processes Why we Care

Random Processes Why we Care Random Processes Why we Care I Random processes describe signals that change randomly over time. I Compare: deterministic signals can be described by a mathematical expression that describes the signal

More information

Continuous Random Variables

Continuous Random Variables 1 / 24 Continuous Random Variables Saravanan Vijayakumaran sarva@ee.iitb.ac.in Department of Electrical Engineering Indian Institute of Technology Bombay February 27, 2013 2 / 24 Continuous Random Variables

More information

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Financial Econometrics / 49 State-space Model Eduardo Rossi University of Pavia November 2013 Rossi State-space Model Financial Econometrics - 2013 1 / 49 Outline 1 Introduction 2 The Kalman filter 3 Forecast errors 4 State smoothing

More information

Chapter 2 Random Processes

Chapter 2 Random Processes Chapter 2 Random Processes 21 Introduction We saw in Section 111 on page 10 that many systems are best studied using the concept of random variables where the outcome of a random experiment was associated

More information

ECE531 Lecture 12: Linear Estimation and Causal Wiener-Kolmogorov Filtering

ECE531 Lecture 12: Linear Estimation and Causal Wiener-Kolmogorov Filtering ECE531 Lecture 12: Linear Estimation and Causal Wiener-Kolmogorov Filtering D. Richard Brown III Worcester Polytechnic Institute 16-Apr-2009 Worcester Polytechnic Institute D. Richard Brown III 16-Apr-2009

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Quick Tour of Basic Probability Theory and Linear Algebra

Quick Tour of Basic Probability Theory and Linear Algebra Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions

More information

EIE6207: Maximum-Likelihood and Bayesian Estimation

EIE6207: Maximum-Likelihood and Bayesian Estimation EIE6207: Maximum-Likelihood and Bayesian Estimation Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak

More information

GATE EE Topic wise Questions SIGNALS & SYSTEMS

GATE EE Topic wise Questions SIGNALS & SYSTEMS www.gatehelp.com GATE EE Topic wise Questions YEAR 010 ONE MARK Question. 1 For the system /( s + 1), the approximate time taken for a step response to reach 98% of the final value is (A) 1 s (B) s (C)

More information

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei)

UCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei) UCSD ECE53 Handout #34 Prof Young-Han Kim Tuesday, May 7, 04 Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei) Linear estimator Consider a channel with the observation Y XZ, where the

More information

Northwestern University Department of Electrical Engineering and Computer Science

Northwestern University Department of Electrical Engineering and Computer Science Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

PROBABILITY AND RANDOM PROCESSESS

PROBABILITY AND RANDOM PROCESSESS PROBABILITY AND RANDOM PROCESSESS SOLUTIONS TO UNIVERSITY QUESTION PAPER YEAR : JUNE 2014 CODE NO : 6074 /M PREPARED BY: D.B.V.RAVISANKAR ASSOCIATE PROFESSOR IT DEPARTMENT MVSR ENGINEERING COLLEGE, NADERGUL

More information

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011

A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 2011 Reading Chapter 5 (continued) Lecture 8 Key points in probability CLT CLT examples Prior vs Likelihood Box & Tiao

More information

EE4601 Communication Systems

EE4601 Communication Systems EE4601 Communication Systems Week 2 Review of Probability, Important Distributions 0 c 2011, Georgia Institute of Technology (lect2 1) Conditional Probability Consider a sample space that consists of two

More information

4 Derivations of the Discrete-Time Kalman Filter

4 Derivations of the Discrete-Time Kalman Filter Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof N Shimkin 4 Derivations of the Discrete-Time

More information

ECE Homework Set 3

ECE Homework Set 3 ECE 450 1 Homework Set 3 0. Consider the random variables X and Y, whose values are a function of the number showing when a single die is tossed, as show below: Exp. Outcome 1 3 4 5 6 X 3 3 4 4 Y 0 1 3

More information

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors EE401 (Semester 1) 5. Random Vectors Jitkomut Songsiri probabilities characteristic function cross correlation, cross covariance Gaussian random vectors functions of random vectors 5-1 Random vectors we

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions

ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions Problem Solutions : Yates and Goodman, 9.5.3 9.1.4 9.2.2 9.2.6 9.3.2 9.4.2 9.4.6 9.4.7 and Problem 9.1.4 Solution The joint PDF of X and Y

More information

Signals and Spectra (1A) Young Won Lim 11/26/12

Signals and Spectra (1A) Young Won Lim 11/26/12 Signals and Spectra (A) Copyright (c) 202 Young W. Lim. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version.2 or any later

More information

Problems on Discrete & Continuous R.Vs

Problems on Discrete & Continuous R.Vs 013 SUBJECT NAME SUBJECT CODE MATERIAL NAME MATERIAL CODE : Probability & Random Process : MA 61 : University Questions : SKMA1004 Name of the Student: Branch: Unit I (Random Variables) Problems on Discrete

More information