Stochastic Process II Dr.-Ing. Sudchai Boonto

Dr-Ing Sudchai Boonto Department of Control System and Instrumentation Engineering King Mongkuts Unniversity of Technology Thonburi Thailand

Random process Consider a random experiment specified by the outcome ρ from some sample space S, by the event defined on S, and by the probabilities on these events Suppose that every outcome ρ S, we assign a function of time according to some rule: X(t, ρ), t I The graph of the function X(t, ρ) versus t, for fixed ζ, is called a realization, sample path or sample function of the random process For fixed t i from the index set I, X(t i, ρ) is a random variable The indexed family of random variables, {X(t, ρ), t I} is call a random process or stochastic processes 2/34

Random process example x(t 1, ρ 1 ) x(t 2, ρ 1 ) Noise Gen ρ 1 x(t, ρ 1 ) t x(t 1, ρ 2 ) x(t 2, ρ 2 ) Noise Gen ρ 2 x(t, ρ 2 ) t x(t 1, ρ 3 ) x(t 2, ρ 3 ) Noise Gen ρ 3 x(t, ρ 3 ) t t 1 t 2 3/34

Random process continuous vs discrete A random process is said to be discrete-time if the index set I is a countable set, ie, the set of integers or the set of nonnegative integers) We will usually use k to denote the time index and X(k) to denote the random process A continuous-time stochastic process is one in which I is continuous, ie, the real line of the nonnegative real line 4/34

Random process Random sinusoids Let ρ be selected at random from the interval [ 1, 1] Define the continuous-time random process X(t, ρ) by X(t, ρ) = ρ sin(2πt), < t < The realizations of this random process are sinusoids with amplitude ρ, as shown in Fig ρ = 09 ρ = 04 t ρ = 02 5/34

Random process Random sinusoids Let ρ be selected at random from the interval [ π, π], and let Y (t, ρ) = sin(2πt + ρ) The realizations of Y (t, ρ) are time-shifted versions of sin 2πt, as shown in Fig ρ = 0 t ρ = π/2 6/34

Distribution and Density Functions Consider the time sequence {x(k, ρ} N 1 k=0 The k th sample {x(k, ρ j )} of each run is a random variable The first-order distribution function is defined as F x(k) (α) = P [x(t) α] Assuming that this function is continuous, its first-order density function is defined as f x(k) (α) = df x(k)(α, k) dα 7/34

Distribution and Density Functions The joint distribution function as F x(t1 ),x(t 2 )(α 1, α 2 ) = P [x(t 1 ) α 1, x(t 2 ) α 2 ] The joint density function as f x(t1 ),x(t 2 )(α 1, α 2 ) = 2 F x(t1 ),x(t 2 )(α 1, α 2 ) α 1 α 2 8/34

Expectations of Random signals We indicate the sequence by the time sequence x(k), the mean is also a time sequence and is given by µ x (k) = E [x(k)] = αf x(k) (α)dα the auto-correlation function of a random process x(t) is defined as the joint moment of x(t 1 ) and x(t 2 ) R x (k, l) = E [x(k)x(l)] = α 1 α 2 f x(k),x(l) (α 1, α 2 )dα 1 dα 2 In general, the autocorrelation is a function of k and l The auto-covariance function of a random process x(k) is defined as the covariance of x(k) and x(l) C x (k, l) = E [ (x(k) µ x (k))(x(l) µ x (l)) T ] = R x (k, l) µ x (k)µ x (l) 9/34

Cross-correlation function Assuming that k = l = 1, 2 [ (x(1) µx (1)) 2 C x (k, l) = E (x(2) µ x (2)) 2 ], where are cross terms Then we have C x (k, k) = var[x(k)] the correlation coefficient of x(k) is defined as the correlation coefficient of x(k) and x(l) ρ x (k, l) = C x (k, l) Cx (k, k) C x (l, l) Note C x (k, k) and C x (l, l) are scalar The random signal x(k) and y(k) are uncorrelated if C xy (k, l) = 0, k, l, and orthogonal if R xy (k, l) = 0, k, l 10/34

Random signals example 1 Let x(k) = A cos 2πk, where A is some random variable The mean of x(k) is The auto-correlation is The auto-covariance is then µ x (k) = E [A cos 2πk] = E [A] cos 2πk R x (k, l) = E [(Acos2πk)(A cos 2πl)] = E [ A 2] cos 2πk cos 2πl C x (k, l) = R x (k, l) µ x (k)µ x (l) = {E [ A 2] E [A] 2 } cos 2πk cos 2πl = var[a] cos 2πk cos 2πl 11/34

Random signals example 2 Let x(k) = cos(ωk + Θ), where Θ is uniformly distributed in the interval ( π, π) The mean of x(k) is µ x (k) = E [cos(ωk + Θ)] = 1 2π π π The autocorrelation and auto-covariance are then cos(ωk + θ)dθ = 0 C x (k, l) = R x (k, l) = E [cos(ωk + Θ) cos(ωl + Θ)] = 1 2π π π = 1 2 cos(ω(t 1 t 2 )), 1 {cos(ω(k l)) + cos(ω(k + l) + 2θ)}dθ 2 where cos(a) cos(b) = 1 2 cos(a + b) + 1 2 cos(a b) 12/34

Gaussian random signals A discrete-time random signal x(k) is a Gaussian random signal if every collection of a finite number of samples of this random signal is jointly Gaussian the probability density function of the samples x(k), k = 0, 1, 2,, N 1 of a Gaussian random signal is given by f x(0),x(1),,x(n 1) (α 0, α 1,, α N 1 ) ( 1 = (2π) N/2 exp 1 ) det(c x ) 1/2 2 (α µ x)cx 1 (α µ x ) T, where µ X = [ ] T E [x(0)] E [x(1)] E [x(n 1)] 13/34

Gaussian random signals and C x (0, 0) C x (0, 1) C x (0, N 1) C x (1, 0) C x (1, 1) C x (1, N 1) C x = C x (N 1, 0) C x (N 1, 1) C x (N 1, N 1) 14/34

IID random signals The IID stands for independent, identically distributed An IID random signal x(k) is a sequence of independent, identically distributed random variables with common PDF F x(k) (α), mean µ, and variance σ 2 F x(0),x(1),,x(n 1) (α 0, α 1,, α N 1 ) = F x (α 0 )F x (α 1 ) F x (α N 1 ) The mean of IID process is obtained from µ x (k) = E [x(k)] = µ, k Thus the mean is constant 15/34

IID random signals The auto-covariance function is obtained from: for k l then C x (k, l) = E [(x(k) µ)(x(l) µ)] = E [(x(k) µ)] E [x(l) µ] = 0 since x(k) and x(l) are independent random variables for k = l then C x (k, k) = E [ (x(k) µ) 2] = σ 2 in matrix from C x (k, l) = σ 2 δ k,l, where δ k,l = 1 if k = l and 0 otherwise 16/34

IID random signals The auto-correlation function of the IID process is obtained from R x (k, l) = C x (k, l) + µ 2 The second term is µ 2 because the mean value of IID signal is constant 17/34

Stationary random signals A discrete-time random signal x(k) is stationary if the joint probability distribution function of any finite number of samples does note depend on the placement of the time origin, that is, F x(k0 ),x(k 1 ),,x(k N 1 )(α 0, α 1,, α N 1 ) = F x(k0 +τ),x(k 1 +τ),,x(k N 1 +τ)(α 0, α 1,, α N 1 ), τ Z The first-order probability distribution function of a station random process must be independent of time, ie F x(k) (α) = F x(k+τ) (α) = F x(k) (α), t, τ 18/34

Stationary random signals This implies that the mean and variance of x(k) are constant and independent of time: µ x (k) = E [x(k)] = µ x, k var[x(k)] = E [ (x(k) µ x ) 2] = σ 2 x, t The second-order probability distribution function of a stationary random process can depend only on the time difference between the samples and not on the particular time of the samples F x(k),x(l) (α 1, α 2 ) = F x(0),x(k l) (α 1, α 2 ), k, l 19/34

Stationary random signals This is implied that the auto-correlation and the auto-covariance of x(k) can depend only on k l: R x (k, l) = R x (k l), k, l C x (k, l) = C x (k l), k, l for iid random signal, the joint PDF for the samples at any N time instants, 0,, N 1, is F x(0),x(2),,x(n 1) (α 0, α 1,, α N 1 ) = F x (α 0 )F x (α 1 ) F x (α N 1 ) = F x(0 τ),x(1 τ),,x(n 1 τ) (α 0, α 1,, α N 1 ) Thus the IID process is also stationary 20/34

Wide-Sense Stationary random signals In many situations we cannot determine whether a random process is stationary but we can determine whether the mean is a constant: µ x(k) (k) = m, k and whether the auto-covariance (or equivalently the autocorrelation) is a function of k l only: C x (k, l) = C x (k l), k, l We call the random process which satisfy both conditions above wide-sense stationary (WSS) process 21/34

Wide-Sense Stationary random signals A random signal x(k) is wide-sense stationary (WSS) if the following three conditions are satisfied i Its mean is constant: µ x (k) = E [x(k)] = µ x ii Its auto-correlation function R x (k, l) depends only on the lag k l iii Its variance is finite: var[x(k)] = E [ (x(k) µ x ) 2] < 22/34

Wide-Sense Stationary random signals Let x(k) consist of two interleaved sequences of independent random variables For k even, x(k) assumes the values ±1 with probability 1/2; for k odd, x(k) assumes the values 1/3 and 3 with probability 9/10 and 1/10, respectively x(k) is not stationary since its PDF varies with k µ x (k) = k αp [x(k) = α] = (1( 1 2 ) 1(1 2 )) + (( 1 9 3 10 3 1 )) = 0, k 10 x=even k=odd and the covariance function is E [x(k)] E [x(l)] = 0, for k j C x (k, l) = E [ x(k) 2] = 1 for k = j x(k) is therefore wide-sense stationary 23/34

Wide-Sense Stationary random signals Example White Gaussian noise (WGN): Since x(k) N (0, σ 2 ) for all k, we have that µ x (k) = 0 σ 2 x(k) = σ 2 Recalling that the auto-covariance between x(k) is just the variance, we have { 0, k l C x (k, l) = σ 2, k = l Or in short C x (k, l) = σ 2 xδ(k l), k, l In summary, for a WGN random process we have that µ x (k) = 0 for all k and C x (k, l) = σ 2 δ(k l) 24/34

Wide-Sense Stationary random signals Auto-correlation function The auto-correlation function R x (τ) of a WSS random signa x(k) is symmetric in its argument τ, that is Since from R x (τ) = R x ( τ) R x (τ) = E [x(k)x(k τ)] = E [x(k τ)x(k)] = R x ( τ) The auto-correlation function R x (τ) of a WSS random signal x(k) satisfies, for τ = 0, R x (0) = E [x(k)x(k)] 0 25/34

Wide-Sense Stationary random signals Maximum of the auto-correlation function The maximum of the auto-correlation function R x (τ) of a WSS random signal x(k) occurs at τ = 0, R x (0) R x (k), k Since, E [x(k)y(k)] 2 E [ x(k) 2] E [ y(k) 2], based on Cauchy-Schwarz inequality If we apply this to x(k + τ) and x(k), we obtain R x (τ) 2 = E [x(k)x(k + τ)] 2 E [ x 2 (k) ] E [ x 2 (k + τ) ] = R x (0) 2 Thus R x (τ) R x (0) 26/34

Ergodicity and time averages of random signals The ergodicity of a random signal state that, for a stationary IID random signal x(k) with mean E [x(k)] = µ x, the time average converges with probability unity to the mean value µ x, provided that the number of observation N goes to infinity This is denoted by P [ lim N N 1 1 N k=0 x(k) = µ x ] = 1 the ergodic theorem states under what conditions statistical quantities characterizing a stationary random signal, such as its covariance function, can be derived with probability unity from a single realization of that random signal 27/34

Ergodicity and time averages of random signals Let {x(k)} N 1 k=0 and {y(k)}n 1 k=0 be two realizations of the stationary random signals x(k) and y(k), respectively Then, under the ergodicity argument, we obtain relationships of the following kind: [ ] N 1 1 P lim x(k) = E [x(k)] = 1, N N P [ lim N k=0 N 1 1 N k=0 y(k) = E [y(k)] ] = 1 If E [x(k)] and E [y(k)] are denoted by µ x and µ y, respectively, then [ ] N 1 1 P lim (x(k) µ x ) (x(k τ) µ x ) = C x (τ) = 1, N N P [ lim N k=0 N 1 1 N k=0 (x(k) µ x ) (y(k τ) µ y ) = C xy (τ) ] = 1 28/34

Power spectra The Fourier transform of a random signal will remain a random signal To get a deterministic notion of the frequency content for a random signal, the power-spectral density function, or the power spectrum, is used The spectrum of a signal can be thought of as the distribution of the signal s energy over the whole frequency band Signal spectra are defined for WSS time sequences 29/34

Power spectra definition Let x(k) and y(k) be two zero-mean WSS sequences with sampling time T The (power) spectrum of x(k) is Φ x (ω) = τ= R x (τ)e jωτt, and the cross-spectrum between x(k) and y(k) is Φ xy (ω) = τ= R xy (τ)e jωτt The inverse DTFT applied to the spectrum yields R x (τ) = T 2π π/t π/t Φ x (ω)e jωτt dω 30/34

Power spectra properties The power spectrum Φ x (ω) is real-valued and symmetric with respect to ω, that is Φ x ( ω) = Φ x (ω) Let a WSS random signal x(k) R with sampling time T and power spectrum Φ x (ω) be given Then E [ x(k) 2] = T 2π π/t π/t Φ x (ω)dω This property shows that the total energy of the signal x(k) given by E [ x(k) 2] is distributed over the frequency band π/t ω π/t 31/34

Power spectra example Let x(k) is a white noise sequence then R x (k) = σ 2 δ(k), we have Φ x (ω) = R x (k)e jωτt = = σ 2 τ= τ= σ 2 δ(τ)e j2πτt We need only consider frequencies in the range π/t ω π/t Φ x (ω) σ 2 π T π T ω 32/34

Power spectra Filtering WSS random signals Let u(k) be WSS and the input to the BIBO-stable LTI system with transfer function G(q) = g(k)q k, such at y(k) = G(q)u(k) Then i y(k) is WSS ii Φ yu (ω) = G(e jωt )Φ u (ω) iii Φ y (ω) = G(e jωt ) 2 Φ u (ω) k=0 33/34

Reference 1 Michel Verhaegen and Vincent Verdult, Filtering and System Identification: A Least Squares Approach, Cambridge University Press, 2007 2 Lecture note on Control Systems Theory and Design, Herbert Werner, Humburg University of Technology 3 David T Westwick and Robert E Kearney, Identification of Nonlinear Physiological Systems, IEEE Press, 2003 4 Alberto Leon-Garcia Probability and Random Processes for Electrical Engineering, 2nd editon, Addison Wesley, 1994 34/34