Time Series Analysis and Signal Modeling

Size: px
Start display at page:

Download "Time Series Analysis and Signal Modeling"

Transcription

1 Time Series Analysis and Signal Modeling Andreas Jakobsson Lund University Version 4

2 i Time Series Analysis and Signal Modeling This is an early version of a set of notes for the course on Time Series Analysis offered at Lund University. Any and all comments and suggestions are most welcome and appreciated. Andreas Jakobsson aj@maths.lth.se

3 ii

4 Contents Signal Modeling 3. Introduction Stochastic Processes 5 2. Introduction Stochastic vectors Properties and peculiarities Normal distributed vectors Conditional expectations Linear projections Stochastic processes Properties and peculiarities The power spectral density Filtering of a stochastic process The moving average process The autoregressive process The Levinson-Durbin algorithm ARMA, ARIMA, and SARIMA processes Estimating the power spectral density Identification and Modeling Introduction Finding an appropriate model structure The partial auto-correlation function Data with trends Estimating the unknown parameters Least squares estimation Maximum likelihood estimation iii

5 iv CONTENTS The Cramér-Rao lower bound Estimating the model order Prediction of Stochastic Processes Optimal linear prediction Prediction of ARMA processes Implementation using Matlab Prediction of ARMAX processes Multivariate Processes Introduction Identification and Estimation Maximum likelihood estimation A Complements 7 A..2 Complex-valued Normal distributed vectors A..3 Wishart distribution B Matlab functions 73

6 Glossary and Notation Glossary and Abbreviations -D, 2-D, etc. ACF AR ARIMA ARMA ARMAX ARX BLUE CRLB db DFT FIM FFT LS MA MIMO MISO ML MSE NLS PACF PDF PSD SARIMA SARIMAX SARMA SARMAX SISO SNR SVD VARMA One-dimensional, two-dimensional, etc. Auto-covariance function Autoregressive Integrated ARMA Autoregressive moving average ARMA with exogenous input AR with exogenous input Best linear unbiased estimate Cramér-Rao lower bound Decibel Discrete Fourier transform Fisher information matrix Fast Fourier transform Least-squares Moving average Multiple-input multiple-output Multiple-input single-output Maximum likelihood Mean squared error Non-linear least-squares Partial auto-correlation function Probability density function Power spectral density Seasonal ARIMA SARIMA with exogenous input Seasonal ARMA SARMA with exogenous input Single-input single-output Signal-to-noise ratio Singular value decomposition Vector ARMA

7 2 CONTENTS Notational Conventions a, b,... boldface lower case letters are used for vectors A, B, Σ,... boldface upper case letters are used for matrices A, a,, α,... non-bold letters are generally used to denote scalars a T, A T,... ( ) T denotes the matrix or vector transpose a, A,... ( ) denotes the Hermitian (conjugate) transpose Â, â, â, ˆα,... ( ) ˆ is used to denote an estimate C n m the complex n m-dimensional space C n the complex n-dimensional plane CN (m, R) the complex-valued Normal distribution N (m, R) the real-valued Normal distribution arg max f(x) the argument that maximizes f(x) arg min f(x) the argument that minimizes f(x) E{ } the expectation operator exp( ) the exponential function; exp(a) = e a i or j the imaginary unit,, unless otherwise specified I the identity matrix (of unspecified dimension) I m,n the m n identity matrix I θ the Fisher information matrix Im( ) the imaginary part of log( ) the natural logarithm R n m the real n m-dimensional space R n the real n-dimensional plane (R is used for n = ) Re( ) the real part of tr( ) the trace of a matrix defined as conditioned on; e.g., a b means a conditioned on b det( ) matrix determinant the L 2 -norm (Euclidian norm) x the integer part of x

8 Chapter Signal Modeling. Introduction.3.2 Amplitude Time [s] Figure.: A female voice uttering: Why were you away a year, Roy? Example.. Figure. shows the sampled speech signal of a female speaker uttering the phrase Why were you away a year, Roy?. The signal is sampled at f s = 8 Hz, which is a common sampling frequency for speech signals. Example.2. Figure.2 shows the measured temperature in the Swedish city Svedala. The temperature data is sampled every hour during a period in April and May

9 4. Signal Modeling 25 2 Temperature Days Figure.2: Temperature measurements in the Swedish city Svedala. The temperature data is sampled every hour during a period in April and May 994.

10 Chapter 2 Stochastic Processes A thing of beauty is a joy for ever: Its loveliness increases; it will never Pass into nothingness; but still will keep A bower quiet for us, and a sleep Full of sweet dreams, and health, and quiet breathing. John Keats 2. Introduction We first examine vectors of stochastic variables, and then extend the discussion to also deal with stochastic processes. 2.2 Stochastic vectors 2.2. Properties and peculiarities Let x denote a vector containing p stochastic variables, such that x = [ x... x p ] T (2.) where ( ) T and x l denote the transpose and the l:th element of the vector x, respectively. Denote the mean of the vector m x = E{x} = [ E{x }... E{x p } ] T, (2.2) where E{ } denotes the statistical expectation, defined as E{g(x)} = 5 g(x)f(x) dx, (2.3)

11 6 2. Stochastic Processes with f(x), depending on the dimensionality of x in (2.2) and (2.3), denoting the probability density function (PDF) of the stochastic vector or variable, respectively. Furthermore, denote the covariance matrix of x and y by R x,y = C{x, y} = E { [x m x ] [y m y ] }, (2.4) where ( ) denotes the conjugate transpose, and where the q-dimensional vectors y and m y are defined similarly to x. Thus, R x,y is a (p q)-dimensional matrix with elements R x,y = C{x, y }... C{x, y q }..... C{x p, y }... C{x p, y q } (2.5) In the particular case when y = x, this yields the (p p)-dimensional (auto) covariance matrix R x,x, which will then be a positive semi-definite Hermitian matrix, i.e., The (auto) covariance matrix R x,x will be: (i) Positive semi-definite, here denoted R x,x, implying that w R x,x w, for all vectors w. This also implies that the eigenvalues of R x,x are real-valued and non-negative. (ii) Hermitian, i.e., the matrix wil satisfy R x,x = R x,x. If x is a real-valued vector, this implies that R x,x = R T x,x; such matrices are termed symmetric. When there is no risk for confusion, we will, for notational convenience, often denote R x,x simply by R x. Although we have here, in the interest of generality, used the general definitions that also allow the stochastic vector to be complex-valued, we will throughout these notes generally assume that the examined data is real-valued, and only treat the particularities of complex-valued measurements in separate, optional, sections. When working with covariances, the following lemma is often useful: Lemma 2.. Let x, u, v, and y denote two p and q-dimensional stochastic vectors, respectively, and let A and B be a (n p) and (m q)-dimensional deterministic matrix, respectively. Then, it holds that C{A(x + u), B(y + v)} = AC{x, y}b + AC{x, v}b + which implies that V {Ax + a} AC{u, y}b + AC{u, v}b (2.6) = C{Ax + a, Ax + a} (2.7) = AV {x}a (2.8) C{x + u, y} = C{x, y} + C{u, y} (2.9) where a is a deterministic p-dimensional vector.

12 2.2. Stochastic vectors Normal distributed vectors Throughout these notes, we often consider p-dimensional Normal distributed vectors, i.e., x N (m x, R x ). By this notation, we imply that x follows a real-valued multivariate Normal distribution with a PDF of the form { f x (x) = (2π) p/2 det (R x ) /2 exp } 2 [x m x] T R x [x m x ] (2.) where det (R x ) denotes the determinant of R x. In the one-dimensional case, i.e., when p =, (2.) naturally simplify to the well-known form { } f x (x) = exp [x m x] 2 2πσ 2 x 2σx 2, (2.) where m x and σ 2 x denote the mean and the variance of the stochastic variable x, respectively. In the particular case when the vector contains equally distributed and independent stochastic variables, i.e., it is an independent identically distributed (iid) white vector, R x = σ 2 I, (2.2) where I is a (p p)-dimensional identity matrix. The distribution for a complexvalued Normal distributed vector, x CN (m x, R x ), is given in Appendix A..2. Hereafter, we will by these notations indicate if x is real- or complexvalued. Theorem 2.. Let x be Normal distributed with x N (m x, Σ), then y = Cx (2.3) is distributed according to y N (Cm x, CΣC T ) for non-singular C. The proof can be found, for instance, in [] Conditional expectations We proceed to examine conditional expectations, using the following definitions: Definition 2.. The conditional distribution of the random variable y, given that x = x, for some value x, is defined as f y x=x (y) = f x,y(x, y) f x (x ) = f x,y(x, y) fx,y (x, y) dy (2.4) Furthermore, the conditional expectation is defined as E {y x = x } = yf y x=x (y) dy (2.5) where, from here on, the integration, unless specified, is from to.

13 8 2. Stochastic Processes In the interest of notational convenience, we will most commonly omit that x = x, simply writing the conditional expectation as E {y x}. Note that, from the above definitions, it is clear that if the stochastic variables x and y are independent, i.e., if this implies that f y x (y) = f y (y), and that f x,y (x, y) = f x (x) f y (y), (2.6) E {y x} = E {y} (2.7) Using Definition 2., one can easily show several useful results E {g(x)y x} = g(x)e {y x} (2.8) E {cu + dv x} = ce {u x} + de {v x} (2.9) E {y} = E x {E y {y x}} (2.2) E {g(x)y} = E x {g(x)e y {y x}} (2.2) where c and d are some deterministic constants, g(x) is a function of x, and where in (2.2) and (2.2), in the interest of clarity, we have indicated the variable over which the expectation is taken as a subscript. The nested expectation in (2.2), commonly referred to as an iterated expectation, or as the tower rule, may, at first sight, seem confusing, and can therefore deserve some further comments. Note that E x {E y {y x}} = E y {y x} f x (x) dx (2.22) = yf y x (y) dyf x (x) dx (2.23) = yf y,x (x, y) dy dx (2.24) = y f y,x (x, y) dx dy (2.25) = yf y (y) dy (2.26) which proves (2.2). Here, we have in obtaining (2.24) made use of (2.4), and in obtaining (2.26) the fact that the inner integral over x will integrate out the dependence of x from f y,x (x, y). The result in (2.2) is shown similarly. We proceed to define the conditional covariance: Definition 2.2. The conditional covariance is defined as { C {y, z x} = [y ] [ ] } x E my x z mz x (2.27) where m y x = E {y x} and m z x = E {z x}. Setting z = y immediately yields a similar expression for V {y, z x}. Equipped with this definition and the results above, we are then ready to state the variance separation theorem.

14 2.2. Stochastic vectors 9 Theorem 2.2. The variance separation theorem states that { } { } V {y} = E V [y x] + V E [y x] { } { } C {y, z} = E C [y, z x] + C E [y x], E [z x] (2.28) (2.29) where the expectations and (co)variances are taken with respect to the appropriate variables Linear projections We will now examine how to exploit above defined conditional expectations to formulate the optimal linear projection of a y given knowledge of x. To do so, we begin by defining what we mean by a linear projection: Definition 2.3. The linear projection of y onto the space spanned by x, here denoted R(x), is defined as E{y x} = a + Bx (2.3) where a R(x) and B is a deterministic matrix of appropriate dimension. Expressed differently, we decompose the (stochastic) vector y into two components, one that can be written as a linear combination of x, and one that cannot. This can geometrically be viewed as depicted in the figure below, where the vector y is decomposed into the vector E{y x}, which lies in the range space of x, and a vector e = y E{y x}, which is orthogonal to the range space of x. y e = y E{y x} R(x) p = E{y x} From the above figure, we can immediately conclude that the difference, here termed the error, e, is orthogonal to any vector in R(x), i.e., { } C y E{y x}, x = (2.3) Alternatively, one may show this using the variance separation theorem. Using

15 2. Stochastic Processes (2.3) and (2.8), note that { } C y E{y x}, x { } = C y a Bx, x (2.32) { } { } = C y, x C a + Bx, x (2.33) { } = C y, x BV {x} (2.34) However, using Theorem 2.2, { } { } { } C y, x = E C [y, x x] + C E [y x], E [x x] (2.35) { } { } = E C [y, x x] + C a + Bx, x (2.36) { } = E C [y, x x] + BV {x} (2.37) = BV {x} (2.38) where we in the last step have used Definition 2.2. Inserting (2.38) in (2.34) immediately yields (2.3). It is worth stressing that the geometrical interpretation is offering a valuable insight, allowing us directly conclude (2.3), without having to go through the steps of the detailed proof. Theorem 2.3. Let z denote the concatenated vector z = [ x y ] T (2.39) having mean E{z} = [ ] T m x m y and covariance matrix (cf. (2.5)) [ ] Rx,x R R z,z = y,x R x,y R y,y (2.4) Then, the linear projection in (2.3) can be expressed as E{y x} = m y R y,x R x,x (x m x ) (2.4) with the difference e = y E{y x} having variance { } V {e x} = R y,y R y,x R x,xr y,x = E V {y x} (2.42) is the optimal linear projection, i.e., the projection that yields the minimum variance among all linear projections. Furthermore, if x and y are Normal distributed, then e and x are independent; otherwise, they are uncorrelated. Here, the last step of (2.42) follows from Theorem 2.2. Furthermore, recall that stochastic variables are said to be uncorrelated if E {xy} = E {x} E {y}, (2.43) which is a notably weaker requirement than independence. A direct result of Theorem 2.3 is that if we wish to use the vector x to form an optimal linear prediction of y, this is obtained as E{y x}. We will return to this notion in further detail in Chapter 4.

16 2.3. Stochastic processes 2.3 Stochastic processes We will now proceed to extend the earlier discussion to also treat stochastic processes. In doing so, we will initially consider only one-dimensional stochastic processes, and then, in Chapter 5, extend on this discussion to also allow for multi-dimensional stochastic processes Properties and peculiarities In these notes, we will restrict our attention to wide-sense stationary (WSS) processes: A stochastic process is wide-sense stationary (WSS) if (i) The mean of the process is constant. (ii) The auto-covariance C {y s, y t } only depends on the difference (s t), and not on the actual values of s and t. (iii) The variance of the process is finite, i.e., E { y t 2} <. For a WSS process y t, we define the auto-covariance, cross-covariance, autocorrelation, and cross-correlation functions as: Definition 2.4. The auto-covariance function (ACF) for y t is defined as r y (k) = C { y t, y t k} = E { [yt m y ] [y t k m y ] } (2.44) = E { y t y t k} my m y (2.45) Similarly, we define the cross-covariance of the WSS processes x t and y t as r x,y (k) = C { x t, y t k} = E { [xt m x ] [y t k m y ] }, (2.46) where m x and m y denote the means of the respective processes. Definition 2.5. The auto-correlation function of y t is defined as ρ y (k) = r y(k) r y () (2.47) and will therefore be bounded such that ρ y (k), with equality for k =, as well as, possibly, for k = l, with l >, if the signal is periodic with period l (see, e.g., Example 2.2). Similarly, we define the cross-correlation of the processes x t and y t as r x,y (k) ρ x,y (k) = rx ()r y (), (2.48) which will be bounded as ρ x,y (k). Example 2.. Consider a complex-valued sinusoidal signal with frequency ω, x t = Ae itω+iφ, (2.49)

17 2 2. Stochastic Processes Amplitude.5.5 Amplitude Time [ms] (a) Lag (b) Figure 2.: (a) An example of a real-valued periodic signal. The signal is a voiced speech signal extracted from the utterance in Example., together with the estimated correlation function of the signal. where φ is a uniformly distributed random variable between [ π, π]. Then, m x = E { Ae itω+iφ} = π π Ae itω+iφ dφ = (2.5) 2π and { r x (k) = E A 2 e itω+iφ e i(t k)ω iφ} = A 2 e iωk. (2.5) Thus, the ACF of a complex-valued sinusoid is also a sinusoid, both having the same frequency. Example 2.2. Consider instead a real-valued sinusoidal signal x t = A cos(tω + φ), (2.52) where φ is a uniformly distributed random variable between [ π, π]. Euler s formula, we rewrite x t as x t = A 2 Using ( e itω+iφ + e itω iφ), (2.53) which, using steps similar to the ones in Example 2., yields r x (k) = A 2 cos(ω k) (2.54) Comparing the steps needed here and in Example 2. clearly illustrates that working with complex-valued signals is often simpler than working with their real-valued counterparts. Example 2.3. Figure 2.(a) shows an example of a real-valued periodic signal. This signal is a voiced speech signal extracted from the utterance in Example., and is sampled at f s = 8 Hz. Clearly, the signal exhibits strong periodicities,

18 2.3. Stochastic processes 3 and one can therefore conclude from Example 2.2 that the covariance function of this signal should contain the same periodicities as the actual signal. Figure 2.(b) illustrates this, as well as the fact that ρ y (k) is symmetric, and bounded as ρ y (k), with equality for k =. The definition of WSS processes implies some very useful properties of the ACF, namely: The auto-covariance function (ACF) of a WSS process satisfy: (i) The ACF is conjugate symmetric, i.e., r y (k) = r y( k). (ii) The variance is always non-negative, i.e., r y () = E { y t m y 2}. (iii) The ACF takes its largest value at lag, i.e., r y () r y (k), k. These properties are easily verified for the above examples. In these notes, we will typically assume that we have observed a single realization of the process, containing, say, N samples, numbered from t =,..., N. Generally, the true r y (k) is also unknown, and we will therefore need to estimate r y (k) as accurately as possible from this one (vector) observation. When no assumptions are made on the measurements, there are two standard ways to estimate the ACF, namely, the unbiased ACF estimate ˆr y (k) = N k and the biased ACF estimate ˆr y (k) = N N t=k+ N t=k+ ( yt ˆm y )( yt k ˆm y ), (2.55) ( yt ˆm y )( yt k ˆm y ), (2.56) for k N, where the mean of the process is estimated as ˆm y = N N y t. (2.57) t= These covariance estimates deserves some further commenting; firstly, note that the sums in (2.55) and (2.56) will start at t = k +. This is due to the first available sample is y, and values t < k + in the sums will thus use measurements y l, for l <, which are not available. Secondly, note that the

19 4 2. Stochastic Processes estimates only differ in the normalization constant before the sum. As E {ˆr y (k)} = = N l N l N t=k+ N t=k+ { (yt )( ) } E ˆm y yt k ˆm y, (2.58) r y (k) (2.59) = N k N l r y(k), (2.6) for l = or l = k, corresponding to (2.55) and (2.56), respectively, the estimate in (2.55) will clearly result in an unbiased estimate of r y (k), i.e., E {ˆr y (k)} = r y (k), (2.6) whereas (2.56) will result in an estimate that is only asymptotically unbiased, i.e., it is unbiased only as N. Similar to the above discussion, we should estimate the cross-covariance between the stationary processes x t and y t as ˆr x,y (k) = N N t=k+ ( xt ˆm x )( yt k ˆm y ), (2.62) where the means of the processes have been estimated similar to (2.57). In both the estimation of ˆr y (k) and ˆr x,y (k), it is important to be aware of the difficulty to estimate these covariances accurately for higher order lags. Due to finite sample effects, both these estimates can exhibit correlation with themselves, making it appear as there may be a correlation among higher lags that is not there. Often, this correlation appears as a pattern at larger lags that is also seen in the lower lags. Therefore, as a practical rule of thumb, one should at most calculate these covariances for lags up to N/4. Obviously, this rule also holds for the corresponding correlation functions. A convenient way to represent a set of measurements is in vector form. To allow for further flexibility, we will here divide the measurement into a collection of subvectors y t, each containing L N samples of y t, i.e., y t = [ y t y t+l ] T, (2.63) for t =,..., M, where M = N L + denotes the number of available subvectors y t. Often, we will also only use a single vector containing all the samples, i.e., L = N, and will then, for simplicity, commonly just write y in place of y N. Following the discussion in Section 2.2., we can form the

20 2.3. Stochastic processes Amplitude.6.4 Amplitude Lag (a) Lag (b) Figure 2.2: The estimated correlation function for the white noise signal in Example 2.4. Figure (b) is a magnified version of parts of the figure in (a). The dashed lines correspond to ±2/ N. covariance matrix of y t as } R y = E {y t yt r y () r y () r y (L ) ry() r y () r y (L 2) = ry(l ) ry(l 2) r y () (2.64) (2.65) Similarly to the covariance matrix for a general stochastic vector, R y will be a positive semi-definite Hermitian matrix. However, as can be seen from (2.65), the matrix will also have a Toeplitz structure, i.e., the matrix will have the same element along each of the diagonals. Example 2.4. Let e t be a zero-mean white (real-valued) Gaussian process with variance σ 2 e. Then, r e (k) = σ 2 eδ K (k), (2.66) where δ K (k) is the Kronecker delta function, δ K (k) = {, k =, k (2.67) and the covariance matrix of the L-dimensional subvectors e, formed similar to (2.63), is given by R e = σ 2 ei (2.68) where I is the L L identity matrix. Figure 2.2 illustrates the estimated correlation for N = 5 samples of a white noise process. Looking at Figure 2.2(a), showing ρ e (k), for 6 k 6, it is clear that ρ e (k) is a symmetric function taking its maximal value ρ e (k) =,

21 6 2. Stochastic Processes for k =. An even more interesting thing to note is that, counter to what could be expected from (2.66), ˆρ e (k), for k (2.69) where ˆρ e (k) is estimated using (2.56). This is due to the limited number of available samples; given N samples, one is simply not able to estimate r e (k), and thus ρ e (k), with better accuracy than this. To be able to determine the variance of these estimates is most useful when trying to determine if a process is white or not, and we will in Chapter 3 make good use of the following result: Theorem 2.4. Let e t, for t =,..., N, be a realization of a zero-mean white process with variance σ 2 e. If ˆρ e (k) is estimated according to Definition 2.5, i.e., ˆρ e (k) = ˆr e(k) ˆr e (), (2.7) where ˆr e (k) is estimated using (2.56), then asymptotically E{ˆρ e (k)} = (2.7) V {ˆρ e (k)} = N for k. Furthermore, ˆρ e (k) is asymptotically Normal distributed. (2.72) An important consequence of Theorem 2.4 is that the 95% (approximative) confidence interval of ˆρ e (k), for k, is ±.96/ N, i.e., with 95% confidence, ˆρ e (k) ± 2 N, for k (2.73) This means that all values of the estimate ˆρ e (k) < 2/ N should be treated as zero, i.e., we are unable to tell the difference between values within ±2/ N, and should therefore treat all of them as being zero. As seen in Figure 2.2(b), illustrating the correlation estimate for k 2 with the corresponding confidence intervals, the estimate can thus be seen to satisfy (2.66). We will exploit this result frequently in what follows. Example 2.5. The covariance matrix for the process in Example 2. is where a L (ω) is a so-called Fourier vector, R x = A 2 a L (ω )a L(ω ) (2.74) a L (ω) = [ e iω e iω(l ) ] T Thus, R x is a rank-one positive semi-definite Toeplitz matrix. (2.75) Similar to the above discussion for r y (k), we will generally need to estimate R y from the available measurement. The definition of R y in (2.65) immediately suggests the forming of ˆR y as the Toeplitz matrix formed from the estimated

22 2.3. Stochastic processes 7 ˆr y (k), obtained using (2.56). An alternative estimate can be obtained by instead forming the outer-product estimate ˆR y = M M y t yt, (2.76) which, due to finite sample effects, typically will not exhibit a Toeplitz structure. Imposing the Toeplitz structure, as is done if using ˆr y (k) often yields undesirable effects (more on this later on), and one therefore often prefer using (2.76) instead. As all Toeplitz matrices will be persymmetric (although it should be noted that the opposite is not true), i.e., t= A = J T AJ (2.77) for some matrix A, where J is the L L exchange (or reversal) matrix formed as J =... (2.78) where all the empty values of the matrix are zero, one further alternative is to instead impose a persymmetric structure on ˆR y, forming the so-called forwardbackward covariance matrix estimate ˆR fb y = 2 ( ˆRy + J ˆR T y J ), (2.79) where ˆR y is formed using (2.76). Often, yields estimates that are superior to ˆR y, and if not otherwise specified, this should be our choice for estimating R y. The estimates in (2.76) and (2.79) can be computed using the Matlab function covm provided in Appendix B. ˆR fb y The power spectral density An often convenient way to characterize a stochastic process is via its power spectral density (PSD), defined as the discrete-time Fourier transform (DFT) of the ACF, i.e., φ y (ω) = r y (k)e iωk (2.8) k= The inverse transform recovers r y (k), r y (k) = 2π φ y (ω)e iωk dω (2.8) It is worth noting that the inverse of a (per)symmetric matrix will be (per)symmetric. Generally, the inverse of a Toeplitz matrix is not Toeplitz, but as all Toeplitz matrices are persymmetric, the inverse of a Toeplitz matrix will be persymmetric. Furthermore, the inverse of a symmetric Toeplitz matrix will be centrosymmetric, i.e., it is both symmetric and persymmetric. If A C, such that one A = J T AJ, we instead say that A is a perhermitian matrix; in this case, the inverse of a Hermitian Toeplitz matrix will instead be centrohermitian.

23 8 2. Stochastic Processes from which we note that r y () = 2π φ y (ω) dω. (2.82) Since r y () = E{ y t 2 } measures the power of y t, the equality in (2.82) shows that φ y (ω) is indeed correctly named a power spectral density as it is representing the distribution of the signal power over frequencies. Under weak assumptions 2, it can be shown that (2.8) is equivalent to φ y (ω) = lim E N 2 y t e iωk (2.83) N N Using the DFT, Y N (ω) = t= N y t e iωk, (2.84) t= the PSD in (2.83) can be expressed as { } φ y (ω) = lim E N N Y N (ω) 2, (2.85) which also suggests the most natural way to estimate the PSD, i.e., as the magnitude square of the DFT of the data vector, i.e., ˆφ y (ω) = N Y N (ω) 2 = N 2 y t e iωk N t= (2.86) This estimator, termed the periodogram, was introduced in 898 by Sir Arthur Schuster 3, who derived it to determine hidden periodicities (non-obvious periodic signals) in time series [2; 3]. As an alternative, one could use the definition in (2.8) to instead form the estimate of the PSD as ˆφ y (ω) = N k= (N ) ˆr y (k)e iωk, (2.87) where ˆr y (k) is the biased ACF estimate defined in (2.56). The resulting estimate is commonly referred to as the correlogram. As shown in, e.g. [4], the estimates in (2.86) will coincide with the estimate in (2.87) as long as the latter is formed using the biased ACF estimate. This is most convenient as it is often simpler to use (2.87) when analyzing the performance of the estimate, whereas it is computationally simpler to use (2.86) when actually computing the estimate. 2 P The ACF needs to decay sufficiently rapidly, i.e., lim N N N k= N k ry(k) =. 3 Schuster applied the Periodogram to find hidden periodicities in the monthly sunspot numbers for the years 749 to 894, yielding the classical estimate of.25 years for the sunspot cycle.

24 2.3. Stochastic processes Magnitude] 4 3 Power [db] Absolute frequency (a) Absolute frequency (b) Figure 2.3: The periodogram estimate of the white noise signal in Example 2.6. Since φ y (ω) is a power density, it is natural to assume that it should be realvalued and non-negative. This is indeed the case which can readily be seen from (2.83). Hence, φ y (ω), ω. (2.88) Further, the power spectral density is periodic, such that φ y (ω) = φ y (ω + 2πk), (2.89) for any integer k. In the particular case when the process is real-valued, the PSD is symmetric, so that φ y (ω) = φ y ( ω). Otherwise, if it is complex-valued, the PSD is non-symmetric. Example 2.6. The white process in Example 2.4 has the PSD φ e (ω) = σ 2 e, (2.9) which, as expected, is real-valued and positive. Figure 2.3 illustrates the periodogram estimate of a realization of this process, with σe 2 = and the realization consisting of N = 5 samples. Figure 2.3(a) shows the (regular) periodogram estimate, whereas Figure 2.3(b) instead plots the estimate in decibel (db). The second plot is obtained as { } ˆφ db e (ω) = log ˆφe (ω), (2.9) db where ˆφ e (ω) is the periodogram estimate expressed in db, whereas ˆφ e (ω) is the periodogram estimate expressed in the regular (linear) domain. Plotting the signal in db allows us to easier see the full range of values, as even the relatively small values are visible, whereas if expressed in the regular domain, it would be hard to see these. Here, since the spectrum will be symmetric (as e t is real-valued), the figures

25 2 2. Stochastic Processes Power [db] Frequency [Hz] Figure 2.4: The periodogram estimate of the voiced speech signal in Example 2.8. The fundamental frequency of the signal is about 24 Hz. only show the positive frequencies. As is clear from the figures, the periodogram estimate seems to be unbiased, having a mean of about db (i.e., ), but exhibit a very large variance. We will return to this aspect and discuss it further detail in Section 2.4. Example 2.7. Let y t = x t + e t, t =,..., N, where e t is assumed to by a zero-mean white noise, with variance σ 2 e, independent of x t, and Then, and R y = x t = n A l e iω lt+iϕ l (2.92) l= n A l 2 a L (ω l )a L(ω l ) + σei 2 (2.93) l= φ y (ω) = n A l 2 δ D (ω ω l ) + σe 2 (2.94) l= where δ D (ω) is the Dirac delta, satisfying f(a) = f(x)δ D (x a) dx. (2.95) In the particular case when x t is a sum of real values sinusoids, it is clear that the spectrum will be symmetric, whereas it will otherwise not be. Example 2.8. Figure 2.4 illustrates the periodogram estimate of the voiced speech signal in Example 2.3. As is typical for voiced speech, the signal can be seen to contain several spectral peaks at frequencies being an integer multiple

26 2.3. Stochastic processes 2 of the first peak frequency, the so-called fundamental frequency. One common model for such signals is (see also, e.g. [5]) y t = d α k sin(ω k t + φ k ) + e t, (2.96) k= where α k, ω k, and φ k are the amplitude, frequency, and phase of the k:th sinusoidal component, with e t denoting some additive noise, and the frequencies ω k = kω, with ω being the fundamental frequency. In this example, the fundamental frequency of the signal is about 24 Hz Filtering of a stochastic process We are herein particularly interested in the filtering of stochastic processes through a asymptotically stable linear system. Let H(z) = k= h k z k (2.97) denote an asymptotically stable linear time-invariant system, where z denotes the unit delay operator, defined as z z t = z t, (2.98) and assume that the process y t is formed as the output of this system, i.e., y t = k= where x t is the input to the system. Then, m y = E { k= = m x k= h k x t k } = h k x t k, (2.99) k= h k E {x t k } (2.) h k = m x H() (2.) with H(ω) = h k e iωk (2.2) k= The mean of the output process is thus the mean of the input process scaled with the gain of the filter. Comparing (2.97) and (2.2), it is clear that we are here using the fact that z = e iω. For this reason, the notation H(e iω ) is often also used for H(ω). We proceed to examine the covariance of the output

27 22 2. Stochastic Processes process, noting that the covariance of the output process can be expressed as { } r y (t + k, t) = E {y t+k y t } = E = = l= l= y t+k l= x l h t l (2.3) h t le {y t+k x l } (2.4) h t lr y,x (t + k, l) (2.5) where we with the notation r y (t + k, t) and r y,x (t + k, l) indicate that the auto-covariance and the cross-covariance may, possibly, not be WSS, and can therefore not be written as a function of only the time difference k. Expanding the cross covariance in (2.5) as { } r y,x (t + k, t) = E {y t+k x t } = E h l x t+k l x t (2.6) = = l= l= l= h l E {x t+k l x t } (2.7) h l r x (k l) (2.8) which indicate that the cross-covariance, and as a result from (2.5), the autocovariance of the output process, only depends on the time difference k, thus indicating that y t will also be WSS. Changing the summation index in (2.5) by setting m = k l yields which using (2.8) yields r y (k) = r y (k) = m= m= l= h mr y,x (m + k), (2.9) h mh l r x (m + k l) (2.) = r x (k) h k h k (2.) where denotes the convolution operator, or, in the frequency domain, It is worth noting that this also implies that r y () = φ y (ω) = H(ω) 2 φ x (ω) (2.2) m= l= h mh l r x (m l), (2.3)

28 2.3. Stochastic processes 23 which, for finite length filters, say, of length n, implies that r y () = σ 2 y = h R x h, (2.4) where h = [ h... h n ] T (2.5) and R x is defined as in (2.65). We proceed to define: Definition 2.6. The cross spectral density of the two stationary processes x t and y t is defined as the DFT of the cross-covariance function, i.e., φ x,y (ω) = k= r x,y (k)e iωk (2.6) where r x,y (k) is defined as in (2.46). In general, φ x,y (ω) is complex-valued. Definition 2.7. The (complex) coherence spectrum of the two stationary processes x t and y t is defined as C x,y (ω) = φ x,y (ω) φx (ω)φ y (ω), (2.7) The coherence spectrum is generally complex-valued and is bound as C x,y (ω), (2.8) with equality, for all ω, if and only if, x t and y t are related as in (2.99). From (2.6)-(2.8), it is clear that the cross spectrum of the input and output is related via the so-called Wiener-Hopf equation We have thus concluded the following: φ x,y (ω) = H(ω)φ x (ω) (2.9) When filtering the WSS process x t through the stable linear finite time-invariant system, h k, the output, y t, will satisfy: φ y (ω) = H(ω) 2 φ x (ω) φ x,y (ω) = H(ω)φ x (ω) m y = m x H() r y () = h R x h with H(ω) and h are defined in (2.2) and (2.5), respectively. We will make good use of these important relations in the following.

29 24 2. Stochastic Processes e t e t e t 2 e t q z z z c c 2 3 c q y t Figure 2.5: Generation of a MA(q) process The moving average process We proceed to define the first of two basic forms of linear filters, namely: Definition 2.8. The process y t is called a moving average (MA) process if y t = e t + c e t c q e t q = C(z)et, (2.2) where C(z) is a monic polynomial of order q, i.e., C(z) = + c z c q z q, (2.2) where c q, and e t is a zero-mean white noise process with variance σe. 2 The process is always stable, and is invertible if and only if all the zeros to the generating polynomial C(z) are strictly within the unit circle. Figure 2.5 illustrates the generation of a MA(q)-process. As seen in (2.2), the generating polynomial, C(z), C(z) = + c z c q z q = q c k z k, (2.22) k= where c =, allows us to express the MA(q) process as y t = C(z)e t, (2.23) which suggests that the transfer function of the corresponding (linear) filter is C(z). If the zeros of the polynomial C(z) are inside the unit circle, the polynomial is invertible, allowing one to form the inverse filter, i.e., one may form the (driving) noise process e t as e t = C(z) y t = D(z)y t = d k y t k, (2.24) where D(z) is the inverse filter generating the noise process. It is worth noting that D(z) will generally have an infinite impulse response (IIR). From Definition 2.8, one can easily conclude that: k=

30 2.3. Stochastic processes 25 An MA(q)-process will satisfy m y = E{C(z)e t } = (2.25) { σ 2 q k r y (k) = e l= c k c k+ l if k q (2.26) if k > q φ y (ω) = σe 2 C(e iω ) 2 (2.27) where C(e iω ) indicates that the polynomial has been evaluated at frequency ω, i.e., z = e iω. Example 2.9. Consider the (real-valued) MA()-process y t = e t +c e t, i.e., the process having the generating polynomial C(z) = + c z. (2.28) Thus, if c <, the process is invertible. The ACF of y t is (cf. (2.26)) r y () = σ 2 e( + c 2 ) (2.29) r y () = σ 2 ec (2.3) r y (k) =, for k > 2 (2.3) To easily verify the above, as well as other similar cases, it is helpful to write out the covariances explicitly, i.e., for instance for r y () (cf. (2.44)) r y () = E { [e t + c e t ] [e t + c e t 2 ]} (2.32) = E { e t e t + c e t e t 2 + c e t e t + c 2 e t e t 2 } (2.33) = c E {e t e t } = c σ 2 e. (2.34) Similarly, the PSD of y t is φ y (ω) = σe 2 + c e iω ( ) 2 = σe 2 + c 2 + 2c cos(ω), (2.35) for ω = 2πf, with.5 f.5. It is worth stressing that for an MA(q)-process, r y (k) =, for k > q. This insight allow for a way to identify if a measurement can be well modeled as an MA-process; if the estimated ACF is zero for lags higher than l, it may be reasonable to model the measurement as a realization of an MA(l)-process. We will return to this discussion further in Chapter 3. Example 2.. Consider the MA(4)-process formed using C(z) = +.8z +.5z 2 +.2z 3 +.6z 4 (2.36) Figure 2.6 illustrates a realization of this process together with the estimated correlation function, spectral density, and the roots of the C(z)-polynomial.

31 26 2. Stochastic Processes Amplitude Amplitude Sample (a) Lag (b) Periodogram MA estimate Power [db] Imaginary Part Frequency [Hz] (c).5.5 Real Part (d) Figure 2.6: The figure illustrates the MA(4)-process discussed in Example 2., with (a) showing a realization of the process, (b) the estimated correlation function, (c) the estimated power spectral density, and (d) the roots of the C(z)-polynomial. These figures deserve some further comments. First, it is worth noting in Figure 2.6(b) that the estimated cross-covariance is not zero, as expected from (2.26), for lags higher than 4. Similar to the discussion following Example 2.4, this is due to the difficulty of estimating r y (k) accurately given a finite amount of data. This can be seen better in Figure 2.7, which shows a closer look at the correlation function in Figure 2.6(b), together with the corresponding confidence intervals as given by Theorem 2.5 (below). Secondly, in Figure 2.6(c), the periodogram estimate is plotted together with the true PSD. As can be seen in the figure, the spectrum contains two nulls, i.e., two frequencies for which the PSD has low power. This can also be seen from the location of the roots of the C(z)-polynomial. These roots are shown in Figure 2.6(d), and are (approximately) z = i, z 2 =.76.64i, z 3 = i, and z 4 =.36.69i. If expressed using z = e iω, with ω = 2πf, this corresponds to the frequencies f =.39, f 2 =.39, f 3 =.7, and f 4 =.7. These frequencies can also be viewed as the angles of the

32 2.3. Stochastic processes Amplitude Lag Figure 2.7: A closer look at the estimated correlation function for the MA(4)- process in Example 2.. This is a magnified version of Figure 2.6(b). The dashed lines correspond to the confidence interval given by Theorem 2.5. vectors pointing to the corresponding roots. Thus, the angle of the vector for the root corresponding to f, which is marked with an arrow in the figure, will be ω = 2π.39. An important insight is that the spectrum is nothing but C(e iω ) 2 evaluated along the unit circle, which implies that the spectrum will have dips at the frequencies that corresponds to the angle of the roots z l, for l =,..., 4. Moreover, the closer the actual root is to the unit circle, the deeper the null, with the spectrum being zero if the root is on the unit circle. Examining the root corresponding to f, we note that this root is closer to the unit circle as compared to z 3, and will thus exhibit a deeper null in the resulting spectrum as compared to the one at frequency f 3, just as we see in Figure 2.6(c). It is often helpful to compute the roots of the generating polynomial as well as the angles of these roots. Using Matlab, this is easily done using the following lines of code: C = [ ]; f = angle( roots(c) )/pi/2 In Matlab, all vectors are indexed as starting at z. Thus, the first line will be interpreted by Matlab as forming the polynomial C(z) in (2.36). The second line will compute the roots of C(z), followed by finding the argument of the roots, and scaling these arguments with 2π. As noted in the above discussion, we also need to formulate a generalization of Theorem 2.4 for MA-processes. It can be shown that: Theorem 2.5. Let y t, for t =,..., N, be a realization of an MA(q)-process.

33 28 2. Stochastic Processes e t 2 a p z y t p a 2 a z z y t 2 y t y t Figure 2.8: Generation of an AR(p)-process. If ˆρ y (k) is estimated according to Definition 2.5, then asymptotically E{ˆρ y (k)} = (2.37) V {ˆρ y (k)} = ( ) + 2(ˆρ 2 N y() ˆρ 2 y(q)) (2.38) for k = q+, q+2,.... Furthermore, ˆρ y (k), for k > q, is asymptotically Normal distributed The autoregressive process We proceed to define the second basic linear process, namely: Definition 2.9. The process y t is called an autoregressive (AR) process if A(z)y t = yt + a y t a p y t p = e t, (2.39) where A(z) is a monic polynomial of order p, i.e., A(z) = + a z a p z p, (2.4) where a p, and e t is a zero-mean white noise process with variance σ 2 e, being uncorrelated with y t l, for l >. The process is stationary (and thus an AR-process) if and only if all the zeros of the generating polynomial A(z) are strictly within the unit circle. An AR-process is always invertible. Figure 2.8 illustrates the generation of an AR(p)-process. The mean of an AR-process is easily found by taking the expectation on both sides of (2.39), i.e., E{y t + a y t a p y t p } = E{e t } =, (2.4) Thus, m y ( + a a p ) = m y A() =, which implies that m y = as all the zeros of A(z) are strictly within the unit circle, implying that A().

34 2.3. Stochastic processes 29 From (2.44) and (2.39), as well as m y =, one may also find the covariance function of the process by post-multiplying the process with yt k and taking the expectation, i.e., E{e t y t k} = E{y t y t k + a y t y t k a p y t p y t k} (2.42) = r y (k) + a r y (k ) a p r y (k p) (2.43) Since e t is uncorrelated with y t l, for l >, E{e t y t k } = σ2 eδ K (k), with δ K (k) defined as in (2.67), implying that r y (k) + a r y (k ) a p r y (k p) = σ 2 eδ K (k), (2.44) which is known as the Yule-Walker equations. Expressed in matrix form for k =,..., n, (2.44) implies r y () r y ( )... r y ( n) σe 2. r y () r y (). a.... ry ( ). =. (2.45). r y (n)... r y () a n Introducing θ = [ a... a p ] T (2.46) and, by using all but the first row of (2.45), implies that r y () r y ()... r y ( n + ) a r y (n) r y (n )... r y () a n or, with obvious definitions, implying that =. (2.47) r n + R n θ =, (2.48) ˆθ = R n r n, (2.49) which directly yields an estimate of the AR coefficients. We will here refer to this as the Yule-Walker estimate of the AR-coefficients (see also Example 3.5). It is worth noting that R n is a Toeplitz matrix, a fact that we will make good use of in the following. Example 2.. Consider the real-valued AR()-process formed using y t + a y t = e t. (2.5) Clearly, if a >, then y t = e t a y t will grow exponentially as t grows, and y t will therefore not a stationary process, confirming that the roots of the A(z)-polynomial needs to be strictly inside the unit circle for the process to be an AR-process. Using (2.44) implies that r y () + a r y () = σ 2 e (2.5) r y () + a r y () = (2.52)

35 3 2. Stochastic Processes where we have exploited that r y (k) = r y( k). Clearly, this allows us to estimate a as a function of r y () and r y (), i.e., a = r y() r y () σ 2 e = r y () + a r y () = r2 y() r 2 y() r y () (2.53) (2.54) Alternatively, we may assume we know a and instead solve for r k, i.e., r y () = σ 2 2 a 2 (2.55) σ2 2 r y () = a r y () = a a 2. (2.56) As r y (k) + a r y (k ) =, we may extend this to r y (k) = ( ) k σ 2 a 2 a 2, (2.57) where we have again exploited the symmetry of r y (k). The power spectrum of y t can be found easily by expressing the process as formed by filtering a white noise with variance σ 2 e through the first-order all-pole filter which, using (2.2), yields H(z) =, (2.58) a z φ y (ω) = σ 2 e [ a e iω ] [ a e iω ] = σ 2 e + a 2 2a cos ω (2.59) From the expression of φ y (ω), it is worth noting that the power in y t will be concentrated at low frequencies if a >, and is therefore referred to as a lowpass process, with the power being more concentrated close to ω = if a is closer to (recall that a < to ensure stability), whereas if a <, the power will instead be concentrated at high frequencies, and is then called a high-pass process. Generalizing the formulation of φ y (ω) in the above example to an AR(p)- process, we find that:

36 2.3. Stochastic processes Amplitude 2 Amplitude Sample (a) Lag (b) 6 4 Periodogram AR estimate Power [db] 2 4 Imaginary Part Frequency [Hz] (c).5.5 Real Part (d) Figure 2.9: The figure illustrates the AR(4)-process discussed in Example 2.2, with (a) showing a realization of the process, (b) the estimated correlation function, (c) the estimated power spectral density, and (d) the roots of the A(z)-polynomial. An AR(p)-process will satisfy m y = (2.6) p r y (k) = σeδ 2 K (k) a l r y (k l) (2.6) l= φ y (ω) = σ 2 e A(e iω ) 2 (2.62) with A(e iω ) indicating that the polynomial has been evaluated at frequency ω, i.e., z = e iω. Example 2.2. Consider the AR(4)-process formed using A(z) = +.4z +.4z 2 +.7z 3 +.6z 4 (2.63) Figure 2.9 illustrates a realization of this process together with the estimated

37 32 2. Stochastic Processes.8 2 Periodogram AR estimate Imaginary Part.2.2 Power [db] Real Part (a) Frequency [Hz] (b) Figure 2.: (a) The roots of the estimated A(z)-polynomial, as well as (b) the estimated power spectral density for the signal in Example 2.3. correlation function, spectral density, and the roots of the A(z)-polynomial The Levinson-Durbin algorithm In this section, we will discuss a computationally efficient method for computing the Yule-Walker estimate of the AR-coefficients, as given in (2.49). The presentation here follows the one in [4]. The computing of (2.49), as stated, is computationally expensive, requiring about O(n 3 ) operations, meaning that the cost can be written as c n 3 + c 2 n 2 + c 3 n + c 4, for some constants c l, for l =,..., 4, i.e., the operation has a complexity of order n 3. Fortunately, this complexity can be drastically reduced by exploiting the Toeplitz structure of the covariance matrix. Recall the Yule-Walker equations in (2.45), r y () r y ( )... r y ( n) r y () r y ()..... ry ( ) r y (n)... r y () a. a n = σ 2 n. (2.64) or, using matrix notation, [ ] [ σ 2 R n+ = n θ n ] (2.65) where denotes a column vector with elements of appropriate dimension, and where we now use the notation σ 2 n and θ n in place of σ 2 e and θ, respectively, to stress the order n of the nested structure. Using this structure, we may form

38 2.3. Stochastic processes 33 the vector R n+2 θ n = = r R y(n + ) n+ r n θ n r y (n + ) r n r y () σn 2 (2.66) α n where r n indicates that the vector r n has been ordered in the opposite direction, i.e., (cf. (2.47)-(2.48)) r n = [ r y (n)... r y () ] T (2.67) and where α n = r y (n + ) + r n θ n (2.68) is obtained from the bottom row. Thus, if α n could be nulled, (2.66) would be the counterpart of (2.65), with n increased by one. To achieve this, we introduce the reflection coefficient k n+, defined as and form R n+2 θ n + k n+ k n+ = α n σ 2 n θ n = = σn 2 + k n+ α n [ σ 2 n + k n+ α n ] α n σ 2 n (2.69) (2.7) where we have made use of the fact that for any Hermitian Toeplitz matrix y = Rx ỹ = R x (2.7) where, as before, x indicates that the vector x has been ordered in the opposite direction. The expression in (2.69) has the same form as (2.65), with n increased by one, i.e., [ ] [ ] σ 2 R n+2 = n+ (2.72) θ n+ This suggests that we may compute an order-recursive estimate of θ as [ ] [ ] θn θn θ n+ = + k n+ (2.73) σ 2 n+ = σ 2 n ( kn+ 2) (2.74) The initialization is straightforward, and the algorithm can be summarized as:

39 34 2. Stochastic Processes Initialization: The Levinson-Durbin algorithm θ = r y() r y () = k σ 2 = r y () r y() 2 r y () Then, for iteration n =,..., n max, k n+ = r y(n + ) + rθ n σn 2 σn+ 2 = σn 2 ( kn+ 2) [ ] [ θn θn θ n+ = + k n+ As can be seen from the table, the Levinson-Durbin algorithm will reduce the complexity of computing θ to O(n 2 ) operations, which is a substantial computational reduction, particularly important for larger values of n. One should note that an estimate of r y (k) is needed prior to computing the Levinson- Durbin estimate. As this is also a relatively computationally expensive estimate, an algorithm that could estimate θ directly from the measurements y t, without the need of first computing r y (k), would clearly be preferable. Such algorithms exist and work exceedingly well (see, e.g., Example 3.5). The most well-known of these are the so-called Burg algorithm [6] and the modified covariance method [7], where the latter is generally perceived to be the method of choice for estimating θ both efficiently and accurately. If using Matlab, these estimates can be found by using the functions arburg and armcov. The interested reader is referred to [4; 7] for a further discussion of these algorithms. It is also worth stressing that the Levinson-Durbin algorithm will produce an exact solution of (2.49). If one allows for an approximate solution, one can achieve further substantial computational reductions using, for instance, the preconditioned conjugate gradient algorithm which only requires O (2n log(2n)) operations (see, e.g., [8]) ARMA, ARIMA, and SARIMA processes We now proceed to combine the two basic processes to form an ARMA process: Definition 2.. The process y t is called a autoregressive moving average (ARMA) process if A(z)y t = C(z)e t, (2.75) where A(z) and C(z) are monic polynomials of order p and q, respectively, i.e., A(z) = + a z a p z p (2.76) C(z) = + c z c q z q (2.77) ]

Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes

Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes Electrical & Computer Engineering North Carolina State University Acknowledgment: ECE792-41 slides were adapted

More information

Computer Exercise 0 Simulation of ARMA-processes

Computer Exercise 0 Simulation of ARMA-processes Lund University Time Series Analysis Mathematical Statistics Fall 2018 Centre for Mathematical Sciences Computer Exercise 0 Simulation of ARMA-processes The purpose of this computer exercise is to illustrate

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

Probability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver

Probability Space. J. McNames Portland State University ECE 538/638 Stochastic Signals Ver Stochastic Signals Overview Definitions Second order statistics Stationarity and ergodicity Random signal variability Power spectral density Linear systems with stationary inputs Random signal memory Correlation

More information

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation University of Oxford Statistical Methods Autocorrelation Identification and Estimation Dr. Órlaith Burke Michaelmas Term, 2011 Department of Statistics, 1 South Parks Road, Oxford OX1 3TG Contents 1 Model

More information

Advanced Digital Signal Processing -Introduction

Advanced Digital Signal Processing -Introduction Advanced Digital Signal Processing -Introduction LECTURE-2 1 AP9211- ADVANCED DIGITAL SIGNAL PROCESSING UNIT I DISCRETE RANDOM SIGNAL PROCESSING Discrete Random Processes- Ensemble Averages, Stationary

More information

EEM 409. Random Signals. Problem Set-2: (Power Spectral Density, LTI Systems with Random Inputs) Problem 1: Problem 2:

EEM 409. Random Signals. Problem Set-2: (Power Spectral Density, LTI Systems with Random Inputs) Problem 1: Problem 2: EEM 409 Random Signals Problem Set-2: (Power Spectral Density, LTI Systems with Random Inputs) Problem 1: Consider a random process of the form = + Problem 2: X(t) = b cos(2π t + ), where b is a constant,

More information

Lecture 4 - Spectral Estimation

Lecture 4 - Spectral Estimation Lecture 4 - Spectral Estimation The Discrete Fourier Transform The Discrete Fourier Transform (DFT) is the equivalent of the continuous Fourier Transform for signals known only at N instants separated

More information

Multivariate Time Series

Multivariate Time Series Multivariate Time Series Notation: I do not use boldface (or anything else) to distinguish vectors from scalars. Tsay (and many other writers) do. I denote a multivariate stochastic process in the form

More information

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno

Stochastic Processes. M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno Stochastic Processes M. Sami Fadali Professor of Electrical Engineering University of Nevada, Reno 1 Outline Stochastic (random) processes. Autocorrelation. Crosscorrelation. Spectral density function.

More information

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong STAT 443 Final Exam Review L A TEXer: W Kong 1 Basic Definitions Definition 11 The time series {X t } with E[X 2 t ] < is said to be weakly stationary if: 1 µ X (t) = E[X t ] is independent of t 2 γ X

More information

Time Series: Theory and Methods

Time Series: Theory and Methods Peter J. Brockwell Richard A. Davis Time Series: Theory and Methods Second Edition With 124 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition vn ix CHAPTER 1 Stationary

More information

CCNY. BME I5100: Biomedical Signal Processing. Stochastic Processes. Lucas C. Parra Biomedical Engineering Department City College of New York

CCNY. BME I5100: Biomedical Signal Processing. Stochastic Processes. Lucas C. Parra Biomedical Engineering Department City College of New York BME I5100: Biomedical Signal Processing Stochastic Processes Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not

More information

Discrete time processes

Discrete time processes Discrete time processes Predictions are difficult. Especially about the future Mark Twain. Florian Herzog 2013 Modeling observed data When we model observed (realized) data, we encounter usually the following

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Time Series 2. Robert Almgren. Sept. 21, 2009

Time Series 2. Robert Almgren. Sept. 21, 2009 Time Series 2 Robert Almgren Sept. 21, 2009 This week we will talk about linear time series models: AR, MA, ARMA, ARIMA, etc. First we will talk about theory and after we will talk about fitting the models

More information

Review of some mathematical tools

Review of some mathematical tools MATHEMATICAL FOUNDATIONS OF SIGNAL PROCESSING Fall 2016 Benjamín Béjar Haro, Mihailo Kolundžija, Reza Parhizkar, Adam Scholefield Teaching assistants: Golnoosh Elhami, Hanjie Pan Review of some mathematical

More information

Time Series Examples Sheet

Time Series Examples Sheet Lent Term 2001 Richard Weber Time Series Examples Sheet This is the examples sheet for the M. Phil. course in Time Series. A copy can be found at: http://www.statslab.cam.ac.uk/~rrw1/timeseries/ Throughout,

More information

ECE 636: Systems identification

ECE 636: Systems identification ECE 636: Systems identification Lectures 3 4 Random variables/signals (continued) Random/stochastic vectors Random signals and linear systems Random signals in the frequency domain υ ε x S z + y Experimental

More information

3. ARMA Modeling. Now: Important class of stationary processes

3. ARMA Modeling. Now: Important class of stationary processes 3. ARMA Modeling Now: Important class of stationary processes Definition 3.1: (ARMA(p, q) process) Let {ɛ t } t Z WN(0, σ 2 ) be a white noise process. The process {X t } t Z is called AutoRegressive-Moving-Average

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t, CHAPTER 2 FUNDAMENTAL CONCEPTS This chapter describes the fundamental concepts in the theory of time series models. In particular, we introduce the concepts of stochastic processes, mean and covariance

More information

EEG- Signal Processing

EEG- Signal Processing Fatemeh Hadaeghi EEG- Signal Processing Lecture Notes for BSP, Chapter 5 Master Program Data Engineering 1 5 Introduction The complex patterns of neural activity, both in presence and absence of external

More information

EE482: Digital Signal Processing Applications

EE482: Digital Signal Processing Applications Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu EE482: Digital Signal Processing Applications Spring 2014 TTh 14:30-15:45 CBC C222 Lecture 11 Adaptive Filtering 14/03/04 http://www.ee.unlv.edu/~b1morris/ee482/

More information

Computer Exercise 1 Estimation and Model Validation

Computer Exercise 1 Estimation and Model Validation Lund University Time Series Analysis Mathematical Statistics Fall 2018 Centre for Mathematical Sciences Computer Exercise 1 Estimation and Model Validation This computer exercise treats identification,

More information

Gaussian, Markov and stationary processes

Gaussian, Markov and stationary processes Gaussian, Markov and stationary processes Gonzalo Mateos Dept. of ECE and Goergen Institute for Data Science University of Rochester gmateosb@ece.rochester.edu http://www.ece.rochester.edu/~gmateosb/ November

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be ed to

Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be  ed to TIME SERIES Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be emailed to Y.Chen@statslab.cam.ac.uk. 1. Let {X t } be a weakly stationary process with mean zero and let

More information

DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 1002 Lecture notes 10 November 23, Linear models DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Elements of Multivariate Time Series Analysis

Elements of Multivariate Time Series Analysis Gregory C. Reinsel Elements of Multivariate Time Series Analysis Second Edition With 14 Figures Springer Contents Preface to the Second Edition Preface to the First Edition vii ix 1. Vector Time Series

More information

Lecture 1: Fundamental concepts in Time Series Analysis (part 2)

Lecture 1: Fundamental concepts in Time Series Analysis (part 2) Lecture 1: Fundamental concepts in Time Series Analysis (part 2) Florian Pelgrin University of Lausanne, École des HEC Department of mathematics (IMEA-Nice) Sept. 2011 - Jan. 2012 Florian Pelgrin (HEC)

More information

3F1 Random Processes Examples Paper (for all 6 lectures)

3F1 Random Processes Examples Paper (for all 6 lectures) 3F Random Processes Examples Paper (for all 6 lectures). Three factories make the same electrical component. Factory A supplies half of the total number of components to the central depot, while factories

More information

Notes on Random Processes

Notes on Random Processes otes on Random Processes Brian Borchers and Rick Aster October 27, 2008 A Brief Review of Probability In this section of the course, we will work with random variables which are denoted by capital letters,

More information

A time series is called strictly stationary if the joint distribution of every collection (Y t

A time series is called strictly stationary if the joint distribution of every collection (Y t 5 Time series A time series is a set of observations recorded over time. You can think for example at the GDP of a country over the years (or quarters) or the hourly measurements of temperature over a

More information

ECE531 Lecture 12: Linear Estimation and Causal Wiener-Kolmogorov Filtering

ECE531 Lecture 12: Linear Estimation and Causal Wiener-Kolmogorov Filtering ECE531 Lecture 12: Linear Estimation and Causal Wiener-Kolmogorov Filtering D. Richard Brown III Worcester Polytechnic Institute 16-Apr-2009 Worcester Polytechnic Institute D. Richard Brown III 16-Apr-2009

More information

Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering. Stochastic Processes and Linear Algebra Recap Slides

Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering. Stochastic Processes and Linear Algebra Recap Slides Prof. Dr.-Ing. Armin Dekorsy Department of Communications Engineering Stochastic Processes and Linear Algebra Recap Slides Stochastic processes and variables XX tt 0 = XX xx nn (tt) xx 2 (tt) XX tt XX

More information

Chapter 4: Models for Stationary Time Series

Chapter 4: Models for Stationary Time Series Chapter 4: Models for Stationary Time Series Now we will introduce some useful parametric models for time series that are stationary processes. We begin by defining the General Linear Process. Let {Y t

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

5: MULTIVARATE STATIONARY PROCESSES

5: MULTIVARATE STATIONARY PROCESSES 5: MULTIVARATE STATIONARY PROCESSES 1 1 Some Preliminary Definitions and Concepts Random Vector: A vector X = (X 1,..., X n ) whose components are scalarvalued random variables on the same probability

More information

Machine Learning. A Bayesian and Optimization Perspective. Academic Press, Sergios Theodoridis 1. of Athens, Athens, Greece.

Machine Learning. A Bayesian and Optimization Perspective. Academic Press, Sergios Theodoridis 1. of Athens, Athens, Greece. Machine Learning A Bayesian and Optimization Perspective Academic Press, 2015 Sergios Theodoridis 1 1 Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens, Athens,

More information

Reliability and Risk Analysis. Time Series, Types of Trend Functions and Estimates of Trends

Reliability and Risk Analysis. Time Series, Types of Trend Functions and Estimates of Trends Reliability and Risk Analysis Stochastic process The sequence of random variables {Y t, t = 0, ±1, ±2 } is called the stochastic process The mean function of a stochastic process {Y t} is the function

More information

Adaptive Filtering. Squares. Alexander D. Poularikas. Fundamentals of. Least Mean. with MATLABR. University of Alabama, Huntsville, AL.

Adaptive Filtering. Squares. Alexander D. Poularikas. Fundamentals of. Least Mean. with MATLABR. University of Alabama, Huntsville, AL. Adaptive Filtering Fundamentals of Least Mean Squares with MATLABR Alexander D. Poularikas University of Alabama, Huntsville, AL CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is

More information

X random; interested in impact of X on Y. Time series analogue of regression.

X random; interested in impact of X on Y. Time series analogue of regression. Multiple time series Given: two series Y and X. Relationship between series? Possible approaches: X deterministic: regress Y on X via generalized least squares: arima.mle in SPlus or arima in R. We have

More information

LECTURE NOTES IN AUDIO ANALYSIS: PITCH ESTIMATION FOR DUMMIES

LECTURE NOTES IN AUDIO ANALYSIS: PITCH ESTIMATION FOR DUMMIES LECTURE NOTES IN AUDIO ANALYSIS: PITCH ESTIMATION FOR DUMMIES Abstract March, 3 Mads Græsbøll Christensen Audio Analysis Lab, AD:MT Aalborg University This document contains a brief introduction to pitch

More information

Ross Bettinger, Analytical Consultant, Seattle, WA

Ross Bettinger, Analytical Consultant, Seattle, WA ABSTRACT DYNAMIC REGRESSION IN ARIMA MODELING Ross Bettinger, Analytical Consultant, Seattle, WA Box-Jenkins time series models that contain exogenous predictor variables are called dynamic regression

More information

Time Series Analysis. Solutions to problems in Chapter 5 IMM

Time Series Analysis. Solutions to problems in Chapter 5 IMM Time Series Analysis Solutions to problems in Chapter 5 IMM Solution 5.1 Question 1. [ ] V [X t ] = V [ǫ t + c(ǫ t 1 + ǫ t + )] = 1 + c 1 σǫ = The variance of {X t } is not limited and therefore {X t }

More information

Some Time-Series Models

Some Time-Series Models Some Time-Series Models Outline 1. Stochastic processes and their properties 2. Stationary processes 3. Some properties of the autocorrelation function 4. Some useful models Purely random processes, random

More information

Statistical Signal Processing Detection, Estimation, and Time Series Analysis

Statistical Signal Processing Detection, Estimation, and Time Series Analysis Statistical Signal Processing Detection, Estimation, and Time Series Analysis Louis L. Scharf University of Colorado at Boulder with Cedric Demeure collaborating on Chapters 10 and 11 A TT ADDISON-WESLEY

More information

On Moving Average Parameter Estimation

On Moving Average Parameter Estimation On Moving Average Parameter Estimation Niclas Sandgren and Petre Stoica Contact information: niclas.sandgren@it.uu.se, tel: +46 8 473392 Abstract Estimation of the autoregressive moving average (ARMA)

More information

Time Series Examples Sheet

Time Series Examples Sheet Lent Term 2001 Richard Weber Time Series Examples Sheet This is the examples sheet for the M. Phil. course in Time Series. A copy can be found at: http://www.statslab.cam.ac.uk/~rrw1/timeseries/ Throughout,

More information

E 4101/5101 Lecture 6: Spectral analysis

E 4101/5101 Lecture 6: Spectral analysis E 4101/5101 Lecture 6: Spectral analysis Ragnar Nymoen 3 March 2011 References to this lecture Hamilton Ch 6 Lecture note (on web page) For stationary variables/processes there is a close correspondence

More information

Lesson 1. Optimal signalbehandling LTH. September Statistical Digital Signal Processing and Modeling, Hayes, M:

Lesson 1. Optimal signalbehandling LTH. September Statistical Digital Signal Processing and Modeling, Hayes, M: Lesson 1 Optimal Signal Processing Optimal signalbehandling LTH September 2013 Statistical Digital Signal Processing and Modeling, Hayes, M: John Wiley & Sons, 1996. ISBN 0471594318 Nedelko Grbic Mtrl

More information

STAT Financial Time Series

STAT Financial Time Series STAT 6104 - Financial Time Series Chapter 4 - Estimation in the time Domain Chun Yip Yau (CUHK) STAT 6104:Financial Time Series 1 / 46 Agenda 1 Introduction 2 Moment Estimates 3 Autoregressive Models (AR

More information

Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2

Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution. e z2 /2. f Z (z) = 1 2π. e z2 i /2 Next tool is Partial ACF; mathematical tools first. The Multivariate Normal Distribution Defn: Z R 1 N(0,1) iff f Z (z) = 1 2π e z2 /2 Defn: Z R p MV N p (0, I) if and only if Z = (Z 1,..., Z p ) (a column

More information

1 Linear Difference Equations

1 Linear Difference Equations ARMA Handout Jialin Yu 1 Linear Difference Equations First order systems Let {ε t } t=1 denote an input sequence and {y t} t=1 sequence generated by denote an output y t = φy t 1 + ε t t = 1, 2,... with

More information

Difference equations. Definitions: A difference equation takes the general form. x t f x t 1,,x t m.

Difference equations. Definitions: A difference equation takes the general form. x t f x t 1,,x t m. Difference equations Definitions: A difference equation takes the general form x t fx t 1,x t 2, defining the current value of a variable x as a function of previously generated values. A finite order

More information

On Input Design for System Identification

On Input Design for System Identification On Input Design for System Identification Input Design Using Markov Chains CHIARA BRIGHENTI Masters Degree Project Stockholm, Sweden March 2009 XR-EE-RT 2009:002 Abstract When system identification methods

More information

Stochastic Processes. A stochastic process is a function of two variables:

Stochastic Processes. A stochastic process is a function of two variables: Stochastic Processes Stochastic: from Greek stochastikos, proceeding by guesswork, literally, skillful in aiming. A stochastic process is simply a collection of random variables labelled by some parameter:

More information

Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes (cont d)

Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes (cont d) Parametric Signal Modeling and Linear Prediction Theory 1. Discrete-time Stochastic Processes (cont d) Electrical & Computer Engineering North Carolina State University Acknowledgment: ECE792-41 slides

More information

Adaptive Systems Homework Assignment 1

Adaptive Systems Homework Assignment 1 Signal Processing and Speech Communication Lab. Graz University of Technology Adaptive Systems Homework Assignment 1 Name(s) Matr.No(s). The analytical part of your homework (your calculation sheets) as

More information

Econ 623 Econometrics II Topic 2: Stationary Time Series

Econ 623 Econometrics II Topic 2: Stationary Time Series 1 Introduction Econ 623 Econometrics II Topic 2: Stationary Time Series In the regression model we can model the error term as an autoregression AR(1) process. That is, we can use the past value of the

More information

Part III Spectrum Estimation

Part III Spectrum Estimation ECE79-4 Part III Part III Spectrum Estimation 3. Parametric Methods for Spectral Estimation Electrical & Computer Engineering North Carolina State University Acnowledgment: ECE79-4 slides were adapted

More information

Linear Stochastic Models. Special Types of Random Processes: AR, MA, and ARMA. Digital Signal Processing

Linear Stochastic Models. Special Types of Random Processes: AR, MA, and ARMA. Digital Signal Processing Linear Stochastic Models Special Types of Random Processes: AR, MA, and ARMA Digital Signal Processing Department of Electrical and Electronic Engineering, Imperial College d.mandic@imperial.ac.uk c Danilo

More information

Multivariate ARMA Processes

Multivariate ARMA Processes LECTURE 8 Multivariate ARMA Processes A vector y(t) of n elements is said to follow an n-variate ARMA process of orders p and q if it satisfies the equation (1) A 0 y(t) + A 1 y(t 1) + + A p y(t p) = M

More information

Subspace Identification

Subspace Identification Chapter 10 Subspace Identification Given observations of m 1 input signals, and p 1 signals resulting from those when fed into a dynamical system under study, can we estimate the internal dynamics regulating

More information

CONTENTS NOTATIONAL CONVENTIONS GLOSSARY OF KEY SYMBOLS 1 INTRODUCTION 1

CONTENTS NOTATIONAL CONVENTIONS GLOSSARY OF KEY SYMBOLS 1 INTRODUCTION 1 DIGITAL SPECTRAL ANALYSIS WITH APPLICATIONS S.LAWRENCE MARPLE, JR. SUMMARY This new book provides a broad perspective of spectral estimation techniques and their implementation. It concerned with spectral

More information

LECTURE 10 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA. In this lecture, we continue to discuss covariance stationary processes.

LECTURE 10 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA. In this lecture, we continue to discuss covariance stationary processes. MAY, 0 LECTURE 0 LINEAR PROCESSES II: SPECTRAL DENSITY, LAG OPERATOR, ARMA In this lecture, we continue to discuss covariance stationary processes. Spectral density Gourieroux and Monfort 990), Ch. 5;

More information

EC402: Serial Correlation. Danny Quah Economics Department, LSE Lent 2015

EC402: Serial Correlation. Danny Quah Economics Department, LSE Lent 2015 EC402: Serial Correlation Danny Quah Economics Department, LSE Lent 2015 OUTLINE 1. Stationarity 1.1 Covariance stationarity 1.2 Explicit Models. Special cases: ARMA processes 2. Some complex numbers.

More information

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong Modeling, Estimation and Control, for Telecommunication Networks Notes for the MGR-815 course 12 June 2010 School of Superior Technology Professor Zbigniew Dziong 1 Table of Contents Preface 5 1. Example

More information

Linear models. Chapter Overview. Linear process: A process {X n } is a linear process if it has the representation.

Linear models. Chapter Overview. Linear process: A process {X n } is a linear process if it has the representation. Chapter 2 Linear models 2.1 Overview Linear process: A process {X n } is a linear process if it has the representation X n = b j ɛ n j j=0 for all n, where ɛ n N(0, σ 2 ) (Gaussian distributed with zero

More information

Chapter 9: Forecasting

Chapter 9: Forecasting Chapter 9: Forecasting One of the critical goals of time series analysis is to forecast (predict) the values of the time series at times in the future. When forecasting, we ideally should evaluate the

More information

(a)

(a) Chapter 8 Subspace Methods 8. Introduction Principal Component Analysis (PCA) is applied to the analysis of time series data. In this context we discuss measures of complexity and subspace methods for

More information

ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions

ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions ECE 541 Stochastic Signals and Systems Problem Set 9 Solutions Problem Solutions : Yates and Goodman, 9.5.3 9.1.4 9.2.2 9.2.6 9.3.2 9.4.2 9.4.6 9.4.7 and Problem 9.1.4 Solution The joint PDF of X and Y

More information

Statistics of stochastic processes

Statistics of stochastic processes Introduction Statistics of stochastic processes Generally statistics is performed on observations y 1,..., y n assumed to be realizations of independent random variables Y 1,..., Y n. 14 settembre 2014

More information

New Introduction to Multiple Time Series Analysis

New Introduction to Multiple Time Series Analysis Helmut Lütkepohl New Introduction to Multiple Time Series Analysis With 49 Figures and 36 Tables Springer Contents 1 Introduction 1 1.1 Objectives of Analyzing Multiple Time Series 1 1.2 Some Basics 2

More information

Estimation Theory Fredrik Rusek. Chapters

Estimation Theory Fredrik Rusek. Chapters Estimation Theory Fredrik Rusek Chapters 3.5-3.10 Recap We deal with unbiased estimators of deterministic parameters Performance of an estimator is measured by the variance of the estimate (due to the

More information

Univariate Time Series Analysis; ARIMA Models

Univariate Time Series Analysis; ARIMA Models Econometrics 2 Fall 24 Univariate Time Series Analysis; ARIMA Models Heino Bohn Nielsen of4 Outline of the Lecture () Introduction to univariate time series analysis. (2) Stationarity. (3) Characterizing

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

More information

Empirical Market Microstructure Analysis (EMMA)

Empirical Market Microstructure Analysis (EMMA) Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg

More information

Classical Decomposition Model Revisited: I

Classical Decomposition Model Revisited: I Classical Decomposition Model Revisited: I recall classical decomposition model for time series Y t, namely, Y t = m t + s t + W t, where m t is trend; s t is periodic with known period s (i.e., s t s

More information

where r n = dn+1 x(t)

where r n = dn+1 x(t) Random Variables Overview Probability Random variables Transforms of pdfs Moments and cumulants Useful distributions Random vectors Linear transformations of random vectors The multivariate normal distribution

More information

Definition of a Stochastic Process

Definition of a Stochastic Process Definition of a Stochastic Process Balu Santhanam Dept. of E.C.E., University of New Mexico Fax: 505 277 8298 bsanthan@unm.edu August 26, 2018 Balu Santhanam (UNM) August 26, 2018 1 / 20 Overview 1 Stochastic

More information

Practical Spectral Estimation

Practical Spectral Estimation Digital Signal Processing/F.G. Meyer Lecture 4 Copyright 2015 François G. Meyer. All Rights Reserved. Practical Spectral Estimation 1 Introduction The goal of spectral estimation is to estimate how the

More information

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment:

Stochastic Processes: I. consider bowl of worms model for oscilloscope experiment: Stochastic Processes: I consider bowl of worms model for oscilloscope experiment: SAPAscope 2.0 / 0 1 RESET SAPA2e 22, 23 II 1 stochastic process is: Stochastic Processes: II informally: bowl + drawing

More information

7. MULTIVARATE STATIONARY PROCESSES

7. MULTIVARATE STATIONARY PROCESSES 7. MULTIVARATE STATIONARY PROCESSES 1 1 Some Preliminary Definitions and Concepts Random Vector: A vector X = (X 1,..., X n ) whose components are scalar-valued random variables on the same probability

More information

Statistical and Adaptive Signal Processing

Statistical and Adaptive Signal Processing r Statistical and Adaptive Signal Processing Spectral Estimation, Signal Modeling, Adaptive Filtering and Array Processing Dimitris G. Manolakis Massachusetts Institute of Technology Lincoln Laboratory

More information

Gaussian processes. Basic Properties VAG002-

Gaussian processes. Basic Properties VAG002- Gaussian processes The class of Gaussian processes is one of the most widely used families of stochastic processes for modeling dependent data observed over time, or space, or time and space. The popularity

More information

DOA Estimation using MUSIC and Root MUSIC Methods

DOA Estimation using MUSIC and Root MUSIC Methods DOA Estimation using MUSIC and Root MUSIC Methods EE602 Statistical signal Processing 4/13/2009 Presented By: Chhavipreet Singh(Y515) Siddharth Sahoo(Y5827447) 2 Table of Contents 1 Introduction... 3 2

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Akaike criterion: Kullback-Leibler discrepancy

Akaike criterion: Kullback-Leibler discrepancy Model choice. Akaike s criterion Akaike criterion: Kullback-Leibler discrepancy Given a family of probability densities {f ( ; ψ), ψ Ψ}, Kullback-Leibler s index of f ( ; ψ) relative to f ( ; θ) is (ψ

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

ARIMA Modelling and Forecasting

ARIMA Modelling and Forecasting ARIMA Modelling and Forecasting Economic time series often appear nonstationary, because of trends, seasonal patterns, cycles, etc. However, the differences may appear stationary. Δx t x t x t 1 (first

More information

Spectral Analysis. Jesús Fernández-Villaverde University of Pennsylvania

Spectral Analysis. Jesús Fernández-Villaverde University of Pennsylvania Spectral Analysis Jesús Fernández-Villaverde University of Pennsylvania 1 Why Spectral Analysis? We want to develop a theory to obtain the business cycle properties of the data. Burns and Mitchell (1946).

More information

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University

A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University A6523 Modeling, Inference, and Mining Jim Cordes, Cornell University Lecture 19 Modeling Topics plan: Modeling (linear/non- linear least squares) Bayesian inference Bayesian approaches to spectral esbmabon;

More information

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Autoregressive Moving Average (ARMA) Models and their Practical Applications Autoregressive Moving Average (ARMA) Models and their Practical Applications Massimo Guidolin February 2018 1 Essential Concepts in Time Series Analysis 1.1 Time Series and Their Properties Time series:

More information