Exact Maximum Likelihood Estimation for Non-Gaussian Non-invertible Moving Averages
|
|
- Melanie Jenkins
- 5 years ago
- Views:
Transcription
1 Exact Maximum Likelihood Estimation for Non-Gaussian Non-invertible Moving Averages Nan-Jung Hsu Institute of Statistics National Tsing-Hua University F Jay Breidt Department of Statistics Colorado State University January 3, 2005 Abstract A procedure for solving exact maximum likelihood estimation MLE is proposed for non-invertible non-gaussian MA processes By augmenting certain latent variables, the exact likelihood of all relevant innovations can be expressed explicitly according to a set of recursions Breidt and Hsu, 2005 Then, the exact MLE is solved numerically by EM algorithm Two alternative estimators are proposed subject to different treatments to the latent variables We show that these three estimators and the approximate MLE proposed by Lii and Rosenblatt 992 are asymptotically equivalent In simulations, the MLE solved by EM performs better than other estimators in terms of smaller root mean square errors for small samples Keywords: EM algorithm, Monte Carlo, non-invertible, non-minimum phase, non- Gaussian Introduction Consider a qth order moving average process MAq X t = θbz t, where {Z t } is an independent and identically distributed iid sequence of random variables with zero mean and finite variance, B is the backshift operator B k Y t = Y t k for k = 0, ±, ±2,, θz = + θ z + + θ q z q, and θ q 0 The moving average polynomial, θz, is said to be invertible if all the roots of θz = 0 are outside the unit circle in the complex plane, and non-invertible or non-minimum phase otherwise Brockwell and Davis, 99, Theorem 32
2 Invertibility is a standard assumption in the analysis of moving average time series models, because without this assumption the model is not identifiable using estimation methods based on second-order moments of the process Such methods include Gaussian likelihood, least-squares, and various spectral-based methods see, for example, Brockwell and Davis, 99 But in the non-gaussian case, invertible and non-invertible moving averages are distinguishable on the basis of higher-order cumulants or likelihood functions The invertibility assumption in the non-gaussian case is entirely artificial, and removing this assumption leads to a broad class of useful models Indeed, non-invertible moving averages and the broader class of non-minimum phase autoregressive moving average ARMA models are important tools in a number of applications, including seismic and other deconvolution problems Wiggins, 978; Ooe and Ulrych, 979; Blass and Halsey, 98; Donoho, 98; Godfrey and Rocca, 98; Hsueh and Mendel, 985; Scargle, 98, design of communication systems Benveniste, Goursat, and Roget, 980, processing of blurry images Donoho, 98; Chien, Yang, and Chi, 997, and modeling of vocal tract filters Rabiner and Schafer, 978; Chien, Yang, and Chi, 997 Estimation methods for general moving average processes include cumulant-based estimators using cumulants of order greater than two Wiggins, 978; Donoho, 98; Lii and Rosenblatt, 982; Giannakis and Swami, 990; Chi and Kung, 995; Chien, Yang, and Chi, 997; quasi-likelihood methods which lead to least absolute deviation-type estimators Huang and Pawitan, 2000; Breidt, Davis, and Trindade, 200; and the approximate maximum likelihood estimation Lii and Rosenblatt, 992 The approach by Huang and Pawitan 2000 is a semi-parametric method which uses the Laplacian likelihood and does not need the assumption for the distribution of innovations However, their likelihood is actually a conditional likelihood which is conditional on some initial variables setting to be zero This estimator is proved to be consistent only for heavy tailed errors The alternative approach by Lii and Rosenblatt 992 uses an appropriate truncation in the representation of the innovations in terms of the observations and approximates the likelihood function based on these truncated innovations This approximate MLE is shown to be asymptotically equivalent to the exact MLE under mild conditions In this work, we incorporate q latent variables into the likelihood function and solve the exact MLE of the model parameters by EM algorithm In addition, we propose two alternative estimators subject to different treatments to the latent variables One is the conditional MLE in which all of the latent variables are set to be zero The other is the joint MLE in which the latent variables and the model parameters are estimated simultaneously by maximizing the joint likelihood We show that these two alternative estimators have the same asymptotic distribution as the MLE The rest of the paper is organized as follows In Section 2, we first review the recursions given by Breidt and Hsu 2005 for computing residuals from a realization of a general moving average process Then, the likelihood function is written in terms of these residuals and the specified distribution of the innovations In Section 3, the procedures for solving exact MLE by EM algorithm and two alternative estimators are introduced In Section 4, numerical simulations are conducted for evaluating the performance of different estimators in finite samples for a non-invertible MA2 with non-gaussian noise under various parameter settings As a result, the exact MLE performs better than other estimators in terms of smaller root mean square errors for all cases A brief discussion follows in Section 5 Proofs 2
3 are given in the Appendix 2 MA processes and the likelihoods Rewrite as with and X t = θbz t = θ Bθ BZ t, 2 θ z = + θ z + + θ sz s 0 for z >, θ z = + θ z + + θ r zr 0 for z <, where r + s = q, θ s 0, and θ r 0 We further assume that any unit roots of θ z = 0 are not repeated roots The moving average polynomial, θz, is said to be invertible if s = 0 and non-invertible or non-minimum phase if s 0 We will refer to the case in which r = 0 and s > 0 as purely non-invertible According to Breidt and Hsu 2005, define W t = θ BZ t, 3 so that Z t = W t θ B Z t 4 Then, X t = θ BW t = + θ B + + θ s Bs W t = θs θb W t s, 5 where Thus, θz = + θ s z + + θ z s + z s θs θs θs W t s = X t θ s θb W t s 6 By incorporating the latent variables Z r = Z q+,, Z q+r and W s = W n s+,, W n to the observations X n = X,, X n, the joint distribution of X n, Z r, W s can be represented by the joint distribution of Z q+,, Z, Z 0, Z,, Z n via the following two 3
4 linear transformations: Z q+ Z q+r W s+ W 0 W W n = θ r θ θ r θ Z q+ Z q+r Z s+ Z 0 Z Z n, and Z q+ Z q+r X X n W n s+ W n = θ s θ θ s θ Z q+ Z q+r W s+ W n s W n s+ W n, which can be obtained directly from 4 and 6 Consequently, the joint distribution of X n, Z r, W s satisfies px n, z r, w s ; θ, = θ s n n t= q+ f z t, 7 where f z = fz/ is the probability density function of Z t with scale parameter and θ = θ, θ with θ θ, θ 2,, θ s and θ θ, θ 2,, θ r Also, note that {z t : t = q +,, n} solved by 4 and 6 are functions of θ, X n, Z r and W s, denoted as z t θ, x n, z r, w s for completeness in the following context Since the latent variables Z r and W s are unobserved in practice, we proposed three estimators of parameters by maximizing the joint distribution 7 subject to different treatments to the latent variables The first estimator is the exact MLE solved by EM algorithm in which the latent variables are treated as missing data The second estimator is the conditional MLE in which all of the latent variables are set to be zero The third estimator is the joint MLE in which the latent variables and the model parameters are estimated simultaneously The details are described in the next section 3 Estimation methods In the following, three estimators of ψ = θ, are introduced 4
5 First, the exact MLE is considered The latent variables Z r, W s are unobserved and treated as missing data in the joint distribution 7 Then, the MLE of parameters is solved recursively by the EM algorithm given in the following two steps: E-step: Compute the conditional expectation of the log-likelihood which satisfies Q ψ ψ old = n log θ s + n t= q+ E old [log f z t θ, X n, Z r, W s X n ], where the expectation E old is taken on the latent variables Z r, W s with respect to the conditional distribution given X n at current parameter estimates ψ old M-step: Solve ψ new = argmax ψ Q ψ ψ old These two steps are repeated until the convergence achieved In general, the expectation in the E-step does not have close form Therefore, we adopt a Monte Carlo method to compute the expectation numerically The same idea is used in Breidt and Hsu 2005 for obtaining the best mean square prediction for non-gaussian MA processes More precisely, the expectation in the E-step is re-written as Q t ψ ψ old E old [log f z t θ, x n, Z r, W s X n = x n ] = log f z t θ, x n, z r, w s px n, z r, w s ; ψ old dz r dw s pxn, z r, w s ; ψ old dz r dw s = log f z t θ, x n, z r, w s [ nk= q+ f old z k θ old, x n, z r, w s ] dz r dw s nk= q+ f old z k θ old, x n, z r, w s dz r dw s 8 These q-dimensional integrals can be evaluated numerically via importance sampling as follows Let hz r, w s be a q-dimensional joint density called importance sampler satisfying supph supp px n, z r, w s ; ψ old dx n, where supph is the support of h For any z i r, w i s supph, define the importance weight A x n, z i r Then, 8 can be written as, wi s ; ψ old nk= q+ f old zk θ old, x n, z r i, w i s = h z i r, w i s Q t ψ ψ old = log f z t θ, x n, z r, w s A x n, z r, w s ; ψ old h z r, w s dz r dw s A xn, z r, w s ; ψ old h z r, w s dz r dw s = E h [log f z t θ, x n, Z r, W s A x n, Z r, W s ; ψ old ], 9 E h [A x n, Z r, W s ; ψ old ] 5
6 where the expectation in 9 is taken with respect to hz r, w s The ratio in 9 can be approximated via Monte Carlo MC method as ˆQ t ψ ψ old = Mi= log f zt θ, x n, z i r, wi s A x n, z i r, wi s ; ψ old Mi= A x n, z i r, w i, 0 s ; ψ old where M is the number of draws in the importance sampling and {z i r, wi s ; i =, 2,, M} are q-dimensional random vectors drawn from the importance sampler hz r, w s Consequently, the MC approximation of Q ψ ψ old is ˆQ ψ ψ old = n log θ s + n t= q+ ˆQ t ψ ψ old, which is actually used in the implementation of the EM algorithm to solve the MLE The second estimator is the conditional MLE in which the latent variables are set to be zero Namely, ˆψ c = argmax ψ n log θ s + n t= q+ log f z t θ, x n, 0 r, 0 s The third estimator is the joint MLE in which the latent variables and the model parameters are estimated simultaneously, which is ˆψ J, ẑ r, ŵ s = argmax ψ,z r,w s n log θ s + n t= q+ log f z t θ, x n, z r, w s 2 In Lii and Rosenblatt 992, they derived the asymptotic distribution for the quasiestimator of parameters which maximizing the pseudo likelihood defined in 8 in their paper This quasi-estimator is the MLE if all {X t : t Z} are observed However, the pseudo likelihood depends on unobserved {X t } and cannot be computed in practice Therefore, they further show that the approximate MLE of parameters, which maximizes a truncated version of the pseudo likelihood, is asymptotically equivalent to the quasiestimator In this work, we further show that the conditional MLE ˆψ c and the joint MLE ˆψ J defined in and 2 are also equivalent to the quasi-estimator asymptotically The proof is given in the Appendix 4 Simulation To investigate the performance of the proposed estimators for finite samples, the following simulation study is conducted We consider two kinds of innovation distributions; one is Laplace distribution and the other is Student s t distribution with four degrees of freedom, which are the same as the setting given in Lii and Rosenblatt 992 The models considered in our simulation study include an invertible MA s = 0, r =, a non-invertible MA s =, r = 0 and a non-purely non-invertible MA2 s = r = For each MA process, 6
7 Table : Model settings for simulations Models Parameter Values Innovation Distributions invertible MA θ = 09 or θ = 08 Laplace non-invertible MA θ = /09 or θ = /08 Laplace non-invertible MA2 θ = /09 and θ = / Laplace or t4 θ = /08 and θ = /2 Laplace or t4 two parameters settings are considered in which one is closer to the unit root case and the other is not The detail settings are given in Table For each process, 200 samples with size n = 50 and n = 00 are generated and four estimators, including the MLE by EM, conditional MLE, joint MLE and the approximate MLE proposed by Lii and Rosenbaltt abbreviated by L&R MLE, are computed for each sample Similar to Lii and Rosenblatt 992 in their simulations, we use q trunc = 0 as the number of truncation for computing the approximate MLE for both sample sizes, which leads to the effective data length n 2q trunc being 30 for the case with n = 50 and being 80 for the case with n = 00 The performance of four estimators under various settings are summarized in Tables?? 5 in terms of biases, standard errors and root mean square errors The standard error based on the asymptotic distribution ASD for each process is also given at the bottom of each table for reference Note that, in Table 4 and 5, the estimation results contain two parts a and b in which the parameters are estimated under the true distribution t4 with unknown in part a and the parameters are estimated under the Laplace distribution with unknown in part b which leads to the least absolute deviation estimation ***Add results for MA here*** Generally speaking, for MA2 processes, the MLE solved by EM outperforms the other estimators in terms of smaller root mean square errors when n = 50, especially for the scale parameter Moreover, the second best estimator is the joint MLE for most of the cases with n = 50 For the rest of two estimators, the L&R MLE is better on estimating the MA coefficients and the conditional MLE is better on estimating the scale parameter Note that, compared to other estimators, the L&R MLE produces very poor estimation for the scale parameter for small samples which is probably due to small effective data length subject to the truncation in the objective function This situation is improved when the sample size increases Roughly speaking, these four estimators are fairly competitive when n = 00 which confirms their asymptotic equivalence Between different parameter settings, the efficiency gain of the MLE solved by EM relative to the L&R MLE is larger when the roots of MA polynomials are closer to the unit circle To sum up, the MLE solved by EM has best performance among four considered estimators for small samples But, from the computational point of view, the joint MLE is a good alternative for small samples, especially when the roots of MA polynomials are closer to the unit circle 7
8 Table 2: Biases, standard errors and root mean square errors of various likelihood-based estimators for a non-invertible MA2 satisfying X t = 09 B + BZ t where {Z t } are iid Laplace with = 200 replicates n = 50 n = 00 Estimator r = 09 r 2 = = r = 09 r 2 = = L&R MLE bias sd rmse MLE by EM bias sd rmse Joint MLE bias sd rmse Cond MLE bias sd rmse ASD Table 3: Biases, standard errors and root mean square errors of various likelihood-based estimators for a non-invertible MA2 satisfying X t = 08 B + 2 BZ t where {Z t } are iid Laplace with = 200 replicates n = 50 n = 00 Estimator r = 08 r 2 = 2 = r = 08 r 2 = 2 = L&R MLE bias sd rmse MLE by EM bias sd rmse Joint MLE bias sd rmse Cond MLE bias sd rmse ASD
9 Table 4: Biases, standard errors and root mean square errors of various likelihood-based estimators for a non-invertible MA2 satisfying X t = 09 B + BZ t where {Z t } are iid t4 with = 200 replicates a Parameters are estimated under f t4 n = 50 n = 00 Estimator r = 09 r 2 = = r = 09 r 2 = = L&R MLE bias sd rmse MLE by EM bias sd rmse Joint MLE bias sd rmse Cond MLE bias sd rmse ASD b Parameters are estimated under f Lap n = 50 n = 00 Estimator r = 09 r 2 = = r = 09 r 2 = = L&R MLE bias sd rmse MLE by EM bias sd rmse Joint MLE bias sd rmse Cond MLE bias sd rmse
10 Table 5: Biases, standard errors and root mean square errors of various likelihood-based estimators for a non-invertible MA2 satisfying X t = 08 B + 2 BZ t where {Z t } are iid t4 with = 200 replicates a Parameters are estimated under f t4 n = 50 n = 00 Estimator r = 08 r 2 = 2 = r = 08 r 2 = 2 = L&R MLE bias sd rmse MLE by EM bias sd rmse Joint MLE bias sd rmse Cond MLE bias sd rmse ASD b Parameters are estimated under f Lap n = 50 n = 00 Estimator r = 08 r 2 = 2 = r = 08 r 2 = 2 = L&R MLE bias sd rmse MLE by EM bias sd rmse Joint MLE bias sd rmse Cond MLE bias sd rmse
11 5 Discussion We proposed an algorithm to compute the exact MLE for invertible or non-invertible moving averages with non-gaussian noise Instead of truncating unobserved {X t } in the residuals for approximating the likelihood Lii and Rosenblatt, 992, we augment suitable latent variables to construct a joint likelihood We, then solve the exact MLE by the EM algorithm In addition, two alternative estimators are suggested subject to different treatments to the latent variables Both of the alternative estimators are much easier to implement but equivalent to the MLE asymptotically Simulation results show that the exact MLE solved by EM algorithm performs better than other estimators in terms of smaller root mean square errors in small samples for non-invertible non-gaussian MA processes 6 Appendix In Lii and Rosenblatt 992, they first derive the asymptotic distribution for the quasiestimator of parameters which maximizing the pseudo likelihood defined in 8 in Lii and Rosenblatt 992 However, the pseudo likelihood depends on unobserved {X t } and cannot be computed in practice, they further show that the approximate MLE of parameters, which maximizing the truncated version of pseudo likelihood defined in 9, is asymptotically equivalent to the quasi-estimator In the following, we shall show that the conditional MLE and the joint MLE defined in and 2 are also asymptotically equivalent to the quasi-estimator For notational simplicity, we only give the proof for a non-invertible MA2 with r = s = satisfying θ B = θ B and θ B = B with θ <, > The parameters considered in this particular case are ψ = θ,, First, some assumptions and notations are given The assumptions for the density of the innovations {Z t } satisfying f z = fz/ are given below: A : fx > 0 for all x A2 : f C 2 A3 : f L with f xdx = 0 A4 : xf xdx = A5 : f xdx = 0 A6 : xf xdx = 0 A7 : x 2 f xdx = 2 A8 : + x 2 [f x] 2 /fxdx < A9 : uz + h uz A + z k h + h l for all z, h with positive constants k, l, A, where u = f /f, f /f Before the proof, We first give two lemmas without proofs
12 Lemma Exercise 88 in Chapter, Durrett 2005 Assume {X t ; t =, 2, } are iid non-zero random variables, then n t= X t z t converges almost surely for z < if and only if E log + X t <, where log + x = max{logx, 0} Lemma 2 Proposition 3 in Brockwell and Davis 996 Assume {X t } are any sequence of random variables such that sup t E X t < If j ψ j <, then j ψ j X t j converges almost surely Proof: Define W t = θ BZ t and V t = θ BZ t Consequently, X t = θ BW t Given the initials w n, z, according to the backward and forward recursions in 6 and 4, we have w n j = j l j X n j+l + w n, l= θ2 j =, 2,, n, z k = k θw i k i + θ k+ z = = i=0 k i=0 k i=0 n k+i l= n k+i l= θ i θ i θ2 θ2 l X k i+l + [ k l X k i+l + θ i i=0 θ2 n k+i ] w n + θ k+ z θ2 n k θ / k+ θ / w n + θ k+ z, 3 k = 0,,, n Since the residuals {z t } are functions of θ, X n, w n and z, they are expressed as z t θ, x n, z, w n for completeness and abbreviated as z t θ z t θ, x n, z, w n and ẑ t θ z t θ, x n, ẑ, ŵ n for simplicity, where z, w n is the actual initial variables associated with X n and ẑ, ŵ n is some estimator of z, w n For Instance, ẑ, ŵ n = 0 for the conditional MLE case and ẑ, ŵ n is a function of X n for the joint MLE case Similar to Lii and Rosenblatt 992, let Q ɛ = { ψ IR q+ : ψ ψ 0 ɛ }, be a neighborhood of ψ 0 θ 0, 0, where ψ 0 is the true parameter vector and is the max norm on IR q+ Then, for small ɛ > 0, there exist a d 0, such that { max θ, } < d, 4 which is obtained directly from 32 in Lii and Rosenblatt 992 Moreover, according to 33 in Lii and Rosenblatt 992, we have sup z t θ z t θ 0 C ɛ /2 for some C > 0 According to 3 and 4, we have ẑ t θ z t θ = θ / t+ θ / θ2 2 d j X t j, n t ŵ n w n + θ t+ ẑ z,
13 and sup ẑ t θ z t θ 2 d 2 dn t ŵ n w n + d t+ ẑ z 5 Also, the first derivatives of ẑ t θ z t θ satisfy θ ẑ t θ z t θ = n t+ [ t + θ t + t θ t+ ] θ 2 ŵ n w n + t + θ t ẑ z, ẑ t θ z t θ n t+2 = θ [ t ] t+ + t + θ t θ n t+ θ 2 n t θ θ2 t+ θ / ŵ n w n, and therefore, sup θ ẑ tθ z t θ 2n t + dn t+ + 2td n+t+ d 2 2 ŵ n w n + t + d t ẑ z 6 Similarly, the second derivative of ẑ t θ z t θ satisfies sup 2 ẑ t θ z t θ θ 4 ŵ n w n { d 2 4 n t + 2d n t+2 + [2t + n t]d n+t} 7 In the following proof, we also need the derivatives of z t θ which satisfy z t θ θ = θ [θ B] W t = z t θ = θ B BW 2 t = θ B θ B W t = θ B Z t = θz j t j, j=0 [ [θ B] V t = θ θ B Z t = B ] B Z t = B j B j j+ Z t = Z t+j, j=0 j=0 θ B θ B V t = θ B θ B Z t 2 = θb j j k+ B k Z t, j=0 k=0 2 z t θ = θ θ B Z t = 3
14 in which all of the partial derivatives with respect to θ have the following form γ j Z t j, 8 with some {γ j } where γ j decay exponentially In Lii and Rosenblatt 992, the asymptotic distribution of the maximizer is derived based on the following pseudo likelihood function l n ψ = n log + t= q+ log f z t θ, in which the residuals are functions of X n, Z r, W s and can be represented as infinite series of {X t } This maximizer is ideal but cannot be computed in practice since only finite number of {X t } are observable They further show that the maximizer of a truncated version of l n ψ denoted as l q ψ, which is computable is equivalent to the quasi-mle asymptotically In this paper, we consider the following objective function: ˆl n ψ = n log + t= q+ log f ẑ t θ, where ẑ t depends on some initials ẑ, ŵ n When ẑ, ŵ n = 0, the corresponding estimator is the conditional MLE defined in One the other hand, the joint MLE defined in 2 is also a special case which maximizes ˆl n ψ where ẑ, ŵ n are functions of X n Our goal is to show that the conditional MLE, joint MLE and the quasi-mle are asymptotically equivalent Following the proof given by Lii and Rosenblatt 992, we first show that and then show that n /2 n sup n l nψ ˆl n ψ 0, as, 9 [ ψ log f ẑ t θ ] ψ log f z t θ 0, 20 ψ=ψ 0 in probability, where ψ = θ,, ψ 0 = θ 0, 0, n ˆBψ Bψ 0 0, 2 Bθθ ψ B Bψ θ ψ 2 B θ ψ B ψ ψψ log f z t θ, and ˆBψ is defined similarly but subject to ẑ t θ 4
15 For showing 9, n l nψ ˆl n ψ = n n = n n n n A + A 2, log f z t θ log f ẑ t θ log f z t θ log f ẑ t θ z tθ ẑ t θ f ẑt f θ f z t θ ẑ t θ z t θ 0 + f f z t θ ẑ t θ z t θ 0 f + n f ẑt f θ f z t θ 0 f f z t θ ẑ t θ ẑ t f θ f z t θ 0 f where the first-order Taylor expansion is used with ẑ t θ = z t θ + u t θẑ t θ z t θ for some u t θ [0, ] ie, ẑ t θ is between z tθ and ẑ t θ According to 5, we have sup A = = f z t θ 0 f sup z t θ ẑ t θ f { } 2 z t θ 0 f d 2 dn t ŵ n w n + d t+ ẑ z 2 d ŵ 2 n w n d n t f z t θ 0 f +d ẑ z d t f z t θ 0 f 22 Since { f f z t θ 0 ; t =,, n } are iid random variables with E [ f ] 2 [ f ] 2 [ z t θ 0 = f z f zdz = f f f ] z 2 f z dz = [f z] 2 2 dz <, fz because of Assumption A8, by Lemma 2, both series in 22 converge almost surely Therefore, A converges almost surely On the other hand, implies ẑ t θ z tθ 0 = z t θ z t θ 0 + u t θ ẑ t θ z t θ ẑ t θ z tθ 0 z t θ z t θ 0 + ẑ t θ z t θ, 5
16 and from Assumption A9 with u = f f, A 2 satisfies A 2 = A Therefore, sup A 2 z t θ ẑ t θ f f ẑ t θ f f zt θ 0 z t θ ẑ t θ A { + z t θ 0 / k ẑ t θ z t θ 0 + l ẑ t θ z t θ 0 l} z t θ ẑ t θ { + z t θ 0 / k z t θ z t θ 0 + ẑ t θ z t θ + l z t θ z t θ 0 + ẑ t θ z t θ l} A 0 ɛ 2 A + 0 ɛ 2 A + 0 ɛ l+ + z tθ 0 k 0 ɛ k sup z t θ ẑ t θ sup z t θ z t θ 0 + z tθ 0 k 0 ɛ k sup A A 2 + A 22 + A 23, z t θ ẑ t θ 2 sup z t θ ẑ t θ 2 l sup z t θ z t θ 0 l + 2 l sup ẑ t θ z t θ l where A 2, A 22 and A 23 are further derived below According to Equation 33 in Lii and Rosenblatt 992, A 2 = C ɛ /2 0 ɛ 2 + z tθ 0 k 0 ɛ k 2C ɛ /2 d 2 0 ɛ 2 ŵ n w n + C ɛ/2 d 0 ɛ 2 ẑ z { } d j X t j 2 d 2 dn t ŵ n w n + d t+ ẑ z d n t + z tθ 0 k 0 ɛ k d j X t j d t + z tθ 0 k 0 ɛ k d j X t j, where the summations in the last equation can be shown to converge almost surely as follows d t + z tθ 0 k d j X 0 ɛ k t j d t + z tθ 0 k 2 /2 2 /2 d t d j X 0 ɛ k t j, 23 6
17 where the first series in 23 converges almost surely by Lemma with E log + + z tθ 0 k 2 = 2E log + z tθ 0 k 0 ɛ k 0 ɛ k [ 2E log + z tθ 0 k ] [ 0 ɛ k { ztθ 0 0 ɛ} + 2E log + z tθ 0 k 0 ɛ k { ztθ 0 > 0 ɛ} [ 2 log 2P z t θ 0 0 ɛ + 2E log 2 z tθ 0 k ] 0 ɛ k { ztθ 0 > 0 ɛ} 2 log 2 + 2k E z tθ 0 0 ɛ < and the second series in 23 also converges almost surely by Lemma 2 with E 2 d j X t j EX 2 t 2 d j < Therefore, A 2 converges almost surely Similarly, for A 22, we have A 22 = + z tθ 0 k sup z 0 ɛ 2 0 ɛ k t θ ẑ t θ 2 + z tθ 0 k { 2 0 ɛ 2 0 ɛ k d 2 dn t ŵ n w n + d t+ ẑ z 4 0 ɛ 2 d 2 ŵ 2 n w n 2 d 2n t + z tθ 0 k 0 ɛ k + d2 0 ɛ ẑ 2 z 2 d 2t + z tθ 0 k 0 ɛ k 2d + 0 ɛ 2 d 2 ŵ n w n ẑ z d t + z tθ 0 k 0 ɛ k Similar to the derivation for A 2, the three series in the last equation all converge almost surely by Lemma and consequently, A 22 converges almost surely For A 23, we have A 23 2l C l ɛ l/2 { } l 2 0 ɛ l+ d 2 dn t ŵ n w n + d t+ ẑ z d j X t j 2 l { } 2 l+ + 0 ɛ l+ d 2 dn t ŵ n w n + d t+ ẑ z, where the second series converges obviously and, by Lemma 2, the first series converges almost surely if E d j X t j 7 l } 2 < 24 ]
18 Consequently, A 23 converges almost surely The proof for 9 is complete since sup n l nψ ˆl n ψ n sup A + A 2 0, as We begin to show 20 as follows First, we consider the partial derivative with respect to θ: θ log f ẑ t θ θ log f z t θ = f ẑ t θ f θ ẑtθ f z t θ f θ z tθ f zt θ f θ ẑ tθ z t θ + θ z f ẑt θ tθ f zt θ f f + θ ẑ f ẑt θ tθ z t θ f zt θ f f B t + B 2t + B 3t 25 Accordingly, the derivative with respect to θ in Equation 20 can be bounded by the sum of three series subject to the decomposition in 25 By 6, the first term of 20 subject to B t in 25 satisfies n /2 B t n /2 0 ψ=ψ 0 = n /2 0 f f f zt θ 0 f 0 θ ẑ tθ z t θ θ=θ 0 { } zt θ 0 2n t + d n t+ + 2td n+t+ d 2 2 ŵ n w n + t + d t ẑ z, 0 in which, by Lemma 2, all of the series converge almost surely since td t f zt θ 0 < and E f <, and therefore, n /2 [ n B t ] ψ=ψ converges to zero almost surely According to 8, the 0 second term of 20 subject to B 2t in 25 satisfies n /2 B 2t = n ψ=ψ /2 0 f ẑt θ 0 f zt θ 0 f 0 f 0 θ z tθ θ=θ 0 0 n /2 0 A [ + z tθ 0 k k 0 0 ] 0 ẑt θ 0 z t θ l ẑt θ 0 z t θ 0 l γ j Z t j Following the similar arguments as those in calculating A 2, n /2 [ n B 2t ] ψ=ψ 0 n /2 [ n B 3t ] ψ=ψ both converge to zero almost surely by Lemma 2 since 0 and 2 E γ j Z t j EZt 2 2 γ j <, 8
19 where γ j defined in 8 decay exponentially Finally, the third term of 20 subject to B 3t in 25 satisfies n /2 B 3t A n /2 0 0, as, ψ=ψ 0 = n /2 0 { + z tθ 0 k k 0 f f { 2n t + d n t+ + 2td n+t+ ẑt θ 0 f zt θ 0 0 f 0 θ ẑ tθ z t θ θ=θ 0 } 0 ẑt θ 0 z t θ l ẑt θ 0 z t θ 0 l d 2 2 ŵ n w n + t + d t ẑ z under similar arguments as those in calculating A 22 To complete 20, we then consider the partial derivative with respect to : log f ẑ t θ log f z t θ = f ẑt θ 2 ẑ t θ f zt θ z t θ f f f zt θ 2 f ẑ tθ z t θ + z f ẑt θ tθ 2 f zt θ f f + ẑ f ẑt θ tθ z 2 t θ f zt θ f f B 4t + B 5t + B 6t 26 Accordingly, the derivative with respect to in Equation 20 can be bounded by the sum of three series subject to the decomposition in 26 Similar to deriving the convergence for A, t B 4t and t B 6t at ψ = ψ 0 both converge almost surely Also, since z t θ 0 = πbx t = π j X t j, where πb = [θ B] [ θ sb s θb ] with j π j <, the upper bound for t B 5t at ψ = ψ 0 converges almost surely under similar arguments as those for B 2t Therefore, 20 holds from the above results It is left to show 2 First, let n ˆBψ Bψ 0 = n Bψ Bψ 0 + n ˆBψ Bψ In Lii and Rosenblatt 992, it has been shown that } n Bψ Bψ 0 0, in probability, as ψ ψ 0 In the following, we only need to show n ˆBψ Bψ 0, in probability 9
20 Since 2 θθ log f z t θ = f f zt θ 2 z t θ θθ + 2 f zt θ zt θ f θ z t θ θ, we have ˆB θθ ψ B θθ ψ 2 θθ log f ẑ t θ 2 θθ log f z t θ ψ=ψ n f ẑt θ [ 2 ] ẑ t θ f θθ f zt θ [ 2 ] z t θ θ=θ f θθ θ=θ + f ẑt θ [ ] ẑt θ ẑ t θ f 2 f θ θ θ=θ zt θ [ ] zt θ z t θ f θ θ θ=θ n f ẑt θ 2 ẑ t θ z t θ f θθ + f zt θ θ=θ f f ẑt θ 2 z t θ f θθ θ=θ + f ẑt θ 2 ẑ t θ z t θ ẑ t θ z t θ f θ θ + ẑ tθ z t θ z t θ θ θ + z tθ ẑ t θ z t θ θ θ θ=θ + f ẑt θ f zt θ 2 f z t θ z t θ f θ θ 27 θ=θ Using similar techniques used in the previous proof, it can be shown that each series in the decomposition in 27 converges almost surely Similarly, since we have 2 log f z 2 t θ = z2 t θ 4 ˆB ψ B ψ f zt θ f f f f f + 2 z tθ f 3 f zt θ, ẑt θ f ẑ 2 zt θ t θ z 2 f t θ ẑt θ ẑ t θ f zt θ z f t θ 28 Ignoring the scalar / 4, the first term on the right hand side of 28 can be bounded by f ẑt θ f zt θ f f z2 t θ f ẑt θ + f ẑ2 t θ zt 2 θ f ẑt θ f zt θ { } f f z2 t θ + ẑ t θ z t θ z t θ ẑ t θ z t θ f { f zt θ 0 ẑt θ f zt θ 0 } + f f f
21 Analogous to the previous proof, it can be shown that the series associated with each term in the above decomposition converges with probability one The second term on the right hand side of 28 is identical to 2 log f ẑ t θ log f z t θ ψ=ψ which also converges with probability one using the same decomposition given in 26 Finally, we consider B θ Since 2 θ log f z t θ = f zt θ z 3 t θ z tθ f zt θ zt θ f θ 2 f θ, we have ˆB θ ψ B θ ψ 2 θ log f ẑ t θ 2 θ log f z t θ ψ=ψ ẑt θ [ ] 3 ẑ t θ ẑt θ f θ θ=θ zt θ [ ] f z t θ zt θ θ θ=θ + ẑt θ [ ẑt θ 2 θ ]θ=θ f zt θ [ ] zt θ f θ θ=θ ẑt θ 3 ẑ t θ z t θ ẑ t θ z t θ θ θ=θ + ẑt θ 3 z tθ ẑ tθ z t θ + ẑ t θ z t θ z tθ θ θ θ=θ + ẑt θ f zt θ 3 f z tθ z t θ θ θ=θ + ẑt θ ẑ t θ z t θ 2 θ θ=θ + ẑt θ 2 f zt θ z t θ f θ 29 θ=θ f f f f f f f f f f f f f f Similarly, it can be shown that the series associated with each term in the decomposition in 29 converges with probability one Consequently, ˆBψ Bψ converges with probability one and n ˆBψ Bψ 0 From 9, as n, there exists a consistent sequence of estimators { ˆψ n Q ɛ } satisfying the proposed likelihood equations By Taylor expansion, we have [ ] 0 = n /2 ψ ˆl n ψ ψ= ˆψ n ψ ˆl n ψ = 0 2
22 n = n /2 [ ] ψ log f ẑ t θ + n ˆBψ n /2 ˆψ n ψ 0 ψ=ψ 0 n = n /2 [ ] [ ψ log f z t θ + n /2 ψ=ψ 0 ψ log f ẑ t θ ] ψ log f z t θ ψ=ψ 0 +n Bψ 0 n /2 ˆψ [ ] n ψ 0 + n ˆBψ Bψ 0 n /2 ˆψ n ψ 0, where ψ is between ˆψ n and ψ 0 According to 20 and 2, 0 = n /2 n which implies [ ] ψ log f z t θ + n Bψ 0 n /2 ˆψ n ψ 0 + o p, ψ=ψ 0 n /2 ˆψ n ψ 0 = [ n Bψ 0 ] N 0, Σ, n /2 [ ] ψ log f z t θ where Σ is defined in 7 and 8 in Lii and Rosenblatt 992 References + o p ψ=ψ 0 Benveniste, A, Goursat, M, and Roget, G 980 Robust identification of a nonminimum phase system: blind adjustment of linear equalizer in data communications IEEE Transactions on Automatic Control AC 25, Blass, WE and Halsey, GW 98 Deconvolution of Absorption Spectra Academic Press, New York Breidt, FJ, Davis, RA, and Trindade, AA 200 Least absolute deviation estimation for all-pass time series models Annals of Statistics 29, Breidt, JF and Hsu, N-J 2004 Best mean square prediction for moving averages forthcoming in Statistical Sinica Brockwell, PJ and Davis, RA 99 Time Series: Theory and Methods, 2nd ed Springer- Verlag, New York Chi, C-Y, and Kung, J-Y 995 A new identification algorithm for allpass systems by higher-order statistics Signal Processing 4, Chien, H-M, Yang, H-L, and Chi, C-Y 997 Parametric cumulant base phase estimation of -d and 2-d nonminimum phase systems by allpass filtering IEEE Transactions on Signal Processing 45, Donoho, D 98 On minimum entropy deconvolution In Applied Time Series Analysis II, DF Findley, ed, Academic Press, New York, pp Evans, M and Swartz, T 2000 Approximating Integrals via Monte Carlo and Deterministic Methods Oxford University Press, Oxford, UK 22
23 Gelman, A, Carlin, JB, Stern, HS and Rubin, DB 995 Chapman & Hall, London, UK Bayesian Data Analysis Giannakis, GB, and Swami, A 990 On estimating noncausal nonminimum phase ARMA models of non-gaussian processes IEEE Transactions on Acoustics, Speech, and Signal Processing 38, Godfrey, R and Rocca, F 98 Zero memory nonlinear deconvolution Geophysical Prospecting 29, Hildebrand, FB 968 Finite-Difference Equations and Simulations Prentice-Hall, Englewood Cliffs, NJ Hsueh, A-C and Mendel, JM 985 Minimum-variance and maximum-likelihood deconvolution for non-causal channel models IEEE Transactions on Geoscience and Remote Sensing 23, Huang, J and Pawitan, Y 2000 Quasi-likelihood estimation of non-invertible moving average processes Scandinavian Journal of Statistics 27, Lii, K-S and Rosenblatt, M 982 Deconvolution and estimation of transfer function phase and coefficients for nongaussian linear processes Annals of Statistics 0, Lii, K-S and Rosenblatt, M 992 An approximate maximum likelihood estimation for non-gaussian non-minimum phase moving average processes Journal of Multivariate Analysis 43, Lii, K-S and Rosenblatt, M 996 Maximum likelihood estimation for nongaussian nonminimum phase ARMA sequences Statistica Sinica 6, 22 Ooe, M and Ulrych, TJ 979 Minimum entropy deconvolution with an exponential transformation Geophysical Prospecting 27, Rabiner, LR and Schafer, RM 978 Digital Processing of Speech Signals Prentice-Hall, Englewood Cliffs, New Jersey Rosenblatt, M 2000 Gaussian and Non-Gaussian Linear Time Series and Random Fields Springer-Verlag, New York Scargle, JD 98 Phase-sensitive deconvolution to model random processes, with special reference to astronomical data In Applied Time Series Analysis II, DF Findley, ed, Academic Press, New York, Tierney, L 994 Markov chains for exploring posterior distributions with discussion, Annals of Statistics 22, Wiggins, RA 978 Minimum entropy deconvolution Geoexploration 6,
EXACT MAXIMUM LIKELIHOOD ESTIMATION FOR NON-GAUSSIAN MOVING AVERAGES
Statistica Sinica 19 (2009), 545-560 EXACT MAXIMUM LIKELIHOOD ESTIMATION FOR NON-GAUSSIAN MOVING AVERAGES Nan-Jung Hsu and F. Jay Breidt National Tsing-Hua University and Colorado State University Abstract:
More informationBEST MEAN SQUARE PREDICTION FOR MOVING AVERAGES
Statistica Sinica 15(2005), 427-446 BEST MEAN SQUARE PREDICTION FOR MOVING AVERAGES F. Jay Breidt and Nan-Jung Hsu Colorado State University and National Tsing-Hua University Abstract: Best mean square
More informationOn uniqueness of moving average representations of heavy-tailed stationary processes
MPRA Munich Personal RePEc Archive On uniqueness of moving average representations of heavy-tailed stationary processes Christian Gouriéroux and Jean-Michel Zakoian University of Toronto, CREST 3 March
More informationTesting Restrictions and Comparing Models
Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by
More informationLabor-Supply Shifts and Economic Fluctuations. Technical Appendix
Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January
More informationPREDICTION AND NONGAUSSIAN AUTOREGRESSIVE STATIONARY SEQUENCES 1. Murray Rosenblatt University of California, San Diego
PREDICTION AND NONGAUSSIAN AUTOREGRESSIVE STATIONARY SEQUENCES 1 Murray Rosenblatt University of California, San Diego Abstract The object of this paper is to show that under certain auxiliary assumptions
More informationLessons in Estimation Theory for Signal Processing, Communications, and Control
Lessons in Estimation Theory for Signal Processing, Communications, and Control Jerry M. Mendel Department of Electrical Engineering University of Southern California Los Angeles, California PRENTICE HALL
More informationGaussian processes. Basic Properties VAG002-
Gaussian processes The class of Gaussian processes is one of the most widely used families of stochastic processes for modeling dependent data observed over time, or space, or time and space. The popularity
More informationLecture 16: State Space Model and Kalman Filter Bus 41910, Time Series Analysis, Mr. R. Tsay
Lecture 6: State Space Model and Kalman Filter Bus 490, Time Series Analysis, Mr R Tsay A state space model consists of two equations: S t+ F S t + Ge t+, () Z t HS t + ɛ t (2) where S t is a state vector
More informationStrictly Stationary Solutions of Autoregressive Moving Average Equations
Strictly Stationary Solutions of Autoregressive Moving Average Equations Peter J. Brockwell Alexander Lindner Abstract Necessary and sufficient conditions for the existence of a strictly stationary solution
More informationAdjusted Empirical Likelihood for Long-memory Time Series Models
Adjusted Empirical Likelihood for Long-memory Time Series Models arxiv:1604.06170v1 [stat.me] 21 Apr 2016 Ramadha D. Piyadi Gamage, Wei Ning and Arjun K. Gupta Department of Mathematics and Statistics
More informationOn Information Maximization and Blind Signal Deconvolution
On Information Maximization and Blind Signal Deconvolution A Röbel Technical University of Berlin, Institute of Communication Sciences email: roebel@kgwtu-berlinde Abstract: In the following paper we investigate
More informationTime Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY
Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY PREFACE xiii 1 Difference Equations 1.1. First-Order Difference Equations 1 1.2. pth-order Difference Equations 7
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationADAPTIVE FILTER THEORY
ADAPTIVE FILTER THEORY Fourth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada Front ice Hall PRENTICE HALL Upper Saddle River, New Jersey 07458 Preface
More informationAllpass Modeling of LP Residual for Speaker Recognition
Allpass Modeling of LP Residual for Speaker Recognition K. Sri Rama Murty, Vivek Boominathan and Karthika Vijayan Department of Electrical Engineering, Indian Institute of Technology Hyderabad, India email:
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationTime Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY
Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY & Contents PREFACE xiii 1 1.1. 1.2. Difference Equations First-Order Difference Equations 1 /?th-order Difference
More informationPart III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be ed to
TIME SERIES Part III Example Sheet 1 - Solutions YC/Lent 2015 Comments and corrections should be emailed to Y.Chen@statslab.cam.ac.uk. 1. Let {X t } be a weakly stationary process with mean zero and let
More informationIf we want to analyze experimental or simulated data we might encounter the following tasks:
Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction
More information{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }
Stochastic processes Time series are an example of a stochastic or random process Models for time series A stochastic process is 'a statistical phenomenon that evolves in time according to probabilistic
More informationA Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models
A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes (bilmes@cs.berkeley.edu) International Computer Science Institute
More informationEstimating Moving Average Processes with an improved version of Durbin s Method
Estimating Moving Average Processes with an improved version of Durbin s Method Maximilian Ludwig this version: June 7, 4, initial version: April, 3 arxiv:347956v [statme] 6 Jun 4 Abstract This paper provides
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationOutline Lecture 2 2(32)
Outline Lecture (3), Lecture Linear Regression and Classification it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic
More informationBayesian Inference for the Multivariate Normal
Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate
More informationStatistical and Adaptive Signal Processing
r Statistical and Adaptive Signal Processing Spectral Estimation, Signal Modeling, Adaptive Filtering and Array Processing Dimitris G. Manolakis Massachusetts Institute of Technology Lincoln Laboratory
More informationExpressions for the covariance matrix of covariance data
Expressions for the covariance matrix of covariance data Torsten Söderström Division of Systems and Control, Department of Information Technology, Uppsala University, P O Box 337, SE-7505 Uppsala, Sweden
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationState-Space Methods for Inferring Spike Trains from Calcium Imaging
State-Space Methods for Inferring Spike Trains from Calcium Imaging Joshua Vogelstein Johns Hopkins April 23, 2009 Joshua Vogelstein (Johns Hopkins) State-Space Calcium Imaging April 23, 2009 1 / 78 Outline
More informationTime Series: Theory and Methods
Peter J. Brockwell Richard A. Davis Time Series: Theory and Methods Second Edition With 124 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition vn ix CHAPTER 1 Stationary
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationMaster 2 Informatique Probabilistic Learning and Data Analysis
Master 2 Informatique Probabilistic Learning and Data Analysis Faicel Chamroukhi Maître de Conférences USTV, LSIS UMR CNRS 7296 email: chamroukhi@univ-tln.fr web: chamroukhi.univ-tln.fr 2013/2014 Faicel
More informationSTAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong
STAT 443 Final Exam Review L A TEXer: W Kong 1 Basic Definitions Definition 11 The time series {X t } with E[X 2 t ] < is said to be weakly stationary if: 1 µ X (t) = E[X t ] is independent of t 2 γ X
More informationWorking Paper No Maximum score type estimators
Warsaw School of Economics Institute of Econometrics Department of Applied Econometrics Department of Applied Econometrics Working Papers Warsaw School of Economics Al. iepodleglosci 64 02-554 Warszawa,
More informationENGR352 Problem Set 02
engr352/engr352p02 September 13, 2018) ENGR352 Problem Set 02 Transfer function of an estimator 1. Using Eq. (1.1.4-27) from the text, find the correct value of r ss (the result given in the text is incorrect).
More informationThe autocorrelation and autocovariance functions - helpful tools in the modelling problem
The autocorrelation and autocovariance functions - helpful tools in the modelling problem J. Nowicka-Zagrajek A. Wy lomańska Institute of Mathematics and Computer Science Wroc law University of Technology,
More informationMATHEMATICAL ENGINEERING TECHNICAL REPORTS. A Superharmonic Prior for the Autoregressive Process of the Second Order
MATHEMATICAL ENGINEERING TECHNICAL REPORTS A Superharmonic Prior for the Autoregressive Process of the Second Order Fuyuhiko TANAKA and Fumiyasu KOMAKI METR 2006 30 May 2006 DEPARTMENT OF MATHEMATICAL
More information7. Forecasting with ARIMA models
7. Forecasting with ARIMA models 309 Outline: Introduction The prediction equation of an ARIMA model Interpreting the predictions Variance of the predictions Forecast updating Measuring predictability
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationLong-range dependence
Long-range dependence Kechagias Stefanos University of North Carolina at Chapel Hill May 23, 2013 Kechagias Stefanos (UNC) Long-range dependence May 23, 2013 1 / 45 Outline 1 Introduction to time series
More informationThe Metropolis-Hastings Algorithm. June 8, 2012
The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings
More informationOn Moving Average Parameter Estimation
On Moving Average Parameter Estimation Niclas Sandgren and Petre Stoica Contact information: niclas.sandgren@it.uu.se, tel: +46 8 473392 Abstract Estimation of the autoregressive moving average (ARMA)
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.
More informationSYSTEM RECONSTRUCTION FROM SELECTED HOS REGIONS. Haralambos Pozidis and Athina P. Petropulu. Drexel University, Philadelphia, PA 19104
SYSTEM RECOSTRUCTIO FROM SELECTED HOS REGIOS Haralambos Pozidis and Athina P. Petropulu Electrical and Computer Engineering Department Drexel University, Philadelphia, PA 94 Tel. (25) 895-2358 Fax. (25)
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationExercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters
Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for
More informationA New Subspace Identification Method for Open and Closed Loop Data
A New Subspace Identification Method for Open and Closed Loop Data Magnus Jansson July 2005 IR S3 SB 0524 IFAC World Congress 2005 ROYAL INSTITUTE OF TECHNOLOGY Department of Signals, Sensors & Systems
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian
More informationTime Series I Time Domain Methods
Astrostatistics Summer School Penn State University University Park, PA 16802 May 21, 2007 Overview Filtering and the Likelihood Function Time series is the study of data consisting of a sequence of DEPENDENT
More informationImproved system blind identification based on second-order cyclostationary statistics: A group delay approach
SaÅdhanaÅ, Vol. 25, Part 2, April 2000, pp. 85±96. # Printed in India Improved system blind identification based on second-order cyclostationary statistics: A group delay approach P V S GIRIDHAR 1 and
More informationBayesian Dynamic Linear Modelling for. Complex Computer Models
Bayesian Dynamic Linear Modelling for Complex Computer Models Fei Liu, Liang Zhang, Mike West Abstract Computer models may have functional outputs. With no loss of generality, we assume that a single computer
More informationAsymptotic inference for a nonstationary double ar(1) model
Asymptotic inference for a nonstationary double ar() model By SHIQING LING and DONG LI Department of Mathematics, Hong Kong University of Science and Technology, Hong Kong maling@ust.hk malidong@ust.hk
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationLecture 2: Univariate Time Series
Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:
More informationLecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH
Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationExploring Granger Causality for Time series via Wald Test on Estimated Models with Guaranteed Stability
Exploring Granger Causality for Time series via Wald Test on Estimated Models with Guaranteed Stability Nuntanut Raksasri Jitkomut Songsiri Department of Electrical Engineering, Faculty of Engineering,
More informationModule 4. Stationary Time Series Models Part 1 MA Models and Their Properties
Module 4 Stationary Time Series Models Part 1 MA Models and Their Properties Class notes for Statistics 451: Applied Time Series Iowa State University Copyright 2015 W. Q. Meeker. February 14, 2016 20h
More informationTHE K-FACTOR GARMA PROCESS WITH INFINITE VARIANCE INNOVATIONS. 1 Introduction. Mor Ndongo 1 & Abdou Kâ Diongue 2
THE K-FACTOR GARMA PROCESS WITH INFINITE VARIANCE INNOVATIONS Mor Ndongo & Abdou Kâ Diongue UFR SAT, Universit Gaston Berger, BP 34 Saint-Louis, Sénégal (morndongo000@yahoo.fr) UFR SAT, Universit Gaston
More informationExercises - Time series analysis
Descriptive analysis of a time series (1) Estimate the trend of the series of gasoline consumption in Spain using a straight line in the period from 1945 to 1995 and generate forecasts for 24 months. Compare
More informationAutoregressive Moving Average (ARMA) Models and their Practical Applications
Autoregressive Moving Average (ARMA) Models and their Practical Applications Massimo Guidolin February 2018 1 Essential Concepts in Time Series Analysis 1.1 Time Series and Their Properties Time series:
More informationSpring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 14 GEE-GMM Throughout the course we have emphasized methods of estimation and inference based on the principle
More informationCONTROL SYSTEMS, ROBOTICS, AND AUTOMATION - Vol. V - Prediction Error Methods - Torsten Söderström
PREDICTIO ERROR METHODS Torsten Söderström Department of Systems and Control, Information Technology, Uppsala University, Uppsala, Sweden Keywords: prediction error method, optimal prediction, identifiability,
More information11. Bootstrap Methods
11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods
More informationA COMPLETE ASYMPTOTIC SERIES FOR THE AUTOCOVARIANCE FUNCTION OF A LONG MEMORY PROCESS. OFFER LIEBERMAN and PETER C. B. PHILLIPS
A COMPLETE ASYMPTOTIC SERIES FOR THE AUTOCOVARIANCE FUNCTION OF A LONG MEMORY PROCESS BY OFFER LIEBERMAN and PETER C. B. PHILLIPS COWLES FOUNDATION PAPER NO. 1247 COWLES FOUNDATION FOR RESEARCH IN ECONOMICS
More informationVolume 30, Issue 3. A note on Kalman filter approach to solution of rational expectations models
Volume 30, Issue 3 A note on Kalman filter approach to solution of rational expectations models Marco Maria Sorge BGSE, University of Bonn Abstract In this note, a class of nonlinear dynamic models under
More informationFinding Robust Solutions to Dynamic Optimization Problems
Finding Robust Solutions to Dynamic Optimization Problems Haobo Fu 1, Bernhard Sendhoff, Ke Tang 3, and Xin Yao 1 1 CERCIA, School of Computer Science, University of Birmingham, UK Honda Research Institute
More informationReconstruction of individual patient data for meta analysis via Bayesian approach
Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationExpectation propagation for signal detection in flat-fading channels
Expectation propagation for signal detection in flat-fading channels Yuan Qi MIT Media Lab Cambridge, MA, 02139 USA yuanqi@media.mit.edu Thomas Minka CMU Statistics Department Pittsburgh, PA 15213 USA
More informationSubmitted to the Brazilian Journal of Probability and Statistics
Submitted to the Brazilian Journal of Probability and Statistics Multivariate normal approximation of the maximum likelihood estimator via the delta method Andreas Anastasiou a and Robert E. Gaunt b a
More informationLecture 3: More on regularization. Bayesian vs maximum likelihood learning
Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationReduction of Random Variables in Structural Reliability Analysis
Reduction of Random Variables in Structural Reliability Analysis S. Adhikari and R. S. Langley Department of Engineering University of Cambridge Trumpington Street Cambridge CB2 1PZ (U.K.) February 21,
More informationA Note on Auxiliary Particle Filters
A Note on Auxiliary Particle Filters Adam M. Johansen a,, Arnaud Doucet b a Department of Mathematics, University of Bristol, UK b Departments of Statistics & Computer Science, University of British Columbia,
More information2 Statistical Estimation: Basic Concepts
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationINTRODUCTION TO PATTERN RECOGNITION
INTRODUCTION TO PATTERN RECOGNITION INSTRUCTOR: WEI DING 1 Pattern Recognition Automatic discovery of regularities in data through the use of computer algorithms With the use of these regularities to take
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38
ARIMA Models Jamie Monogan University of Georgia January 25, 2012 Jamie Monogan (UGA) ARIMA Models January 25, 2012 1 / 38 Objectives By the end of this meeting, participants should be able to: Describe
More informationLTI Systems, Additive Noise, and Order Estimation
LTI Systems, Additive oise, and Order Estimation Soosan Beheshti, Munther A. Dahleh Laboratory for Information and Decision Systems Department of Electrical Engineering and Computer Science Massachusetts
More informationMINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava
MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS Maya Gupta, Luca Cazzanti, and Santosh Srivastava University of Washington Dept. of Electrical Engineering Seattle,
More informationMeasurement Error and Linear Regression of Astronomical Data. Brandon Kelly Penn State Summer School in Astrostatistics, June 2007
Measurement Error and Linear Regression of Astronomical Data Brandon Kelly Penn State Summer School in Astrostatistics, June 2007 Classical Regression Model Collect n data points, denote i th pair as (η
More informationRank Regression with Normal Residuals using the Gibbs Sampler
Rank Regression with Normal Residuals using the Gibbs Sampler Stephen P Smith email: hucklebird@aol.com, 2018 Abstract Yu (2000) described the use of the Gibbs sampler to estimate regression parameters
More informationAdaptive MMSE Equalizer with Optimum Tap-length and Decision Delay
Adaptive MMSE Equalizer with Optimum Tap-length and Decision Delay Yu Gong, Xia Hong and Khalid F. Abu-Salim School of Systems Engineering The University of Reading, Reading RG6 6AY, UK E-mail: {y.gong,x.hong,k.f.abusalem}@reading.ac.uk
More informationGARCH Models Estimation and Inference
GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in
More informationA NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1
A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1 Jinglin Zhou Hong Wang, Donghua Zhou Department of Automation, Tsinghua University, Beijing 100084, P. R. China Control Systems Centre,
More informationMassachusetts Institute of Technology Department of Economics Time Series Lecture 6: Additional Results for VAR s
Massachusetts Institute of Technology Department of Economics Time Series 14.384 Guido Kuersteiner Lecture 6: Additional Results for VAR s 6.1. Confidence Intervals for Impulse Response Functions There
More informationIntroduction to Time Series Analysis. Lecture 11.
Introduction to Time Series Analysis. Lecture 11. Peter Bartlett 1. Review: Time series modelling and forecasting 2. Parameter estimation 3. Maximum likelihood estimator 4. Yule-Walker estimation 5. Yule-Walker
More informationA CENTRAL LIMIT THEOREM FOR NESTED OR SLICED LATIN HYPERCUBE DESIGNS
Statistica Sinica 26 (2016), 1117-1128 doi:http://dx.doi.org/10.5705/ss.202015.0240 A CENTRAL LIMIT THEOREM FOR NESTED OR SLICED LATIN HYPERCUBE DESIGNS Xu He and Peter Z. G. Qian Chinese Academy of Sciences
More informationStatistics of stochastic processes
Introduction Statistics of stochastic processes Generally statistics is performed on observations y 1,..., y n assumed to be realizations of independent random variables Y 1,..., Y n. 14 settembre 2014
More informationStatistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach
Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score
More informationOptimal Polynomial Control for Discrete-Time Systems
1 Optimal Polynomial Control for Discrete-Time Systems Prof Guy Beale Electrical and Computer Engineering Department George Mason University Fairfax, Virginia Correspondence concerning this paper should
More informationMGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong
Modeling, Estimation and Control, for Telecommunication Networks Notes for the MGR-815 course 12 June 2010 School of Superior Technology Professor Zbigniew Dziong 1 Table of Contents Preface 5 1. Example
More informationOutline lecture 2 2(30)
Outline lecture 2 2(3), Lecture 2 Linear Regression it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic Control
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationWe use the centered realization z t z in the computation. Also used in computing sample autocovariances and autocorrelations.
Stationary Time Series Models Part 1 MA Models and Their Properties Class notes for Statistics 41: Applied Time Series Ioa State University Copyright 1 W. Q. Meeker. Segment 1 ARMA Notation, Conventions,
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationLecture Stat Information Criterion
Lecture Stat 461-561 Information Criterion Arnaud Doucet February 2008 Arnaud Doucet () February 2008 1 / 34 Review of Maximum Likelihood Approach We have data X i i.i.d. g (x). We model the distribution
More information