POLYNOMIAL FILTERING FOR LINEAR DISCRETE TIME NON-GAUSSIAN SYSTEMS

SIAM J CONTROL AND OPTIMIZATION c 1996 Society for Industrial and Applied Mathematics Vol 34, No 5, pp 1666 1690, September 1996 010 POLYNOMIAL FILTERING FOR LINEAR DISCRETE TIME NON-GAUSSIAN SYSTEMS FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI Abstract In this work we propose a new filtering approach for linear discrete time non-gaussian systems that generalizes a previous result concerning quadratic filtering [A De Santis, A Germani, and M Raimondi, IEEE Trans Automat Control, 40 (1995 pp 1274 1278] A recursive νth-order polynomial estimate of finite memory is achieved by defining a suitable extended state which allows one to solve the filtering problem via the classical Kalman linear scheme The resulting estimate will be the mean square optimal one among those estimators that take into account ν-polynomials of the last observations Numerical simulations show the effectiveness of the proposed method Key words nonlinear filtering, polynomial estimates, recursive estimates, non-gaussian systems AMS subject classifications 93E10, 93E11 1 Introduction In this paper the state estimation problem for linear non-gaussian systems is considered In many important technical areas the widely used Gaussian assumption cannot be accepted as a realistic statistical description of the random quantities involved As shown in various papers (see for instance [1, 2], increasing attention has been paid in control engineering to non-gaussian systems, and the importance of parameters and state estimation problems is plainly evidenced In these cases the conditional expectation, which gives the optimal minimum variance estimate, cannot be generally computed, so that it is necessary to look for suboptimal estimates that are easier to achieve, such as the optimal linear one In recent years, the signal filtering and detection problems in the presence of non-gaussian noise have been widely investigated with different signal models and statistical settings Non- Gaussian problems often arise in digital communications when the noise interference includes noise components that are essentially non-gaussian (this is a common situation below 100 MHz [6] Neglecting these components is a major source of error in communication system design In [3, 4] the existence of stable filters for a class of nonlinear stochastic systems is studied, where the nonlinearity is defined not by its deterministic structure but by its statistical properties In [5] the Bayesian approach to nonlinear parameter estimation is considered and the cost of computing the posterior density description is investigated when the Bayes formula is recursively applied In telecommunication systems the detection problem in the presence of non-gaussian noises is extensively addressed in [6] [12], while in [13] a general abstract setting is considered for high-order statistical processing (Volterra filters A first attempt in the definition of a polynomial filter, which in some sense generalizes the Kalman approach, is described in [14], where, in particular, an instantaneous polynomial function of the innovation process constitutes the forcing term for the linear dynamic of the filter The computation of the polynomial coefficients, which generalizes the Kalman gain to the non-gaussian case, remains the main problem In [15] the linear recursive estimation is dealt with for stochastic signals having multiplicative noise and in [16] for linear discrete time systems with stochastic parameters In [17] an asymptotic minimum variance algorithm is described for parameter estimation in non-gaussian moving average (MA and autoregressive moving average (ARMA Received by the editors July 28, 1993; accepted for publication (in revised form May 31, 1995 This work was partially supported by MURST Dipartimento di Informatica e Sistemistica, Università di Roma La Sapienza, Via Eudossiana 18, 00184 Roma, Italy Dipartimento di Ingegneria Elettrica, Università dell Aquila, 67100 Monteluco (L Aquila, Italy, and Istituto di Analisi dei Sistemi ed Informatica del CNR, Viale Manzoni 30, 00185 Roma, Italy Dipartimento di Matematica G Castelnuovo, Università di Roma La Sapienza, Piazzale Aldo Moro 2, 00185 Roma, Italy 1666

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1667 processes, using sample high-order statistics The same problem is studied in [18] by using a fixed set of output cumulants In [19], on the basis of the knowledge of the output process together with its Kronecker square products, a linear filter with respect to such information process is defined In this paper we consider the more general polynomial case, where past values of the output process are also considered The paper is organized as follows: in 2 we recall some definitions and properties on the estimation theory in a geometric framework Moreover, some results on the Kronecker algebra are given In 3 the non-gaussian filtering problem is formulated with reference to a linear discrete time system The augmented state and the corresponding dynamical model generating process are defined In 4 some theoretical results useful for the practical implementation of the proposed algorithm are reported Finally in 5 some numerical examples of application are presented, showing high performance of the proposed filter with respect to the Kalman one The paper ends with a concluding remark in 6 2 Preliminaries 21 Estimates as projections In this section, we will consider the mean square optimal (and suboptimal estimate of a partially observed random variable as a projection onto a suitable L 2 -subspace Let (, F, P be a probability space For any given sub σ -algebra G of F let us denote by L 2 (G, n the Hilbert space of the n-dimensional, G-measurable, random variables with finite second moment as { } L 2 (G, n = X : R n, G-measurable, X(ω 2 dp(ω < +, where is the euclidean norm in R n Moreover, when G is the σ -algebra generated by a random variable Y : R m, that is, G = σ(y, we will use the notation L 2 (Y, n to indicate L 2 (σ(y, n Finally if M is a closed subspace of L 2 (F, n, we will use the symbol (X/M to indicate the orthogonal projection of X L 2 (F, n onto M As is well known, the optimal minimum variance estimate of a random variable X L 2 (F, n with respect to a random variable Y, that is, (X/L 2 (Y, n, is given by the conditional expectation (CE E(X/Y If X and Y are jointly Gaussian, then the CE is the following affine transformation of Y : (211 E(X/Y = E(X + E(XỸ T E(ỸỸ T 1 Ỹ, where Ỹ = Y E(Y Moreover, defining Y = [ 1Y ], (211 can be also interpreted as the projection on the subspace L(Y, n = {Z : R n / A R n (m+1 such that Z = AY } L 2 (Y, n = L 2 (Y, n Unfortunately in the non-gaussian case, no simple characterization of the CE can be achieved Consequently it is worthwhile to consider suboptimal estimates which have a simpler mathematical structure that allows the treatment of real data The simplest suboptimal estimate is the optimal affine one, that is, (X/L(Y, n, which is given again by the right-hand side (RHS of (211 In the following discussion such an estimate will be denoted with ˆX and

1668 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI shortly called an optimal linear one Intermediate estimates between the optimal linear and the CE can be considered by projecting onto subspaces greater than L(Y, n, like subspaces of polynomial transformations of Y In order to proceed this way, we need to state some results on the Kronecker products [20] that constitute a powerful tool in treating vector polynomials 22 The Kronecker algebra DEFINITION 221 Let M and N be matrices of dimensions r s and p q respectively Then the Kronecker product M N is defined as the (r p (s q matrix [ ] m11 N m 1s N M N =, m r1 N m rs N where the m ij are the entries of M Of course this kind of product is not commutative DEFINITION 222 Let M be the r s matrix (221 M = [ m 1 m 2 m s ], where m i denotes the ith column of M Then the stack of M is the r s vector m 1 m 2 (222 st(m = m s Observe that a vector as in (222 can be reduced to a matrix M as in (221 by considering the inverse operation of the stack denoted by st 1 We refer to [20, Chap 12] for the main properties of the Kronecker product and stack operation It is easy to verify that for u R r, v R s, the ith entry of u v is given by [ ] i 1 (223 (u v i = u l v m, l = + 1, m = i 1 s + 1, s where [ ] and s denote integer part and s-modulo, respectively Moreover, the Kronecker power of M is defined as M [0] = 1 R, M [l] = M M [l 1], l 1 Even if the Kronecker product is not commutative in general, the following result holds [24] THEOREM 223 For any given pair of matrices A R r s, B R n m, we have (224 B A = C T r,n (A BC s,m, where C r,n, C s,m are suitable 0 1 matrices It is possible to show that C u,v is the (u v (u v matrix such that its (h, l entry is given by (225 {C u,v } h,l = ([ h 1 1 if l = ( h 1 v u + v 0 otherwise ] + 1 ; Observe that C 1,1 = 1, hence in the vector case when a R r and b R n, (224 becomes (226 b a = Cr,n T (a b

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1669 Moreover, in the vector case the commutation matrices also satisfy the following recursive formula LEMMA 224 For any a, b R n and for any l = 1, 2, let G l = C T n,n l so that (227 b [l] a = G l (a b [l] Then the sequence {G l } satisfies the following equations: (228 G 1 = C T n,n, (229 G l = (I 1 G l 1 (G 1 I l 1, l > 1, where I r is the identity matrix in R nr Proof Equation (226 assures the existence of the G l s and implies (228 Moreover, using the associative property of the Kronecker product and recalling the identity with A = I 1, we have (A C (B D = (A B (C D b [l] a = b b [l 1] a = b ( G l 1 (a b [l 1] = (I 1 G l 1 (b a b [l 1] = (I 1 G l 1 ((G 1 (a b b [l 1] = (I 1 G l 1 (G 1 I l 1 (a b [l] Then equation (229 follows immediately by using (227 We can also find a binomial formula for the Kronecker power which generalizes the classical Newton one THEOREM 225 For any integer h 0 the matrix coefficients of the binomial power formula (2210 (a + b [h] = h Mk h (a[k] b [h k] k=0 constitute a set of matrices {M0 h,, Mh h } such that (2211 M h h = Mh 0 = I h, (2212 Mj h = (Mh 1 j I 1 + (M h 1 j 1 I 1 (I j 1 G h j, 1 j h 1, where G l and I l are as in Lemma 224 Proof Equation (2211 is obviously true for any h We will prove (2212 by induction for h 2 For h = 2 it results in (2213 (a + b [2] = a [2] + a b + b a + b [2] = a [2] + (I 2 + G 1 (a b + b [2], where (227 has been used Moreover, using (2212 we obtain M 2 1 = (M1 1 I 1 + (M 1 0 I 1(I 0 G 1 = I 2 + I 2 G 1 = I 2 + G 1

1670 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI so that the matrix coefficient of a b in (2210 (which is equal to I 2 +G 1 by (2213 agrees with the matrix M1 2 computed by using (2212 Now suppose that (2212 is true for h 2 Then we will prove that it is true for h + 1 We have (a + b [h+1] = (a + b [h] (a + b ( h = Mk h (a[k] b [h k] (a + b = = = k=0 h k=0 h k=0 ( (Mk h I 1 (a [k] b [h k] a + (Mk h I 1 (a [k] b [h+1 k] ( (M h k I 1 (a [k] ( G h k (a b [h k] +(Mk h I 1 (a [k] b [h+1 k] h (Mk h I 1 (I k G h k (a [k+1] b [h k] k=0 + h (Mk h I 1 (a [k] b [h+1 k] k=0 Hence, taking into account (2210 we have M h+1 j = (M h j I 1 + (M h j 1 I 1 (I j 1 G h+1 j, 1 j h 23 Polynomial estimates Let X L 2 (F, n, Y L 2 (F, m be random variables and, moreover, suppose that for some integer i, Y 2i dp < + Then we can define the ith-order polynomial estimate of X as (X/L(Y i, n, where Y i L 2 (F, 1 + m + + m i is given by Y i = 1 Ỵ Y [i] Note thatl([ 1 Y ], n = L(Y 1, n L(Y i 1, n L(Y i, n so that a polynomial estimate improves (in terms of error variance the performance of the linear one Observe, moreover, that the previous estimate has the form (231 i c l Y [l], c l R n ml, l=0 which justifies the term polynomial used in this paper If Y 2i dp < + i N, let H be defined as the L 2 -closure of + i=0 L(Y i, n Then the CE can be decomposed as (232 E(X/Y = (X/H + (X/H, where the first term of the RHS of (232 is the L 2 -limit of a sequence of polynomials of Y In particular such a sequence can be obtained by projecting X on the subspaces L(Y i, n,

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1671 so that the difficulty in computing the CE is moved to the second part of the RHS of (232 In any case we can compute the coefficients in (231 of any finite-rank polynomial approximation of the term (X/H by using the linear estimate formula given by the RHS of (211 3 Problem formulation 31 The system to be filtered Let us consider the filtering problem for the following class of linear discrete time systems: (311 x(k + 1 = Ax(k + FN(k, x(0 = x, (312 y(k = Cx(k + GN(k, where x(k R n, y(k R m, N(k R u, A R n n, C R m n, F R n u, G R m u The random variable x (the initial condition and the random sequence {N(k} satisfy the following conditions for k 0: i E{ x} = 0, E{N(k} = 0; ii there exists an integer ν 1 such that for any given multiindex i 1,, i L {1,, u}, j 1,, j L {1,, n}, 1 L 2ν we have (313 F(i 1,, i L = E{ N i1 (kn i2 (k N il (k } <, (314 X(j 1,, j L = E { x j1 x j2 x jl } < ; iii the sequence {N(k} forms with x a family of independent random variables 32 Recursive estimates It is well known that the optimal mean square state estimate for the state x(k of the linear system (311, (312 with respect to the observations up to the time k is given by the conditional expectation (321 ˆx(k = E(x(k/F y k, where F y k is the σ -algebra generated by {y(τ, τ k} Hence there exists a Borel function F such that ˆx(k = F(y(τ, τ k As we have already seen in 2, the computation of F could be very difficult and, in general, does not produce a recursive algorithm, so it does not turn out to be very useful from an application point of view If we are interested only in an optimal linear estimate, then we can also express the above estimate in the general recursive form, (322 ˆx(k = F(k, ˆx(k 1, y(k In fact the well-known Kalman filter, which gives the optimal linear estimate of the state, is expressed as in (322 with a linear transformation F More generally we can consider the set of the recursive Borel transformations of finite memory, that is, (323 ˆx(k = ρ(k, ˆx(k 1, y(k, y(k 1,, y(k In order to realize (323 we will adopt the larger class of recursive functions (324 ˆx(k = T ξ(k, ξ(k = (k, ξ(k 1, y(k, y(k 1,, y(k,

1672 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI where ξ(k L 2 (F, n, n n, and T is the (linear operator that extracts the first n components of ξ(k In particular the method that will be proposed will allow us to obtain an estimate in the form (325 ˆx(k = T ξ(k, ξ(k = L(kξ(k 1 + P(y(k, y(k 1,, y(k, where L(k R n n and P is a polynomial transformation One way to justify (325 is that a similar form is optimal in some interesting cases [25] 33 The extended system In order to obtain a recursive estimate like in (325, as a first step we introduce the following extended vectors: (331 x e (k = x(k y(k 1 y(k Rq, y e (k = y(k y(k 1 y(k Rp, with q = n + m and p = ( + 1m The model equations (311, (312 become (332 x e (k + 1 = A e x e (k + F e N(k, x e (0 = x e, (333 y e (k = C e x e (k + G e N(k, where A 0 0 F x C 0 0 (334 A e = 0 I G 0, F e = 0, x e = 0, 0 I 0 0 0 C 0 0 0 I (335 C e =, G e = 0 I G 0 0 Moreover, let us define the generalized νth-degree polynomial observation as the vector Y R µ, µ = p + p 2 + + p ν : y e (k y e [2] (336 Y(k = (k y e [ν] (k Finally let us introduce the extended state X R χ, χ = q + q 2 + + q ν : x e (k x e [2] (337 X(k = (k x e [ν] (k

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1673 In the following discussion we will denote with M j i (l the binomial matrices (2211, (2212 highlighting the dependence by the dimension l of the vectors, and the symbol I i,j will denote the identity in R ij In order to obtain a recursive filter we need to write an evolution equation for the extended state X(k and another one that links it to Y(k For this purpose we can prove the following important result LEMMA 331 Let, on the same probability space, {z(k, k 0} and {N(k, k 0} be random sequences in R α and R β, respectively, such that k N(k is independent by {z(k, z(j, N(j, j < k} Moreover, let us assume (338 w(k = Ŵz(k + N(k, where w(k R γ and Ŵ, are subsequently dimensioned deterministic matrices Consider the Kronecker powers of w(k and z(k up to the νth order aggregated in the vectors w(k z(k w [2] (k W(k =, Z(k = z [2] (k, w [ν] (k z [ν] (k and where O = Ŵ 0 0 O 2,1 Ŵ [2] 0 O ν,1 O ν,2 Ŵ [ν], T = E(N(k [2] E(N [2] (k, [ν] E(N [ν] (k (339 O i,l = M i i l (γ( [i l] Ŵ [l] ( E(N [i l] (k I α,l Then there exists the representation (3310 W(k = OZ(k + T + N(k, where h 1 (k h 2 (k (3311 N(k = h ν (k and i 1 h i (k = Mi l i (γ( [i l] Ŵ [l] (( N [i l] (k E(N [i l] (k I α,l z [l] (k l=0 Moreover, {N(k} is a zero-mean white sequence such that k N(k is uncorrelated with {Z(j, j k}, with covariance S(k such that its (r, s-block is given by (3312 S r,s = E(hs (kh r (k T r 1 s 1 = Mr l r (γ( [r l] Ŵ [l] st 1( (I β,s m C T β r l,α I m α,l l=0 m=0 ((E(N [s+r m l] (k E(N [s m] (k E(N [r l] (k C 1,α m I α,l E(z [l+m] (k ( [s m] Ŵ [m] T (M s s m (γt, provided that there exist finite all the moments involved

1674 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI Proof Taking the ith Kronecker power of both members in (338, we have (3313 w [i] (k = (Ŵz(k + N(k [i], which can be exploited by using Theorem 225 so that (3314 w [i] (k = (Ŵz(k [i] + = Ŵ [i] z [i] (k + i j=1 i j=1 M i j (γ( ( N(k [j] (Ŵz(k [i j] M i j (γ( [j] Ŵ [i j] (N [j] (k z [i j] (k i 1 = Ŵ [i] z [i] (k + Mi l i (γ( [i l] Ŵ [l] (N [i l] (k I α,l z [l] (k, l=0 from which (3310 follows Now, let us consider the above-defined augmented noise N(k From the independence of z(k and N(k (and hence, the independence of N [i l] (k E(N [i l] (k and z [l] (k l = 0,, i 1 the zero mean property for N(k follows, as can be readily verified To prove the whiteness property, suppose k > j First of all, observe that because N(k, by the hypotheses, is independent of {z(k, z(j, N(j, j < k}, it follows that is independent of N [r l] (k E(N [r l] (k z [l] (kz [m]t (j ( (N [s m] (j E(N [s m] T (j I α,m ; then for the (r, s-block of the covariance matrix we have ( E(N(kN(j T r,s = E(h r(kh s (j T r 1 s 1 = Mr l r (γ( [r l] Ŵ [l] l=0 m=0 E ( ((N [r l] (k E(N [r l] (k I α,l z [l] (kz [m] (j T (( N [s m] (j E(N [s m] (j T I α,m ( [s m] Ŵ [m] T (Ms m s (γt = 0, because N [r l] (k E(N [r l] (k is a zero-mean random variable Moreover, for j k, ( E(N(kZ(j T r,s = E(h r(kz [s] (j T r 1 = Mr l r (γ( [r l] Ŵ [l] l=0 E (( (N [r l] (k E(N [r l] (k I α,l z [l] (kz [s] (j T = 0, which follows, as before, by the independence of the random variables involved

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1675 In order to simplify the notation, let us introduce the following symbols for the calculation of the (r, s-block of the covariance matrix: (3315 M u,v = M u u v (γ( [u v] Ŵ [v], N u,v = N [u v] (k E ( N [u v] (k, z u = z [u] (k, where (u, v {(r, l, (s, m} Then we have (3316 E ( h r (kh s (k T r 1 s 1 = M r,l E ( (N r,l I α,l z l zm T (N s,m I α,m T Ms,m T l=0 m=0 Let us now consider the argument of the expected value in (3316: (3317 (N r,l I α,l z l z T m (N s,m I α,m T = st 1 ( st ( (N r,l I α,l z l z T m (N s,m I α,m T Moreover, (3318 st ( (N r,l I α,l z l z T m (N s,m I α,m T = ( (N s,m I α,m (N r,l I α,l st(z l z T m = ( N s,m ( C T β r l,α m (N r,l I α,m C 1,α m Iα,l (zm z l = ((( I β,s m C T β r l,α m ( Ns,m ( (N r,l I α,m C 1,α m Iα,l zl+m = ((( I β,s m C T β r l,α m ( Ns,m N r,l I α,m (1 C1,α m I α,l zl+m = ((( I β,s m C T β r l,α m (( (Ns,m N r,l 1 (I α,m C 1,α m I α,l zl+m = (( I β,s m C T β r l,α m I α,l ( Ns,m N r,l C 1,α m I α,l zl+m ; by substituting the previous expression in (3317 and then in (3316, taking into account (3315 we obtain formula (3312 Now, we are able to find the augmented linear stochastic system that generates the observation powers, as stated in the following theorem THEOREM 332 The processes {Y(k} and {X(k} defined in (336, (337 satisfy the following equations: (3319 X(k + 1 = AX(k + U + F(k, Y(k = CX(k + V + G(k, where A e 0 0 H A = 2,1 A [2] e 0, U = H ν,1 H ν,2 A e [ν] F [2] F [ν] e C e 0 0 L C = 2,1 C e [2] 0, V = L ν,1 L ν,2 C e [ν] X(0 = X, 0 x e e E(N [2] (k, x [2] ẹ X = E(N [ν] (k x [ν] e 0 G [2] e E(N [2] (k, G e [ν] E(N [ν] (k,

1676 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI H i,l = Mi l i [i l] (q(f e A e [l] ( E(N [i l] (k I q,l, L i,l = M i i l (p(g[i l] e C [l] e ( E(N [i l] (k I q,l, f 1 (k g 1 (k f 2 (k (3320 F(k =, G(k = g 2 (k, f ν (k g ν (k i 1 f i (k = Mi l i [i l] (q(f e A e [l] (( N [i l] (k E(N [i l] (k I q,l x [l] e (k, l=0 i 1 g i (k = l=0 M i i l (p(g[i l] e and {F(k}, {G(k} are zero-mean white sequences such that C [l] e (( N [i l] (k E(N [i l] (k I q,l x [l] e (k, (3321 E(F(kG T (j = 0, k j Moreover, defining P r,s l,m (k = st 1( (I u,s m C T u r l,q m I q,l (( E ( N [s+r m l] (k E ( N [s m] (k E ( N [r l] (k C 1,q m I q,l E ( x [l+m] e (k, we have, for the auto-covariances Q(k, R(k of the noises {F(k}, {G(k}, respectively, and for the cross-covariance the following formulas: r 1 s 1 (3322 Q r,s (k = l=0 m=0 r 1 s 1 (3323 R r,s (k = l=0 m=0 r 1 s 1 (3324 J r,s (k = where l=0 m=0 J (k = E(F(kG(k T, Mr l r [r l] (q(f e M r r l (p(g[r l] e Mr l r [r l] (q(f e A [l] e r,s [s m] Pl,m(k(F e A [m] e T (M s s m (qt, C e [l] r,s Pl,m (k(g[s m] e C e [m] T (Ms m s (pt, A [l] r,s e Pl,m (k(g[s m] e C e [m] T (Ms m s (pt, Q r,s (k = E(f r (kf s (k T, R r,s (k = E(g r (kg s (k T, J r,s (k = E(f r (kg s (k T Proof Equations (3319 and formulas (3322, (3323 follow immediately by applying Lemma 331 to (332 and (333 Taking into account the structure (3320 of the noises F(k, G(k, it follows that ( E(F(kG(j T is the mean value of a product of terms in the r,s

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1677 form (3311 (obtained by means of a suitable substitution of, Ŵ, γ, α, so that (3321 is easily shown, and with some manipulations similar to (3318 we obtain (3324 Given a stochastic process {ξ(k, k 0}, ξ(k R α, here we say that it is an hth-order asymptotically stationary process if i, 1 i h, there exists a constant vector m i R αi such that lim E(ξ [i] (k = m i k + For the sequence {F(k} and {G(k} in (3320, we can show their second-order asymptotic stationarity, provided that the originary system (311, (312 is asymptotically stable, ie, all the eigenvalues of the matrix A are in the open unit circle of the complex plane For now, let us prove the following lemma LEMMA 333 Let us assume the matrix A in (311 to be asymptotically stable Then the sequence {x e (k} is a 2νth-order asymptotically stationary sequence Proof Let m i (k = E(x e [i] (k, i = 1, 2,, 2ν Taking the ith block in the first equation (3319 we have i 1 x e [i] (k + 1 = A[i] e x[i] e (k + H i,l x e [l] (k + H i,0 + f i (k, l=1 with the H i,l s as in Theorem 333 Now taking the expected values of the previous equation we obtain m i (k + 1 = A [i] e m i 1 i(k + H i,e m l (k + H i,0, and by defining the vectors m(k and U 2ν as m(k = we can write the recursive equation m 1 (k m 2 (k m 2ν (k l=1, U 2ν = (3325 m(k + 1 = A 2ν m(k + U 2ν, where A 2ν is defined as 0 H 2,0 H 2ν,0, A e 0 0 H A 2ν = 2,1 A [2] e 0 H 2ν,1 H 2ν,2 A [2ν] e Equation (3325 is a recursive asymptotically stable equation Actually, the asymptotic stability of A and the block-triangular structure of A e imply the asymptotic stability of A e itself and, hence, of all its Kronecker powers [20] This in turn implies the asymptotic stability of A 2ν The lemma is proven by observing that U 2ν is a constant input

1678 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI THEOREM 334 The stochastic white sequences {F(k} and {G(k} in (3320 are second-order asymptotically stationary processes, provided that the matrix A in (331 is asymptotically stable Proof The thesis immediately follows by using Lemma 333 and recalling that {F(k} and {G(k} are zero mean sequences and observing that their covariances, which are given by (3322, (3323, attain to a finite limit if the first 2ν moments of x e (k are convergent Note also that, under the hypotheses of Theorem 334, the cross-covariance matrix between the augmented noises, given by (3324, is convergent for k + Equation (3319 is a linear model with both deterministic and stochastic forcing terms Note that each noise is white, but they are correlated with each other at the same instant of time Moreover, for any k, F(k and G(k are uncorrelated with the initial augmented state X, as easily follows by direct calculation Then for this model it is possible to determine the optimal linear estimate of the extended state X(k with respect to the extended observations Y(0, Y(1,, Y(k, by using the Kalman filter in the form which takes into account the cross-correlation between noises [23] We can obtain the optimal linear estimate of the original state x(k with respect to the same set of augmented observations by extracting in ˆX(k the first n components (as can be readily verified by observing the structure of the vectors x e and X Clearly this operation produces an estimate in the generalized recursive form (325 In the following we will denote this estimate with ˆx (ν, (k Observe that ˆx (ν, (k agrees with the optimal mean square estimate in the (finite-dimensional Hilbert space H ν, generated by objects as s y(i l, 0 s ν, 0 i k, i i l i +, l=1 which is a subspace of L(Y k,ν, n, where 1 Y k Y k,ν = Y [ν] k ; Y k = y(0 y(1 y(k Roughly speaking we can say that the so-defined estimate approximates the projection of x(k onto L(Y k,ν, n, which is the most general mean square optimal polynomial estimate of fixed degree ν Note that the relations H ν, H ν+1,, H ν, H ν, +1 hold ν, and, hence, since ˆx (ν, (k = (x(k/h ν,, we have that the error variance E( ˆx (ν, (k x(k 2 decreases when ν or δ increases Moreover, because ˆx (ν, (k = (x(k/h ν, = ( (x(k/l 2 (Y k, n/h ν, = (E(x(k/Y k /H ν,, we have also that the expression E( ˆx (ν, (k E(x(k/Y k 2 decreases when ν or increases To conclude, we say that the polynomial filter produces an estimate of the state x(k which is as nearer to the optimal one as the parameters ν or are chosen large

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1679 4 Implementation of the filter For computation purposes we need to establish the following result THEOREM 41 Let z R n Then, k, the ith entry of z [k] is (41 (z [k] i = z l1 z l2 z lk, where [ ] (42 l j = i 1 n k j (43 l k = i 1 n + 1 n + 1, j = 1, 2,, k 1, Proof For k = 1 the theorem is true Proceeding by induction, from (223 we obtain with (z [k+1] i = (z z [k] i = z l1 (z [k] m1 l 1 = [ i 1 n k ] + 1 = [ i 1 n k+1 1 ] n + 1, m 1 = i 1 n k + 1 as in (42 for k + 1 Moreover, by (41, (42, and (43, with (z [k] m1 = zˆl 1 zˆl 2 zˆl k ˆl j = [ m 1 1 n k j ] n + 1, ˆl k = m 1 1 n + 1 Finally by denoting l j = ˆl j 1 we have j = 2,, k [ ] [ ] l j = i 1 n k n + 1 = i 1 n n k (j 1 + 1, n k+1 j whereas l k+1 = i 1 n k + 1 1 n + 1 = i 1 n + 1, which proves the theorem Note in (3322, (3323, (3324 that we can evaluate the covariance matrices of the noises F(k, G(k and their cross-covariance from the moments E(x e [h] (k and E(N [h ] (k, where h = 1, 2,, 2(ν 1 and h = 1, 2,, 2ν From Theorem 41 it follows for the ith entry of E(N [h ] (k that ( ( (N(k E(N [h ] (k i = E ( l1 N(k l 2 (N(k = F(l l h 1, l 2,, l h In order to evaluate E(x e [h] (k, noting that from (3325 it results in ( k 1 m(k = A 2(ν 1 m(0 + U 2(ν 1, i=0 A i 2(ν 1

1680 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI we need only evaluate m(0 Taking the hth block m h (0 of m(0, for 1 h 2(ν 1, we have by definition that m h (0 = E(x e [h] (0 Next, for the ith entry of m h (0, defining the h-tuple l 1,, l h, which corresponds to i by Theorem 41, we have that (44 (m h (0 i = E(x [h] e (0 i = X e (l 1,, l h, where { X(l1,, l h if 1 l i n i = 1,, h, (45 X e (l 1,, l h = 0 otherwise Equations (3319 are a state-space model driven by the white noise F(k and with white observation noise G(k Then we can obtain the optimal mean square linear estimate of the state X(k defined in (337, by using the following Kalman filter equations, which take into account the correlation between noises [21], [22], [23]: ( (46 ˆX(k = ˆX(k/k 1 + K(k Y(k C ˆX(k/k 1 V, (47 Z(k = J (k ( CP(k/k 1C T + R(k 1, (48 ˆX(k + 1/k = ( A ( AK(k + Z(k C ˆX(k/k 1 + ( AK(k + Z(k (Y(k V + U, (49 P(k + 1/k = AP(kA T + Q(k Z(kJ T (k AK(kJ T (k J (kk T (ka T, (410 P(k = P(k/k 1 K(kCP(k/k 1, (411 K(k = P(k/k 1C T ( CP(k/k 1C T + R(k 1, where K(k is the filter gain, P(k, P(k/ k 1 are the filtering and prediction error covariances, respectively, and the other symbols are defined as in Theorem 332 If the matrix CP(k/k 1C T + R(k is singular we can use the Moore Penrose pseudoinverse The initial condition for (46 is and for (47 it is ˆX(0/ 1 = E( X, P(0/ 1 = E ( ( X E( X( X E( X T, which can be easily calculated by using (44, (45 By noting that the optimal linear estimate of each entry of the augmented state process X(k with respect to the augmented observations Y(k agrees with its optimal polynomial estimate with respect to the original observations y(k, in the sense of taking into account all of the powers, up to the νth order, of y(j, j = 0,, k, and all of the cross-products as y [l1] (i y [l2] (i 1 y [l +1] (i for i k; 0 l s ν; +1 s=1 l s ν, the method proposed yields the optimal polynomial (as specified before estimate of the system (311, (312, and this estimate can be obtained by extracting the first n entries of the estimated extended state Xa ˆX(k given by the Kalman filter Note that in this manner we have obtained a recursive form as (325 As we have already observed, if the dynamical matrix of system (311, (312 is asymptotically stable, the covariance matrices Q(k, R(k, and J (k tend to finite limits as time goes to infinity In this case we certainly can utilize the well-known steady-state form of the Kalman filter, and then much of the heavier calculations (such as the gain computation can be performed before data processing

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1681 41 Reduced-order filter A considerable reduction of the filter state-space dimension can be obtained by eliminating the redundancy contained in the vector ˆX(k In fact, the block entries of ˆX(k are the (polynomial estimates of monomials in the form: (x e l1 (x e lh, 1 l 1,, l h q, 1 h ν These terms do not change their values with a permutation of the indices l 1,, l h, so that the same value can be repeated many times We can avoid this by using a suitable definition of Kronecker power, instead of the classical one, which eliminates all redundancies, as suggested in [20] This helps in reducing both memory space and computation time LetX = [x 1 x n ] T We will call the reduced Kronecker power of hth order the following vector: x 1 x 1 x 1 x 1 x 1 x n X [h] =, x l1 x lh 1 x lh x n x n x n where 1 l 1 l h n Note that the entries of X [h], are those of X [h], where all the monomials x i1 x ih which differ from each other for a permutation of the indices i 1,, i h are considered once Let X (n [h] denote the hth reduced Kronecker power of X, where we highlight the dimension n of the vector X, and d(y is the length of a vector Y Then it is easy to find the following formulas, both giving the dimension of the vector X (n [h] : Let T (n h d(x (n d(x (n [h] = n k=1 i 1 =0 i 2 =0 d(x (k [h 1], h h i 1 [h] = h (i 1 + +i n 2 1 i n 1 =0 R nh d(x (n [h] and T (n h R d(x(n [h] nh, with the matrices such that Note that the following identities hold: X [h] = T (n h X [h]; X [h] = T (n h X[h] (411 T (n h T (n h X[h] = X [h] ; T (n h T (n h X [h] = X [h] In order to obtain an expression for T (n h vectors X [h], X [h], respectively: and T (n h, let us consider the ith and i th entries in the {X [h] } i = x l1 x lh, {X [h] } i = x l1 x lh, where 1 l 1 l h n We shall indicate with ζ, η the functions such that ζ(i = (l 1,, l h, η(i = (l 1,, l h

1682 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI Of course the inverse functions ζ 1, η 1 are well defined Moreover, let o(l 1,, l h be the ordering function acting on a h-tuple (l 1,, l h Then the following expressions hold: { 1, if j = η {T (n 1 h } (o(ζ(i; i,j = 0, otherwise; { T (n h } i,j = { 1, if j = ζ 1 (η(i; 0, otherwise Note that the function ζ is easily obtained by applying Theorem 41 Moreover, it is easy to show that h 1 i = ζ 1 (l 1,, l h = (l j 1n h j + l h Now, if we define for a fixed ν X X X [1] X r = ; X X [1] nr =, X [ν] then we have where T (n = T (n 1 0 0 T (n 2 X [ν] j=1 X nr = T (n X r, X r = T (n X nr, T (n ν ; T (n = 0 (n 0 T 2 (n T ν T 1 (n We are now able to write down the reduced-order filter equations Let X r (k, Y r (k be defined as x e (k y e (k x e[2] (k X r (k =, Y y e[2] (k r(k =, x e[ν] (k y e[ν] (k where x e (k, y e (k are still given by (331 Then we have for any k X(k = T (q X r (k, X r (k = T (q X(k, (412 Y(k = T (p Y r (k, Y r (k = T (p Y(k, wherey(k, X(k are given by (336, (337 andq andp are the same as in (331 Moreover, we have that the same relations link the vectors ˆX(k, ˆX(k/k 1 in (44, (45 to their reduced counterparts ˆX r (k, ˆX r (k/k 1: (413 ˆX(k = T (q ˆX r (k, ˆX r (k = T (q ˆX(k,

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1683 (414 ˆX(k/k 1 = T (q ˆX r (k/k 1, ˆX r (k/k 1 = T (q ˆX(k/k 1 By using (412, (413, (414 in (46, (48 and taking into account (411, we obtain where ˆX r (k = ˆX r (k/k 1 A 1 (kxa ˆX r (k/k 1 + B 1 (ky r (k V 1 (k, ˆX r (k + 1/k = A 2 (k ˆX r (k/k 1 + B 2 (ky r (k V 2 (k + U 1, A 1 (k = T (q K(kCT (q, B 1 (k = T (q K(kT (p, V 1 (k = T (q K(kV, A 2 (k = T (q (A (AK(k + Z(kCT (q, B 2 (k = T (q (AK(k + Z(kT (p, which is the reduced-order filter V 2 (k = T (q (AK(k + Z(kV, U 1 = T (q U, 5 Numerical results Numerical simulations on an IBM Risk 6000 endowed with Mathematica have been performed for two examples in order to test the method In both of them, we consider the problems of signal and state filtering for the following linear discrete time system, where the state and the output noises are non-gaussian: (51 x(k + 1 = Ax(k + f (k, x(0 = 0, s(k = Cx(k, y(k = s(k + g(k In the first example it is assumed [ ] A = 01 03, 012 01 C = [ 07 03 ], [ ] f (k = f1 (k R f 2 (k, g(k R, {f (k} and {g(k} are independent, zero-mean random sequences in (, F, P defined as f 1 (k(ω = 04χ F1 (ω + 01χ F2 (ω, f 2 (k(ω = 002χ F3 (ω + 018χ F4 (ω, g(k(ω = 028χ G1 (ω + 062χ G2 (ω + 162χ G3 (ω, where χ Q, Q F denotes the characteristic function of Q and the disjoint events (F 1, F 2, (F 3, F 4 and (G 1, G 2, G 3 have probability P(F 1 = 02, P(F 2 = 08, P(F 3 = 09, P(F 4 = 01, P(G 1 = 08, P(G 2 = 01, P(G 3 = 01 The optimal linear, quadratic, and cubic algorithms, without memory ( = 0, and the quadratic with = 1 have been implemented In order to simplify the computations, we have used the steady-state Kalman filter, starting from initial conditions x(0 = ˆx(0 = 0 The results are displayed in Figs 51 55 for 30 iterations with reference to the signal s(k It can be seen that the quadratic filter follows the true state evolution better than the linear filter, although the quadratic one with = 1 does not give in this case a meaningful improvement A further remarkable improvement is indeed obtained with the cubic filter All the mentioned results agree with the steady-state error estimate covariance values obtained by solving the

1684 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI FIG 51 s=signal (dashed line, y=observation (solid line FIG 52 s=signal (dashed line, ŝ L =optimal linear estimate with = 0 (solid line Riccati equation for the linear, quadratic, and cubic cases with = 0, and for the quadratic one with = 1 In particular the above error covariance matrices, namely S L, S Q, S C, and S Q (which are 2 2, 6 6, 20 20, and 12 12, respectively, have the form S L = [ ] 002773 000014 003864 000049, S 000049 000420 Q = 000014 000401,

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1685 FIG 53 s=signal (dashed line, ŝ Q =optimal quadratic estimate with = 0 (solid line FIG 54 s=signal (dashed line, ŝ Q =optimal quadratic estimate with = 1 (solid line 000898 000085 002773 000014 S C = 000085 0003711, S Q = 000014 000401, where we have remarked only the 2 2 matrix blocks in the top left side for the matrices S Q, S C, and S Q because they contain in the main diagonal the steady-state estimate error covariance

1686 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI FIG 55 s=signal (dashed line, ŝ C =optimal cubic estimate with = 0 (solid line TABLE 51 Linear Quadratic Quadratic Cubic = 0 = 0 = 1 = 0 σ 2 (x ˆx 1, N = 30 002188 001377 001373 000145 σ 2 (x ˆx 2, N = 30 000181 000156 000156 000145 σ 2, N = 30 001084 000662 000660 000060 (s ŝ σ 2 (x ˆx 1, N = 5000 003809 002800 002800 000913 σ 2 (x ˆx 2, N = 5000 000429 000409 000409 000378 σ 2, N = 5000 001947 001422 001422 000447 (s ŝ of each component of the state It results in S C (1, 1 < S Q (1, 1 = S Q (1, 1 < S L (1, 1 and S C (2, 2 < S Q (2, 2 = S Q (2, 2 < S L (2, 2 By executing the product CSC T for all the filters implemented, where S is the block of interest in S L, S Q, S Q, S C, we have the corresponding values v L, v Q, v Q, and v C for the steady-state signal error variances v L = 001952, v Q = 001389, v Q = 001389, v C = 000438 As expected, these values are close to the sampled ones obtained via numerical simulation In Table 51 the sampled variances of the state and signal, obtained with a number of N = 30 and N = 5000 trials, are reported In the second example we will see a case where the quadratic filter with = 1 improves not only the simpler quadratic ( = 0 one but also the cubic ( = 0 one, showing the

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1687 TABLE 52 Linear Quadratic Quadratic Cubic = 0 = 0 = 1 = 0 σ 2, N = 5000 441906 188443 15012 184051 (x ˆx σ 2, N = 5000 28282 120604 096077 117793 (s ŝ nonexistence of any relation between ν and Let us consider a scalar system with and the following noises: A = 06, C = 08, f (k(ω = χ F1 (ω + 3χ F2 (ω + 9χ F3 (ω, g(k(ω = 9χ G1 (ω 3χ G2 (ω + χ G3 (ω, P(F 1 = 15/18, P(F 2 = 2/18, P(F 3 = 1/18, P(G 1 = 1/18, P(G 2 = 2/18, P(G 3 = 15/18, where the two systems of disjoint events (F 1, F 2, F 3 and (G 1, G 2, G 3 are independent For this system the linear, quadratic, and cubic filters with = 0 and, moreover, the quadratic filter with = 1 have been implemented Similar to the first example, we report the steady-state error covariance matrices S L, S Q, S C, and S Q (which are 1 1, 2 2, 3 3, and 6 6, respectively remarking the entry (1,1: [ ] [ ] S L = 439815 17734, S Q =, [ ] [ ] 175661 153214 S C =, S Q = The corresponding signal error variances are v L = 281482, v Q = 113498, v Q = 098057, v C = 112423 Moreover, in Table 52 the error variance results for a numerical simulation with N = 5000 trials are reported Simulations of higher-order polynomial filters should require a more sophisticated numerical implementation, which is not the aim of this paper 51 An example of polynomial estimate converging to the CE It would be of real practical interest (and also add further theoretic insight to have some idea about the performance of the polynomial approximation, ie, about the distance between the ideal and approximate estimate (given a particular model For this purpose, let us consider the following model: (511 x(k = f (k, y(k = x(k + g(k,

1688 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI where {f (k}, {g(k} are scalar white sequences defined on some probability space (, F, P, independent of each other, defined as f (k(ω = 2χ F1 (ω + χ F2 (ω, g(k(ω = χ G1 (ω + χ G2 (ω + 2χ G3 (ω, where (F 1, F 2, F 3 and (G 1, G 2, G 3 are disjoint events having the following probabilities: P(F 1 = 1/4, P(F 2 = 1/2, P(F 3 = 1/4, P(G 1 = 4/7, P(G 2 = 2/7, P(G 3 = 1/7 Because (511 is an instantaneous system it results in ˆx(k = E(x(k/y(0, y(1,, y(k = E(x(k/y(k Moreover, x(k and y(k assume for any k only a finite number of values Hence we have (512 ˆx(k(ω = Moreoever, being ( 6 3 x j P(x(k/y = y i χ {y=yi }(ω i=1 j=1 y(k(ω = 3χ F1 G 1 (ω χ (F1 G 2 (F 3 G 1 (ω + χ F3 G 2 (ω + 2χ (F3 G 3 (F 2 G 2 (ω + 3χ F2 G 3 (ω, by direct calculation and taking into account (512, we obtain ˆx(k(ω = 2χ F1 G 1 (ω (2/3χ (F1 G 2 (F 3 G 1 (ω + (2/3χ (F1 G 3 (F 2 G 1 (ω + (4/5χ (F3 G 3 (F 2 G 2 (ω + χ F2 G 3 (ω, and using this we can calculate the error variance v o = E((x(k ˆx(k 2 = 0504762 By denoting with v i, i = 1,, 5, the a priori error variances in the polynomial estimates of degree i, respectively, obtained by applying the polynomial filter to (511 it results in v 1 = 0731707, v 2 = 0615380, v 3 = 0614835, v 4 = 0567688, v 5 = 0504762 Observe that v o = v 5 This is not surprising because, from the fact that the observation takes values on a finite set of six numbers, it follows that an at most 5th-degree polynomial is the exact interpolator of ˆx(k versus y(k In this example we can see how the polynomial estimates converge (as the polynomial degree increases to the CE, even in a finite number of steps

POLYNOMIAL FILTERING FOR NON-GAUSSIAN SYSTEMS 1689 6 Concluding remark The method proposed allows us to obtain recursively a νthorder polynomial state estimate of the stochastic linear non-gaussian system (311, (312 For this purpose, we have defined a new linear system in which the state and the observation are obtained in two steps: first of all by augmenting the original ones with the past values of the observations taken over a time window of fixed length, then by aggregating the previous augmented vectors with their powers up to the νth order The optimal linear estimate of the extended state with respect to the extended observations agrees with the optimal polynomial estimate (of finite memory with respect to the original observations, so that it can be obtained via the well-known Kalman filter It should be noted that denoting by σ 2 (ν, the signal error covariance (highlighting the dependence from the polynomial order ν and the memory, from the above-developed theory it follows that for any ν, σ 2 (ν + 1, σ 2 (ν,, σ 2 (ν, + 1 σ 2 (ν, Moreover, σ 2 (ν + 1, and σ 2 (ν, + 1 are not in any reciprocal relation, and this agrees with the results shown in the numerical simulations where the quadratic filter with = 1 gives, with respect to the cubic one with = 0, a worse and a better result in the first and second case, respectively Finally, numerical simulations show the heavy inadequacy of optimal linear filtering in a non-gaussian environment together with the high performance of polynomial filters Of course this nice behavior is at the expense of growing computational complexity Nevertheless, it should be stressed that this larger amount of calculations can be performed before the real time data processing, because they are mostly concerned with the computation of the covariance error matrix Moreover, a further reduction of the filter dimensions can be obtained by using the reduced-order Kronecker powers Acknowledgment The authors thank Prof S I Marcus, the Associate Editor, and the anonymous referees for their critical reading and useful comments REFERENCES [1] E YAZ, Relationship between several novel control schemes proposed for a class of nonlinear stochastic systems, Internat J Control, 45 (1987, pp 1447 1454 [2], A control scheme for a class of discrete nonlinear stochastic systems, IEEE Trans Automat Control, 32 (1987, pp 77 80 [3], Linear state estimators for nonlinear stochastc systems with noisy nonlinear observation, Internat J Control, 48 (1988, pp 2465 2475 [4], On the optimal state estimation of a class of discrete-time nonlinear systems, IEEE Trans Circuits Systems I Fund Theory Appl, 34 (1987, pp 1127 1129 [5] R KULHAVY, Differential geometry of recursive nonlinear estimation, in Proc of IFAC, 3, Tallinn, USSR, 1990, pp 113 118 [6] S S RAPPAPORT AND L KURTZ, An optimal nonlinear detector for digital data transmission through non- Gaussian channels, IEEE Trans Comm Tech, COM-14 (1966, pp 266 274 [7] B PICINBONO AND G VEZZOSI, Detection d un signal certain dans un bruit non stationnaire et non gaussien, Ann des Telecomm, 25 (1970, pp 433 439 [8] R D MARTIN AND S C SCHWARTZ, Robust detection of a known signal in nearly Gaussian noise, IEEE Trans Inform Theory, 17 (1971, pp 50 56 [9] J H MILLER AND J B THOMAS, Detectors for discrete-time signals in non-gaussian noise, IEEE Trans Inform Theory, 18 (1972, pp 241 250 [10] N H LU AND B A EISENSTEIN, Detection of weak signals in non-gaussian noise, IEEE Trans Inform Theory, 28 (1982, pp 84 91 [11] S A KASSAM, G MOUSTAKIDES, AND J G SHIN, Robust detection of known signals in asymmetric noise, IEEE Trans Inform Theory, 28 (1982, pp 84 91

1690 FRANCESCO CARRAVETTA, ALFREDO GERMANI, AND MASSIMO RAIMONDI [12] B PICINBONO AND P DUVAUT, Optimal linear quadratic system for detection and estimation, IEEE Trans Inform Theroy, 34 (1988, pp 304 311 [13] B PICINBONO, Geometrical properties of optimal Volterra filters for signal detection, IEEE Trans Inform Theory, 36 (1990, pp 1061 1068 [14] T SUBBA RAO AND M YAR, Linear and non-linear filters for linear, but not Gaussian processes, Internat J Control, 39 (1984, pp 235 246 [15] P K RAJESEKARAN, N SATYANARAYANA, AND M D SRINATH, Optimum linear estimation of stochastic signals in the presence of multiplicative noise, IEEE Trans Aerospace Electron Systems, AES-7 (1971, pp 462 468 [16] W I DE KONIG, Optimal estimation of linear discrete-time systems with stochastic parameters, Automatica J IFAC, 20 (1984, pp 113 115 [17] B FRIEDLANDER AND B PORAT, Asymptotically optimal estimation of MA and ARMA parameters, IEEE Trans Automat Control, 35 (1990 [18] G B GIANNAKIS, On the identifiability of non-gaussian ARMA models using cumulants, IEEE Trans Automat Control, 35 (1990, pp 18 26, pp 27 35 [19] A DE SANTIS, A GERMANI, AND M RAIMONDI, Optimal quadratic filtering of linear discrete time non-gaussian systems, IEEE Trans Automat Control, 40 (1995, pp 1274 1278 [20] R BELLMAN, Introduction to Matrix Analysis, McGraw Hill, 1970 [21] G C GOODWIN AND R L PAYNE, Dynamical System Identification: Experiment Design and Data Analysis, Math Sci Engrg, 136, R Bellman, ed, Academic Press, New York, 1977, pp 77 78 [22] T KAILATH, An innovation approach to least square estimation Part I: Linear filtering in additive white noise, IEEE Trans Automat Control, 13 (1968, pp 646 655 [23] A V BALAKRISHNAN, Kalman Filtering Theory, Optimization Software, Inc, Publication Division, New York, 1984 [24] G S RODGERS, Matrix Derivatives, Marcel Dekker, New York, Basel, 1980 [25] S I MARCUS, Optimal nonlinear estimation for a class of discrete-time stochastic systems, IEEE Trans Automat Control, 24 (1979, pp 297 302