Generalized Linear Minimum Mean-Square Error Estimation

Generalized Linear Minimum Mean-Square Error Estimation Yu Liu and X. Rong Li Department of Electrical Engineering University of New Orleans New Orleans, LA 7148, U.S.A. Email: {lyu2, xli}@uno.edu Abstract The linear minimum mean-square error (LMMSE) estimation plays an important role in nonlinear estimation. Generalized LMMSE (GLMMSE) estimation is proposed in this work. LMMSE estimation finds the best estimator in the set of all estimators that are linear in the data. We extend this candidate set in GLMMSE estimation by employing a vectorvalued function of the data and hence find the best one among all estimators that are linear in this function, rather than the data itself. The estimation performance may be enhanced since linear functions may not be adequate to provide good accuracy for a highly nonlinear problem. Theoretically speaking, GLMMSE estimation should perform at least as well as LMMSE estimation if the moments involved can be evaluated exactly. Unfortunately, similar to LMMSE estimation, those moments are difficult to evaluate analytically in general. Many numerical approximations for LMMSE estimation are also applicable to GLMMSE estimation. Computation of GLMMSE estimation based on the Gaussian-Hermite quadrature is presented, and its superior performance, compared with the unscented filter and the Gaussian filter, is demonstrated by several numerical examples. Keywords: nonlinear estimation, linear minimum meansquare error estimation, measure of nonlinearity, Gaussian Hermite quadrature I. INTRODUCTION In a Bayesian framework, estimation is to infer a random quantity x based on prior information and data z (or measurements). It can be classified into two categories: (a) point estimation: only the moments (usually the first and second moments) of x is of interest; (b) density estimation: the probability density or distribution function of x is of interest and needs to be obtained. Obviously, density estimation is more demanding, both technically and computationally, than point estimation. We restricted our focus to point estimation, which is widely applied in many fields, such as target tracking and system identification. A survey for point estimation methods was given in [1]. It is well known that the posterior mean is the optimal estimator that minimizes mean-square error (MSE). This minimum mean-square error (MMSE) estimation in general requires knowledge of the entire distribution of x. It is not feasible for most applications other than the linear Gaussian case. However, it provides a theoretical basis for approximation techniques. Instead of looking for the best estimator among all estimators, the best one in the set of all linear (in the data Research supported in part by NASA/LEQSF(213-15)-Phase3-6 through grant NNX13AD29A and ONR-DEPSCoR through grant N14-9-1-1169. z) estimators may be a good choice, resulting in the linear minimum mean-square error (LMMSE) estimation. It makes a good compromise between simplicity and performance. The well-known Kalman filter (KF) [2], [3] is a special case of the recursive LMMSE estimator, which is optimal in terms of MSE when dealing with a linear Gaussian dynamic system. LMMSE estimation plays a major role in nonlinear estimation [1] and many popular algorithms, for example, the first-order extended Kalman filter (EKF) [4] [9], the unscented filter (UF) [1] [18], the divided difference filter [19] and the Gaussian filter [2], are based on LMMSE estimation. Further, LMMSE estimation was successfully applied directly to radar tracking [21], [22]. It was also applied to state estimation for Markovian jump linear systems [23], [24], which, however, is better handled by multiple-model estimation [25]. Intuitively, if the degree of nonlinearity between x and z is low, LMMSE estimation should work well. However, if x and z are related highly nonlinearly, a linear estimator may not be adequate to provide acceptable accuracy. Instead of searching for the best in the set of all linear estimators, as LMMSE estimation does, we can look for the best estimator in some larger set, and hence improve estimation performance. We propose the generalized LMMSE (GLMMSE) estimation by employing this idea. The candidate set in GLMMSE estimation can be determined by a vector-valued function g(z) of data, and we may search for the best estimator within the set of all estimators that are linear in g(z), rather than in z. Clearly, this includes LMMSE estimation as a special case with g being an identity function. Theoretically speaking, GLMMSE estimation with a proper g performs at least as well as LMMSE estimation if the moments involved can be evaluated exactly. Unfortunately, similar to LMMSE estimation, these moments are difficult to obtain in general. However, many approximation techniques developed for LMMSE estimation are also applicable to GLMMSE estimation. Design of g(z) is evidently problem dependent and up to the preference of users. However, a general guideline for the design can be made that x should be less nonlinear in the converted data y g(z) than in z. A measure of nonlinearity [26] can be employed to quantify the degree of nonlinearity and it provides guidance for designing function g. As shown by simulation results given later, LMMSE estimation and GLMMSE estimation perform similarly for a weakly nonlinear system, while GLMMSE estimation outperforms LMMSE estimation significantly when the problem is highly nonlinear.

This paper is organized as follows. LMMSE estimation and recursive LMMSE estimation are briefly reviewed in Section II. GLMMSE estimation is presented in Section III and its approximation based on numerical integration is given in Section IV. Several numerical examples are provided in Section V to demonstrate the performance of our algorithm. Conclusions are made in Section VI. II. REVIEW OF LMMSE ESTIMATION Consider a parameter estimation problem, x and z are related by a nonlinear function φ z φ(x)+v (1) with additive noise v. The MMSE estimator, i.e., the conditional mean E[x z], is infeasible to compute in general due to the mathematical intractability. However, we can restrict our attention to the set of linear (more rigorously, affine) estimators ˆx a + Bz, within which the best one ˆx LMMSE arg min MSE(ˆx) (2) ˆxa+Bz is the LMMSE estimator. The coefficients a and B are determined such that the MSE is minimized. It makes a good compromise between simplicity and performance for many practical problems. LMMSE estimation can be computed based on the first and second moments of (x, z), that is, ˆx LMMSE x + C xz Cz 1 (z z) (3) P C x C xz Cz 1 C xz (4) x E[x] and z E[z] are the means of x and z, respectively. C ( ) is the covariance matrix of ( ). P is the (estimated) MSE matrix. LMMSE estimation is unbiased (i.e., E[ˆx LMMSE ]E[x]) and the estimation error x LMMSE x ˆx LMMSE is orthogonal to the space spanned by the data (i.e, E[ x LMMSE z ]). The estimator above is a batch algorithm, which is computationally inefficient for a dynamic problem. Recursive LMMSE estimation is desirable for filtering for a discrete-time dynamic system. We focus on discrete-time system since continuoustime nonlinear filtering has mainly theoretical value. Further, due to the widespread use of digital computers, many applications have been developed in the discrete-time system. Also, discrete-time nonlinear filtering is significantly easier than nonlinear filtering in continuous time or mixed time. Therefore, for simplicity we focus only on a general discretetime system model with additive, mutually independent white process noise w k and white measurement noise v k : x k+1 f k (x k )+w k (5) z k h k (x k )+v k (6) without loss of generality, we assume w k (,Q k ) (mean and covariance) and v k (,R k ). The corresponding recursive LMMSE estimation for the discrete-time system (5) (6) is given by ˆx LMMSE k ˆx k k 1 + K k z k P k P k k 1 K k C zk K k K k C xk k 1 z k C 1 z k, Z k [z 1 z 2 z k] ẑ k k 1 E[z k Z k 1 ], ˆx k k 1 E[x k Z k 1 ] z k z k ẑ k k 1, x k k 1 x k ˆx k k 1 and P k k 1 and P k are the predicted and updated MSE matrices at time k. The well-known Kalman filter (KF) is a special case of the recursive LMMSE estimation with system (5) (6) being linear. Recursive LMMSE estimation was successfully applied to radar tracking [21], [22] with linear target dynamics model and nonlinear measurement model. In general, the moments involved in LMMSE estimation are difficult to compute exactly. Approximation techniques, such as unscented transform and numerical integration can be employed, leading to the unscented filter and the Gaussian filter, respectively. III. GENERALIZED LMMSE ESTIMATION LMMSE estimation offers the best estimator among all linear estimators, which should work well for systems of a moderate degree of nonlinearity. For a highly nonlinear system, the performance of a linear estimator, even the best one, may not be good enough. We can enhance the performance of LMMSE estimation by enlarging the candidate set. That is, we find the best one within all estimators that are linear in a vector-valued function of data, i.e., y g(z), leading to the GLMMSE estimation: ˆx GLMMSE arg min {a,b} MSE(ˆx) ˆx a + By a + Bg(z). Clearly, the estimator ˆx GLMMSE is linear in y but nonlinear in z in general. Also, it includes LMMSE estimation as a special case with g being an identity function. The candidate set is determined by g, which is evidently problem dependent and up to the user preference. Take a scalar-valued x as an example, g may be chosen to include monomials of z up to the third-degree: g(z) [z z 2 z 3 ]. Then ˆx is actually a third-degree polynomials and ˆx GLMMSE is the best one within the set of all polynomial functions of z to the third-order, which clearly includes the set of linear estimators as a subset. Theoretically speaking, GLMMSE estimation should perform at least as well as LMMSE estimation if the set of all linear estimators is included in the candidate set and all the moments involved can be evaluated exactly. However, inaccurate approximation of these moments may result in worse performance. General guidance for designing the function g is to achieve a lower degree of nonlinearity between x and y than that between x and z. The degree of nonlinearity can be evaluated by the measure proposed in [26]. Hence, a proper choice of g benefits estimation by enlarging the candidate set and reducing the degree of nonlinearity.

IV. NUMERICAL APPROXIMATION FOR GLMMSE ESTIMATION For a parameter estimation problem (1), following a similar derivation of LMMSE estimation yields ˆx GLMMSE x + C xy Cy 1 (y ȳ) (7) P C x C xy Cy 1 C xy (8) This is exactly the same as LMMSE estimation except that the moments of (x, z) are replaced by those of (x, y). Unfortunately, similar to LMMSE estimation, these moments are difficult to obtain analytically in general. Approximation techniques available for LMMSE estimation, for example, the unscented transform and numerical integration are optional for GLMMSE estimation. We present our algorithm based on the Gaussian-Hermite quadrature (GHQ) due to its superior performance as shown by our simulation results later. For a scalar-valued x, the GHQ approximates an integral by a weighted sum: + f(x)e x2 dx n ω i f(x i ) i1 n is the number of sample points x i, which are the roots of the Hermite polynomial H n (x) ( 1) n e x2 dn e x2 dx n and their corresponding weights are ω i 2n 1 n! π [nh nx i]. The 2 generalization of GHQ to the vector case with a general Gaussian density is shorthand xc 1 2 r+ x r 2s a(s) f((2c) 1 2 s+ x) π nx f(x)n (x; x, C)dx 1 d xnx f(x)n (x; x, C)dx f(c 1 2 r + x)n (r;,i)dr 1 (2π) n x f(c 1 2 r + x)e 1 2 rr dr 1 π n x f((2c) 1 2 s + x)e ss ds a(s)e s2 1 e s 2 nx ds 1 d snx n x is the dimension of x. Therefore, the GHQ can be applied to each dimension of s separately. Let η [x v ], and approximate the distribution of η by a Gaussian density η η N(η; η, P η ) [ x ], P η [ ] Px R and x and P x are the assumed prior mean and covariance matrix of x, respectively. Then, the moments involved in Eqs. (7) and (8) are computed by ȳ y(η)n (η; η, P η ) dη g(φ(x)+v)n (η; η, P η ) dη C y (y(η) ȳ)(y(η) ȳ) N (η; η, P η ) dη C xy (x x)(y(η) ȳ) N (η; η, P η ) dη φ(x) is given in Eq. (1), and all the integrals are evaluated by GHQ. The recursive GLMMSE estimation for the dynamic system (5) (6) can be obtained similarly ˆx GLMMSE k ˆx k k 1 + K k ỹ k P k P k k 1 K k Cỹk K k K k C xk k 1 ỹ k C 1 ỹ k, Y k [y 1 y 2 y k] y k g(z k ), ŷ k k 1 E[y k Y k 1 ], ỹ k y k ŷ k k 1 Following the same idea of the Gaussian filter [2], [27], which successively approximates the probability densities needed by moment-matched Gaussian densities and evaluates integrals by GHQ, we have the recursive GLMMSE estimation. One cycle of it is given below: 1) State prediction Approximate the posterior probability density (PDF) of x k 1 by a moment-matched Gaussian density: PDF(x k 1 Y k 1 ) N(x k 1 ;ˆx k 1,P k 1 ) Then ˆx k k 1 f k 1 (x)n (x;ˆx k 1,P k 1 )dx P k k 1 (f k 1 (x) ˆx k k 1 )(f k 1 (x) ˆx k k 1 ) N(x;ˆx k 1,P k 1 )dx + Q k 1 2) Measurement prediction Let η k [x k v k ]. Approximate the probability density by a moment-matched Gaussian density: f(η k Z k 1 ) N(η k ; η k,p η k ) η k [ˆxk k 1 ] [ ], P η k Pk k 1 R k Let y k (η k )g(h k (x k )+v k ). Then 3) Filter gain ŷ k k 1 + y k (η k )N (η k ; η k,p η k )dη k K k C xy S 1 k

S k (y k (η k ) ŷ k k 1 )(y k (η k ) ŷ k k 1 ) N(η k ; η k,p η k )dη k C xy (x ˆx k k 1 )(y k (η k ) ŷ k k 1 ) N(η k ; η k,p η k )dη k All the integrals in the above equations are evaluated by the GHQ. 4) Update ˆx ˆx k k 1 + K k [g(z k ) ŷ k k 1 ] P k P k k 1 K k S k K k Instead of a single Gaussian density, the probability density can be approximated by a Gaussian mixture to promote approximation accuracy, similarly as the Gaussian sum filter or the mixed Gaussian filter does [2]. GHQ suffers from the curse of dimensionality and is computationally inefficient for high dimensional x or η. Many methods are available to mitigate, to some extent, this effect (see [28] [4]). GHQ is chosen due to its superior performance demonstrated by the simulation results in the following section. However, other approximation techniques may be preferred for a specific problem. V. ILLUSTRATIVE EXAMPLES Our algorithm is evaluated by several numerical examples and is compared with the unscented filter (UF) and the Gaussian filter (). The first two moments involved in GLMMSE estimation are approximated by the unscented transform and by GHQ. The resulting algorithms are denoted by GL UF and GL, respectively. The unscented transform uses (2n η +1) σ-points and GHQ uses 5 quadrature points for each dimension of x (or η). The posterior Cramer-Rao lower bound (CRLB) [41] for estimation error is provided as a performance reference. A. Performance Measure The following performance measures are computed based on 1 Monte Carlo (MC) runs. Each measure reveals different aspects of an estimator s performance. 1) Root mean-square error (RMSE): RMSE 1 I ( x I [i] ) x [i] i1 x [i] x [i] ˆx [i] is the estimation error on the ith MC run, and I is the total number of MC runs. This measure evaluates an estimator s (average) accuracy. 2) Filter credibility: Other than estimation accuracy, we also evaluate the credibility of each estimator how close the estimator s selfassessments are to the true performance by evaluating the accuracy of the estimated MSE. The non-credibility index (NCI) γ 1 I and inclination index (II) I log 1 (ρ i ) i ξ 1 I log I 1 (ρ i ) i were proposed for such a purpose in [42], [43], ρ [i] ( x[i] ) ( ˆP [i] [i] ) 1 x 1 I ( x [i] ) ( ˆP ) 1 x, ˆP x [i] ( x [i] ) [i] I i1 [i] and [i] is the index of the MC run, ˆP is the filter estimated MSE matrix on the ith run, ˆP can be understood as the (approximated) real MSE based on simulated estimation errors, and ρ is the ratio of squared estimation errors normalized by estimated and real MSE matrices, respectively. Hence, an estimator is more credible if the NCI is smaller, meaning that the estimated MSE is closer to the true MSE. The inclination (pessimistic or optimistic) of an estimator is indicated by II: if II is significantly larger than zero and γ ξ, it implies that the estimator tends to be optimistic, meaning that the estimated MSE is smaller than the truth; if II is significantly smaller than zero and γ ξ, then the estimator tends to be pessimistic. 3) Measure of nonlinearity: The degree of nonlinearity between x and the data is also of interests since it is closely related to the estimation performance. A normalized measure of nonlinearity for Eq. (1) was defined as in [26] E[ L(x) φ(x) 2 2 M min ] (9) L L tr(c φ ) C φ is the covariance matrix of φ(x), and L is the set of all linear functions L(x) a + Bx (1) Note that we are interested in the nonlinearity between x and z, and hence additive noise is not considered in the measure. For the dynamic system (5) (6), stacking Eqs. (5) and (6) together yields u k φ k (x k )+ɛ k (11) u k [x k+1, z k], ɛ k [w k, v k] φ k (x k )[f k (x k ), h k (x k ) ] Equation (11) is of the same form as Eq. (1) and the measure of nonlinearity can be computed similarly. Note that M depends on time k in this case. Intuitively, M quantifies the portion (or percentage) of the nonlinear part

that cannot be accounted for by a linear function in φ. As shown in [26], it has a standard range of [, 1]. Clearly, M implies that φ is linear almost every, and M 1 implies that ˆL, meaning roughly that φ contains no linear component at all. The computation of M requires knowledge of PDF p(x). If it is difficult to evaluate analytically due to the nonlinearity or the dynamics of the system, a sample representation of the PDF can be obtained by the Markov chain Monte Carlo method based on the prior distribution. Random samples can be drawn from the initial distribution and propagated to time k, resulting in a sample version of p(x). In our simulation, the prior distribution of x was assumed known and 1 6 particles were drawn from this prior distribution to approximate M. B. Parameter Estimation We consider a 2-dimensional parameter estimation problem with the following model z φ(x)+v (12) z [ z1 z 2 ], x [ x1 x 2 ], φ(x) The function g(z) is chosen to be z y g(z) z 1 cos(z 2 ) z 1 sin(z 2 ) [ x 2 1 + x 2 2 ] arctan(x 2 /x 1 ) that is, the original measurement in polar coordinates is augmented by its transformed version in Cartesian coordinates, which results in an equivalent model y g(φ(x)+v) (13) Clearly, the candidate set determined by g includes the set of linear estimators as a subset. As shown in latter, this augmented model indeed reduces the degree of nonlinearity, and hence benefits estimation. The prior distribution of x was assumed known to each estimator to be Gaussian: x N( x, P ) different values of x x 1 [1 1], x 2.1 x 1, x 3.1 x 1 and P P 1 [ ] 5.1, P.1 5 2.1P 1, P 3.1P 1 were simulated to compare their impact on the model s degree of nonlinearity and the estimation performance of each estimator. We also vary the value of the covariance matrix R of measurement noise [ ] 1 R 1, R.1 2.1R 1, R 3.1R 1 to evaluate its effect on each estimator. Scenario 1: We simulated three cases corresponding to x x i,i 1, 2, 3, respectively, with P P 2 and R R 2 fixed. Therefore, the effects of different x on estimators performance can be compared. The simulation results are given in Table I. The measure of nonlinearity M for Eq. (12) and Eq. (13) with different x are given in Table II. Clearly, the transform g(z) reduces the model s degree of nonlinearity for all three cases since the transformed measurements in Cartesian coordinates are less nonlinear in x. Further, it is observed that the degree of nonlinearity increases as x decreases. All the estimators have similar performance for case 1 (i.e., x x 1 ) due to the low degree of nonlinearity. GL UF performs slightly worse than UF due to the approximation of moments involved. The GL outperforms the other filters in terms of all measures for cases 2 and 3, the degrees of nonlinearity are significantly larger than for case 1. Note that GL UF has a comparable performance with GL in terms of RMSE; however, its credibility is the worst, as indicated by NCI and II. By comparing the moments computed in GL UF and GL, we know that C y is not well approximated by UT since x has a smaller dimension than y g(z) does, resulting in an insufficient number of σ-points. Further, it is observed that every estimator becomes less credible as the degree of nonlinearity increases, as expected. Clearly, GL has performance superior to the other estimators, especially when the system is highly nonlinear, but at the cost of a higher computational complexity, as shown in Table VI. From the inclination index (II), we observe that all the estimators tend to be optimistic, that is, the estimated MSE is smaller than the actual MSE. TABLE I SIMULATION RESULTS OF SCENARIO 1 WITH DIFFERENT PRIOR MEAN x. Measure UF GL UF GL CLRB RMSE.6977.6979.6978.6977.492 x 1 NCI.1221.1327.1243.1244 II.86.129.852.855 RMSE.685.35.681.3476.1163 x 2 NCI 1.7342 1.9881 1.4334.4849 II 1.488 1.929 1.4334.4849 RMSE 1.27.3284.948.3174.974 x 3 NCI 3.5242 18.3172 1.4295.7691 II 3.2586 18.264 1.213.515 TABLE II MEASURE OF NONLINEARITY OF EQS. (12) AND (13) IN SCENARIO 1. Case x 1 Case x 2 Case x 3 Eq. (12) 4.85% 51.51% 74.75% Eq. (13) 4.3% 34.53% 65.12% Scenario 2: Three cases corresponding to P P i,i 1, 2, 3, respectively, were simulated with x x 2 and R R 2 fixed. Hence, the impact of different P on estimators performance can be compared. The simulation results are given in Table III. The measures of nonlinearity M for Eq. (12) and Eq. (13) with different P are given in Table IV. Similar to Scenario 1, the transform g(z) reduces the model s degree of

nonlinearity. Further, it is observed that the degree of nonlinearity decreases as P decreases. A larger P leads to a more spread distribution of x, which results in a more nonlinear model. Similar observations on estimators performance as in Scenario 1 can be made. When the system has a low degree of nonlinearity, all the estimators have comparable performance, while GL outperforms the other estimators for the highly nonlinear cases. TABLE III SIMULATION RESULTS OF SCENARIO 2 WITH DIFFERENT PRIOR COVARIANCE MATRIX P. Measure UF GL UF GL CRLB RMSE 2.9745.4813 2.633.4774.1313 P 1 NCI 2.6427 5.118.7885.4327 II 2.4362 5.118.6723.2421 RMSE.685.35.681.3476.1163 P 2 NCI 1.7342 1.9881 1.4334.4849 II 1.488 1.929 1.4334.4849 RMSE.2317.2315.2317.2281.1497 P 3 NCI.3459.4968.3343.2936 II.3283.4968.283.2666 TABLE IV MEASURE OF NONLINEARITY OF EQS. (12) AND (13) IN SCENARIO 2. Case P 1 Case P 2 Case P 3 Eq. (12) 79.88% 51.51% 13.16% Eq. (13) 44.87% 34.53% 8.77% Scenario 3: Three cases corresponding to R R i,i 1, 2, 3, respectively, were simulated with x x 2 and P P 2 fixed. Hence, the impact of different R on estimators performance can be compared. The simulation results are given in Table V. Since M doesn t account for the measurement noise, the degree of nonlinearity remains the same for different R. A large R (i.e., case 1, R R 1 ) results in similar RMSE for all estimators, while GL UF and GL improve the estimation accuracy much more than the other two estimators when R decreases, as shown in the cases with R 2 and R 3. However, unlike Scenarios 1 and 2, the change of R does not have a consistent impact on the estimator s credibility since the degree of nonlinearity is not changed. TABLE V SIMULATION RESULTS OF SCENARIO 3 WITH DIFFERENT COVARIANCE R OF MEASUREMENT NOISE. Measure UF GL UF GL CRLB RMSE.7693.7495.7614.7119.317 R 1 NCI.6643 1.3564.3131.1798 II.621 1.255.2512.89 RMSE.685.35.681.3476.1163 R 2 NCI 1.7342 1.9881 1.4334.4849 II 1.488 1.929 1.4334.4849 RMSE.5167.1137.5211.1138.416 R 3 NCI 1.8611 1.7278 1.3385.714 II 1.81 1.5584 1.316.5 The average computational costs (normalized by the cost of UF) are given in Table VI. Clearly, and GL are computationally more demanding than UF and GL UF due to the costly GHQ. However, the cost increment from LMMSE estimation to GLMMSE estimation with the same approximation technique is quite insignificant. TABLE VI AVERAGE COMPUTATIONAL COST. UF GL UF GL 1 1.4 42.64 44.13 C. Filtering for Dynamic Systems Our algorithm was also applied to nonlinear filtering for dynamic systems. Let the system state be x k [x k y k ], and we consider the following two cases: Case 1: [ ] xk +.5 x k+1 + w y k +.1 k Case 2: x k+1 [ ] 2x 2 k 1 2y 2 k 1 + w k with the common measurement model [ ] [ ] z 1 z k k x 2 k + y 2 k + v arctan(y k /x k ) k z 2 k Q k diag(1, 1) 1 6, R k diag(1, 1) 1 3, and the initial state is [ ] x [.1.2].5.1, P.1.5 The function g(z) in GLMMSE estimation is chosen to be z k g(z k ) z 1 k cos(z2 k ) (14) z 1 k sin(z2 k ) We only consider the quadrature based algorithms (i.e., and GL ) because they perform better than the UT-based algorithms in the previous simulation. All the performance measures are time dependent in this scenario. The normalized measure of nonlinearity for the systems in the two cases is given in Fig. 1. Clearly, Case 1 has a much weaker nonlinearity than Case 2, as expected, due to the linear dynamics of x k in Case 1. The filtering performance for the two cases is given in Figs. 2 and 3, respectively. and GL perform similarly for Case 1 and they almost reach the posterior Cramer-Rao lower bound (CRLB) in terms of estimation accuracy. This makes sense due to the weak nonlinearity. Besides, both algorithms are credible as indicated by NCI, respectively. However, GL outperforms significantly in terms of RMSE in Case 2, the system is highly nonlinear. Further, the gap between the posterior CRLB and the actual filtering accuracy becomes evident. Both algorithms non-credibility increases in this case and they tend to be optimistic. However, GL is still much less non-credible and less optimistic than. The simulation results demonstrate that our algorithm with the choice of g (Eq. (14)) benefits estimation for a highly nonlinear system.

.12 CRLB GL_.6 GL_.8.6 GL_ Root mean square error.1.8.6 NCI.5.4.3.2 II.4.2.2.4.4.1.6.2 1 2 3 4 5 6 7 8 9 1 1 2 3 4 5 6 7 8 9 1.8 2 4 6 8 1 Fig. 2. Simulation results for Case 1. The RMSE, NCI and II are shown in the left, middle and right figures, respectively. Due to the weak nonlinearity of the system in this case, both algorithms perform similarly. 1.2 1 CRLB GL_ 25 2 GL_ 25 2 GL_ Root mean square error.8.6.4 NCI 15 1 II 15 1 5.2 5 3 4 5 6 7 8 9 2 4 6 8 1 5 2 4 6 8 1 Fig. 3. Simulation results for Case 2. The RMSE, NCI and II are shown in the left, middle and right figures, respectively. The system in this case has a much stronger nonlinearity than in Case 1. Clearly, GL outperforms in terms of estimation accuracy and credibility. Normalized measure of nonlinearity.9.8.7.6.5.4.3.2.1 Case1 Case2 2 4 6 8 1 Time step k Fig. 1. Normalized measure of nonlinearity for the system in Cases 1 and 2. The linear dynamics of x k in Case 1 results in a much weaker nonlinearity than in Case 2. VI. CONCLUSIONS A generalized LMMSE (GLMMSE) estimator has been proposed in this work. It extends the candidate set of estimators in LMMSE estimation by employing a vector-valued function g of data. The best estimator in terms of the meansquare error is searched in the set of estimators that are linear in g(z). Clearly, this candidate set is determined by g. Theoretically speaking, GLMMSE estimation should perform at least as well as LMMSE estimation if the candidate set of GLMMSE estimation includes the set of linear estimators and the moments involved can be calculated exactly. Unfortunately, similar to LMMSE estimation, analytical evaluation of the moments is infeasible for most nonlinear problems and approximation is necessary. Cautions should be taken that an inaccurate approximation might result in worse performance of GLMMSE estimation than LMMSE estimation in practice. Numerical approximation of GLMMSE estimation and recursive GLMMSE estimation based on the Gaussian- Hermite quadrature is presented for parameter estimation and filtering for dynamic systems, respectively. Design of g(z) is evidently problem dependent. A guideline is that g(z) should be designed to reduce the degree of nonlinearity of the transformed system and consequently benefit estimation. Numerical examples have been given to demonstrate the performance of our algorithm by comparing with the unscented filter and the Gaussian filter. REFERENCES [1] X. R. Li and V. P. Jilkov, A survey of maneuvering target tracking: approximation techniques for nonlinear filtering, in Proceedings of SPIE, vol. 5428, April 24, pp. 537 55.

[2] R. E. Kalman, A new approach to linear filtering and prediction problems, Journal of Basic Engineering, vol. 82, no. 1, pp. 35 45, 196. [3] R. E. Kalman and R. S. Bucy, New results in linear filtering and prediction theory, Journal of Basic Engineering, vol. 83, pp. 95 17, 1961. [4] G. L. Smith, S. F. Schmidt, and L. A. McGee, Application of statistical filter theory to the optimal estimation of position and velocity on board a circumlunar vehicle. National Aeronautics and Space Administration (NASA), 1962. [5] B. A. McElhoe, An assessment of the navigation and course corrections for a manned flyby of Mars or Venus, IEEE Transactions on Aerospace and Electronic Systems, vol. 2, no. 4, pp. 613 623, July 1966. [6] R. P. Wishner, J. A. Tabaczynski, and M. Athans, A comparison of three non-linear filters, Automatica, vol. 5, no. 4, pp. 487 496, July 1969. [7] A. H. Jazwinski, Stochastic Processes and Filtering Theory. New York: Academic Press, 197. [8] R. K. Mehra, A comparison of several nonlinear filters for reentry vehicle tracking, IEEE Transactions on Automatic Control, vol. 16, no. 4, pp. 37 319, August 1971. [9] Y. Bar-Shalom, X. R. Li, and T. Kirubarajan, Estimation with Applications to Tracking and Navigation: Theory, Algorithms, and Software. New York: Wiley, 21. [1] S. J. Julier, J. K. Uhlrnann, and H. F. Durrant-Whyte, A new approach for filtering nonlinear systems, in Proceedings of the American Control Conference, vol. 3, Seattle, WA, USA, June 1995. [11] S. J. Julier and J. K. Uhlmann, A new extension of the Kalman filter to nonlinear systems, in Proceedings of AeroSense: The 11th International Symposium on Aerospace/Defence Sensing, Simulation and Controls, 1997. [12], A consistent, debiased method for converting between polar and cartesian coordinate systems, in Proceedings of SPIE Acquisition, Tracking, and Pointing XI, vol. 386, 1997. [13] S. Julier, J. Uhlmann, and H. F. Durrant-Whyte, A new method for the nonlinear transformation of means and covariances in filters and estimators, IEEE Transactions on Automatic Control, vol. 45, no. 3, pp. 477 482, March 2. [14] J. R. V. Zandt, A more robust unscented transform, in Proceedings of SPIE, vol. 4473, no. 1, 21. [15] S. J. Julier, The scaled unscented transformation, in Proceedings of the American Control Conference, vol. 6, Anchorage, AK, USA, May 22. [16] S. J. Julier and J. K. Uhlmann, Reduced sigma point filters for the propagation of means and covariances through nonlinear transformations, in Proceedings of the American Control Conference, vol. 2, Anchorage, AK, 22. [17] S. J. Julier, The spherical simplex unscented transformation, in Proceedings of the American Control Conference, vol. 3, Denver, CO, USA, June 23. [18] S. J. Julier and J. K. Uhlmann, Unscented filtering and nonlinear estimation, Proceedings of The IEEE, vol. 92, no. 3, pp. 41 422, March 24. [19] M. Norgaard, N. K. Poulsen, and O. Ravn, New developments in state estimation for nonlinear systems, Automatica, vol. 36(11), pp. 1627 1638, Nov. 2. [2] K. Ito and K. Xiong, Gaussian filters for nonlinear filtering problems, IEEE Transactions on Automatic Control, vol. 45, no. 5, pp. 91 927, May 2. [21] Z. Zhao, X. R. Li, and V. P. Jilkov, Best linear unbiased filtering with nonlinear measurements for target tracking, IEEE Transactions on Aerospace and electronic systems, vol. 4, no. 4, pp. 1324 1336, October 24. [22] Z. Duan, Y. Liu, and X. R. Li, Recursive LMMSE filtering for target tracking with range and direction cosine measurements, in Proceedings of 13th International Conference on Information Fusion, Edinburgh, UK, July 21. [23] O. L. V. Costa, Linear minimum mean square error estimation for discrete-time Markovian jump linear systems, IEEE Transactions on Automatic Control, vol. 39, no. 8, pp. 1685 1689, August 1994. [24] M. T. Hadidi and C. S. Schwartz, Linear recursive state estimators under uncertain observations, IEEE Transactions on Automatic Control, vol. 24, no. 6, pp. 944 948, December 1979. [25] X. R. Li and V. P. Jilkov, Survey of maneuvering target tracking part V: multiple-model methods, IEEE Transaction on Aerospace and Electronic Systems, vol. 41, no. 4, pp. 1255 1321, Oct. 25. [26] X. R. Li, Measure of nonlinearity for stochastic systems, in Proceedings of 15th International Conference on Information Fusion, Singapore, July 212. [27] I. Arasaratnam, S. Haykin, and R. J. Elliott, Discrete-time nonlinear filtering algorithms using Gauss Hermite quadrature, Proceedings of the IEEE, vol. 95, no. 5, pp. 953 977, May 27. [28] S. A. Smolyak, Quadrature and interpolation formulas for tensor products of certain classes of functions, Soviet Mathematics, Doklady, vol. 4, pp. 24 243, 1963. [29] A. H. Stroud, Approximate Caculation of Multiple Integrals. Englewood Cliffs, NJ, USA: Prentice Hall, 1971. [3] H. Niederreiter, Quasi-Monte Carlo methods and pseudo-random numbers, Bulletin of the American Mathematical Society, vol. 84, no. 6, pp. 957 141, November 1978. [31] I. H. Sloan and S. Joe, Lattice Methods for Multiple Integration. USA: Oxford University Press, 1994. [32] J. Monahan and A. Genz, Spherical-radial integration rules for Bayesian computation, Journal of the American Statistical Association, vol. 92, no. 438, pp. 664 674, June 1997. [33] T. Gerstner and M. Griebel, Numerical integration using sparse grids, Numerical Algorithms, vol. 18, no. 4, pp. 29 232, 1998. [34] J. S. Liu, Monte Carlo Strategies in Scientific Computing. New York, NY, USA: Springer, 21. [35] R. Cools, Advances in multidimensional integration, Journal of Computational and Applied Mathematics, vol. 149, no. 1, pp. 1 12, December 22. [36] I. Arasaratnam and S. Haykin, Cubature Kalman filters, IEEE Transactions on Automatic Control, vol. 54, no. 6, pp. 1254 1269, June 29. [37] I. Arasaratnam, Cubature Kalman filtering: theory & applications, Ph.D. dissertation, Department of Electrical & Computer Engineering, McMaster University, April 29. [38] H. Pesonen and R. Piche, Cubature-based Kalman filters for positioning, in Proceedings of 7th Workshop on Positioning, Navigation and Communication, Dresden, March 21. [39] I. Arasaratnam, S. Haykin, and T. R. Hurd, Cubature Kalman filtering for continuous-discrete systems: theory and simulations, IEEE Transactions on Signal Processing, vol. 58, no. 1, pp. 4977 4993, October 21. [4] M. Havlicek, K. J. Friston, J. Jan, M. Brazdil, and V. D. Calhoun, Dynamic modeling of neuronal responses in fmri using cubature Kalman filtering, NeuroImage, vol. 56, no. 4, pp. 219 2128, June 211. [41] P. Tichavsky, C. H. Muravchik, and A. Nehorai, Posterior cramerrao bounds for discrete-time nonlinear filtering, IEEE Transactions On Signal Processing, vol. 46, no. 5, pp. 1386 1396, May 1998. [42] X. R. Li, Z. L. Zhao, and X. B. Li, Evaluation of estimation algorithmspart II: credibility tests, IEEE Transactions on Systems, Man, and CyberneticsPart A: Systems and Humans, vol. 42, no. 1, pp. 147 163, 212. [43] X. R. Li and Z. Zhao, Measuring estimator s credibility: noncredibility index, in Proceedings of the 9th International Conference on Information Fusion, 26.