Maximum Likelihood Ensemble Filter Applied to Multisensor Systems Arif R. Albayrak a, Milija Zupanski b and Dusanka Zupanski c abc Colorado State University (CIRA), 137 Campus Delivery Fort Collins, CO 823, USA ABSTRACT Maximum Likelihood Ensemble Filter (MLEF) is an alternative deterministic ensemble based filter technique that optimizes a non-linear cost function along with a Maximum Likelihood approach. In addition to the common use of ensembles for calculating error covariance, the ensembles in MLEF are exploited to efficiently calculate Hessian preconditioning and the gradient of the cost function. This study is divided into two segments. The first part presents a one sensor approach, were MLEF is compared to different filters using Lorenz 63 system. These filters are: Extended KalmanFilter,Ensemble Kalman Filter. The second part develops a multi-sensor system. Here we study a moving particle on an orbit obtained from the same Lorenz system. We analyze the information content of MLEF s ensemble subspace for each sensor and consider the effects of different number of ensembles on the fusion process. In practice, when using ensemble based filtering techniques, a large ensemble size is required to obtain the best results. In this study we show that MLEF can obtain similar results using a smaller ensemble size by utilizing an information matrix, where essential characteristics are captured. This is a vital consideration when working with multi-sensor data fusion systems. Keywords: Ensemble filter, maximum likelihood, information content, data fusion 1. INTRODUCTION Data assimilation in atmospheric applications has been based on the Kalman Filtering theory introduced by Kalman and Bucy 1 in 1961. Since then, data assimilation methodologies used in real applications can be seen as efforts to approximate the Kalman filter/smoother theoretical framework. The lack of knowledge of statistical properties of models and observations, together with the computational burden associated with high dimensionality of realistic atmospheric data assimilation problems, have made approximations necessary. A novel approach based on the use of ensemble forecasting in nonlinear Kalman Filtering has been pursued in recent years by different studies (Evensen 2 1994; Houtekamer and Mitchell 3 1998; Whitaker and Hamill 4 22; Ott 24). Zupanski 6 introduced the MLEF approach as an alternative deterministic ensemble based filter technique. A similar algorithm called Ensemble Transform Kalman Filter (ETKF) was introduced by Bishop. 7 However, MLEF differs from this algorithm by acting on state space instead of sample space. In short, MLEF optimizes a non-linear cost function along with a Maximum Likelihood approach. As in variational and ensemble data assimilation methods, in MLEF the cost function is derived using a Gaussian probability density function framework. Furthermore, like other ensemble data assimilation algorithms, MLEF produces an estimate of the analysis uncertainty. In addition to the common use of ensembles for calculating error covariance, the ensembles in MLEF are exploited to efficiently calculate Hessian preconditioning and the gradient of the cost function. In order to understand the behavior of the MLEF for the multi-sensor problems, we concentrated on two different types of experiments. In the first experiment we established the credibility of MLEF method for highly nonlinear state estimation using a one sensor approach. MLEF, EKF and Ensemble Kalman Filter algorithms were applied to data assimilation with three-dimensional Lorenz equations. We have chosen Lorenz attractor as a simulation model because of its high non-linear chaotic behavior. Our goal in choosing such a system was to Further author information: (Send correspondence to Arif Albayrak) Arif Albayrak: E-mail: albayrak@cira.colostate.edu, Telephone: 1 97 491 86
emphasize the power of ensemble approaches compared to the classical Kalman Filter techniques. After reviewing the performance of MLEF algorithm, we extended our study of MLEF to a second experiment to analyze multisensor systems with information fusion. Here we study a moving particle on a an orbit that is obtained from the same Lorenz system. We analyze the information content of the MLEF algorithm and introduce a information fusion notion to combine different sensors states and covariance estimates. The rest of this study is organized in 4 sections. In section 2 the algorithms and models used throughout the paper are summarized. Section 3 presents the results for the two experiments. Finally, section 4 summarizes the findings and offers concluding remarks. 2. METHODS This section describes the methods considered during our study. Those methods are The maximum Likelihood Ensemble Filter, Ensemble Kalman filter, Kalman Filter and Extended Kalman Filter. Because KF and EKF are widely used approaches we have cited references for the review of their methodology. However, in the case of Ensemble Kalman Filter we offer a summary of the equations used by this approach. 2.1. The Maximum Likelihood Ensemble Filter The Maximum Likelihood Ensemble Filter (MLEF - Zupanski and Zupanski 8 and 6 ) is an ensemble system for data assimilation and forecasting based on control theory. The MLEF seeks a posterior mode by employing an unconstrained iterative minimization, such as the nonlinear conjugate-gradient and quasi-newton methods. The MLEF performs minimization in a low-dimensional, ensemble-spanned subspace of the state space. Although the MLEF may be used as a full-rank algorithm, in realistic high-dimensional problems the maximum number of degrees of freedom is drastically reduced due to computational limits. This reduction of degrees of freedom allows a superior explicit Hessian preconditioning, otherwise not feasible. As true for other ensemble data assimilation algorithms, the ensembles in the MLEF are used to calculate the flow-dependent error covariance. Assuming Gaussian PDF, the cost function is defined in the familiar form (1), J(x) = 1 2 (x xf ) T P 1 f (x xf )+ 1 2 (y H(x))T R 1 (y H(x)), (1) where R is the observation error covariance, y is the observation vector, H is a nonlinear observation operator, x is the state vector, and P f is the forecast error covariance. In practice, the inverse of the matrix P f is never calculated; instead, a square-root is defined from ensemble forecasts, with the i-th column-vector p f i defined as in Eqs. (2) and (3) P 1 2 f =[p f 1 pf 2...pf N E ], (2) P f i = M(xa + P a i ) M(xa ), (3) where M denotes the prediction model p a i, pf i (i =1...N E) are the analysis and forecast ensemble perturbations, respectively, and N E is the dimension of ensemble subspace. The uncertainties of the forecast are calculated from Equation (2). The uncertainties of the analysis are obtained from the square-root analysis error covariance given in Equation (4) P a i =[pa 1 pa 2...pa N E ], (4) P a i = P 1/2 ( f Iens + Z T Z ) 1 2, () P a i = P 1/2 f U ( I ens + Λ 2) 1 2 U T, (6) where U and Λ are the eigenvectors and eigenvalues of Z T Z, respectively. The column-vectors of Z are (7), z i (x) =R 1 2 H(x + p f i ) R 1 2 H(x), (7) It is important to note that column-vectors z i are defined using information from both data (e.g. H, R) and state (e.g. x, p f i ). This proves to be important for the information content calculations. As shown in Zupanski
et al. 9 the matrix Z T Z and its eigenvalues are related to Shannon information theory (Shannon and Weaver 1949) and can be used to calculate the degrees of freedom d s and entropy reduction h (e.g., Rodgers 2) d s = i λ 2 i 1+λ 2 i 2.2. The Kalman Filter and The Ensemble Kalman Filter and h = 1 ln(1 + λ 2 i 2 ), (8) During our study, in addition to the MLEF algorithm we consider Extended Kalman Filter and Ensemble Kalman Filter algorithms for the purpose of comparison. Kalman filter is an estimation technique used for linear models. 11 The theory supporting KF can be found in reference 11. To solve weak non-linear applications the KF algorithm can be modified using Taylor series expansion around the current estimate, thus obtaining an Extended Kalman Filter algorithm (EKF). An explanation of EKF can be found in references 11, and. Ensemble Kalman Filter is the first example of ensemble filtering techniques considered for data assimilation purposes. Bellow, a short summary of Ensemble Kalman Filter is provided in Eqs. (9) through (1) Consider a non-linear discrete time system given by Eqs.(9), and () x(k + 1) =f(ξ i (k, k), u(k), w i (k, k)), (9) y(k) =C(k)x(k) +v(k), () where x denotes the state vector, u model input,v is the measurement error, and w is the model error. They are assumed to be independent Gaussian white noise processes with covariances Σ state,andσ observation. Finally y denotes observation vector. i ξ i (k +1,k)=f(ξ i (k, k), u(k), w i (k, k)), (11) ˆx(k +1,k)= 1 q ξ i (k +1,k), q (12) i=1 [L c (k +1,k)] 1:n,i = 1 q ξ i (k +1,k) ˆx(k +1,k), (13) K c (k+1) = L c (k+1,k)l c (k+1,k) T C(k+1) T ( C(k +1)L c (k +1,k)L c (k +1,k) T C(k +1) T ) +Σ (k+1), (14) ξ i (k +1,k+1)=ξ i (k +1,k)+K c (y(k +1) C(k)ξ i (k +1,k) v i (k +1)), (1) where ξ i is an ensemble of state vectors, 2.3. The Lorenz Model To provide an example of the performance of the MLEF approach, we compare it to Kalman and Extended Kalman Filters using three dimensional Lorenz equations (1963) 12. These equations were originally designed as a approximation to the convective motion associated with a two dimensional cell fluid which is heated from below and cooled from above. The corresponding equations are given in Eqs (16), (17), and (18), dx = α(x Y ), dt (16) dy = rx Y XZ, dt (17) dz = bz + XY, dt (18)
z 2 x y z 2 1 1 Truth Solution for x 2 1 3 2 Truth Solution for y 4 4 3 3 1 2 2 3 1 2 2 4 4 3 Truth Solution for z 1 3 2 2 2 2 1 1 2 2 1 x 1 1 y Figure 1. 3-D Lorenz Attractor. Where α, b and r are the parameters of the system. Equation (19), X t+1 1 ατ ατ Y t+1 = rτ 1 τ X t τ Z t+1 Y t τ 1 bτ Lorenz system can be written in discrete form as in X t Y t Z t, (19) The system was integrated using the 4 th order Runge-Kutta method with time step.1. In the simulations, we generated the true state by running the model for non-dimensional time units, (i.e., steps) with α =, b = 8 3, and 28 which resulted in a butterfly like attractor (Lorenz 1963). Infrequent observations were created every. time units by adding Gaussian noise with mean and variance 4 to each coordinate of the true state x. Hence the observation operator H is linear and equal to the identity matrix, and the observation error covariance matrix R is a diagonal matrix. See Fig. 1 for the illustration of Lorenz system.. 2.4. Multi-Sensor Data Fusion In this study we introduce a multi-sensor data fusion application to show the effectiveness of the MLEF method. The consideration of multiple sensors is a complex problem which depends on the types of sensors and their attributes. Because the focus of this study is to show the applicability of MLEF to such problems, we have simplified our application. We considered a two sensor system each separately design with a MLEF tracker for the estimation of the parameters and covariance. It is assumed that while both sensors are of the same type, their observations are collected with different noise components. During this study, state estimates obtained from different sensors are combined in the sense of information fusion. This approach is introduced by Chang et al. 13 using Eqs. (2), and (21), 14 ˆx c = P c ( P 1 i ˆx i + P 1 j ˆx j P 1 pi ˆx pi P 1 pj ˆx pj + Pp 1 ) ˆx p, (2) P c = [ P 1 i + P 1 j P 1 pi P 1 pj + Pp 1 ] 1, (21) where ˆx pj, ˆx pi are previous sensor state estimates, P pi, P pj are previous sensor covariance matrices, and ˆx p is previous composite, and P p is covariance matrix. These composite formulations are based on the information filter concept that is also introduced in Refs. 14. A full oscillation around one of the butterfly wings corresponds to roughly one time unit.
1 1. 2 2. 3 3. 4 4.. 6.4 RMS MonteCarlo Runs.4.3 KF Kal Ens MLEF.3.2 Error.2.1.1. Figure 2. RMS for estimation results (x axis * ). 3. RESULTS The results obtained for the two experiments performed in this study are provided in this section. First, the outcome from the performance comparison of MLEF versus Ensemble Kalman Filter and Extended Kalman Filter is offered for a one sensor case. The simulation was designed using a total of 3 observations created from original ground truth values distorted with normally distributed random noise and a standard deviation of.1. We initiated the state vector and observation covariance to an average background covariance and state vector from the first n number of observations which were chosen randomly between 2 and. In order to compare the results effectively we applied this parameter values to each algorithm in consideration. Construction of ensemble filters require perturbations in order to create the covariance structure; therefore, additional calculations were required for MLEF and Ensemble Kalman Filter. A total of ensemble perturbations were obtained by multiplying the square root observation covariance matrix with a randomly chosen 3 by x, y,z values. Note that x, y, z values can only be in the interval and 1. Case observations were created irregularly from the first 6 time steps. After that we continue to run our simulation without observations. The results for each algorithm were compared for Monte Carlo runs. As a first step we analyze the RMS errors of the estimated states from all the techniques. We focused on the interval that includes observations, the results are shown in Fig. 2. In this figure we can observe that MLEF gave lower RMS error values compared to Ensemble Kalman Filter and EKF. Next we analyzed the behavior of filters in the absent of observations. The results of the simulation for that interval is given in Fig. 3. Here we see that MLEF and Ensemble Kalman Filter techniques offer similar performance when observations are not present. We believe that the optimization and preconditioning algorithms built within MLEF are the reasons for better estimate results. From this results we can also conclude that MLEF is able to obtain similar RMS values to those obtained by Ensemble Kalman Filter using a low number of ensembles. Second, we analyze the outcome from our second experiment, which is the result of the application of MLEF to a multi-sensor case. To apply the information fusion concept, we considered only x, andy coordinates of the Lorenz attractor. The results provided are the average of Monte Carlo runs. In Fig. 4 the location of the observation and the ground truth values can be seen. In Fig., an example of an instant observation during the run with the MLEF estimates of sensor-1 and sensor-2 is shown. In addition to that we included a σ uncertainty ellipse around the estimated state variable for each sensor. In this figure it can be observed that the fused state is in a perfect location including the estimated values, observations, as well as the original ground truth value. Furthermore, we analyzed the RMS errors of each sensor and fused state. As we did earlier we look at the
6 6. 7 7. 8 8. RMS MonteCarlo Runs.2 KF Kal Ens MLEF.2 Error.1.1. Figure 3. RMS for prediction results (x axis * ). 2 1 Attractor Orbit Ground Truth Observations 2D Lorenz Attractor y 1 2 2 2 x Figure 4. 2D Lorenz attractor with observation points.
2D Lorenz Attractor 4.16 4.17 Attractor Orbit Ground Truth Observations 4.18 4.19 y 4.2 4.21 4.22 4.23 4.24.8.78.76.74.72.7 x Figure. Uncertainity ellipses. Green ellipse sensor-1, blue ellipse sensor-2 and red ellipse fused state RMS 16 14 MLEF sensor 1 MLEF sensor 2 Fused 12 RMS error 8 6 4 2 2 4 6 8 12 Steps Figure 6. Left: RMS errors for MLEF sensor-1,2 and fused state Right: Run average (x axis * ). Fig. 6 in two separate intervals. In the first interval where observations exist we see that the RMS errors for each case are very close to each other. On the other hand in the no observation region, after the 6th step we begin to see a smaller RMS error for the fused states. 4. DISCUSSION AND CONCLUSION In this study, the Maximum Likelihood Ensemble Filter approach is presented in applications to three-dimensional Lorenz attractor. 12 The filter combines the maximum likelihood approach with the ensemble Kalman filter methodology, to create a new ensemble data assimilation algorithm. Thus, MLEF becomes a powerful estimation prediction algorithm where optimization, preconditioning and ensemble ideas are all combined. In this paper we extend the MLEF application area in order to show the effectiveness of the MLEF algorithm for multi-sensor applications.
During our study we perform two different experiments. First the MLEF algorithm was compared to Extended Kalman Filter and Ensemble Kalman filter. From the results obtained from the simulation, we concluded that MLEF offers a better performance than Extended Kalman Filter at all times. When compared against Ensemble Kalman filter while observations exist, the MLEF algorithm obtained better results. And when the observations were missing the MLEF algorithm was as good as the Ensemble Kalman Filter. We believe that the optimization and preconditioning algorithms built within MLEF are the reasons for better estimate results. From this results we can also conclude that MLEF is able to obtain similar RMS values to those obtained by Ensemble Kalman Filter using a lower number of ensembles. In our second experiment, MLEF is considered for a multi-sensor problem where an information fusion process is also included. During these experiment it was shown that fused states have reduced uncertainty areas as expected. In addition to that, while observations were included the RMS error of fused states was similar to those of the sensor with minimum RMS. However in the case of missing observations, the RMS of the fused states was considerably smaller than the RMS of the sensors. We believe that the major contribution of this study is the use of MLEF for a multi-sensor application with the incorporation of an information fusion concept, for the first time. Although the simulation used is a basic one, MLF results were very encouraging, setting the foundation for future uses in more complex systems. There are some issues that need further attention, but have not been address because of the limited scope of this study. These are the definition of the number of ensembles, initiation of ensembles in a non-random manner, and further analysis of information content in subspaces. REFERENCES 1. R. Kalman and R. Bucy, New results in linear prediction and filtering theory, Trans. AMSEJ. Basic Eng. 83-D, pp. 9 8, 1961. 2. G. Evensen, Sequential data assimilation with a nonlinear quasi-geostrophic model usin monte-carlo methods to forcast error statistics, J. Geophysics. Res 99, pp. 143 162, 1994. 3. P. Houtekamer and H. Mitchell, Data assimilation using kalman filter technique, Mon. Wea. Rev 126, pp. 796 811, 1998. 4. J. Whitaker and T. Hamill, Ensemble data assimilation without perturbed observations, Mon. Wea. Rev 13, pp. 1913 1924, 22.. E. Ott and E. K. M. C. E. K. D. P. J. Y. I. Szunyogh, A.V. Zimin, A local ensemble kalman filter for atmospheric data assimilation, Tellus 6A, pp. 41 428, 24. 6. M. Zupanski, Maximum likelihood ensemble filter: Theoretical aspects, Mon. Wea. Rev 133, pp. 17 1726, 2. 7. B. Bishop and S. M. J. Etherton, Adaptive sampling with the ensemble transform kalman filter. part 1: Theoretical aspects, Mon. Wea. Rev 129, pp. 42 436, 21. 8. D. Zupanski and M. Zupanski, Model error estimation employing an ensemble data assimilation approach, Mon. Wea. Rev. 134, pp. 1337 134, 26. 9. D. Zupanski, A. Hou, S. Zhang, M. Zupanski, C. Kummerow, and W. E. Oliu, Information theory and ensemble data assimilation, Submitted to Q. J. Roy. Meteorol. Soc, 27.. M. Verlaan and A. Heemink, Non-linearity in data assimilation applications: A practical method for analysis, Mon. Wea. Rev., 1999. 11. A. Gelb, Applied Optimal Estimation, The MIT Press, Cambridge, 1974. 12. E. N. Lorenz, Deterministic non-periodic flow, J. Atmos. Sci. 2, pp. 13 141, 1963. 13. K. Chang, R. Saha, and Y.Bar-Shalom, On optimal track-to-track fusion, IEEE Transactions on Aerospace and Electronic Systems 33, pp. 13 141, October, 1997. 14. S. Blackman and R. Popoli, Design and Analysis of Modern Tracking Systems, Artech House, Boston, 1999.