Improved Kalman Filter Initialisation using Neurofuzzy Estimation

Similar documents
Modelling Multivariate Data by Neuro-Fuzzy Systems

A Modified Incremental Principal Component Analysis for On-line Learning of Feature Space and Classifier

Constructing Fuzzy Controllers for Multivariate Problems by Using Statistical Indices

A Novel Activity Detection Method

A Modified Incremental Principal Component Analysis for On-Line Learning of Feature Space and Classifier

Model-based Correlation Measure for Gain and Offset Nonuniformity in Infrared Focal-Plane-Array Sensors

FORECASTING OF ECONOMIC QUANTITIES USING FUZZY AUTOREGRESSIVE MODEL AND FUZZY NEURAL NETWORK

A METHOD OF FINDING IMAGE SIMILAR PATCHES BASED ON GRADIENT-COVARIANCE SIMILARITY

Recursive Least Squares for an Entropy Regularized MSE Cost Function

Neuromorphic Sensing and Control of Autonomous Micro-Aerial Vehicles

Robust Motion Segmentation by Spectral Clustering

A RAIN PIXEL RESTORATION ALGORITHM FOR VIDEOS WITH DYNAMIC SCENES

Recursive Generalized Eigendecomposition for Independent Component Analysis

Mixture Models and EM

Eigenface-based facial recognition

IN neural-network training, the most well-known online

Combined Particle and Smooth Variable Structure Filtering for Nonlinear Estimation Problems

Supervised locally linear embedding

Simulation studies of the standard and new algorithms show that a signicant improvement in tracking

Affine Structure From Motion

Non-linear Measure Based Process Monitoring and Fault Diagnosis

Independent Component Analysis and Unsupervised Learning. Jen-Tzung Chien

Submitted to Electronics Letters. Indexing terms: Signal Processing, Adaptive Filters. The Combined LMS/F Algorithm Shao-Jen Lim and John G. Harris Co

Riccati difference equations to non linear extended Kalman filter constraints

Linear regression methods

Independent Component Analysis and Unsupervised Learning

Introduction to Machine Learning

Identification of a Chemical Process for Fault Detection Application

I. INTRODUCTION /01/$10.00 c 2001 IEEE IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 37, NO. 1 JANUARY

A SIMPLE CLOUD SIMULATOR FOR INVESTIGATING THE CORRELATION SCALING COEFFICIENT USED IN THE WAVELET VARIABILITY MODEL (WVM)

Intelligent Modular Neural Network for Dynamic System Parameter Estimation

The Scaled Unscented Transformation

DETECTING PROCESS STATE CHANGES BY NONLINEAR BLIND SOURCE SEPARATION. Alexandre Iline, Harri Valpola and Erkki Oja

Grassmann Averages for Scalable Robust PCA Supplementary Material

Hand Written Digit Recognition using Kalman Filter

Gaussian Process for Internal Model Control

Abnormal Activity Detection and Tracking Namrata Vaswani Dept. of Electrical and Computer Engineering Iowa State University

Face Recognition and Biometric Systems

Using Neural Networks for Identification and Control of Systems

System 1 (last lecture) : limited to rigidly structured shapes. System 2 : recognition of a class of varying shapes. Need to:

CHAPTER 4 PRINCIPAL COMPONENT ANALYSIS-BASED FUSION

A NOVEL OPTIMAL PROBABILITY DENSITY FUNCTION TRACKING FILTER DESIGN 1

Fuzzy Adaptive Kalman Filtering for INS/GPS Data Fusion

Worst-Case Analysis of the Perceptron and Exponentiated Update Algorithms

Robust extraction of specific signals with temporal structure

New Recursive-Least-Squares Algorithms for Nonlinear Active Control of Sound and Vibration Using Neural Networks

Feature Vector Similarity Based on Local Structure

Model-Based Diagnosis of Chaotic Vibration Signals

Reservoir Computing and Echo State Networks

PERIODIC KALMAN FILTER: STEADY STATE FROM THE BEGINNING

Bearing fault diagnosis based on EMD-KPCA and ELM

EUSIPCO

Dimension Reduction. David M. Blei. April 23, 2012

Sensor Fusion: Particle Filter

Analysis of Interest Rate Curves Clustering Using Self-Organising Maps

Human Pose Tracking I: Basics. David Fleet University of Toronto

Research Overview. Kristjan Greenewald. February 2, University of Michigan - Ann Arbor

Two-View Segmentation of Dynamic Scenes from the Multibody Fundamental Matrix

Nonlinear State Estimation! Particle, Sigma-Points Filters!

Soft and hard models. Jiří Militky Computer assisted statistical modeling in the

Distributed Data Fusion with Kalman Filters. Simon Julier Computer Science Department University College London

A SELF-TUNING KALMAN FILTER FOR AUTONOMOUS SPACECRAFT NAVIGATION

Probabilistic Graphical Models

DYNAMIC TEXTURE RECOGNITION USING ENHANCED LBP FEATURES

High Accuracy Fundamental Matrix Computation and Its Performance Evaluation

Comparing Robustness of Pairwise and Multiclass Neural-Network Systems for Face Recognition

Hierarchical Clustering of Dynamical Systems based on Eigenvalue Constraints

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method

ARTEFACT DETECTION IN ASTROPHYSICAL IMAGE DATA USING INDEPENDENT COMPONENT ANALYSIS. Maria Funaro, Erkki Oja, and Harri Valpola

TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen

Spatially adaptive alpha-rooting in BM3D sharpening

A recursive algorithm based on the extended Kalman filter for the training of feedforward neural models. Isabelle Rivals and Léon Personnaz

Extending the Associative Rule Chaining Architecture for Multiple Arity Rules

Tracking and Identification of Multiple targets

CLOUD NOWCASTING: MOTION ANALYSIS OF ALL-SKY IMAGES USING VELOCITY FIELDS

Constrained Projection Approximation Algorithms for Principal Component Analysis

A new unscented Kalman filter with higher order moment-matching

Quantitative Description of Robot-Environment Interaction Using Chaos Theory 1

Do We Really Have to Consider Covariance Matrices for Image Feature Points?

Data Retrieval and Noise Reduction by Fuzzy Associative Memories

Linear Subspace Models

A Factorization Method for 3D Multi-body Motion Estimation and Segmentation

CS229 Final Project. Wentao Zhang Shaochuan Xu

Discriminative Direction for Kernel Classifiers

Multi-scale Geometric Summaries for Similarity-based Upstream S

MULTICHANNEL SIGNAL PROCESSING USING SPATIAL RANK COVARIANCE MATRICES

self-driving car technology introduction

RESTORATION OF VIDEO BY REMOVING RAIN

Notes on Latent Semantic Analysis

IS NEGATIVE STEP SIZE LMS ALGORITHM STABLE OPERATION POSSIBLE?

Tennis player segmentation for semantic behavior analysis

Statistical Machine Learning from Data

Adaptive Channel Modeling for MIMO Wireless Communications

Stable Adaptive Momentum for Rapid Online Learning in Nonlinear Systems

THE PRINCIPAL components analysis (PCA), also

EVALUATING SYMMETRIC INFORMATION GAP BETWEEN DYNAMICAL SYSTEMS USING PARTICLE FILTER

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

A Residual Gradient Fuzzy Reinforcement Learning Algorithm for Differential Games

100 inference steps doesn't seem like enough. Many neuron-like threshold switching units. Many weighted interconnections among units

X. F. Wang, J. F. Chen, Z. G. Shi *, and K. S. Chen Department of Information and Electronic Engineering, Zhejiang University, Hangzhou , China

Transcription:

Improved Kalman Filter Initialisation using Neurofuzzy Estimation J. M. Roberts, D. J. Mills, D. Charnley and C. J. Harris Introduction It is traditional to initialise Kalman filters and extended Kalman filters with estimates of the states calculated directly from the observed (raw) noisy inputs 7, 9, but unfortunately their performance is extremely sensitive to state initialisation accuracy (Figure ). Good initial state estimates ensure fast convergence whereas poor estimates may give rise to slow convergence or even filter divergence. Divergence is generally due to excessive observation noise and leads to error magnitudes that quickly become unbounded, 5. When a filter diverges, it must be re-initialised but because the observations are extremely poor, re-initialised states will have poor estimates. This paper proposes that if neurofuzzy estimators produce more accurate state estimates than those calculated from the observed noisy inputs (using the known state model), then neurofuzzy estimates can be used to initialise the states of Kalman and extended Kalman filters. Filters whose states have been initialised with neurofuzzy estimates should give improved performance by way of faster convergence when the filter is initialised, and when a filter is re-started after divergence. Poor initial estimate Reasonable initial estimate State Time to convergence Time Figure : More accurate initial state estimates result in shorter convergence times Neurofuzzy estimators A neurofuzzy estimator combines the positive attributes of a neural network and a fuzzy network. Neural networks have become increasingly popular, mostly due to the theorem which states that any function can be arbitrarily well approximated. The problem comes when an attempt is made to understand the information stored; extracting the information is difficult and consequently the network is termed opaque. By contrast, the information stored by a fuzzy network is transparent as it is easily expressed in terms of linguistic rules. However, a drawback with fuzzy networks is seen when an attempt is made to adapt their parameters since the learning rules are based mainly on heuristics about which little can be proved. Neural networks have better established learning rules with provable behavioural characteristics. The problems with both approaches have long been recognised and have recently motivated the development of the neurofuzzy estimator 3. A neurofuzzy estimator embodies the well established modelling and learning capabilities of a neural network with the fuzzy system transparent framework for representing linguistic rules. ASMOD image-plane velocity estimation The ability of the neurofuzzy estimator ASMOD (Adaptive Spline Modelling of Observation Data) 8 to improve initialisation accuracy and hence reduce filter convergence times will be shown by example for the case of estimating the image-plane velocity of a tracked feature which moves with constant velocity in the real-world. Constant velocity motion (with respect to the real world) was simulated using the following equation: y.t/ = fh () + v z t where h was the height of the feature above the optical axis, was the initial distance of the feature from the focal point (O), and v z was the velocity of the feature along the z-axis (Figure ). te that h and v z were all in the intervals h [ :5 :5], [ 3] and v z [ ], and were all randomly generated. Training and test data were artificially generated using MATLAB. Output values, u.t/, where generated using: y.t/ y.t / u.t/ =ẏ.t/ = () t where t was the sampling interval (t = ms). Zero mean Gaussian noise with a standard deviation was added to the input y.t/ to simulate real noisy observations. Y O h f Image-plane y λ v z Z Motion plane Figure : Imaging geometry used to generate simulated training and test data Equation shows that a minimum of two inputs (y.t/ and y.t /) are required to estimate the output ẏ.t/. Training and test sets were generated consisting of 875 and 5 randomly generated data pairs respectively, where a training or test pair is defined as the input y.t/, y.t /, and output ẏ.t/. A value of = : was used, which corresponds to a standard deviation of :5 pixels in a 5 5 pixel image. Figure 3 shows the scatter plots of the two inputs for both the training and test sets. It is clear from these plots that most input data were clustered around the y.t/ = y.t / line. This distribution of data results from the physical constraints of the system being modelled, as described by the dynamics of the motion. The empty or non-populated regions of the input space were therefore physically unrealisable. P

.8.......8 y(t).8.... y(t).8.... was zero mean). The corresponding rotated input space was given by: a = U T y ().....8.8.8.......8.8..8. Figure 3: Training and test sets (two input case) y(t)... a... A large proportion of the training data was found to be clustered around the centre of the input-output space. This is clearly shown in Figure a, which shows the output plotted against one of the inputs. The majority of data were densely packed about the u.t/ = line. This distribution of training data in the input-output space was due to the imaging model, which performs a perspective transform resulting in input values (y.t/) that are inversely proportional to the distance of a feature from the camera. Because uncorrelated training data produces faster learning and, more importantly, produces a model that is not biased (artificially) in any regions of the input-output space where a large amount of training data exist,it was necessary to filter the training data. Filtering involves scaling the input data set so that it lies in the interval [ ], randomising their order and applying a Euclidean norm (with threshold ) as a distance measure to remove some of the redundant data that adds little knowledge about the process being modelled, but would otherwise bias the learning. Figure b shows the effect of filtering the training data. te that the overall shape of the plots remained the same and that the more widely spaced data were not affected by filtering....8.8.......8.8.......8 a Figure 5: Filtered and rotated-filtered training sets (two input case) The rotated input data shown in Figure 5b was scaled and filtered as before and resulted in a more evenly populated input space distribution. ASMOD could then be trained using the rotated data. Input dimensions were not discarded on the basis of their having small eigenvalues rather it was left to the ASMOD algorithm to choose which dimensions were redundant. ASMOD was trained using the filtered and rotated data described above and to measure its performance when presented with the test test, two performance metrics were evaluated:.35.3.5...8 Output MSE..5 Raw u(t) u(t)..5 ASMOD.8.......8.8.......8 Figure : Unfiltered and filtered training sets (two input case) Rotation of the input data so that the major axis of the data distribution (i.e. the y.t/ = y.t / = ::: = y.t N +/ in this example) is parallel to an input axis results in a more globally populated input space, and may reduce the number of dimensions required to model the data. Figure 5a shows the two input training data used previously and Figure 5b shows the result of input data rotation. Data rotation was performed using Principal Components Analysis (PCA) described in. A rotation matrix U was calculated from the eigenvectors of the input correlation matrix: R = E[y y T ] (3) where y was the input vector y minus its mean (i.e. y....8.... Input noise (Standard deviation) Figure : Output mean squared error (MSE) versus the standard deviation of the noise on the input Relative performance (%) 9 85 8 75 7 5 ASMOD....8.... Input noise (Standard deviation) Figure 7: Relative performance of ASMOD versus the standard deviation of the noise on the input

Mean Square output Error (MSE): MSE = T T.û.t/ u.t// (5) t= where û.t/ was the estimated output of either the ASMOD or raw estimators (û a.t/ and û r.t/ respectively), u.t/ was the true output, and T was the number of test pairs in the test set. Relative Performance (R a ): R a.%/ = T T b t= { if.ûa.t/ u.t// b = <.û r.t/ u.t// otherwise The MSE shows the average accuracy of the estimators (ASMOD and raw) over the entire test set. The relative performance measure (R a ) shows (as a percentage of the total number of data pairs tested) how many times AS- MOD was more accurate than using the raw estimate calculated using equation (). Figures and 7 show how the performance varied with changes in the magnitude of the input noise. It should be noted that to obtain results for each point on these plots, ASMOD was re-trained and re-tested using training and test data generated with the corresponding values of. It is possible to calculate the theoretical value of the MSE for the raw estimator using the following equation: () This section will show that a neurofuzzy initialised extended Kalman filter (NF-EKF) is superior to an EKF initialised using raw observations when applied to the estimation of recession-rate.t/ = ẏ.t/ y.t/ of a tracked feature. The EKF estimated both y and ẏ when a feature moved with constant velocity in the real world (known as the F case). The EKF s state (x F ) was initialised using: [ x F./ = y./ y./ y./ t It is proposed that neurofuzzy estimation (ASMOD) can provide more accurate initial estimates than those given by Equation (8), i.e. [ ya./ x F./ = ẏ a./ where y a is the ASMOD estimate of y, and ẏ a is the ASMOD estimate of ẏ. This estimation problem therefore required an ASMOD with N inputs (y.t/ y.t / ::: y.t N + /) and two outputs (y a.t/ and ẏ a.t/), which can be thought of as two separate ASMODs (one for each output). The overall system diagram of the NF- EKF is shown in Figure 8 where both ASMODs consist of N individual ASMODs, one for the two input case, two for the three input case, and so on. y (t) a ASMOD: y Is t < N? ASMOD: y y (t) a ] ] Is t < N? (8) (9) ( ) MSE r = (7) t Is t <= N? Extended Kalman Filter This equation was derived by inserting the noise standard deviation ( ) into equation (). The curve showing the performance of the raw estimator (Figure ) was plotted using equation (7), but the data points shown were found using the test set (using Equation ()). The results shown in Figures and 7 show that ASMOD performed very well compared to raw estimation. It can be seen from equation (7) and from the curve in Figure that raw estimation is very poor when input noise is large, and that the MSE is proportional to. By comparison, ASMOD always out performed the raw estimator (with respect to MSE) and its MSE was approximately proportional to. Figure 7 confirms that ASMOD performed better than raw estimation 85% of the time when input noise was very high. These results imply that AS- MOD did not only successfully learn the basic inputoutput function, but also learned something about the noise characteristics of the system being modelled. AS- MOD can therefore be thought of as a filter in this case. Performance of a neurofuzzy initialised EKF To demonstrate that a neurofuzzy initialised extended Kalman filter (NF-EKF) is often superior to an EKF initialised using raw observations, the recession-rate of a tracked feature was estimated. Figure 8: System diagram of a NF-EKF The performance of the constant velocity (F) EKF was then compared with that of a similar EKF initialised using neurofuzzy estimates (NF-EKF). Two thousand noisy test trajectories ( =.) were randomly generated using MATLAB, and a number of performance measures compared: Relative Performance (R n ): R n.%/ = S c i= { if MSEn.i/ <MSE c = k.i/ otherwise R k.%/ = R n () where S was the total number of trajectories tested, R n and R k were the relative performance of the NF-EKF and EKF respectively, and MSE n.i/ and MSE k.i/ were the MSE of the i th trajectory using the NF-EKF and EKF respectively.

Convergence: defined as occurring when the estimated output was consistently within 5% of the true output for 5t seconds.. of converged estimates (%) = S d i= Recession rate (/sec) NF-EKF EKF { if tc.i/ <T.i/ d = otherwise () where t c.i/ is the time to convergence of the i th trajectory and T.i/ was the total length (in time) of the i th trajectory. This measure reveals how often a filter converges (as a percentage of the total number of trials). Mean convergence time = S e i= { tc.i/ e = T.i/ if t c.i/ <T.i/ otherwise () This measure can be thought of as the mean of the normalised time to convergence of a filter when convergence occurs. The results from a trajectory run are listed in Table and show that the NF-EKF converged significantly faster than the EKF and hence produced more accurate estimates most of the time. Figure 9 contains the results from a single trajectory run showing the more accurate initial estimate and the subsequent shorter converge time of the NF-EKF. The results also imply that improved state initialisation can prevent filter divergence in some cases. This is graphically shown by Figure. Performance measure EKF NF-EKF Relative performance (%). 78.. of converged estimates (%). 78.3 Mean convergence time.7.8 Table : EKF and NF-EKF performance comparison Time to converge (NF-EKF) 3 5 7 8 9 Frame number Figure : NF-EKF prevents repeated filter divergence Conclusions It has been shown that an EKF may be improved by means of initialising its state vector with estimates obtained from a finite history neurofuzzy estimator (AS- MOD). The ASMOD estimator uses a simple input-output mapping and can be thought of as an intelligent look-up table. Any temporal information can only be utilised by means of large numbers of inputs (e.g. to utilise the information contained in ten observations requires a neurofuzzy estimator with ten inputs). Very large numbers of inputs are not practical and hence finite history neurofuzzy estimators cannot make use of all the information contained in a long time series. The Kalman filter is a recursive algorithm. It combines measurement information and model predictions in a recursive manner to produce an optimal estimate of the state parameters. By combining both measurement information and model predictions over time it is able to give accurate state estimates. A recursive neurofuzzy estimator that exploits the positive properties of both the finite history neurofuzzy estimator and the Kalman filter is under development,. Model predictions are made using a finite history neurofuzzy estimator (in the same way as the ASMOD algorithm was used as part of this work), but measurement information is combined with the neurofuzzy model predictions in a recursive manner similar to that of the Kalman filter..9.8.7 Acknowledgements The authors acknowledge the financial support of Lucas Aerospace, British Aerospace and the EPSRC. Recession rate (/sec)..5..3. EKF NF-EKF Time to converge (NF-EKF) Time to converge (EKF). 3 5 7 8 Frame number Figure 9: Shorter convergence time produced by the NF-EKF References Abdelnour G., Chand S., Chiu S. and Kido T., 993, On-Line Detection and Correction of Kalman Filter Divergence by Fuzzy Logic, Proc. of the American Control Conf., San Francisco, CA, 835-839 An P. E., Aslam-Mir S., Brown M. and Harris C. J., 99, Aspects of the CMAC and its Application to High Dimensional Aerospace Modelling Problems, Proc. of the IEE Int. Conf. on Control, Coventry, UK,, -7 3 Brown M. and Harris C. J., 99, Neurofuzzy Adaptive Modelling and Control, Prentice Hall, Hemel Hempstead,

Doyle R. S. and Harris C. J., 99, Multi-Sensor Data Fusion for Obstacle Tracking using Neurofuzzy Estimation Algorithms, SPIE Optical Engineering in Aerospace Sensing, Orlando, Florida, 33 5 Fitzgerald R. J., 97, Divergence of the Kalman Filter, IEEE Trans. on Automatic Control,,, 73-77 Haykin S., 99, Neural Networks - A Comprehensive Foundation, IEEE Press 7 Iu S. and Wohn K., 99, Recovery of 3D Motion of a Single Particle, Pattern Recognition,, 3, -5 8 Kavli T., 993, ASMOD: An Algorithm for Adaptive Spline Modelling of Observation Data, Int. Journal of Control, 58,, 97-98 9 LaScala B. and Bitmead B., 99, Design of an Extended Kalman Filter Frequency Tracker, Submitted to IEEE Trans. on Signal Processing, Mirmehdi M. and Ellis T. J., 993, Parallel Approach to Tracking Edge Segments in Dynamic Scenes, Image and Vision Computing,,, 35-8 Moore C. G., Harris C. J. and Rogers E., 993, Utilising Fuzzy Models in the Design of Estimators and Predictors: An Agile Target Tracking Example, Second IEEE Int. Conf. on Fuzzy Systems, San Francisco, CA,, 79-8