The Kalman filter is arguably one of the most notable algorithms

Similar documents
Derivation of the Kalman Filter

ANALYTICAL MATHEMATICS FOR APPLICATIONS 2018 LECTURE NOTES 3

Least Squares and Kalman Filtering Questions: me,

ECE133A Applied Numerical Computing Additional Lecture Notes

Least Squares Estimation Namrata Vaswani,

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

LECTURE NOTES ELEMENTARY NUMERICAL METHODS. Eusebius Doedel

Constrained Optimization

MATH 4211/6211 Optimization Quasi-Newton Method

Chapter 3 Numerical Methods

A Characterization of the Hurwitz Stability of Metzler Matrices

Course Summary Math 211

RECURSIVE ESTIMATION AND KALMAN FILTERING

4 Derivations of the Discrete-Time Kalman Filter

Computational Methods. Least Squares Approximation/Optimization

Particle Filters. Outline

The Solution of Linear Systems AX = B

March 8, 2010 MATH 408 FINAL EXAM SAMPLE

Math 313 Chapter 1 Review

17 Solution of Nonlinear Systems

Estimation, Detection, and Identification

COMP 558 lecture 18 Nov. 15, 2010

14. Nonlinear equations

Square-Root Algorithms of Recursive Least-Squares Wiener Estimators in Linear Discrete-Time Stochastic Systems

LINEAR SYSTEMS AND MATRICES

Nonlinear Programming

Review Questions REVIEW QUESTIONS 71

Linear Regression and Its Applications

Quasi-Newton methods for minimization

(Extended) Kalman Filter

1 Computing with constraints

Chapter 1: Systems of linear equations and matrices. Section 1.1: Introduction to systems of linear equations

Riccati difference equations to non linear extended Kalman filter constraints

The Discrete Kalman Filtering of a Class of Dynamic Multiscale Systems

KKT Conditions for Rank-Deficient Nonlinear Least-Square Problems with Rank-Deficient Nonlinear Constraints1

Linear Algebra and Matrix Inversion

Jim Lambers MAT 419/519 Summer Session Lecture 11 Notes

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

Kalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q

C&O367: Nonlinear Optimization (Winter 2013) Assignment 4 H. Wolkowicz

A Matrix Theoretic Derivation of the Kalman Filter

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx.

Review of Vectors and Matrices

Probabilistic Graphical Models

Numerical solutions of nonlinear systems of equations

Numerical solution of Least Squares Problems 1/32

GLOBALLY CONVERGENT GAUSS-NEWTON METHODS

10.34 Numerical Methods Applied to Chemical Engineering Fall Quiz #1 Review

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

Written Examination

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Extended Kalman Filter Tutorial

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

Lecture 2: From Linear Regression to Kalman Filter and Beyond

2.098/6.255/ Optimization Methods Practice True/False Questions

Conjugate Gradient (CG) Method

6.4 Kalman Filter Equations

Kalman-Filter-Based Time-Varying Parameter Estimation via Retrospective Optimization of the Process Noise Covariance

Interval solutions for interval algebraic equations

Two improved classes of Broyden s methods for solving nonlinear systems of equations

Digital Workbook for GRA 6035 Mathematics

Algorithms for nonlinear programming problems II

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

CHUN-HUA GUO. Key words. matrix equations, minimal nonnegative solution, Markov chains, cyclic reduction, iterative methods, convergence rate

The Kalman Filter. Data Assimilation & Inverse Problems from Weather Forecasting to Neuroscience. Sarah Dance

Solvability of Linear Matrix Equations in a Symmetric Matrix Variable

Elementary maths for GMT

Numerical Methods for Large-Scale Nonlinear Systems

Lecture 6. Regularized least-squares and minimum-norm methods 6 1

Lecture 7. Gaussian Elimination with Pivoting. David Semeraro. University of Illinois at Urbana-Champaign. February 11, 2014

Numerical Methods I Solving Nonlinear Equations

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

5.6. PSEUDOINVERSES 101. A H w.

5 Quasi-Newton Methods

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

Matrices and systems of linear equations

Prediction, filtering and smoothing using LSCR: State estimation algorithms with guaranteed confidence sets

Algorithms for nonlinear programming problems II

Nonlinear Optimization: What s important?

Jim Lambers MAT 610 Summer Session Lecture 1 Notes

MATRICES. a m,1 a m,n A =

YURI LEVIN, MIKHAIL NEDIAK, AND ADI BEN-ISRAEL

Lecture 4 - The Gradient Method Objective: find an optimal solution of the problem

Math 273a: Optimization Netwon s methods

7.6 The Inverse of a Square Matrix

Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming

Math 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March 22, 2018

Lecture Note 5: Semidefinite Programming for Stability Analysis

Optimization and Root Finding. Kurt Hornik

Notes on Mathematics

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

the Unitary Polar Factor æ Ren-Cang Li Department of Mathematics University of California at Berkeley Berkeley, California 94720

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

Linear Riccati Dynamics, Constant Feedback, and Controllability in Linear Quadratic Control Problems

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

JACOBI S ITERATION METHOD

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lemma 8: Suppose the N by N matrix A has the following block upper triangular form:

Quaternion Data Fusion

Systems of Linear Equations and Matrices

Transcription:

LECTURE E NOTES «Kalman Filtering with Newton s Method JEFFREY HUMPHERYS and JEREMY WEST The Kalman filter is arguably one of the most notable algorithms of the 0th century [1]. In this article, we derive the Kalman filter using Newton s method for root finding. We show that the one-step Kalman filter is given by a single iteration of Newton s method on the gradient of a quadratic objective function, and with a judiciously chosen initial guess. This derivation is different from those found in standard texts [] [6], since it provides a more general framework for recursive state estimation. Although not presented here, this approach can also be used to derive the extended Kalman filter for nonlinear systems. BACKGROUND Linear Estimation Suppose that data are generated by the linear model b 5 Ax 1e, (1) where A is a known m 3 n matrix of rank n, e is an m-dimensional random variable with zero mean and known positive-definite covariance Q 5 E3ee T 4, denoted Q. 0, and b [ R m represents known, but inexact, measurements with errors given by e. The vector x [ R n is the set of parameters to be estimated. The estimator x^ is linear if x^ 5 Kb for some n 3 m matrix K. We say that the estimator x^ is unbiased if E3x^ 4 5 x. Since E3x^ 4 5 E3Kb4 5 E3K(Ax 1e)4 5 KAx 1 KE3e4 5 KAx, it follows that the linear estimator x^ 5 Kb is unbiased if and only if KA 5 I. When m. n, many choices of K could be made to satisfy this condition. We want to find the meansquared error-minimizing linear unbiased estimator, or, in other words, the minimizer of the quantity E3 7x^ x7 4 over all matrices K that satisfy the constraint KA 5 I. In What Is the Best Linear Unbiased Estimator, we prove the Gauss- Markov theorem, which states that the optimal K is given Digital Object Identifier 10.1109/MCS.010.938485 by K 5 (A T Q A) A T Q, and thus the corresponding estimator is given by whose covariance is given by x^ 5 (A T Q A) A T Q b, () P 5 E3(x^ x)(x^ x) T 4 5 (A T Q A). (3) In addition, the Gauss-Markov theorem states that every other linear unbiased estimator of x has larger covariance than (3). Thus we call () the best linear unbiased estimator of (1). Weighted Least Squares It is not a coincidence that the estimator in () has the same form as the solution of the weighted least squares problem 1 x^ 5 argmin x[r n 7b Ax7 W, (4) where W. 0 is a problem-specific weighting matrix and 7 # 7 W describes a weighted -norm defined as 7x7 W J x T Wx. The factor 1/ is not necessary but makes calculations involving the derivative simpler. Scaling the objective function by a constant does not affect the minimizer. Therefore, consider the positive-definite quadratic objective function J(x) 5 1 7b Ax7 W 5 1 xt A T WAx x T A T Wb 1 1 bt Wb. (5) Its minimizer x^ satisfies =J(x^ ) 5 0, which yields the normal equations A T WAx^ 5 A T Wb. (6) Since A is assumed to have full column rank, the weighted Gramian A T WA is nonsingular, and thus (6) has the solution x^ 5 (A T WA) A T Wb. (7) Hence, in comparing () with (7), we see that the best linear unbiased estimator of (1) is found by solving the weighted least squares problem (4) with weight 1066-033X/10/$6.00 010IEEE DECEMBER 010 «IEEE CONTROL SYSTEMS MAGAZINE 101

What Is the Best Linear Unbiased Estimator? Suppose that data are generated by the linear model b 5 Ax 1e, where A is a known m 3 n matrix of rank n, and e is an m-dimensional random variable with zero mean and covariance Q. 0. Recall that a linear estimator x^ 5 Kb of x is unbiased if and only if KA 5 I. If n, m, then there are many choices of K that are possible. We want the matrix K that minimizes the meansquared error E3 7 x^ x 7 4. Note, however, that 7x^ x7 5 7Kb x7 5 7KAx 1 Kex7 5 7Ke7 5e T K T Ke. Moreover, since e T K T Ke is a scalar, we have e T K T Ke5 tr(e T K T Ke) 5 tr(kee T K T ). It follows, by the linearity of expectation, that E3 7x^ x7 4 5 tr(ke3ee T 4K T ) 5 tr(kqk T ). Thus the mean-squared error minimizing linear unbiased estimator is found by minimizing tr(kqk T ) subject to the constraint KA 5 I, where K [ R n3m. This is a convex optimization problem, that is, the objective is a convex function and the feasible set is a convex set. Hence, to fi nd the unique minimizer, it suffi ces to fi nd the unique critical point of the Lagrangian L(K, l) 5 tr(kqk T ) tr(l T (KA I)), where the matrix Lagrange multiplier l [ R n3n corresponds to the n constraints KA 5 I. Thus the minimum occurs when 05= K L(K, l) 5 ' 'K tr(kqkt l T (KA I)) that is, when 5 KQ T 1 KQ la T 5 KQ la T, K 5 1 lat Q. Since KA 5 I, we have that l5(a T Q A). Therefore, K 5 (A T Q A) A T Q, and the optimal estimator is given by x^ 5 (A T Q A) A T Q b. To compute the covariance of the estimator we expand x^ as x^ 5 (A T Q A) A T Q (Ax 1e) 5 x 1 (A T Q A) A T Q e. Thus the covariance is given by P 5 E3(x^ x)(x^ x) T 4 5 (A T Q A) A T Q E3ee T 4Q A(A T Q A) 5 (A T Q A). We now show that every linear unbiased estimator other than x^ produces a larger covariance. If x^l 5 Lb is a linear unbiased estimator of the linear model, then since KA 5 I, it follows that there exists a matrix D such that DA 5 0 and L 5 K 1 D. The covariance of x^l is given by E3(x^L x)(x^l x) T 4 5 E3(K 1 D)ee T (K T 1 D T )4 5 (K 1 D)Q(K T 1 D T ) 5 KQK T 1 DQD T 1 KQD T 1 (KQD T ) T. Note, however, that KQD T 5 0. Thus E3(x^L x)(x^l x) T 4 5 KQK T 1 DQD T $ KQK T. Therefore, x^ is the best linear unbiased estimator of x. W 5 Q. 0. Also from (3) it can be noted that the inverse weighted-gramian matrix P 5 (A T Q A) is the covariance of this estimator. Newton s Method Let f : R n S R n be a smooth function. Assume that f(x^ ) 5 0 and that the Jacobian matrix Df(x) is nonsingular in a neighborhood of x^. If the initial guess x 0 is sufficiently close to x^, then Newton s method, which is given by x k11 5 x k Df(x k ) f(x k ), (8) ` produces a sequence 5x k 6 k50 that converges quadratically to x^, that is, there exists a constant c. 0 such that, for every positive integer k, 7x k11 x^ 7 # c7x k x^ 7. Newton s method is widely used in optimization problems since the local extrema of an objective function J(x) can be obtained by finding roots of its gradient f(x) J =J(x). Hence, from (8) we have x k11 5 x k D J(x k ) =J(x k ), (9) where D J(x k ) denotes the Hessian of J evaluated at x k. This iterative process converges, likewise at a quadratic rate, to the isolated local minimizer x^ of J whenever the starting point x 0 is sufficiently close to x^ and the Hessian D J(x) is positive definite in a neighborhood of x^. In practice, we do not invert the Hessian D J(x k ) when computing (9). Indeed, by a factor of roughly two, we can 10 IEEE CONTROL SYSTEMS MAGAZINE» DECEMBER 010

In this article, we derive the Kalman filter using Newton s method for root finding. more efficiently compute the Newton update by solving the linear system D J(x k )y k 5=J(x k ), for example by Gaussian elimination, and then setting x k11 5 x k 1 y k ; see [7] for details. To find the minimizer of a positive-definite quadratic form (5), it is not necessary to use an iterative scheme, such as Newton s method, since the gradient is affine in x and the unique minimizer can be found by solving the normal equations (6) directly. However, positivedefinite quadratic forms have the special property that Newton s method (9) converges in a single step for each starting point x [ R n, that is, for all x [ R n, the minimizer x^ satisfies x^ 5 x D J(x) =J(x). (10) This observation provides a key insight used to derive the Kalman filter below. Note also that the Hessian of the quadratic form (5) is D J(x) 5 A T WA, which is constant in x. Thus, throughout we denote the Hessian as D J, dropping the explicit dependence on x. Recursive Least Squares We now consider the least squares problem (4) in the case of recursive implementation. At each time k, we seek the best linear unbiased estimator x^ k of x for the linear model where b k 5 A k x 1e k, (11) b 1 A 1 v 1 b k 5 (, A k 5 (, and e k 5 (. b k A k v k We assume, for each j 5 1, c, k, that the noise terms v j have zero mean and are uncorrelated with E3v j v jt 4 5 R j. 0. We also assume that A 1 has full column rank, and thus each A k has full column rank. The estimate is found by solving the weighted least squares problem x^ k 5 argmin7 ba k x7 Wk, (1) x where the weight is given by the inverse covariance W k 5 E3e k e kt 4 5 diag(r 1, c, R k ). As time marches forward, the number of rows of the linear system increases, thus altering the least squares solution. We now show that it is possible to use the least squares solution x^ k at time k 1 to efficiently compute the least squares solution x^ k at time k. This process is the recursive least squares algorithm. We begin by rewriting (1) as J k (x) 5 1 a k i51 7b i A i x7 Ri. (13) The positive-definite quadratic form (13) can be expressed recursively as J k (x) 5 J k (x) 1 1 7b k A k x7 Rk. The gradient and Hessian of J k are given by and =J k (x) 5=J k (x) 1 A k T R k (A k x b k ) D J k 5 D J k 1 A k T R k A k, (14) respectively. Since A 1 has full column rank, we see that D J 1. 0. From (14), it follows that D J k. 0 for every positive integer k, and hence from (10) the minimizer of (13) becomes x^ k 5 x (D J k ) (=J k (x) 1 A k T R k (A k x b k )), (15) where the starting point x [ R n can be arbitrarily chosen. Since =J k (x^ k ) 5 0, we set x 5 x^ k in (15). Thus (15) becomes x^ k 5 x^ k K k A k T R k (A k x^ k b k ), where K k J (D J k ) is the inverse of the Hessian of J, and from (3) represents the covariance of the estimate. Observing from (14) that K k 5 K k 1 A T k R k A k and using Lemma 1 in Inversion Lemmata, we have K k 5 (K k 1 A T k R k A k ) 5 K k K k A kt (R k 1 A k K k A kt ) A k K k. Thus, to summarize, the recursive least squares method is K k 5 K k K k A kt (R k 1 A k K k A kt ) A k K k x^ k 5 x^ k K k A k T R k (A k x^ k b k ). At each time k, this algorithm gives the best linear unbiased estimator of (11), along with its covariance. DECEMBER 010 «IEEE CONTROL SYSTEMS MAGAZINE 103

The Kalman filter is arguably one of the most notable algorithms of the 0th century. The recursive least squares algorithm allows for efficient updating of the weighted least squares solution. Indeed, if p rows are added during the kth step, then the expression R k 1 A k K k A k T is a p 3 p matrix, which is small in size compared to the matrix A k W k A k T, which is needed, say, if we compute the solution directly using (7). STATE-SPACE MODELS AND STATE ESTIMATION Consider the stochastic discrete-time linear dynamic system x k11 5 F k x k 1 G k u k 1 w k, (16) y k 5 H k x k 1 v k, (17) where the variables represent the state x k [ R n, the input u k [ R p, and the output y k [ R q. The terms w k and v k are noise processes, which are assumed to have mean zero and be mutually uncorrelated, with known covariances Q k. 0 and R k. 0, respectively. We assume that the system has the stochastic initial state x 0 5m 0 1 w, where m 0 is the mean and w is a zero-mean random variable with covariance E3w w T 4 5 Q. 0. State Estimation We formulate the state estimation problem, given m known observations y 1, c, y m and k known inputs u 0, c, u k, where k $ m. To find the best linear unbiased estimator of the states x 1, c, x k, we begin by writing (16) (17) as the large linear estimation problem m 0 5 x 0 w, G 0 u 0 5 x 1 F 0 x 0 w 0, y 1 5 H 1 x 1 1 v 1, ( ( G m u m 5 x m F m x m w m, y m 5 H m x m 1 v m, G m u m 5 x m11 F m x m w m, ( ( G k u k 5 x k F k x k w k. Note that the known measurements, namely, the inputs and outputs, are on the left, whereas the unknown states are on the right, together with the noise terms. The parameters to be estimated in this linear estimation problem are the states x 0, x 1, c, x k, which we write as z k 5 3x 0 c xk 4 T [ R (k11)n. Then the linear system takes the form where b k m 5 A k m z k 1e k m, T T T e k m 5 3w w 0 v 1 c T T T wm v m w m c T wk is a zero-mean random variable whose inverse covariance is the positive-definite block-diagonal matrix W k m 5 diag(q, Q 0, R 1, c, Q m, R m, Q m c, Q k ). 4 T Inversion Lemmata We provide some technical lemmata that are used throughout the article. The first lemma is the Sherman Morrison Woodbury formula, which shows how to invert an additively updated matrix when we know the inverse of the original matrix. The second inversion lemma tells us how to invert block matrices. These statements can be verified by direct computation; see also [8, p. 18 19]. LEMMA 1 (SHERMAN-MORRISON-WOODBURY) Let A and C be square matrices and B and D be given so that the sum A 1 BCD is nonsingular. If A, C and C 1 DA B are also nonsingular, then (A 1 BCD) 5 A A B(C 1 DA B) DA. LEMMA (SCHUR) Let M be a square matrix with block form M 5 c A B C D d. If A, D, A BD C, and D CA B are nonsingular, then (A M BD C) A B(D CA B) 5 c D C(A BD C) (D CA B) d. 104 IEEE CONTROL SYSTEMS MAGAZINE» DECEMBER 010

The second observation is that the least-squares estimation problem can be solved by minimizing a positive-definite quadratic functional. We observe that each column of A k m is a pivot column. Hence, it follows that A k m has full column rank, and thus the weighted Gramian A T k m W k m A k m is nonsingular. Thus, the best linear unbiased estimator z^ k m of z k is given by the weighted least squares solution z^ k m 5 (A T k m W k m A k m ) A T k m W k m b k m. Since the state estimation problem is cast as a linear model, we can equivalently solve the estimation problem by minimizing the positive-definite objective function J k m ) 5 1 7A k mz k b k m 7 Wk m. (18) We can find its minimizer z^ k m by performing a single iteration of Newton s method given by z^ k m 5 z (A T k m W k m A k m ) A T k m W k m (A k m z b k m ), (19) where z [ R (k11)n is arbitrary. In the following, we set m 5 k and choose a canonical z that simplifies (19), thus leaving us with the Kalman filter. Kalman Derivation with Newton s Method Consider the state estimation problem in the case m 5 k, that is, the number of observations equals the number of inputs. It is of particular interest in applications to determine the current state x k given the observations y 1, c, y k and inputs u 0, c, u k. The Kalman filter gives the best linear unbiased estimator x^ k k of x k in terms of the previous state estimator x^ k k and the latest data u k and y k up to that point in time. We begin by rewriting (18) as J k m ) 5 1 00x 0 m 0 00 Q 1 1 m a 00y i H i x i 00 R i i51 1 1 k a 00x i F i x i G i u i 00 Q i. (0) i51 For convenience, we denote z^ k k as z^ k, x^ k k as x^ k, and J k k as J k. For m 5 k, (0) can be expressed recursively as J k ) 5 J k ) 1 1 7y k H k x k 7 Rk 1 1 7x k F k z k G k u k 7 Q k, where F k 5 30 c 0 F k 4. The gradient and Hessian of J k are given by =J k ) 5 and T =J k ) 1 F k Q k (F k z k x k 1 G k u k ) c Q k (F k z k x k 1 G k u k ) 1 H T k R k (H k x k y k ) d Q k Q k D J k 5 c D T J k 1 F T k F k F k Q k F k Q k 1 H T k R d, () k H k respectively. Since D J 0 5 Q. 0, it follows inductively that D J k. 0 for every positive integer k. The proof follows from the observation that z T k D T J k z k 5 z k D J k z k 1 7F k z k x k 7 Qk 1 7H k x k 7 Rk, and thus the right-hand side is nonnegative, being zero only if z k 5 0 and x k 5 0, or, equivalently, z k 5 0. From one iteration of Newton s method, we have that z^ k 5 z k (D J k ) =J k ) () for all z k [ R (k11)n. Since =J k (z^ k ) 5 0 and F k z^ k 5 F k x^ k, we set Thus, z^ k z k 5 c F k x^ k 1 G k u k. d. 0 =J k ) 5 c H T k R k 3H k (F k x^ k 1 G k u k ) y k 4 d, and the bottom row of () becomes x^ k 5 F k x^ k 1 G k u k P k H k T R k 3 3H k (F k x^ k 1 G k u k ) y k 4, where P k is the bottom-right block of the inverse Hessian (D J k ), which from Lemma in Inversion Lemmata, is given by P k 5 (Q k 1 H T k R k H k Q k F k 3 (D T J k 1 F k Q k F k ) T F k Q k ) 5 3(Q k 1 F k (D J k ) T F k ) 1 H T k R k H k 4 T 5 3(Q k 1 F k P k F k ) 1 H T k R k H k 4. DECEMBER 010 «IEEE CONTROL SYSTEMS MAGAZINE 105

Note that P k is the covariance of the estimator x^ k. To summarize, we have the recursive estimator for x k given by T P k 5 3(Q k 1 F k P k F k ) 1 H T k R k H k 4 x^ k 5 F k x^ k 1 G k u k P k H k T R k 3 3H k (F k x^ k 1 G k u k ) y k 4, where x^ 0 5m 0, P 0 5 Q. This recursive estimator is the onestep Kalman filter. CONCLUSIONS There are a few key observations in the Newton method derivation of the Kalman filter. First is that the state estimation problem for linear systems (16) (17) is a potentially large linear least squares estimation problem. The second observation is that the least-squares estimation problem can be solved by minimizing a positive-definite quadratic functional. The third point is that a positive-definite quadratic functional can be minimized with a single iteration of Newton s method for any starting point x [ R n. The final observation is that by judiciously choosing the starting point to be the solution of the previous state estimate, several terms cancel and the Newton update simplifies to the one-step Kalman filter. We remark that this approach to state estimation can be generalized to include predictive estimates on future states and smoothed estimates on previous states. We can also use Newton s method to derive the extended Kalman filter. ACKNOWLEDGMENTS The authors thank Randall W. Beard, Magnus Egerstedt, C. Shane Reese, and H. Dennis Tolley for useful conversations. This work was supported in part by the National Science Foundation, Division of Mathematical Sciences, grant numbers CAREER 0847074 and CSUMS 063938. AUTHOR INFORMATION Jeffrey Humpherys received the Ph.D. in mathematics from Indiana University in 00 and was an Arnold Ross Assistant Professor at The Ohio State University from 00-005. Since 005, he has been at Brigham Young University, where he is currently an associate professor of mathematics. His interests are in applied mathematical analysis and numerical computation. Jeremy West received B.S. degrees in mathematics and computer science and the M.S. in mathematics in 009, all from Brigham Young University. He is currently a Ph.D. student in mathematics at the University of Michigan. His interests are in discrete mathematics. REFERENCES [1] J. L. Casti, Five More Golden Rules: Knots, Codes, Chaos, and other Great Theories of 0th-Century Mathematics. New York: Wiley, 000. [] M. S. Grewal and A. P. Andrews, Kalman Filtering: Theory and Practice Using MATLAB, nd ed. New York: Wiley, 001. [3] D. Simon, Optimal State Estimation: Kalman, H-Infinity, and Nonlinear Approaches. New York: Wiley-Interscience, 006. [4] T. Kailath, A. Sayed, and B. Hassibi, Linear Estimation. Englewood Cliffs, NJ: Prentice-Hall, 000. [5] R. Brown and P. Hwang, Introduction to Random Signals and Applied Kalman Filtering. New York: Wiley, 1997. [6] A. Gelb, Applied Optimal Estimation. Cambridge, MA: MIT Press, 1974. [7] C. T. Kelley, Solving Nonlinear Equations with Newton s Method. Philadelphia, PA: SIAM, 003. [8] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, U.K.: Cambridge Univ. Press, 1990. 106 IEEE CONTROL SYSTEMS MAGAZINE» DECEMBER 010