Introduction to nonlinear LS estimation R. I. Hartley and A. Zisserman: Multiple View Geometry in Computer Vision. Cambridge University Press, 2ed., 2004. After Chapter 5 and Appendix 6. We will use x instead of y and x as the measured coordinates, like in the book. n 1 = n nonrobust estimation The n measurement vectors taken together give the vector X R N. The unknown parameter vector is P R M. X = f(p) + ɛ and ɛ has to be minimized. M N The estimated measurements ˆX are on a manifold S M inside R N. The estimated parameters ˆP in R M satisfy the model f( ˆP) = ˆX and f : R M R N. P; X are equal to P o ; X o, the true values The estimation is equivalent to go from h(p, X) 0 to h( ˆP, ˆX) = 0.
The f( ˆP) not one-to-one because of the uncertainties has to be elimitated. In the parameter space has only rank d < M, the number of essential parameters. The vector (X ˆX) have nonzero elements only in the (N d) dimensional space. The vector ( X ˆX) in the tangent space have nonzero elements only in d dimensions. This is the estimation error. The covariance along the d directions cannot be smaller then the initial covariance of the measurements X along these directions. 2
Homography between two 2D images. The homogeneous coordinated are x i = [ x i y i 1 ] and x i = [ x hi y hi w hi ]. The 3 3 matrix H have to the found ˆx i = Ĥˆx i i = 1,..., n. The measurement space is R N = R 4n. The parameter space is R M = R 9. Constraint h h = 1 eliminates the ninth parameter. The number of essential parameters is eight. In the parameter space, ˆP lies on an eight dimensional manifold, a unit sphere in R 9. The null space is perpendicular to the unit sphere, and changes for every ˆP. In 2D is a circle and a line, d = 1 < M = 2. 3
General view. Covariance Σ X. Taking into account also the estimated measurements, the two vectors in P are - a is the parameters, dimension M. - b is the estimated measurements, dimension N. ( ) a P = b Be aware that P became parameters+estimated measurements. Do not confuse with the previous notation f(p). At in each iteration ˆX satisfy the model f( ˆP). The Jacobian has block structure with J = ˆX [ = [A B] A = ˆX ] B = P a - A is an N M matrix. - B is an N N matrix. The equation to be minimized is X f(p) 2 Σ X. The matrix below will be needed A Σ 1 X A A Σ 1 X B J Σ 1 X J = B Σ 1 X A B Σ 1 X B [parameter+measure] [parameter+measure] [ ˆX ] b 4
Nonlinear least squares estimators Always an iterative solution. The approximation depends on the method. Gauss-Newton method The objective function is taken as locally quasi-quadratic in the parameters. The first iteration. Assume that ˆP 0 was solved with the algebraic distance X = f( ˆP 0 ) + ɛ 0 with ɛ 0 being the residual and ˆX 0 = f( ˆP 0 ) was found by projection into the null space S M. The first order expansion for the next iteration is f(p 1 ) f( ˆP 0 ) + J 0 (P 1 ˆP 0 ) δ 1 = P 1 ˆP 0 with the Jacobian J = f/ P evaluated at ˆP 0. X f(p 1 ) X f( ˆP 0 ) J 0 δ 1 = ɛ 0 J 0 δ 1. In the equation ɛ 0 J 0 δ 1 2 Σ X make the gradient in δ 1 equal to zero J 0 Σ 1 X J 0δ 1 = J 0 Σ 1 X ɛ 0 from where ˆP1 = ˆP 0 + δ 1. 5
The (t + 1)-th iteration is executed in a similar way, just that 0 becomes t and 1 becomes (t + 1). The minimization over the scalar e 2 res e 2 res = X f(p) 2 Σ X ɛ t J t δ t+1 2 Σ X = = (ɛ t J t δ t+1 ) Σ 1 X (ɛ t J t δ t+1 ) = = ɛ t Σ 1 X ɛ t 2ɛ t Σ 1 X J tδ t+1 + δ t+1j t Σ 1 X J tδ t+1 has the Hessian matrix approximately equal J t Σ 1 X J t, a symmetric, positive semidefinite matrix. The gradient in δ t+1 is equal to zero Gives the δ t+1 and for the next iteration. J t Σ 1 X J tδ t+1 = J t Σ 1 X ɛ t. ˆP t+1 = ˆP t + δ t+1 The approach is called the Gauss-Newton method, Has an iterative solution and in general converges to a local minimum. Is strongly dependent on the initial estimate of the parameters. 6
Gradient descent method e 2 res = X f(p) 2 Σ X = ɛ Σ 1 X ɛ ɛ = X f(p). The steepest descent method updates in the downhill direction the parameters using the negative of the gradient of the objective function. The gradient is [ e 2 res f( ˆP = 2 ˆP ] t ) t ˆP Σ 1 X ɛ t = 2J t Σ 1 X ɛ t. t The length of the step γ t is found by line search so that the X f(p t+1 ) 2 Σ X is quasi-minimum for iteration (t + 1). δ t+1 = γ t J t Σ 1 X ɛ t ˆP t+1 = ˆP t + δ t+1. Similar methods, like conjugate gradient, also exist. The initial estimate is very important. Converges slowly to a local minimum. 7
Different distances The norm of squared Mahalanobis distances is the cost function. The weights are the full rank N N covariance matrix of the measurements Σ X e 2 res = X ˆX 2 Σ X = (X ˆX) Σ 1 X (X ˆX) = n = (x i ˆx i ) Σ 1 x i (x i ˆx i ) f( ˆP) = ˆX. i=1 In homography, satisfies ˆx i = Ĥˆx i This is a geometric distance d geom, the reprojection error with M + N unknowns, the parameters and estimated measurements. It requires a nonlinear estimation. We saw before the algebraic distance d alg, with M unknowns, the parameters. It is solved by linear TLS. The estimated measurements are only nuisance parameters. Can be the initial solution for the geometric distance. 8