Essential Matrix Estimation via Newton-type Methods

Essential Matrix Estimation via Newton-type Methods Uwe Helmke 1, Knut Hüper, Pei Yean Lee 3, John Moore 3 1 Abstract In this paper camera parameters are assumed to be known and a novel approach for essential matrix estimation is presented. We estimate the essential matrix from point correspondences between a stereo image pair. The technical approach we take is a generalization of the classical Newton method. It is well-known that the set of essential matrices forms a smooth manifold. Moreover, it is quite natural to minimise a suitable cost function the global minimum of which is the essential matrix we are looking for. In this paper we present several algorithms generalising the classical Newton method in the sense that (i) one of our methods can be considered as an intrinsic Newton method operating on the Riemannian manifold consisting of all essential matrices, (ii) the other two methods approximating the first one but being more efficient from a numerical point of view. To understand the algorithm requires a careful analysis of the underlying geometry of the problem. Keywords Essential matrix, stereo vision, motion and structure recovery, -jet approximation, Newton method, quadratic convergence. I. Introduction Epipolar geometry has important applications in 3D computer vision. These include determining 3D motion and estimating the structure of an object moving relative to a camera, estimating the relative orientation and location of two cameras which are both observing the same 3D objects, and camera calibration. Robust and accurate computation of the fundamental or essential matrix is crucial in these applications. In this paper, we assume camera parameters are known and a new approach for essential matrix estimation is presented. The new algorithm can be viewed as a Gauss-on-manifold method and is an approximation to a more sophisticated locally quadratically convergent Gauss-Newton-on-manifold method. Each algorithmic step of our algorithms consists of a two stage process. The first step computes a solution of a least-squares problem in Euclidean space giving an iterate outside the manifold considered as an embedded submanifold. The result of this optimisation procedure is then projected back onto the manifold. To understand the algorithms requires a careful analysis of the underlying geometry of the problem. We compute (i) the fixed points of our algorithms, (ii) discuss the differentiability properties of our methods and (iii) present an analysis resulting in local quadratic convergence in three cases. A. Epipolar geometry II. Epipolar Geometry and the Essential Manifold Two images of the same scene are related by epipolar geometry as illustrated in Figure 1. The images can be taken by two cameras or the images can be taken by a mobile camera at two different positions centered at C and C. Given an object point M and its two dimensional projections ˆm and ˆm on both image planes, the three points define an epipolar plane Π, which intersects both image planes I and I at the epipolar lines l ˆm and l ˆm. The image of the camera centers C,C are captured on the image planes I, I at epipoles e, e respectively. 1 Department of Mathematics, University of Würzburg, D-97074 Würzburg, Germany, Email: helmke@mathematik.uniwuerzburg.de National ICT Australia Ltd, Locked Bag 8001, Canberra ACT 601, Australia, Email: knut.hueper@nicta.com.au 3 Department of Systems Engineering, RSISE, Australian National University, Canberra, ACT 000, Australia, and National ICT Australia Ltd, Locked Bag 8001, Canberra ACT 601, Australia, Email: {peiyean,john.moore}@syseng.anu.edu.au

placements M I 1 I m 1 m x 1 l z 1 l 1 z x y C 1 C y 1 e 1 e (R, T ) Fig. 1. The epipolar geometry Algebraically, epipolar geometry is represented by a fundamental matrix F, as m F m = 0, (1) where m = [u v 1], m = [u v 1]. The fundamental matrix is a (3 3)-matrix of rank two. It is used to establish correspondences between two uncalibrated images. Here, image points m and m are described in pixel image coordinates, and for now we assume that the image data are noise free. When the camera calibration matrices K and K are known, the image pair m m can be expressed in terms of normalized image coordinates as m = K 1 m, m = K 1 m and (1) can be reformulated in terms of an essential matrix E as m Em = 0, () where E = K F K. (3) From (1) and (), it is clear that the essential matrix E is equivalent to a fundamental matrix F when the known image points are expressed in normalized coordinates. B. Essential manifold To set up a notational framework for our results, we review in this section basic facts on the geometry of the essential matrix, see [1],[3],[4]. For simplicity of the subsequent analysis, however, we present these results using the terminology of Lie groups and Lie algebras. Let O 3 denote the Lie group of real (3 3)-orthogonal matrices with determinant equal to plus one or minus one. Let SO 3 denote the Lie group of real (3 3)-orthogonal matrices with determinant equal to one, and let so 3 denote the corresponding Lie algebra, i.e., the set of real (3 3)-skew symmetric matrices so 3 := {X R 3 3 X = X }. Let RP denote the real projective plane, i.e. the set of lines through the origin in R 3. Recall, that an essential matrix is a real (3 3)-matrix in factored form E = ΩΘ. (4)

3 Choosing a basis of so 3 as Ω 1 := 0 0 0 0 0 1, Ω := 0 0 1 0 0 0, Ω 3 := 0 1 0 1 0 0 (5) 0 1 0 1 0 0 0 0 0 then Ω = 3 ω i Ω i 0, Θ SO 3. (6) Here Θ is the rotation matrix indicating the rotation between the two cameras and ω = [ω 1 ω ω 3 ] is the translation vector between the two cameras. Once E is known, then the factorization (4) can be made explicit as shown below. It is well-known, [1], [4], that the essential matrices are characterized by the property that they have exactly one positive singular value of multiplicity two, consequently E must be rank. In particular, normalized essential matrices are those of Frobenius norm equal to and which are therefore characterized by having the set of singular values {1, 1, 0}. The normalized essential manifold is defined as E := { ΩΘ Ω so 3, Θ SO 3, Ω = tr(ωω ) = }. (7) This is the basic nonlinear constraint set on which the proposed algorithms are defined. C. Characterization of normalized essential manifold First, we show that a non-zero (3 3)-matrix E is essential if and only if E = UΣV, (8) where Σ = s 0 0 0 s 0, s > 0, U, V SO 3. (9) 0 0 0 Note that, E is a normalized essential matrix when s = 1. Assume E = ΩΘ with Ω and Θ as in (6). The equality E = ΩΘ implies EE = ΩΘΘ Ω = Ω, with corresponding set of eigenvalues λ(ee ) = { s, s, 0 }, where s := 3 ωi. The set of singular values of E is then σ(e) = {s, s, 0}. For the converse, consider Ψ := 0 s 0 s 0 0 so 3 and Γ := 0 1 0 1 0 0 SO 3. 0 0 0 0 0 1

4 One has Ψ = Γ s 0 0 0 s 0 0 0 0 Hence for the singular value decomposition of E, E = U s 0 0 0 s 0 V = UΓ Γ s 0 0 0 s 0 ΓU UΓ V = (UΓ Ψ ΓU ) (UΓ V ) = ΩΘ. (10) }{{}}{{} 0 0 0 0 0 0 Ω Θ as required. The next result characterizes the normalized essential manifold E as the smooth manifold of (3 3)- matrices with fixed set of singular values {1, 1, 0}, see [] for details on the geometry of such manifolds. Proposition II.1: Let denote [ ] I 0 E 0 :=, (11) 0 0 E := { UE 0 V U, V SO 3 }. (1) Then, E is a smooth five-dimensional manifold diffeomorphic to RP SO 3. The orthogonal matrices appearing in the above SVD of a given essential matrix are not uniquely determined. However, the possible choices are easily described, leading to an explicit description of factorizations. D. Tangent space of essential manifold We now consider the tangent spaces to E. Theorem II.1: The tangent space T E E at the normalized essential matrix E = UE 0 V is T E E = { U (ΩE 0 E 0 Ψ)V } Ω, Ψ so 3 0 ω 1 ψ 1 ψ 13 = U (ω 1 ψ 1 ) 0 ψ 3 V ωij, ψ ij R, i, j {1,, 3} ω 13 ω 3 0 with usual notation Ω = (ω ij ) and Ψ = (ψ ij ). Proof: For any E = UE 0 V E let α E : SO 3 SO 3 E be the smooth map defined by α E (Û, V ) = ÛE V. The tangent space T E E is the image of the linear map (13) D α E (I 3, I 3 ) : so 3 so 3 R 3 3, ( Ω, Ψ) ΩE E Ψ, (14) i.e., the image of the derivative of α E0 evaluated at the identity (I 3, I 3 ) SO 3 SO 3. By setting Ω := U ΩU and Ψ := V ΨV the result follows, see [], pp 89 for details. Corollary II.1: The kernel of the mapping D α E0 (I 3, I 3 ) : so 3 so 3 R 3 3 is the set of matrix pairs (Ω, Ψ) so 3 so 3 with Ω = Ψ = 0 x 0 x 0 0, x R. (15) 0 0 0 Corollary II.: The affine tangent space TE affe at the normalized essential matrix E = UE 0V is TE aff E = U 1 x 3 x 5 x 3 1 x 4 V x1,..., x 5 R x x 1 0. (16)

5 E. Parameterisation of the essential manifold Computations on a manifold are often conveniently carried out in terms of a local parameterisation. For our later convergence analysis we therefore need a local parameterisation of the essential manifold. Lemma II.1: Let N (0) R 5 denote a sufficiently small open neighborhood of the origin in R 5. Let U, V SO 3 be arbitrary, let x = [ ] x 1,..., x 5 R 5, and let E 0 be defined as in (11). Consider the mappings Ω 1 : R 5 so 3, [ ] 1 x 1,..., x 5 0 x 3 x x 3 0 x 1 x x 1 0 (17) and Ω : R 5 so 3, [ x 0 3 x ] 5 1 x 1,..., x 5 x 3 0 x 4. (18) x 5 x 4 0 Consider also µ : N (0) E, x U e Ω1(x) E 0 e Ω(x) V. (19) Then the mapping µ is a diffeomorphism of N (0) onto the image µ(n (0)). Proof: Smoothness of µ is obvious. We will show that µ is an immersion at 0. To see that µ is an immersion it is sufficient to show that the derivative D µ(0) : R 5 T µ(0) E (0) is injective. For arbitrary h = [ ] h 1,..., h 5 R 5 we get D µ(0)h = U 1 0 h 3 h h 3 0 h 1 h h 1 0 E 0 E 0 h 0 3 h 5 1 h 3 0 h 4 V h 5 h 4 0 = 1 U 0 h3 h 3 h 5 0 h 4 h h 1 0 V, which implies injectivity in an obvious manner. The result follows. Remark II.1: In this paper we consider the essential manifold as an orbit of the group SO 3 SO 3 acting on E 0 by equivalence. By the differential of this group action the usual canonical Riemannian metric on SO 3 SO 3 induces a Riemannian metric on the essential manifold which is called the normal Riemannian metric on E, see e.g. [] for details about this construction. Moreover, by exploiting Corollary II.1 one can show that by this group action geodesics on SO 3 SO 3, namely one parameter subgroups, are mapped to geodesics on E. We refer to [6], Theorem 5.9., for a proof of this fact in a more general context. It turns out that the curves on E we will use in the sequel, i.e. γ : t U e tω 1(x) E 0 e tω (x) V, x R 5 (1) are indeed geodesics on E with respect to the normal Riemannian metric. In addition, the inverse µ 1 defines a socalled normal Riemannian coordinate chart. Such a chart has the remarkable feature that the Riemannian metric expressed in this chart evaluated at zero is represented by the identity.

6 F. Cost function Let M (i) := m (i) m (i), where m (i), m (i) R 3 correspond to the normalized i-th point image pair in the left and in the right camera, respectively, for which the correspondence is assumed to be known. Consider the smooth function f : E R, f(e) = 1 n ( ) m (i) Em (i) 1 n = tr (M (i) E). () The value of this cost function atains zero if and only if there is an essential matrix which fulfills the epipolar constraint for each image point pair. That is, in the noise free case the global minimum value is zero. In the noisy case the zero value will in general not be attained. It nevertheless makes sense to search for mimima of this cost function even in the presence of noise. The minima then can be interpreted as least squares approximations to the true essential matrix ignoring for a while any statistical interpretations or refinements. G. -jet of the cost function f on E in terms of local parameterisation The -jet or second order Taylor polynomial of f around the point E = UE 0 V E expressed in local parameters using the smooth parameterisation µ as in (19), is defined as j () 0 (f µ)(x) : N (0) R, x f(µ(tx)) t=0 + d d t t=0 f(µ(tx)) + 1 d d t f(µ(tx)). (3) t=0 That is j () 0 (f(µ(x))) = 1 n tr M (i) E + + 1 n As expected the -jet contains three terms: (i) A constant (ii) A linear one d d t f(µ(tx)) = t=0 n ( tr M (i) E ) ( ) tr M U( (i) Ω 1 (x)e 0 E 0 Ω (x) )V + + 1 n ( ) tr M (i) U Ω 1 (x)e 0 E 0 Ω (x) V + n ( tr M (i) E ) ( ) tr M U( (i) Ω 1(x)E 0 + E 0 Ω (x) Ω 1 (x)e 0 Ω (x) )V. (4) (f µ)(tx) t=0 = 1 n tr M (i) E = const. (5) ( tr M (i) E ) ( ) tr M U( (i) Ω 1 (x)e 0 E 0 Ω (x) )V = ( (f µ)(0)) x = (6) grad f(ue 0 V ), U(Ω 1 (x)e 0 E 0 Ω (x))v, n.rm which can be interpreted as either (I) the transposed Euclidean gradient of f µ : R 5 R evaluated at zero acting on x R 5, or (II) as the Riemannian gradient of f : E R evaluated at UE 0 V E paired by the normal Riemannian metric with the tangent element U(Ω 1 (x)e 0 E 0 Ω (x))v T UE0 V E. (iii) A quadratic term in x. Actually, the quadratic term consists of a sum of two terms. The first one, namely the term following the factor 1/ in the second line of (4) n ( ) tr M (i) U Ω 1 (x)e 0 E 0 Ω (x) V = x Ĥf(µ(0)) x, (7)

7 is a quadratic form on R 5 with the corresponding matrix Ĥf(µ(0)) being positive (semi)definite for all U, V SO 3. We can interpret this summand as a positive (semi)definite part of H f(µ(0)) = Ĥf(µ(0)) + H f(µ(0)), (8) with H f(µ(0)) the Hessian matrix of f µ : R 5 R evaluated at zero. For the quadratic term in (3) there is then a further interpretation: d d t f(µ(tx)) = H f(γ(t)) ( γ(t), γ(t)) t=0, (9) t=0 i.e., H f(γ(t)) is the Hessian operator of f : E R represented along geodesics γ : R E, γ(t) = µ(tx), γ(0) = UE 0 V. H. Projecting onto the manifold E As will be explained in more detail below our algorithms are iterative in nature. Each algorithmic step consists of two partial steps, the first one being an optimisation procedure defined on an appropriate affine tangent space to E, and the second one is a nonlinear projection back to the manifold. H.1 Approximating arbitrary matrices by essential ones We need an explicit description for the best approximant of an arbitrary matrix X R 3 3 to E. Theorem II.: Let X R 3 3 with ordered singular value decomposition X = UΣV, i.e., Σ = diag(σ 1, σ, σ 3 ), σ 1 σ σ 3 0, and U, V O 3. If σ 3 is simple, the unique best approximant X E with respect to the Frobenius norm, i.e., X := arg min X Z (30) Z E is given as X = UE 0 V. (31) Proof: The proof proceeds along the same lines as for the corresponding Eckart-Young-Mirsky theorem for symmetric matrices; see []. The rather straightforward details are omitted here. In the sequel we will use the notation π SVD (X) = X. H. Projecting back by means of the parameterisation µ Let X = U (E 0 + Ω 1 E 0 E 0 Ω ) V with Ω 1, Ω so 3 be an arbitrary element of T aff UE 0 V E. We can simply project X back to the manifold in the following way: X U e Ω 1 E 0 e Ω V. (3) It is straight forward to verify that this mapping is a projection, moreover, it is smooth: Let Ω 1 and Ω defined as in (17) and (18), respectively. Let T aff E be the affine tangent bundle of E. For fixed x R 5 consider the smooth mapping π µ : T aff E E, U ( E 0 + Ω 1 (x)e 0 E 0 Ω (x) ) V U e Ω 1(x) E 0 e Ω (x) V. (33) Obviously, for fixed x the mapping π µ maps straight lines in T aff UE 0 V going through UE 0 V such as l UE0 V : t U( E 0 + tω 1 (x)e 0 E 0 tω (x) ) V, (34) to smooth curves π µ ( lue0 V (t)) E. As mentioned above the resulting curves on E, namely π µ ( lue0 V (t)) = U e tω 1(x) E 0 e tω (x) V = U e Ω 1(tx) E 0 e Ω (tx) V, (35)

8 are geodesics on E with respect to the so-called normal Riemannian metric on E. One therefore can think of the projection π µ defined by (33) as a Riemannian one. Moreover, the parameterisation µ given by (19) defines a so-called Riemannian normal coordinate chart µ 1 sending a suitably chosen open neighborhood of E E diffeomorphically to an open neighborhood of the origin of R 5. H.3 Cayley-like projection As a further alternative one might approximate the matrix exponential of skew-symmetric matrices by its first order diagonal Padé approximant, or more commonly called Cayley transformation: cay : so 3 SO 3, Ω (I + 1 ) ( Ω I 1 ) 1 Ω. (36) The Cayley mapping on so 3 is well known to be a local diffeomorphism around 0 so 3. Moreover, it approximates the exponential mapping exp : so 3 SO 3 defined by Ω exp(ω) = e Ω up to second order. We therefore consider in the sequel the smooth projection mapping A. Objective Function π cay : T aff E E, U (E 0 + Ω 1 E 0 E 0 Ω ) V U cay(ω 1 )E 0 cay( Ω )V. (37) III. Algorithm The cost function we consider first in this paper is analysed to some extent in [5]. We recall here the critical points of f : E R defined by (). Lemma III.1: Let f : E R, f(e) = 1 n tr (M (i) E). (38) The element E = UE 0 V E is a critical point of f if and only if for all Ψ 1, Ψ so 3 n ( tr M (i) E ) ( tr M (i) U(Ψ 1 E 0 E 0 Ψ )V ) = 0. (39) B. Algorithm We consider the algorithm as the self map consisting of an optimisation step followed by projection. Here s = π π 1 : E E (40) π 1 : E R 3 3, E = UE 0 V U ( E 0 + Ω 1 (x)e 0 E 0 Ω (x) ) V (41) and x R 5 as a function of E solves the problem of minimizing the -jet of the objective function f, i.e., x = arg min y N (0) j() 0 (f µ)(y), (4) where µ(0) = E. The second mapping π denotes a projection π : R 3 3 E, X proj(x), (43) where proj is chosen to be one of the projections discussed in Section II-H. Therefore one algorithmic step of s consists of two partial steps, namely π 1 sending a point E on the essential manifold E to an element of the affine tangent space TE affe, followed by π projecting that element back to E.

9 C. Fixed Points of the Algorithm The following theorem holds. Theorem III.1: Let π be either π cay or π SVD, but not π µ. The only fixed points of the corresponding algorithm s = π π 1 are those elements of E which are minima of the objective function f :E R. Unfortunately, the situation is more involved if the projection we use is π µ. Consider the following example. For arbitrary U, V SO 3 let E = UE 0 V and suppose x opt = [ 0 π 0 0 π ], (44) therefore Ω 1 (x opt ) = Ω (x opt ) = 0 0 π 0 0 0 and e Ω 1(x opt) = e Ω (x opt) = 1 0 0 0 1 0. (45) π 0 0 0 0 1 Then but π 1 (E) = U 1 0 π 0 1 0 V E (46) π 0 0 π µ (π 1 (E)) = UE 0 V = E. (47) One can easily find other examples, but the reason that one can construct such examples is that the injectivity radius of the exponential map exp : so 3 SO 3 is finite, namely equal to π. A. Smoothness of the optimisation step π 1 IV. Smoothness Properties of the Algorithm This is obvious under the assumption of the Hessian of f µ being everywhere invertible. Indeed, under this assumption the linear system to be solved in each optimisation step has a unique solution. B. Smoothness of the projections π SVD, π µ and π cay Theorem IV.1: The projections π µ and π cay are smooth mappings. Proof: This is obvious. Theorem IV.: Let U R 3 3 is an open subset and the projection U := {X R 3 3 smallest singular value is simple}. (48) π SVD : U E, X X is smooth. Proof: Consider M := (U, Σ, V ) O 3 R 3 3 Σ O 3 = α β 0 β γ 0, δ < α + γ (α γ) + 4β 0 0 δ. Note that ([ α σ min β ]) β = 1 γ (α + γ (α γ) + 4β ). (49)

10 Thus the condition on δ implies that δ is the eigenvalue of Σ with smallest absolute value. Let { [ ]} M 0 := (U, E 0, V ) O 3 R 3 3 E0 I 0 O 3 = 0 0 and Γ := { S = [ ] } 0 R 0 1 3 3 O. Then σ : Γ M M, (S, (U, Σ, V )) (US, S ΣS, V S) defines a smooth, proper Lie group action with smooth orbit space M/Γ and the quotient map P : M U, (U, Σ, V ) UΣV, is a principal fibre bundle with structure group Γ. Obviously, σ leaves M 0 invariant and therefore restricts to a smooth quotient map Moreover, the projection map P : M 0 E, (U, E 0, V ) UE 0 V. F : M M 0, F (U, Σ, V ) = (U, E 0, V ) is smooth and the diagram F M M 0 P U π SVD is commutative. By standard arguments this implies that π SVD is smooth. P E (50) V. Convergence Analysis of the Algorithm Let E denote a fixed point of s = π π 1, i.e. E is a minimum of the function f. We will compute the first derivative of s at this fixed point. By the chain rule and the fact that π 1 (E ) = π (E ) = E we have for all tangent elements ξ T E E D s(e ) ξ = D π (E ) D π 1 (E ) ξ. (51) Reconsidering s expressed in local coordinates amounts to studying the self map Therefore, rewriting (51) in terms of the parameterisation, defined by µ 1 s µ : R 5 R 5. (5) µ : R 5 N (0) E, y U e Ω 1(y) E 0 e Ω (y) V, (53) with µ(0) = E = U E 0 V, (54)

11 and Ω 1 and Ω as in (17) and (18), respectively, we get D ( µ 1 s µ ) (0) h = D µ 1 (E ) D s(e ) D µ(0) h = (D µ(0)) 1 D π (E ) D π 1 (E ) D µ(0) h. (55) Now where and π 1 µ : R 5 N (0) R 3 3, y U e Ω 1(y) ( E 0 + Ω 1 (x opt (y))e 0 E 0 Ω (x opt (y)) ) e Ω (y) V, (56) x opt : R 5 R 5, y arg min z R 5 j() 0 ϕ(z, y), (57) ϕ : Ñ (0) N (0) R, ϕ(x, y) = f ( ) U e Ω1(y) e Ω1(x) E 0 e Ω(x) e Ω(y) V. (58) Here Ñ (0) N (0) is a suitably chosen open neighborhood of zero and the -jet of ϕ in (57) is understood to be the one with respect to the first argument of ϕ. Exploiting linearity of the mappings Ω 1 and Ω and using the well known formula for differentiating the matrix exponential, we compute the first derivative of π 1 in local coordinates as D(π 1 µ)(0) h = U ( Ω1 (h)e 0 E 0 Ω (h) ) V + U ( Ω1 (D x opt (0) h)e 0 E 0 Ω (D x opt (0) h) ) V = U ( Ω1 (h + D x opt (0) h)e 0 E 0 Ω (h + D x opt (0) h) ) V. We need an expression for D x opt (0) h. Define i.e., for all k R 5 the following holds true and similarly (59) ψ : N (0) N (0) R 5 R, ψ(x, y, k) = D 1 ϕ(x, y) k, (60) ψ(x opt (y), y, k) = 0, (61) ψ(x opt (0), 0, k) = ψ(0, 0, k) = 0. (6) Taking the derivative of (61) with respect to the variable y evaluated at zero acting on an arbitrary h R 5 gives D 1 ψ(x opt (y), y, k) D x opt (y) h + D ψ(x opt (y), y, k) h = 0. (63) y=0 y=0 The linear system (63) has a unique solution in terms of D x opt (y) h y=0 because the linear mapping D 1 ψ(x opt (y), y, k) y=0, (64) is invertible. The reason for this is simply that expression (64) is equal to the Hessian of f µ evaluated at the point 0 and is invertible by assumption. Therefore, application of the Implicit Function Theorem to ψ(x, y, k) implies not only smoothness of x opt but also one gets the explicit expression Plugging (65) into (59) gives the result D x opt (0) h = ( D 1 ψ(0, 0, k) ) 1 D ψ(0, 0, k) h = h. (65) D(π 1 µ)(0) h = 0. (66)

1 We therefore can state the main mathematical result of this paper Theorem V.1: If the algorithm s converges to the fixed point E then it converges locally quadratically fast to E. Proof: Plugging (66) into (55) shows that for all h R 5 D ( µ 1 s µ ) (0) h = (D µ(0)) 1 D π (E ) D π 1 (E ) D µ(0) h = 0 (67) irrespective which projection π we use. Let (E (k) ) denote the sequence of essential matrices generated by the algorithm. Let x (k) = µ 1 (E (k) ) denote the corresponding elements in R 5. For sufficiently large j we may assume that for all k j the iterates x (k) stay in a sufficiently small neighborhood of the origin in R 5. Vanishing of the first derivative then implies local quadratic convergence by the Taylor-type argument (µ 1 s µ)(x (k) ) sup D (µ 1 s µ)(y) x (k). (68) y N (0) A. Discussion A few remarks are in order here. Our proof of quadratic convergence was essentially independent of the chosen cost function. More detailed exploitation of this fact is under consideration and will be published elsewhere. If π µ is used for the second algorithmic step π then one can show that the overall algorithm is nothing other than a Riemannian manifold version of Newton s method, the Riemannian metric being the so-called normal one. Despite the well-known fact that under mild assumptions, the Riemannian manifold version of Newton s method is locally quadratically convergent, see [7], Theorem 3.4, p. 57, our results are apparently more than just an application of this nice result. We would like to mention that the latter version of our algorithm is also different from the approach taken in [5]. The Riemannian metric those authors use is different, therefore also their geodesics are not in accordance with ours. Whereas in [5] the local structure of the essential manifold being a product of Stiefel manifolds is exploited we here prefer to think of this manifold as an orbit of SO 3 SO 3 acting on R 3 3 by equivalence, i.e., the manifold of all (3 3)-matrices having the set of singular values equal to {1, 1, 0}. Some features about these different approaches are summarised as follows. A.1 Manifold structure Ma et al., [5]: The essential manifold E is locally diffeomorphic to the product of two Stiefel manifolds E = local S SO 3 (69) Our approach: We exploit the global diffeomorphism of E to the set of matrices having singular values {1, 1, 0} E = SO 3 1 0 0 0 1 0 SO 3 (70) 0 0 0

13 A. Geodesics emanating from E = ΩΘ = UE 0 V E: Ma et al.: where, Γ so 3 and [, [, Ω]] = 1 Ω. Our approach: [ where = 0 x 3 x x 3 0 x 1 x x 1 0 ] [ and Γ = t ( e t Ω e t, Θ e Γt) (71) t U e t E 0 e Γt V (7) ] and x 1,..., x 5 R. 0 x 3 x 5 x 3 0 x 4 x 5 x 4 0 A.3 Riemannian metric g : T E E T E E R: Ma et al.: The Euclidean one induced by the canonical submanifold structure of each factor S R 3 and SO 3 R 3 3, (73) or equivalently, the normal one induced by the similarity group action on the first factor and right translation on the second factor SO 3 S S, (U, Ω) UΩU (74) SO 3 SO 3 SO 3, (V, Θ) ΘV. (75) Explicitly, for two elements of the tangent space ξ 1, ξ T (Ω,Θ) E with ξ i = ([ i, Ω], ΘΓ i ) ( ([ 1 ) ( ) ) g, Ω], ΘΓ 1, [, Ω], ΘΓ = tr ( ) ( ) 1 + tr Γ1 Γ with i, Γ i so 3, [ i, [ i, Ω]] = 1 i Ω for i = 1,. Our approach: The normal one induced by the equivalence group action (76) SO 3 SO 3 R 3 3 R 3 3, ((U, V ), E) UEV. (77) Explicitly, for two elements of the tangent space ξ 1, ξ T UE0 V E with ξ i = U( i E 0 E 0 Γ i )V ) g (U( 1 E 0 E 0 Γ 1 )V, U( E 0 E 0 Γ )V = tr ( ) ( ) 1 + tr Γ1 Γ [ where for i = 1, : i = 0 x (i) 3 x (i) x (i) 3 0 x (i) 1 x (i) x (i) 1 0 ] [ and Γ = 0 x (i) 3 x (i) 5 x (i) 3 0 x (i) 4 x (i) 5 x (i) 4 0 ] and x (i) 1,..., x (i) 5 R. In fact, the tangent map of µ defined by (19) maps frames {e 1,..., e 5 } in R 5, orthonormal with respect to the Euclidean metric, into frames of T E E, orthonormal with respect to the normal Riemannian metric: e i D µ(0) e i = U Ω 1 (e i )E 0 E 0 Ω (e i ) V, }{{} (79) =:ξ i with Uξi V, Uξ j V = tr Ω n.rm 1 (e i )Ω 1 (e j ) + tr Ω (e i )Ω (e j ) = e i e j = δ ij. (80) One might argue that the Riemannian metric we use is induced by restriction from another Riemannian metric defined on the embedding R 3 3. This is actually not the case, moreover, one can show that such a metric on R 3 3 does not exist. (78)

14 VI. Implementation of Algorithms Start with an initial estimate of Essential matrix E = UE 0 V obtained from the standard 8-point algorithm. Step 1. Carry out the optimization step π 1, Compute the gradient f(µ(0)) and the Hessian H f(µ(0)). If H f(µ(0)) > 0, compute the Newton step x opt = H 1 f(µ(0)) f(µ(0)), otherwise compute the Gauss step x opt = Ĥ 1 f(µ(0)) f(µ(0)). Step. Carry out the projection step π. There are three alternative projections, π SVD : Let x opt = [x 1 x x 5 ], form the optimal affine tangent vector, ξ opt TE affe, 1 x 3 x 5 / ξ opt = U x 3 1 x 4 / x / x 1 / V = Û σ 1 0 0 0 σ 0 V, 0 0 0 σ 3 and compute the projected estimate of the essential matrix Ê = ÛE 0 V. π µ : Û = U eω 1(x opt), V = V e Ω (x opt), Ê = ÛE 0 V. π cay : Û = U cay(ω 1 (x opt )), V = V cay(ω (x opt )), Ê = ÛE 0 V. Step 3. Set E = Ê, U = Û, V = V, go back to Step 1 if f(µ(0)) > ε, a prescribed accuracy. VII. Acknowledgment The authors thank Jochen Trumpf, National ICT Australia, Canberra, for many fruitful discussions. The first two authors were partially supported by DAAD PPP Australia, Germany, under grant D/043869. The last two authors were partially supported by the Australian-German Grant under 4900-17. National ICT Australia is funded by the Australian Department of Communications, Information & Technology & the Arts and the Australian Research Council through Backing Australia s ability and the ICT Centre of Excellence Program. References [1] R. Hartley and A. Zisserman. Multiple View Geometry. Cambridge Univ. Press, Cambridge, 000. [] U. Helmke and J.B. Moore. Optimization and Dynamical Systems. CCES. Springer, London, 1994. [3] K. Kanatani. Statistical Optimization for Geometric Computation: Theory and Practice. Elsevier, Amsterdam, 1996. [4] Q.-T. Luong and O.D. Faugeras. The fundamental matrix: Theory, algorithms and stability analysis. Int. J. of Computer Vision, 17(1):43 75, 1996. [5] Y. Ma, J. Košecká, and S. Sastry. Optimization criteria and geometric algorithms for motion and structure estimation. Int. J. of Computer Vision, 44(3):19 49, 001. [6] R. Mahony. Optimization algorithms on homogeneous spaces. PhD thesis, Australian National University, Canberra, March 1994. [7] S.T. Smith. Geometric optimization methods for adaptive filtering. PhD thesis, Harvard University, Cambridge, May 1993.