Multiple View Geometry in Computer Vision

Multiple View Geometry in Computer Vision Prasanna Sahoo Department of Mathematics University of Louisville 1

Scene Planes & Homographies Lecture 19 March 24, 2005 2

In our last lecture, we examined various methods of triangulation, namely homogeneous method (DLT), inhomogeneous method, and a method based on geometric error. 3

In this lecture, we examine the projective geometry of two cameras and a world plane. A plane induces a homography between two views. 4

Suppose we have some points X on a world plane. Let the images of these points be x in the first view and x in the second view. We have seen during the geometric derivation of the fundamental matrix that there is a planar homography H between the points x and x. 5

There are two relations between the two views: First, a point in one view determines a line in the other view which is the image of the ray through that point. 6

Second, a point in one view determines a point in the other view which is the image of the intersection of the ray with a plane. 7

Homographies given the plane Result 1. Given the projection matrices for the two views P = [I 0] and P = [A a] and a plane defined by π T X = 0 with π = (v T, 1) T, then the homography induced by the plane is x = Hx with H = A av T. 8

Proof: Let P = [I 0] and P = [A a] be camera matrices. Let X be a 3D point on the plane π. Let x = PX = [I 0] X. So any point X = (x T, ρ) T on the ray projects to x where ρ parametrizes the point on the ray. Since X is on the plane π, we have π T X = 0. Therefore π T X = ( v T 1 ) x = v ρ T x + ρ = 0. 9

Hence ρ = v T x. Therefore X = (x T, ρ) T = (x T, v T x) T. The 3D point X projects into the second view. Hence x = P X = [ A a ] = A x a v T x x v T x = (A a v T ) x. 10

Hence H = A a v T is the homography induduced by the plane π. Example 1. A calibrated stereo rig. Suppose the cameras matrices of a stereo rig with world origin at the first camera be P e = K [I 0] and P e = K [R t]. Let π e be the world plane with coordinates given by π e = (n T, d) T. Suppose X = ( X, x 4 ) T. Find the homography induced by the plane π e. 11

Answer: From Result 1, with v = n d, the homography for the cameras P = [I 0] and P = [R t] is H = R t nt d. Applying the transformations K and K to the images x and x, we obtain the cameras P e = K [I 0] and P e = K [R t]. 12

Therefore x = H x K x = K H x K x = ( K H K 1) K x. Hence induced homography by the plane π e is given by H e = K H K 1. 13

Homography compatible with epipolar geometry Suppose four points X i are chosen on a scene plane. The correspondence x i x i of their images between two views defines a homography H. So x i = H x i. These image correspondences x i x i also satisfy the epipolar constraint x T i F x i = 0. Hence x T i F x i = (H x i ) T F x i = x T i H T F x i = 0. This homography is said to be consistent with F. 14

Now suppose we choose four arbitrary points in the first view and four arbitrary points in the second view. Then a homography Ĥ can be computed. However, the correspondence x i x i = Ĥ x i may not satisfy the epipolar constraint. Hence there does not exist a scene plane containing the corresponding four scene points X i. 15

The epipole is mapped by the homography, as e = H e, since the epipoles are images of the point on the plane where the baseline intersect the scene plane π. e = H e 16

The epipolar lines are mapped by the homography as l e = H T l e. l e = H T l e 17

Any point x mapped by the homography lies on its corresponding epipolar line l e = x (Hx). l e = x (Hx) 18

A homography H is compatible with a fundamental matrix F if and only if the matrix H T F is skewsymmetric, that is H T F + F T H = 0. ( ) The compatibility constraint ( ) is an implicit equation in H and F. 19

This is a result from Chapter 8 about fundamental matrix. Result 2. The fundamental matrix F corresponding to a pair of camera matrices P = [ I 0 ] and P = [ A a ] is given by F = [a] A. The converse of this is also true. 20

Now we develop an explicit expression for H induced by a scene plane for a given F. Result 3. Given the fundamental matrix F between two views, the three-parameter family of homographies induced by a scene plane π = (v T, 1) T is H = A e v T where [e ] A = F is any decomposition of the fundamental matrix. 21

Proof: Let F be the fundamental matrix for the two views. Since F = [e ] A by Result 2, the camera matrices are given by P = [ I 0 ] and P = [ A e ]. By Result 1, the homography H induced by π is H = A e v T. 22

Remark: Because of F T H = ( [e ] A ) T ( A e v T ) = A T [e ] ( A e v T) = A T [e ] A + A T [e ] e v T = A T [e ] A (since [e ] e = 0), and H T F = ( F T H) T = A T [e ] A = F T H the homography H = A e v T is compatible with the fundamental matrix F. 23

Corollary 1. A transformation H is the homography between two images induced by some world plane if and only if the fundamental matrix F for the two images has a decomposition F = [ e ] H. 24

Result 4. Given the cameras in the canonical form P = [ I 0 ] and P = [ A a ], then the plane π that induces a given homography H between the views has coordinates π = (v T, 1) T where v may be obtained linearly by solving the equations λ H = A a v T, which are linear in the entries of v and λ. 25

Remark. The equations λ H = A a v T will have an exact solution v if H satisfies the compatibility constraint H T F + F T H = 0 with F. 26

If H is computed numerically from noisy data, then the equations λ H = A a v T may not yield an exact solution. 27

Plane induced homographies So far we have seen how to compute the induced homography H if the coordinates of the scene plane π are given. The scene plane π can also be specified if we are given three points, or a line and a point. 28

Computing H from three given points on π Suppose we are given the images x i in the first view and the corresponding images x i in the second view of 3 scene points X i and the fundamental matrix F for the two views. 29

There are two ways of computing H from three given points on π. Computing H Implicit method 3 Explicit method 30

Implicit Method First, the homography H may be determined from the four correspondences x i = H x i for i = 1, 2, 3, and e = H e. 31

If we let x i = (x i, y i, w i ) T and x i = (x i, y i, w i )T the equation x i H x i = 0 yields 0 T w i xt i y i xt i w i xt i 0 T x i xt i h 11 h 12 h 13 h 21 h 22 h 23 h 31 h 32 h 33 = 0 which can be written as A i h = 0. 32

Algorithm (i) For each correspondence x i x i compute A i. Only two first rows needed. (ii) Assemble four 2 9 matrices A i into a single 8 9 matrix A. (iii) Obtain SVD of A. Solution for h is last column of V. (iv) Determine H from h. 33

Explicit Method Second, the position of three points X i is recovered in a projective reconstruction, and then the plane π is determined solving the equation X T 1 X T 2 X T 3 π = 0 and the homography H is computed from the coordinates of the plane π using Result 1. 34

Recall the following result from the Chapter 8. Result 5. The camera matrices corresponding to a fundamental matrix F may be chosen as P = [ I 0 ] and P = [ [e ] F e ]. The following result can be found in A3.4 on page 555 of the text book (first edition). Result 6. [ e ] [ e ] F = F (up to scale). 35

Result 7. Given F and the three image point correspondences x i x i, the homography induced by the plane of the 3D points is H = A e ( M 1 b ) T, where A = [ e ] F and b is a 3-vector with components ( x i (A x i ) ) T ( x i e ) b i = x i e 2 and M is a 3 3 matrix with rows x T i. 36

Proof: Let F the fundamental matrix for the two views. Let x i x i be the three point correspondences. By Result 5, A = [ e ] F. Hence F = [ e ] A By Result 3, H = A e v T where (v T, 1) T = π. Since x i = H x i, we have x i = A x i e (v T x i ). 37

Each x i x i generates a linear constraint on v. The vectors x i and A x i e (v T x i ) are parallel. Hence x i ( A x i e (v T x i ) ) = 0. This simplifies to (x i Ax i) ( x i ) e (v T x i ) = 0. Taking scalar product with x i e, we have (x i Ax i) T (x i e ) (x i e ) T (x i e )(v T x i ) = 0. 38

Hence x T i v = vt x i = (x i Ax i) T (x i e ) x i e 2 = b i (say). Each x i x i generates an equation xt i v = b i. Collecting the three equations we have M v = b. Here M is a matrix such that M T = [ x 1, x 2, x 3 ]. From M v = b, we get v = M 1 b. Therefore H = A e ( M 1 b ) T. 39

Remark. The equation M v = b can not be solved for v if M is singular, that is det(m) = 0. The determinant of M will be zero if the three image points x i in the first view are collinear. Geometrically, three collinear image points arise from collinear world points or coplanar world points where the plane contains the first camera center. 40