Technion, CS Dept., October [18] M.E. Spetsakis and J. Aloimonos. A unied theory of. June Cambridge, MA., June June PDF Free Download

early span, wth 12 coecents, all other tensors < 1; 2; > (runnng over all vews ). Each addtonal vew contrbutes lnearly 12 parameters and ts tensor wth 1 ; 2 can be determned lnearly usng 6 matchng ponts. More detals on the materal presented n ths secton can be found n [14]. 6 summary Ths paper has presented results on the goal of capturng the nter-relatonshp, geometrcally and photometrcally, across multple perspectve vews. The man analyss vehcle s the \trlnear tensor" whch captures n a very smple and straghtforward manner the basc structures assocated wth ths problem of research. We have not descrbed n detal the partcular applcatons to reconstructon, recognton and anmaton, but these can be found n some of the references provded n the text. References [1] P.A. Beardsley, A. Zsserman, and D.W. Murray. Navgaton usng ane structure from moton. In Proceedngs of the European Conference on Computer Vson, pages 85{96, Stockholm, Sweden, May 1994. [2] S. Carlsson. Dualty of reconstructon and postonng from projectve vews. In Proceedngs of the workshop on Scene Representatons, Cambrdge, MA., June 1995. [3] O.D. Faugeras. Stratcaton of three-dmensonal vson: projectve, ane and metrc representatons. Journal of the Optcal Socety of Amerca, 12(3):465{ 484, 1995. [4] O.D. Faugeras and B. Mourran. On the geometry and algebra of the pont and lne correspondences between N mages. In Proceedngs of the Internatonal Conference on Computer Vson, Cambrdge, MA, June 1995. [5] R. Hartley. Lnes and ponts n three vews a un- ed approach. In Proceedngs Image Understandng Workshop, Monterey, CA, November 1994. [6] R. Hartley. Projectve reconstructon and nvarants from multple mages. IEEE Transactons on Pattern Analyss and Machne Intellgence, 16(10):1036{1040, 1994. [7] A. Heyden. Reconstructon from mage sequences by means of relatve depths. In Proceedngs of the Internatonal Conference on Computer Vson, pages 1058{1063, Cambrdge, MA, June 1995. [8] B.K.P. Horn and B.G. Schunk. Determnng optcal ow. Artcal Intellgence, 17:185{203, 1981. [9] B.K.P. Horn and E.J. Weldon. Drect methods for recoverng moton. Internatonal Journal of Computer Vson, 2:51{76, 1988. [10] H.C. Longuet-Hggns and K. Prazdny. The nterpretaton of a movng retnal mage. Proceedngs of the Royal Socety of London B, 208:385{397, 1980. [11] Q.T. Luong and T. Vevlle. Canonc representatons for the geometres of multple projectve vews. In Proceedngs of the European Conference on Computer Vson, pages 589{599, Stockholm, Sweden, May 1994. Sprnger Verlag, LNCS 800. [12] A. Shashua. Projectve structure from uncalbrated mages: structure from moton and recognton. IEEE Transactons on Pattern Analyss and Machne Intellgence, 16(8):778{790, 1994. [13] A. Shashua. Algebrac functons for recognton. IEEE Transactons on Pattern Analyss and Machne Intellgence, 17(8):779{789, 1995. [14] A. Shashua and S. Avdan. The rank 4 constrant n multple ( 3) vew geometry. Techncal report, Technon, CS Dept., October 1995. [15] A. Shashua and K.J. Hanna. The tensor brghtness constrants: Drect estmaton of moton revsted. Techncal report, Technon, CS Dept., October 1995. [16] A. Shashua and N. Navab. Relatve ane structure: Theory and applcaton to 3D reconstructon from perspectve vews. In Proceedngs of the IEEE Conference on Computer Vson and Pattern Recognton, pages 483{489, Seattle, Washngton, 1994. [17] A. Shashua and M. Werman. On the trlnear tensor of three perspectve vews and ts underlyng geometry. In Proceedngs of the Internatonal Conference on Computer Vson, June 1995. [18] M.E. Spetsaks and J. Alomonos. A uned theory of structure from moton. In Proceedngs Image Understandng Workshop, 1990. [19] B. Trggs. Matchng constrants and the jont mage. In Proceedngs of the Internatonal Conference on Computer Vson, pages 338{343, Cambrdge, MA, June 1995. [20] S. Ullman and R. Basr. Recognton by lnear combnaton of models. IEEE Transactons on Pattern Analyss and Machne Intellgence, PAMI-13:992 1006, 1991. Also n M.I.T AI Memo 1052, 1989. [21] D. Wenshall, M. Werman, and A. Shashua. Shape tensors for ecent and learnable ndexng. In Proceedngs of the workshop on Scene Representatons, Cambrdge, MA., June 1995. [22] M. Werman and A. Shashua. Elmnaton: An approach to the study of 3D-from-2D. In Proceedngs of the Internatonal Conference on Computer Vson, June 1995.

appear n the constrant equaton we have a parametrc constrant, whch means that every pxel wth non-vanshng spatal gradent contrbutes an equaton wth a xed number of unknowns. Moreover, the constrant equaton s lnear n the elements of jk Therefore, n prncple, we can obtan a lnear leastsquares soluton for the elements of jk. just by measurng spatal and temporal dervatves. More detals can be found n [15]. 5 Rank 4: Propertes of the Tensor Manfold The ultmate goal s to nd the means for combnng together the contrbuton of any number of vews, not only three. To ths end we suggest to start wth nvestgatng the space of all trlnear tensors and look for rank decences n that space. Any ndng of that sort s extremely useful because t readly allows a statstcal way of puttng together many vews, smply by means of factorzaton. The man result s that trlnear tensors across m > 3 vews are embedded n a low dmensonal lnear subspace. Consder the followng arrangement: we are gven vews 1 ; 2; ::: m+2, m 1. For each (ordered) trplet of vews there exsts a unque trlnear tensor. Rather than consderng all trplets of vews, we consder the m trplets that contan 1 ; 2,.e., the trplets < 1; 2 ; >, = 3; :::; m + 2. Consder each of the tensors as a vector of 27 components (arrange the components arbtrarly, but stck wth ths arrangement for all tensors) and concatenate all these vectors as columns of a 27m matrx. The queston s what s the rank of ths matrx when m 27? Clearly, f the rank s smaller than 27 we obtan a lne of attack on the task of puttng together many vews. The motvaton for consderng ths arrangement s that a vew adds only 12 parameters (up to scale). It may be the case that the redundancy of representng an addtonal vew wth 27 numbers (a column vector n the 27 m matrx), nstead of 12, comes to bear only at a non-lnear level n whch case t wll not aect the rank of the system above. Therefore, a rank de- cency mples an mportant property of a collecton of tensors. We can prove the followng result: All trlnear tensors lve n a manfold of P 26. The space of all trlnear tensors wth two of the vews xed, s a 12'th dmensonal lnear sub-space of R 27. Therefore, the rank s 12, thus each addtonal vew adds, lnearly, only 12 parameters as expected. An mmedate consequence of ths result s: A lnear combnaton of tensors < 1 ; 2 ; > and < 1; 2; j > produces an admssble tensor < 1; 2; >, for some vew. The corollary s not as obvous as t may seem. All tensors lve n a non-lnear manfold because there are algebrac dependences among the tensor elements. Thus, the lne passng through two arbtrary ponts on that manfold does not necessarly lve nsde the manfold. The fact that t does for the selecton of ponts descrbed n the corollary, s therefore, not obvous. One applcaton, for nstance, of the corollary s vew synthess and anmaton. A smlar result apples to the space of all collneatons (homography matrces) between two xed vews. Gven some plane n space projectng onto vews and 0, the correspondng mage ponts are mapped to each other by a collneaton (homography matrx), Ap = p 0 for all matchng pars p; p 0. Snce the homography matrx A depends on the orentaton and locaton of the planar object, we obtan a famly of homography matrces when we consder all possble planes. Consder homography matrces A 1 ; A 2 ; :::; A k each as a column vector n a 9 k matrx. We ask agan, what s the rank of the system? It would be convenent f t were 4, because each addtonal homography matrx represents a plane, and a plane s determned by 4 parameters. We can prove the followng result: The space of all homography matrces between two xed vews s embedded n a 4 dmensonal lnear subspace of R 9. We can combne the rank 4 result wth the result descrbed prevously that a tensor can be contracted nto three homography matrces and obtan a \rank 4" result on the space of tensors, as follows. We recall from Secton 3 that the tensor jk can be contracted nto three homography matrces, assocated wth three dstnct planes, between and 0. Hence, consder the same stuaton as before where we have vews 1; 2; ::: m+2 and consder the tensors of the trplets < 1; 2; >, = 3; :::; m + 2. But now, nstead of arrangng each tensor as a 27 column vector, we arrange t n a 9 3 block, where each column s the homography jk, j = 1; 2; 3. We obtan a 9 3m matrx. Its rank must be 4: A tensor of vews < 1; 2; 3 > and "thrd" of the tensor < 1; 2; 4 >, lnearly span, wth 12 coecents, all tensors < 1; 2; > (over all vews ). Each such tensor can be recovered usng 6 matchng ponts wth 1 and 2. We can mprove and get even a tghter result on the mnmal nformaton requred to lnearly span the famly of tensors < 1; 2; > (for all vews ). Assume we have the fundamental matrx F and eppole v 0. It s known from the work of [11] that the matrx [v 0 ] x F, where [v 0 ] x s the skew-symmetrc matrx assocated wth vector products, s a homography matrx (lnearly ndependent from the three homography matrces provded by the tensor). We have therefore the followng result: The tensor of vews < 1; 2; 3 > and the eppolar constrant (matrx F and eppole v 0 ) together ln-

Snce every correspondng trplet p; p 0 ; p 00 contrbutes four lnearly ndependent equatons, then seven correspondng ponts across the three vews unquely determne (up to scale) the tensor jk. More detals and applcatons can be found n [13]. It readly follows from the fact that the tensor vanshes wth the contracton of two covarant vectors and one contravarant vector that a matchng trplet of a pont and two lnes provdes one lnear equaton for the tensor elements. Lkewse, two contractons wth covarant vectors leaves us wth a covarant vector, thus three matchng lnes provde two lnear equatons for the tensor elements. Fnally, because a pont s dened by the ntersecton of two lnes, three matchng ponts provde four lnear equatons for the tensor elements (as explctly speced above). 3 Contractons, Collneatons, Fundamental Matrx Consder the contracton e j jk where e = (1; 0; 0) >. The result s a 33 matrx, denoted by E 1. Smlarly, let E 2 = 2k, and E 3 = 3k. We obtan a remarkably and smple result: The three matrces E 1 ; E 2 ; E 3 are three homography matrces E j : 7 0 of three dstnct and ntrnsc planes. In other words, the three matrces are collneatons of the 2D projectve plane mappng ponts n to ponts n 0 nduced by three dstnct planes, respectvely. The orentaton and locatons of each plane are determned by the moton parameters B; v 00. More detals and proofs can be found n [17]. Snce 2D collneatons are the buldng block for projectve reconstructon, ths result s very mportant. For example, the \fundamental" matrx F between and 0 can be lnearly determned from the tensor by: > E j F + F > Ej = 0 whch yelds 18 lnear equatons of rank 8 for F. Smlarly, cross products between columns of two collneatons provde eppolar lnes whch can be used to recover the eppole v 0. We wll also use ths property of the tensor to concatenate together multple (> 3) vews, as descrbed n the sequel. 4 Tensor Brghtness Constrant Consder all lnes s 0 n the 2D plane that are concdent wth p 0,.e., s 0 k p0k = 0. Note that the rows of the matrx s l k are covarant vectors (represent lnes n the 2D plane), thus s 0 s spanned by the rows of s l k, and n turn equaton (2) stll holds: s 0 k v0k + p s 0 k ak = 0; (6) for all covarant vectors s 0 spanned by by the two covarant vectors (?1; 0; x 0 ) and (0;?1; y 0 ) (representng vertcal and horzontal lnes, respectvely). In partcular, consder the lnear combnaton wth coecents I x ; I y whch are the components of the gradent vector ri measured at pont (x; y) n the rst vew. Thus, s 0 = x 0 I x + y 0 I y Our next step s to remove the contrbuton of x 0 ; y 0 from s 0 (ths s the only place n equaton (6) where correspondence s requred) and for that purpose we wll use the \constant brghtness equaton" due to [8]: I x u + I y v + I 0 t = 0; where u = x? x 0, v = y? y 0 and I 0 t s the dscrete temporal dervatve at (x; y),.e., I 2 (x; y)? I 1 (x; y) where I 2 and I 1 are the mage ntensty values of the second and rst mage, respectvely. After substtutng the constant brghtness equaton n s 0 we obtan: s 0 = I 0 t + xi x + yi y (7) Thus equaton (6) wth the covarant vector s 0 gven by (7) s a brghtness constrant equaton that relates camera moton A; v 0, object shape and brghtness nformaton n the form of spatal and temporal dervatves. The stuaton so far s not very derent from that of Horn & Weldon [9] wth the derence that here we are usng an uncalbrated (projectve) model, rather than the Longuett-Hggns & Prazdny [10] model of small moton, and we use a derent style of notatons. The next step s dentcal n concept to the way the trlnear tensor was derved above. We elmnate the contrbuton of object shape by usng the thrd vew 00. The constant brghtness equaton between the rst and thrd vew becomes: I x (x? x 00 ) + I y (y? y 00 ) + I 00 t = 0, and by gong through exactly the same steps as before we obtan: and s 00 = s 00 j v00j + p s 00 j bj = 0; (8) I 00 t + xi x + yi y (9) We elmnate and obtan a new equaton, the \Tensor Brghtness Constrant": s 0 k s00 j p jk = 0 (10) The new constrant relates the observer moton and certan measurable quanttes (products of elements of s 0 and s 00 ) of mage spatal and temporal dervatves. Because the structure of the world does not

tensor of 27 numbers (whch happen to be the same numbers of [18]). These equatons were called \trlneartes". Thus 7 matchng ponts across three vews are sucent for lnearly generatng the ntrnsc geometry of three vews. In that work the equaton relatng the tensor and the (projectve) moton parameters was rst derved. Hartley [5] rederved the tensor equaton but wth a derent ndexng scheme (adopted later n ths paper), and Shashua & Werman [17] have shown that certan rearrangements of the tensor elements (equvalent to certan contractons of the tensor) produce collneatons of the 2D plane,.e., projectve transformatons due to ntrnsc planes n space a result whch s mportant for unfoldng the reconstructon problem n a very smple manner and for the queston of multple (> 3) vews, as descrbed n the sequel. The trlneartes were rederved by Faugeras & Mourran [4] usng exteror algebra. That had two dstnct advantages: rst, a geometrc nterpretaton was gven to the trlnear equatons, second, the method of dervaton was smple and general and whch led later to further work by Wenshall, Werman & Shashua [21] and Carlsson [2] on \dual" tensors. Smlarly, the trlneartes were rederved by Trggs [19], usng Penrose tensoral notatons, and Heyden [7], whch together wth [4, 22], establshed the exstence of quadlnear forms (wth total of 81 coecents) across four vews wth the negatve result that further vews would not add any new constrants. 2 The Trlnear Tensor Consder two perspectve vews ; 0 of a 3D scene. Let P be a pont n 3D projectve space projectng onto matchng ponts p 2 ; p 0 2 0 n 2D projectve plane. The relatonshp between the 3D and 2D spaces s represented by the 3 4 matrces, [I; 0], [A; v 0 ],.e., p = [I; 0]P p 0 = [A; v 0 ]P (1) We may adopt the conventon that p = (x; y; 1) >, p 0 = (x 0 ; y 0 ; 1) >, and therefore P = (x; y; 1; ). The coordnates (x; y); (x 0 y 0 ) are matchng ponts (wth respect to some arbtrary mage orgn say the geometrc center of each mage plane). The vector v 0 s the translatonal component of camera moton and s the vew of the center of projecton of the rst camera n vew 0. The matrx A s a 2D projectve transformatons (collneaton, homography matrx) from to 0 nduced by some plane n space (the plane = 0). In a calbrated camera settng the plane = 0 s the plane at nnty and A s the rotatonal component of camera moton and = 1=z where z s the depth of the pont P n the rst camera coordnate frame. For more detals on the representaton and methods for projectve reconstructon see [3, 6, 12, 16, 11, 1]. Let s l k be the matrx, s =?1 0 x 0 0?1 y 0 It can be vered by nspecton that eqn. 1 can be represented by the followng two equatons (standard method for removng the scale n that equaton): s l k v0k + p s l k ak = 0; (2) wth the standard summaton conventon that an ndex that appears as a subscrpt and superscrpt s summed over (known as a contracton). Superscrpts denote contravarant ndces (representng ponts n the 2D plane, lke v 0 ) and subscrpts denote covarant ndces (representng lnes n the 2D plane, lke the rows of A). Thus, a k s the element of the k'th row and 'th column of A, and v 0k s the k'th element of v 0. Note that we have two equatons because l = 1; 2 s a free ndex. Smlarly, the camera transformaton between vews and 00 s p 0 = [B; v 00 ]P: Lkewse, let r m j And lkewse, be the matrx, r =?1 0 x 00 0?1 y 00 rj m v00j + p rj m bj = 0; (3) Note that k and j are dummy ndces (are summed over) n equatons 2 and 3, respectvely. We used dfferent dummy ndces because now we are about to elmnate and combne the two equatons together. Lkewse, l; m are free ndces, therefore n the combnaton they must be separate ndces. We elmnate and obtan a new equaton: (s l k v0k )(p r m j bj )? (rm j v00j )(p s l k ak ) = 0; and after groupng the common terms: s l k rm j p (v 0 k b j? v00 j a k ) = 0; and the term n parenthess s the trlnear tensor: jk = v 0 k b j? v00 j a k : ; j; k = 1; 2; 3 (4) And the tensoral equatons (the trlneartes) are: s l k rm j p jk = 0 ; (5) Hence, we have four trlnear equatons (note that l; m = 1; 2). In more explct form, these functons (referred to as \trlneartes") are: x 00 13 p? x 00 x 0 33 p + x 0 31 p? 11 p = 0; y 00 13 p? y 00 x 0 33 p + x 0 32 p? 12 p = 0; x 00 23 p? x 00 y 0 33 p + y 0 31 p? 21 p = 0; y 00 23 p? y 00 y 0 33 p + y 0 32 p? 22 p = 0:

Multple-vew Geometry and Photometry Amnon Shashua Technon Israel Insttute of Technology Department of Computer Scence Hafa 32000, Israel. e-mal: shashua@cs.technon.ac.l Abstract The ssue of how to represent and manpulate the nformaton arsng from a multtude of perspectve pctures of a 3D scene s a recent and growng topc of nterest. In ths paper we present a summary account of up-to-date research, my own and wth colleagues, on ths topc, ncludng: the trlnear constrants and assocated tensor; the propertes of the trlnear tensor and ther relevance to camera geometry and nvarance; rank decences and N > 4 multple-vew geometry; and the photometrc dualty n the form of the \Tensor Brghtness Constrant". 1 Introducton The algebrac and geometrc relatons across multple perspectve vews s a recent and growng nterest whch s relevant to a number of topcs ncludng () ssues of 3D reconstructon from 2D data, () representatons of vsual scenes from vdeo data, () mage synthess and anmaton, and (v) vsual recognton and ndexng. Typcal to these topcs s the queston about the lmtatons and possbltes of gong from twodmensonal (2D) measurements of pont matches (correspondences) across two or more vews to propertes of the three-dmensonal (3D) object or scene. Snce the relatonshp between the 3D world and the 2D mage space combnes together 3D shape parameters, camera vewng parameters and 2D mage measurements, the queston of lmtatons and possbltes, n ts wdest scope, s about () 2D constrants across multple vews (matchng constrants), () characterzatons of the space of all mages of a partcular object (ndexng functons). In other words, one seeks to best represent, n terms of ecency, compactness, exblty and scope of use, two knds of manfolds: () the manfold of mage and vewng parameters (nvarance to shape), and () the manfold of mage and object parameters (nvarance to vewng parameters). A further dstncton of ths lne of research s that one generally prefers to nd lnear functons that descrbe the nherently non-lnear relatonshp between 3D objects and ther 2D vews. One can argue whether for a partcular applcaton, say landscape reconstructon from aeral photographs, t really matters f the nal computatonal method s lnear or non-lnear (especally when a good ntal guess can be obtaned for the numercal soluton), but n general, seekng ways to embed the non-lneartes nto a space where the manfold n queston becomes a lnear subspace consttutes an mportant step forward n understandng and manpulatng vsual nformaton. On the photometrc doman an ssue of nterest s how to combne brghtness nformaton from mages, n the form of spatal and temporal dervatves, wth geometrc constrants arsng from the fact that all mages are comng from the same 3D statc scene. Ths ssue s relevant to the problem of recoverng correspondences from a sequence of mages, and to the problem of recoverng scene structure and observer moton parameters drectly from brghtness measurements. An mportant step n achevng these goals s beng able to recast the general moton estmaton problem nto a parametrc framework,.e., that each pxel contrbutes measurements to a xed number of unknowns. The latter mples a process of elmnaton (ether structure of moton parameters), whch brngs us back to the geometrc goals of embeddng the nonlnear vew-manfold n spaces where t becomes a lnear subspace as we shall see n the sequel. Fnally, the meetng pont between geometry and photometry comes from consderng lnes n addton to ponts. A lne can be nterpreted as comng from a 3D lne (a geometrc entty), or as an uncertanty caused by nsucent brghtness measurements (the so called \aperture" problem) a photometrc entty. We wll show n ths paper that common to all these goals s the so called \trlnear tensor". 1.1 Background on the Trlnear Tensor The rst work to combne ponts and lnes nto one framework s due to Alomonos & Spetsaks [18]. They have shown that three vews admt a set 27 numbers arranged n three matrces. Matchng ponts were shown to produce three lnear equatons for the 27 numbers (hence 9 ponts are requred), and matchng lnes produce two equatons (hence 13 lnes are requred). They have also addressed the connecton between lnes and uncertanty by showng that the aperture problem whch s nherent n two vews does not exst wth three vews. Ths lne of research contnued wth Shashua [13] by attemptng to generalze the \lnear combnaton" result of Ullman & Basr [20], showed that the pont geometry of three vews actually produces 4 lnearly ndependent equatons, each havng a trlnear form n mage coordnates, for a

Technion, CS Dept., October [18] M.E. Spetsakis and J. Aloimonos. A unied theory of. June Cambridge, MA., June June 1995.