Monocular SLAM Using a Rao-Blackwellised Particle Filter with Exhaustive Pose Space Search

2007 IEEE Inernaional Conference on Roboics and Auomaion Roma, Ialy, 10-14 April 2007 Monocular SLAM Using a Rao-Blackwellised Paricle Filer wih Exhausive Pose Space Search Masahiro Tomono Absrac This paper presens a mehod of 3-D SLAM using a single. We uilize a Rao-Blackwellised paricle filer (RBPF) o deal wih a large number of. A difficuly in monocular SLAM is robusness o ouliers and noise, which may cause false esimaes especially under shor baseline condiions. We propose an exhausive pose-space search ha finds all he plausible hypoheses efficienly using epipolar geomery. The obained pose hypoheses are refined by he RBPF. Simulaions and experimens show ha he proposed mehod successfully performed 3-D SLAM wih a small number of paricles. Index Terms Objec modeling, 3-D maps, 3-D reconsrucion, Srucure from moion, Dense reconsrucion I. INTRODUCTION 3-D Simulaneous Localizaion and Mapping (SLAM) is a challenge in mobile roboics. 6-DOF localizaion in a 3- D map is crucial in order for a robo o navigae in a complex environmen and o perform a complicaed ask such as objec carrying. Vision-based SLAM is a promising approach o his problem. Especially, monocular SLAM is aracive because is hardware configuraion is simple. Furhermore, monocular SLAM can reconsruc disan objecs in large environmens since is baseline disance is variable. We consider in his paper a sysem which uilizes a single only. Moion sensors such as gyro are no necessary bu can be used o enhance accuracy and efficiency. Monocular SLAM esimaes moions and landmark locaions in 3-D space using feaures exraced from images capured by a moving. Since a single image has no deph informaion, he sysem mus reconsruc he deph of each feaure from wo or more images simulaneously wih reconsrucing he moion. This is a well-known problem referred o as Srucure-from-Moion (SFM) in he compuer vision communiy. This problem is especially crucial a he iniializaion phase, where he sysem has no 3-D reference poins () ye. In SFM, he sabiliy of he sysem heavily depends on ouliers and noise in he feaure posiions in he images. Even small noise will affec he esimaes significanly when he moion is small. Thus, robusness agains ouliers and noise is crucial. This paper presens a monocular SLAM scheme focusing on his problem. To increase robusness, he sysem searches all he moion hypoheses exhausively. If he exraced feaures are noisy, many hypoheses can be generaed. When he moion is small, i is difficul o deermine which hypohesis is correc. Thus, we find all M. Tomono is wih he Deparmen of Sysem Roboics, Toyo Universiy, Kawagoe, SA, Japan omono@eng.oyo.ac.jp he plausible hypoheses. A key poin is an efficien search by he reducion of he search space dimension from 5-D o 3-D using epipolar geomery. Anoher key poin is ha we employ a muliple hypohesis racking scheme, in which he sysem racks all he plausible hypoheses using he Rao- Blackwellised paricle filer (RBPF) [13], [11]. The RBPF filers ou false hypoheses and finds he correc one based on successive measuremens. The RBPF is also suiable for vision-based SLAM since i can handle a large number of. II. RELATED WORK A monocular SLAM sysem was firsly developed by Davison [1]. His sysem employs he Exended Kalman Filer (EKF) and paricle filers for landmark iniializaion. Eade e al. developed a monocular SLAM sysem using an RBPF for scalable SLAM [3]. Elinas e al. proposed σ SLAM using binocular sereo and an RBPF wih SIFT feaures o build indoor maps robusly [4]. These sysems do no use moion sensors. Monocular SLAM is regarded as a kind of bearing-only SLAM. In bearing-only SLAM, he landmark locaion is esimaed using EKF wih observaions from wo or more robo poses. When he disance beween he robo poses is shor, he gaussianiy of he obained esimaion is oo poor o employ EKF. Several approaches o his problem have been proposed including muliple hypohesis filer [9], federaed informaion sharing [16], and inverse deph scheme [3], [12]. Monocular SLAM is also relaed wih he Srucure-from- Moion (SFM) ha has been sudied in he compuer vision communiy. SFM reconsrucs moions and objec shapes simulaneously based on epipolar geomery wih an opimizaion scheme [8]. A number of mehods have been developed including he eigh poin mehod [7], he facorizaion mehod [18], he rifocal ensor [6], bundle adjusmen, and so on. Nisér developed visual odomery based on he SFM scheme [14]. Mos of hese sysems assume ha feaure correspondences are given by a feaure racker, and employ a robus esimaion echnique such as RANSAC [5] in order o eliminae ouliers. SFM has he same srucure as bearing-only SLAM. A difference beween hem is ha bearing-only SLAM is an esimaion problem wih a moion model, for which moion sensors such as odomery and gyro are used in many cases. On he oher hand, mos SFM sysems have no moion models. Our mehod is based on he SLAM scheme wih a moion model, which predics he moion using monocular images, no moion sensors. 1-4244-0602-1/07/$20.00 2007 IEEE. 2421

Some sysems in SFM need no feaure correspondences. Dellaer e al. proposed a SFM mehod wihou correspondence based on he Expecaion-Maximizaion scheme [2]. Makadia e al. proposed a SFM mehod wihou correspondence using a Radon ransform [10]. The laer is based on a kind of voing scheme, and our approach is concepually similar. The difference is ha our approach searches he pose space direcly by reducing he dimension based on he fac ha he ranslaion is no independen of he roaion under epipolar geomery. III. BASIC FRAMEWORK A. SLAM using a Rao-Blackwellised Paricle Filer The SLAM considered here esimaes he join probabiliy densiy p(x 1:,m z 1:,u 1:,c 1: ) of robo poses x 1: and map m [17]. Here, z 1: is he feaures observed from ime sep 1 o, and u 1: is a sequence of moion commands. The map m is a se of m i, and c 1: is correspondences beween and observed feaures. In his paper, as oher vision-based SLAM, a feaure is a 2-D poin exraced from a capured image, and a landmark is a 3-D poin which corresponds o a feaure. For simpliciy, we equae robo pose wih pose. The RBPF-based SLAM facors p(x 1:,m z 1:,u 1:,c 1: ) as follows by exploiing he condiional independence beween robo poses and landmark locaions [13], [11]. p(x 1:,m z 1:,u 1:,c 1: ) n = p(x 1: z 1:,u 1:,c 1: ) p(m i x 1:,z 1:,u 1:,c 1: ) (1) i The join disribuion is decomposed ino low-dimensional probabiliies, which are much more racable han he original one. The probabiliy densiy of robo poses p(x 1: z 1:,u 1:,c 1: ) is represened using a paricle filer. The probabiliy densiy of a landmark locaion p(m i x 1:,z 1:,u 1:,c 1: ) is represened wih a Gaussian disribuion which can be compued using an EKF. In implemenaion, he i-h paricle ν i a ime is represened in he following fashion. ν i =< x i, (µ i 1,, Σ i 1,),...,(µ i N,, Σ i N,) > x i is he robo pose esimae and µi j,, Σi j, are he j-h landmark locaion esimae and is covariance marix. p(x 1: z 1:,u 1:,c 1: ) is esimaed using a paricle filer based on a moion model and a measuremen model. Inuiively, he probabiliy densiy of x 1:+1 is prediced based on he moion model and he probabiliy densiy of x 1:, and hen he imporance weigh of each paricle is calculaed using he likelihood of he observed feaures based on he measuremen model. By resampling paricles according o he imporance weighs, he probabiliy densiy of x 1:+1 is obained. The deails of his procedure in our sysem is presened in Secion V. B. Our Approach o Monocular SLAM As menioned in Secion I, ouliers in feaure correspondences and/or noise in he feaure posiions cause false esimaes especially when he robo moion is small. This is crucial a he iniializaion phase, where he sysem has no 3-D ye. The essenial poin here is ha many subses of feaures could generae a differen hypohesis of he moion when here are ouliers and/or noise. (Noe ha any subse of feaures would generae he same esimae wihou ouliers nor noise.) The RANSAC is a useful scheme o find a good hypohesis, bu unforunaely he hypohesis having he bes score is no necessarily he correc esimae. Fig. 1 shows examples of hypoheses generaed by SFM wih RANSAC. (b) is he correc esimae and (c) is he false one, bu he score of (b) is smaller han ha of (c). To cope wih his problem, our sysem searches all he plausible hypoheses over he pose space, and filers ou false hypoheses using an RBPF in order o find he correc one. In his process, we uilize he fac ha he ranslaion is deermined linearly based on epipolar geomery when a roaion angle is given. This enables us o search he pose (roaion and ranslaion) space exhausively only by raversing he roaion space. Furhermore, his implies ha he robo pose has virually 3-DOFs for roaion only, and ha he number of paricles could be reduced. More concreely, we discreize he roaion space, and find he mos plausible ranslaion for each discreized roaion angle. Given wo poin correspondences and a roaion angle, he ranslaion is exacly deermined up o scale based on epipolar geomery as menioned laer. To find he mos plausible ranslaion from a se of poin correspondences, we employ a voing scheme. For each pair of feaure correspondences, we calculae he ranslaion and voe ino he corresponding bin in he ranslaion space. Then, he bin wih he highes score is seleced as he mos plausible ranslaion a he roaion angle. By repeaing his process for all he discreized roaion angles, we have he score disribuion over he roaion space. Now, we choose he roaion angles having a high score as good hypoheses. Ouliers can be eliminaed hrough he voing process. Our approach can find all he feasible hypoheses over he pose space exhausively. Since he RBPF can filer ou false hypoheses efficienly, he key poin is wheher he obained hypoheses include he rue one or no. The RANSAC can generae feasible hypoheses, bu i is no realisic o examine all he hypoheses over he pose space exhausively since he RANSAC searches hypoheses over he correspondence space. The exhausiveness over he pose space is he major advanage of our approach. Anoher advanage is ha our approach is quie suiable for a moion sensor such as gyro. Odomery is no applicable o he robos ha move wih 6-DOF in 3-D space. Alhough a gyro measures merely roaion angles, i will be sufficien for our scheme. The measuremens from a gyro can narrow he search region in he roaion space significanly, and i will increase he accuracy and efficiency of our approach. 2422

I1 P ni I2 q1i q2i (a) Scene Fig. 1. (b) Correc hypohesis (c) False hypohesis (score = 40) (score = 42) Examples of reconsrucion hypoheses C1 Fig. 2. τ Epipolar geomery C2 R IV. MOTION ESTIMATION BY EXHAUSTIVE SEARCH A. Scoring Funcion over Pose Space Le I 1 and I 2 be images capured from a moving, and Q 1 and Q 2 be he feaure ses exraced from I 1 and I 2 respecively. The problem considered here is o esimae he moion r = ψ, τ from I 1 o I 2 given Q 1 and Q 2. Here, ψ is roaion angles (roll, pich, yaw), and τ is a ranslaion vecor. Noe ha we assume he inrinsic parameers are known. We propose a mehod ha searches he pose space exhausively. Firs, we define he scoring funcion G(r) for moion r. G(r) = g(q 1i,q 2i )D(r, q 1i,q 2i ) (2) q 1i Q 1 q 2i Q 2 g(q 1i,q 2i ) is he maching score of image feaure poins q 1i and q 2i. D(r, q 1i,q 2i ) represens he score relaed wih errors in he epipolar consrain, o be menioned laer. By calculaing G(r) for each r, we have a score disribuion over he pose space. The poses having a high score in his disribuion are regarded as a good hypohesis. However, i is no realisic o search direcly all he poin r in he pose space since he dimension of r is essenially five ( he scale canno be obained from images only). Makadia e al. proposed a mehod of reducing compuaional complexiy using spherical harmonic analysis [10]. We propose a mehod of calculaing Eq.(2) more direcly in he nex subsecion. In he general framework, q 1i and q 2i cover all he poins in he images, and no explici correspondences beween hem are necessary. In his paper, however, for simple implemenaion, we assume he explici one-o-one correspondences beween Q 1 and Q 2 using a feaure racker such as he KLT racker [15]. Thus, g(q 1i,q 2i ) is defined as follows. This resricion will be removed in he near fuure. { 1, q1i and q g(q 1i,q 2i )= 2i are mached (3) 0, oherwise B. Translaion Esimaion by Epipolar Geomery Eq.(2) can be calculaed efficienly by raversing he roaion space only. The basic idea is o calculae he ranslaion from wo poin correspondences using epipolar geomery given a roaion angle. Le q 1i and q 2i be a feaure poin in image I 1 and I 2 respecively as shown in Fig. 2. I is assumed ha q 1i and q 2i are mached by a feaure racker. Then, he well-known epipolar consrain holds as follows. (q 1i Rq 2i ) T τ =0 (4) Here, R and τ are he roaion marix and he ranslaion vecor of r respecively. q 1i Rq 2i is he normal vecor of he epipolar plane. We denoe i by n i. If he roaion marix R is consan, Eq.(4) will be a linear equaion wih respec o τ. Given wo poin correspondences, we can easily obain τ by compuing he cross produc of he normal vecors n i and n j of he wo epipolar planes which are deermined by he poin correspondences (q 1i,q 2i ) and (q 1j,q 2j )(i j). τ = n i n j (5) We assume τ =1since he real scale canno be obained from images. We calculae D(r, q 1i,q 2i ) in Eq.(2) as follows. D(r, q 1i,q 2i ) = g(q 1j,q 2j ) q 1j Q 1 q 2j Q 2 D 0 (r, q 1i,q 2i )D 0 (r, q 1j,q 2j ) D 0 (r, q 1,q 2 ) = e α (q1 Rq2)T τ 2 (6) D 0 (r, q 1,q 2 ) represens he score relaed wih errors in he epipolar consrain. α is a given consan. C. Voing ino Translaion Space We compue he scoring funcion G(r) using a voing scheme. (1) Discreizaion of he roaion space We define a region which will cover all he possible roaion angles beween I 1 and I 2, and discreize he region. We denoe a discreized angle by ψ n. This region is expeced o be small in he case of monocular SLAM, which is a sequenial process in usual. (2) Discreizaion of he ranslaion space We creae a voing able by discreizing he ranslaion space. Since τ is a uni vecor, τ is represened by wo angles in a polar coordinae sysem. (3) Esimaion of ranslaion for a roaion angle Given a discreized roaion angle ψ n, we calculae he ranslaion vecor τ using Eq.(5) for each pair of feaure poins in Q 1 Q 2. In his paper, we approximae D 0 in Eq.(6) simply as a dela funcion, and voe ino he bin corresponding o τ in he ranslaion voing able. Then, we find he bin τ m having he maximal 2423

score. Now, we define G( ψ n,τ m ) as he maximal score. (4) Esimaion of roaion angle By repeaing sep (3) for all he discreized roaion angles, we have he score disribuion over he roaion space. Noe his is an approximaion of G(r). (5) Selecion of pose hypoheses We employ as pose hypohesis each r a which G(r) exceeds a given hreshold h 1. The hypoheses obained a sep (5) have insufficien accuracy because of he discreizaion of he pose space. Thus, we refine each hypohesis using a non-linear opimizaion mehod ha minimizes he reprojecion errors, which is a well-known echnique in compuer vision. The compuaional complexiy of his procedure is O(KN 2 ) when we assume he feaure correspondence is one-o-one as Eq. (3). K is he number of discreized angles in he roaion space, and N is he number of feaure poins. D. Eliminaion of Ouliers The voing process eliminaes ouliers in he moion esimaion. If feaure correspondences include ouliers, he voes calculaed from he ouliers will be disribued randomly over he ranslaion space. Thus, ouliers will no affec he score disribuion as long as he oulier rae is no significanly large (see Secion VI-A). Once a pose r is obained, we can eliminae ouliers wih respec o r using epipolar geomery. If q 1 and/or q 2 are ouliers wih respec o r, (q 1 Rq 2 ) T τ will be large. Thus, we eliminae he feaures which make he value larger han a given hreshold. V. SLAM FORMALIZATION A. Moion Model The moion model p(x x 1,u ) is he probabiliy densiy ha he robo moves from x 1 o x given moion command u. Wihou moion sensors, we define he moion model using a Gaussian mixure which consiss of pose hypoheses esimaed by he abovemenioned mehod. Each pose hypohesis is represened by N(x i, Σ x i ). x i is calculaed as x i = ri + x 1, where r i is he i-h pose hypohesis obained by he voing scheme. The covariance is calculaed as Σ x i =(J T xσ 1 i z J x i ) 1, where J x i is he Jacobian of perspecive projecion funcion z = h(x,m c ) wih respec o he pose a x i. Σ z is he covariance of he feaure noise. If we have a moion sensor, we can reduce he number of possible hypoheses significanly. The moion model based on he velociy and acceleraion esimaed from he pas rajecory is also useful o filer ou he hypoheses. This is imporan from a pracical poin of view, bu we do no discuss i in his paper. B. Measuremen Model The measuremen model p(z x,m c,c ) is he probabiliy densiy ha landmark m c is projeced ono feaure poin z when he pose is x. c represens he correspondence beween m and z. We approximae his probabiliy densiy wih a Gaussian disribuion. Based on he perspecive projecion model, he j-h feaure poin z j, is a funcion of he pose x and he corresponding landmark m j, ha is, z j, = h(x,m j ). By linearizing his funcion using Taylor expansion wih respec o m j, we have he following equaion. z j, = ẑ j, + J mj, 1 (m j m j, 1 )+v j Here, ẑ j, = h(ˆx, m j, 1 ). ˆx is he predicion of x by he moion model. J mj, 1 is he Jacobian of h(x,m j ) wih respec o m j, 1. v j is measuremen noise in a 2- D feaure poin, which is represened by N(0,R). Then, z j, is represened as a Gaussian N(ẑ j, + J mj, 1 (m j m j, 1 ),R). C. Imporance Weigh We calculae he imporance weigh of each paricle according o FasSLAM1.0 [17]. The proposal disribuion is as follows. p(x 1: z 1: 1,u 1:,c 1: 1 )= p(x x 1,u )p(x 1: 1 z 1: 1,u 1: 1,c 1: 1 ) Imporance weigh w i is calculaed as follows. w i arge disribuion = proposal disribuion = p(xi 1: z 1:,u 1:,c 1: ) p(x i 1: z 1: 1,u 1:,c 1: 1 ) = η p(z m,x i,c )p(m x i 1: 1,z 1: 1,c 1: 1 )dm This is a convoluion of N(ẑ j, +J mj, 1 (m j m j, 1 ),R) and N( m j, 1, Σ mj, 1 ). We have he imporance weigh as follows. w i N(ẑj, i, R+ J m T j, 1 Σ mj, 1 J mj, 1 ) (7) j D. Landmark Updae The probabiliy densiy of landmark locaion is updaed as follows. In he RBPF-SLAM, his is calculaed using EKF. p(m c x 1:,z 1:,c 1: ) = ηp(z x,m c,c )p(m c x 1: 1,z 1: 1,c 1: 1 ) In his paper, however, we esimae landmark locaions simply using he riangulaion from feaure poins on wo images. When he baseline disance is shor, he errors in he locaion of a landmark reconsruced from images would be oo large o represen by a Gaussian disribuion because of he non-lineariy of perspecive projecion. Thus, we esimae he landmark locaion using he riangulaion a every frame, and selec he mos accurae esimaion based on he covariance marix of he esimaed landmark locaion. This is he landmark iniializaion problem well-known in monocular SLAM, and we will improve he process by employing EKF wih he inverse deph scheme [12] in he fuure. The covariance of a landmark locaion is calculaed as follows [8]. Σ mj, is compued using SVD. Σ mj, =(Jm T j, Σ 1 z j, J mj, ) 1 2424

E. Procedure Our mehod is performed in he following procedure. (a) Iniializaion ( =1o k) Since here are no a he iniializaion sep, he sysem esimaes he moion and simulaneously only from images wihou moion sensors. To ensure sufficien baseline disance, we use k images. Currenly, k is given by human. (1) Camera pose esimaion We compue he score disribuion from images I 1 and I k using he mehod in Secion IV, and creae paricles for he hypoheses having a high score. (2) Landmark iniializaion For each paricle, we eliminae ouliers and reconsruc by he riangulaion using I 1 and I k. (b) Sequenial reconsrucion ( >k) (1) Camera pose predicion We compue he score disribuion using he mehod in Secion IV, and selec he hypoheses having a high score. For each hypohesis, we eliminae ouliers and esimae he pose. Then, we creae new paricles by pairing each hypohesis a ime and each paricle a ime 1. The number of paricles increases in his process. (2) Imporance weigh and resampling We calculae he imporance weigh of each paricle based on Eq.(7), and resampling paricles according o he normalized imporance weighs. The number of paricles is reduced o he original one. (3) Landmark updae For each resampled paricle, we eliminae ouliers and reconsruc using he riangulaion. If he landmark is new, we jus reconsruc i from he firs wo images in which he landmark appears. If he landmark is already regisered, we updae i when he covariance of he new reconsrucion is smaller han he old one. The real scale canno be obained only from images. The scale of he generaed 3-D map is proporional o τ obained a he iniializaion sep. Noe ha we assume τ =1as menioned above. A he sequenial reconsrucion sep, we esimae he scale facor using he 3-D map buil so far. This is performed by minimizing he reprojecion errors of he in he 3-D map ono he images using a nonlinear opimizaion mehod. VI. EXPERIMENTS A. Simulaion We carried ou a simulaion o evaluae he performance of our mehod by comparison wih a RANSAC-based mehod. Fig.3 shows he success raes of pose esimaion by he wo mehods. In his simulaion, 50 are randomly generaed in 3-D space, and are projeced ono wo images a differen poses. Varying feaure noise level σ (Gaussian) and oulier rae, he relaive pose is reconsruced from he wo images. The 8-poin mehod success rae [%] 100 80 60 40 our mehod RANSAC (bes) RANSAC (all) success rae [%] 20 20 oulier oulier rae[%] rae[%] 0 0 10 20 30 40 50 60 70 10 20 30 40 50 60 70 (a) Feaure noise σ = 0 [pixel] (b) Feaure noise σ = 0.5 [pixel] Fig. 3. 100 80 60 40 Success rae of pose esimaion [7] is used for reconsrucion in he RANSAC-based mehod. The number of samples in RANSAC is 1000. In his simulaion, we judged a hypohesis is passed if is error in each roaion angle is wihin 1.0 [deg]. For our mehod, success means ha a leas one of he hypoheses seleced a sep (5) in Secion IV-C is passed. The hreshold h 1 was se o 70% of he maximal voes. For he RANSAC-based mehod, we employed wo crieria. One is ha i is successful when a leas one of he 1000 samples is passed. The oher is ha i is successful when he sample having he bes score is passed. Theoreically, in he case of using he 8-poin mehod, 1177 samples will provide 99% success rae a 50% oulier rae [8] when σ = 0. Fig.3 (a) suppors i. Fig.3 (b) shows ha he success rae of he RANSAC-based mehod is degraded more han ha of our mehod when feaure noise of σ =0.5 [pixel] is added. From his resul, we found ha our mehod ouperforms RANSAC in finding good hypoheses. We also found ha he bes hypohesis can be false. Muliple hypohesis racking by RBPF addresses his problem. Fig.4 shows he simulaions of monocular SLAM by our mehod. 50 are randomly generaed in 3-D space, and he moves along he predefined rajecories: a circle wih a radius of 700 [cm] and a sraigh line of 1600 [cm]. Feaure noise of σ = 0.5 [pixel] is added o each feaure on he images, and oulier rae is 20 % in each image. The number of paricles in RBPF is 20. Fig.4 shows he rajecory of he bes of he 20 paricles. In (a), he sandard deviaion of he poses in he bes paricle is σ x =7.4 [cm], σ y =55.0 [cm], σ z =95.2 [cm], σ roll = 0.36 [deg], σ pich =0.40 [deg], σ yaw =0.27 [deg]. In (b), he sandard deviaion is σ x =12.0 [cm], σ y =8.2 [cm], σ z =21.9 [cm], σ roll =0.37 [deg], σ pich =1.02 [deg], σ yaw =0.08 [deg]. B. Experimens in Real Environmens We conduced experimens in indoor and oudoor environmens. Images were capured by human wih a digial. The image size was 320 by 240 pixels. The number of paricles is 20. The correspondences beween feaure poins were obained using he KLT racker [15]. The number of feaure poin in an image was 50. The experimens were done off-line. The maps were reconsruced using key frames, each of which was exraced over every n frames (n =3 o 8). n was given by human, which was consan in one experimen. 2425

z x y wih covariance op view rajecory z x y rajecory rajecory 20m (a) Circular rajecory Fig. 4. (b) Sraigh rajecory (lef: sep2, righ: sep9) Simulaion resuls Fig. 6. Resul of anoher experimen in oudoors op view op view rajecory side view wih covariance (a) Snapsho of he environmen 10m (b) Resul rajecory (c) Cov. of he Fig. 7. Resul of an experimen in indoors Fig. 5. Resul of an experimen in oudoors Fig.5 shows he resul of an experimen in oudoors. The moved abou 15[m] and capured 30 images. The oal number of is 173. Alhough oudoor environmens have boh near and far, hey were reconsruced well as shown in (b). The covariance of each landmark is shown in (c). In his experimen, many hypoheses were generaed a he iniializaion sep. I is difficul o find which one is correc from a small number of measuremens. The RBPF filered ou false hypoheses o find he correc one based on he measuremens obained ime afer ime. Fig.6 shows he resul of anoher experimen in oudoors. The moved abou 80[m] and capured 180 images. The oal number of is 368. There are many a disan locaions. Fig.7 shows he resul of an experimen in indoors. The moved in a 10[m] 10[m] room and capured 180 images. The oal number of is 773. This is a good example of he 6-DOF moion in 3-D space. In hese experimens, he roaion space was discreized from -10 [deg] o 10 [deg] by 1 [deg] inerval for each angle. The compuaion ime is currenly 3 o 10 seconds per key frame. The compuaion ime will be reduced by program cusomizaion and parallel processing. VII. CONCLUSIONS The paper has presened a monocular SLAM scheme using a Rao-Blackwellised paricle filer. Our conribuion is an exhausive pose space search, in which all he plausible hypoheses are found efficienly using epipolar geomery and a voing scheme. By racking and refining muliple hypoheses using he RBPF, 3-D SLAM is performed robusly. Fuure work includes error analysis and more efficien implemenaion of he sysem. REFERENCES [1] A. J. Davison: Real-ime simulaneous localizaion and mapping wih a single, Proc. of CVPR 03, 2003. [2] F. Dellaer, S. Seiz, C. Thorpe, and S. Thrun: Srucure from moion wihou correspondences, Proc. of CVPR2000, 2000. [3] E. Eade and T. Drummond: Scalable Monocular SLAM, Proc. of CVPR 06, 2006. [4] P. Elinas, R. Sim, and J. J. Lile: σ SLAM: Sereo Vision SLAM Using he Rao-Blackwellised Paricle Filer and a Novel Mixure Proposal Disribuion, Proc. of ICRA2006, pp. 1564 1570, 2006. [5] M. Fischler and R. Bolles: Random Sample Consensus: a Paradigm for Model Fiing wih Applicaion o Image Analysis and Auomaed Carography, Communicaions ACM, 24:381-395, 1981. [6] A. W. Fizgibbon and A. Zisserman: Auomaic Camera Recovery for Closed or Open Image Sequences, Proc. of ECCV 98, 1998. [7] R. Harley: In defense of he eigh-poin algorihm, IEEE Trans. PAMI, Vol. 19, No. 6, pp. 580 593, 1997. [8] R. Harley and A. Zisserman: Muliple View Geomery in Compuer Vision, Cambridge Universiy Press, 2000. [9] N. M. Kwok and G. Dissanayake: An Efficien Muliple Hypohesis Filer for Bearing-Only SLAM, Proc of IROS2004, 2004. [10] A. Makadia, C. Geyer, and K. Daniilidis: Radon-based Srucure from Moion Wihou Correspondences, Proc of CVPR 05, 2005. [11] M. Monemerlo, S. Thrun, D. Koller, and B. Wegbrei: FasSLAM: A Facored Soluion o he Simulaneous Localizaion and Mapping Problem, Proc of AAAI2002, 2002. [12] J. M. M. Moniel, J. Civera, and A. J. Davison: Unified Inverse Deph Paramerizaion for Monocular SLAM, Proc. of RSS2006, 2006. [13] K. Murphy and S. Russell: Rao-Blackwellised Paricle Filering for Dynamic Bayesian Neworks, in A. Douce ed. : Sequenial Mone Carlo Mehods in Pracice, Springer, 2001. [14] D. Nisér, O. Narodisky, and J. Bergen: Visual Odomery, Proc. of CVPR 04, 2004. [15] J. Shi and C. Tomasi: Good Feaures o Track, Proc. of CVPR 94, pp. 593-600, 1994. [16] J. Sola, A. Monin, M. Devy, and T. Lemaire: Undelayed Iniializaion in Bearing Only SLAM, Proc. of IROS2005, pp. 2751-2756, 2005. [17] S. Thrun, W. Burgard, and D. Fox: Probabilisic Roboics, he MIT Press, 2005. [18] C. Tomasi and T. Kanade: Shape and Moion from Image Sreams under Orhography: A Facorizaion Approach, In. J. of Compuer Vision, 9(2):137-154. 2426