Monocular SLAM Using a Rao-Blackwellised Particle Filter with Exhaustive Pose Space Search

Similar documents
EKF SLAM vs. FastSLAM A Comparison

Robot Motion Model EKF based Localization EKF SLAM Graph SLAM

Estimation of Poses with Particle Filters

Sequential Importance Resampling (SIR) Particle Filter

Probabilistic Robotics

Probabilistic Robotics SLAM

2016 Possible Examination Questions. Robotics CSCE 574

Probabilistic Robotics SLAM

Introduction to Mobile Robotics

L07. KALMAN FILTERING FOR NON-LINEAR SYSTEMS. NA568 Mobile Robotics: Methods & Algorithms

Vehicle Arrival Models : Headway

WATER LEVEL TRACKING WITH CONDENSATION ALGORITHM

Using the Kalman filter Extended Kalman filter

Announcements. Recap: Filtering. Recap: Reasoning Over Time. Example: State Representations for Robot Localization. Particle Filtering

Západočeská Univerzita v Plzni, Czech Republic and Groupe ESIEE Paris, France

7630 Autonomous Robotics Probabilistic Localisation

Notes on Kalman Filtering

Two Popular Bayesian Estimators: Particle and Kalman Filters. McGill COMP 765 Sept 14 th, 2017

SEIF, EnKF, EKF SLAM. Pieter Abbeel UC Berkeley EECS

Modal identification of structures from roving input data by means of maximum likelihood estimation of the state space model

Zürich. ETH Master Course: L Autonomous Mobile Robots Localization II

FastSLAM with Stereo Vision

Probabilistic Robotics The Sparse Extended Information Filter

Look-ahead Proposals for Robust Grid-based SLAM

Random Walk with Anti-Correlated Steps

A Rao-Blackwellized Parts-Constellation Tracker

Physics 235 Chapter 2. Chapter 2 Newtonian Mechanics Single Particle

Simultaneous Localization and Mapping with Unknown Data Association Using FastSLAM

CS 4495 Computer Vision Tracking 1- Kalman,Gaussian

m = 41 members n = 27 (nonfounders), f = 14 (founders) 8 markers from chromosome 19

Probabilistic Fundamentals in Robotics

Robust estimation based on the first- and third-moment restrictions of the power transformation model

STATE-SPACE MODELLING. A mass balance across the tank gives:

FastSLAM 2.0: An Improved Particle Filtering Algorithm for Simultaneous Localization and Mapping that Provably Converges

Kriging Models Predicting Atrazine Concentrations in Surface Water Draining Agricultural Watersheds

Introduction to Mobile Robotics SLAM: Simultaneous Localization and Mapping

Speaker Adaptation Techniques For Continuous Speech Using Medium and Small Adaptation Data Sets. Constantinos Boulis

Content-Based Shape Retrieval Using Different Shape Descriptors: A Comparative Study Dengsheng Zhang and Guojun Lu

CSE-571 Robotics. Sample-based Localization (sonar) Motivation. Bayes Filter Implementations. Particle filters. Density Approximation

Ensamble methods: Boosting

Tracking. Announcements

A PROBABILISTIC MULTIMODAL ALGORITHM FOR TRACKING MULTIPLE AND DYNAMIC OBJECTS

Let us start with a two dimensional case. We consider a vector ( x,

An introduction to the theory of SDDP algorithm

From Particles to Rigid Bodies

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

Fixed-lag Sampling Strategies for Particle Filtering SLAM

Math 10B: Mock Mid II. April 13, 2016

Diebold, Chapter 7. Francis X. Diebold, Elements of Forecasting, 4th Edition (Mason, Ohio: Cengage Learning, 2006). Chapter 7. Characterizing Cycles

Lecture 2-1 Kinematics in One Dimension Displacement, Velocity and Acceleration Everything in the world is moving. Nothing stays still.

Optimal Path Planning for Flexible Redundant Robot Manipulators

Improved Rao-Blackwellized H filter based mobile robot SLAM

ACE 562 Fall Lecture 5: The Simple Linear Regression Model: Sampling Properties of the Least Squares Estimators. by Professor Scott H.

FastSLAM: An Efficient Solution to the Simultaneous Localization And Mapping Problem with Unknown Data Association

Air Traffic Forecast Empirical Research Based on the MCMC Method

Ensamble methods: Bagging and Boosting

Multi-Robot Simultaneous Localization and Mapping (Multi-SLAM)

1 Review of Zero-Sum Games

Kinematics and kinematic functions

Simulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010

20. Applications of the Genetic-Drift Model

AUV positioning based on Interactive Multiple Model

Monte Carlo Sampling of Non-Gaussian Proposal Distribution in Feature-Based RBPF-SLAM

State-Space Models. Initialization, Estimation and Smoothing of the Kalman Filter

Position, Velocity, and Acceleration

PENALIZED LEAST SQUARES AND PENALIZED LIKELIHOOD

0.1 MAXIMUM LIKELIHOOD ESTIMATION EXPLAINED

CHAPTER 10 VALIDATION OF TEST WITH ARTIFICAL NEURAL NETWORK

Christos Papadimitriou & Luca Trevisan November 22, 2016

DEPARTMENT OF STATISTICS

Lab #2: Kinematics in 1-Dimension

RAO-BLACKWELLIZED PARTICLE SMOOTHERS FOR MIXED LINEAR/NONLINEAR STATE-SPACE MODELS

CSE-473. A Gentle Introduction to Particle Filters

On using Likelihood-adjusted Proposals in Particle Filtering: Local Importance Sampling

מקורות לחומר בשיעור ספר הלימוד: Forsyth & Ponce מאמרים שונים חומר באינטרנט! פרק פרק 18

Testing for a Single Factor Model in the Multivariate State Space Framework

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Introduction to Probability and Statistics Slides 4 Chapter 4

Multi-Robot Marginal-SLAM

Particle Swarm Optimization Combining Diversification and Intensification for Nonlinear Integer Programming Problems

Object tracking: Using HMMs to estimate the geographical location of fish

T L. t=1. Proof of Lemma 1. Using the marginal cost accounting in Equation(4) and standard arguments. t )+Π RB. t )+K 1(Q RB

Rapid Termination Evaluation for Recursive Subdivision of Bezier Curves

Planning in POMDPs. Dominik Schoenberger Abstract

Shiva Akhtarian MSc Student, Department of Computer Engineering and Information Technology, Payame Noor University, Iran

Robust Object Tracking under Appearance Change Conditions

Monte Carlo data association for multiple target tracking

Augmented Reality II - Kalman Filters - Gudrun Klinker May 25, 2004

Learning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power

Temporal probability models. Chapter 15, Sections 1 5 1

04. Kinetics of a second order reaction

Linear Response Theory: The connection between QFT and experiments

Week 1 Lecture 2 Problems 2, 5. What if something oscillates with no obvious spring? What is ω? (problem set problem)

) were both constant and we brought them from under the integral.

Tracking. Many slides adapted from Kristen Grauman, Deva Ramanan

KINEMATICS IN ONE DIMENSION

Low Computation and Low Latency Algorithms for Distributed Sensor Network Initialization

A Rao-Blackwellized Particle Filter for EigenTracking

Stability and Bifurcation in a Neural Network Model with Two Delays

Anno accademico 2006/2007. Davide Migliore

Transcription:

2007 IEEE Inernaional Conference on Roboics and Auomaion Roma, Ialy, 10-14 April 2007 Monocular SLAM Using a Rao-Blackwellised Paricle Filer wih Exhausive Pose Space Search Masahiro Tomono Absrac This paper presens a mehod of 3-D SLAM using a single. We uilize a Rao-Blackwellised paricle filer (RBPF) o deal wih a large number of. A difficuly in monocular SLAM is robusness o ouliers and noise, which may cause false esimaes especially under shor baseline condiions. We propose an exhausive pose-space search ha finds all he plausible hypoheses efficienly using epipolar geomery. The obained pose hypoheses are refined by he RBPF. Simulaions and experimens show ha he proposed mehod successfully performed 3-D SLAM wih a small number of paricles. Index Terms Objec modeling, 3-D maps, 3-D reconsrucion, Srucure from moion, Dense reconsrucion I. INTRODUCTION 3-D Simulaneous Localizaion and Mapping (SLAM) is a challenge in mobile roboics. 6-DOF localizaion in a 3- D map is crucial in order for a robo o navigae in a complex environmen and o perform a complicaed ask such as objec carrying. Vision-based SLAM is a promising approach o his problem. Especially, monocular SLAM is aracive because is hardware configuraion is simple. Furhermore, monocular SLAM can reconsruc disan objecs in large environmens since is baseline disance is variable. We consider in his paper a sysem which uilizes a single only. Moion sensors such as gyro are no necessary bu can be used o enhance accuracy and efficiency. Monocular SLAM esimaes moions and landmark locaions in 3-D space using feaures exraced from images capured by a moving. Since a single image has no deph informaion, he sysem mus reconsruc he deph of each feaure from wo or more images simulaneously wih reconsrucing he moion. This is a well-known problem referred o as Srucure-from-Moion (SFM) in he compuer vision communiy. This problem is especially crucial a he iniializaion phase, where he sysem has no 3-D reference poins () ye. In SFM, he sabiliy of he sysem heavily depends on ouliers and noise in he feaure posiions in he images. Even small noise will affec he esimaes significanly when he moion is small. Thus, robusness agains ouliers and noise is crucial. This paper presens a monocular SLAM scheme focusing on his problem. To increase robusness, he sysem searches all he moion hypoheses exhausively. If he exraced feaures are noisy, many hypoheses can be generaed. When he moion is small, i is difficul o deermine which hypohesis is correc. Thus, we find all M. Tomono is wih he Deparmen of Sysem Roboics, Toyo Universiy, Kawagoe, SA, Japan omono@eng.oyo.ac.jp he plausible hypoheses. A key poin is an efficien search by he reducion of he search space dimension from 5-D o 3-D using epipolar geomery. Anoher key poin is ha we employ a muliple hypohesis racking scheme, in which he sysem racks all he plausible hypoheses using he Rao- Blackwellised paricle filer (RBPF) [13], [11]. The RBPF filers ou false hypoheses and finds he correc one based on successive measuremens. The RBPF is also suiable for vision-based SLAM since i can handle a large number of. II. RELATED WORK A monocular SLAM sysem was firsly developed by Davison [1]. His sysem employs he Exended Kalman Filer (EKF) and paricle filers for landmark iniializaion. Eade e al. developed a monocular SLAM sysem using an RBPF for scalable SLAM [3]. Elinas e al. proposed σ SLAM using binocular sereo and an RBPF wih SIFT feaures o build indoor maps robusly [4]. These sysems do no use moion sensors. Monocular SLAM is regarded as a kind of bearing-only SLAM. In bearing-only SLAM, he landmark locaion is esimaed using EKF wih observaions from wo or more robo poses. When he disance beween he robo poses is shor, he gaussianiy of he obained esimaion is oo poor o employ EKF. Several approaches o his problem have been proposed including muliple hypohesis filer [9], federaed informaion sharing [16], and inverse deph scheme [3], [12]. Monocular SLAM is also relaed wih he Srucure-from- Moion (SFM) ha has been sudied in he compuer vision communiy. SFM reconsrucs moions and objec shapes simulaneously based on epipolar geomery wih an opimizaion scheme [8]. A number of mehods have been developed including he eigh poin mehod [7], he facorizaion mehod [18], he rifocal ensor [6], bundle adjusmen, and so on. Nisér developed visual odomery based on he SFM scheme [14]. Mos of hese sysems assume ha feaure correspondences are given by a feaure racker, and employ a robus esimaion echnique such as RANSAC [5] in order o eliminae ouliers. SFM has he same srucure as bearing-only SLAM. A difference beween hem is ha bearing-only SLAM is an esimaion problem wih a moion model, for which moion sensors such as odomery and gyro are used in many cases. On he oher hand, mos SFM sysems have no moion models. Our mehod is based on he SLAM scheme wih a moion model, which predics he moion using monocular images, no moion sensors. 1-4244-0602-1/07/$20.00 2007 IEEE. 2421

Some sysems in SFM need no feaure correspondences. Dellaer e al. proposed a SFM mehod wihou correspondence based on he Expecaion-Maximizaion scheme [2]. Makadia e al. proposed a SFM mehod wihou correspondence using a Radon ransform [10]. The laer is based on a kind of voing scheme, and our approach is concepually similar. The difference is ha our approach searches he pose space direcly by reducing he dimension based on he fac ha he ranslaion is no independen of he roaion under epipolar geomery. III. BASIC FRAMEWORK A. SLAM using a Rao-Blackwellised Paricle Filer The SLAM considered here esimaes he join probabiliy densiy p(x 1:,m z 1:,u 1:,c 1: ) of robo poses x 1: and map m [17]. Here, z 1: is he feaures observed from ime sep 1 o, and u 1: is a sequence of moion commands. The map m is a se of m i, and c 1: is correspondences beween and observed feaures. In his paper, as oher vision-based SLAM, a feaure is a 2-D poin exraced from a capured image, and a landmark is a 3-D poin which corresponds o a feaure. For simpliciy, we equae robo pose wih pose. The RBPF-based SLAM facors p(x 1:,m z 1:,u 1:,c 1: ) as follows by exploiing he condiional independence beween robo poses and landmark locaions [13], [11]. p(x 1:,m z 1:,u 1:,c 1: ) n = p(x 1: z 1:,u 1:,c 1: ) p(m i x 1:,z 1:,u 1:,c 1: ) (1) i The join disribuion is decomposed ino low-dimensional probabiliies, which are much more racable han he original one. The probabiliy densiy of robo poses p(x 1: z 1:,u 1:,c 1: ) is represened using a paricle filer. The probabiliy densiy of a landmark locaion p(m i x 1:,z 1:,u 1:,c 1: ) is represened wih a Gaussian disribuion which can be compued using an EKF. In implemenaion, he i-h paricle ν i a ime is represened in he following fashion. ν i =< x i, (µ i 1,, Σ i 1,),...,(µ i N,, Σ i N,) > x i is he robo pose esimae and µi j,, Σi j, are he j-h landmark locaion esimae and is covariance marix. p(x 1: z 1:,u 1:,c 1: ) is esimaed using a paricle filer based on a moion model and a measuremen model. Inuiively, he probabiliy densiy of x 1:+1 is prediced based on he moion model and he probabiliy densiy of x 1:, and hen he imporance weigh of each paricle is calculaed using he likelihood of he observed feaures based on he measuremen model. By resampling paricles according o he imporance weighs, he probabiliy densiy of x 1:+1 is obained. The deails of his procedure in our sysem is presened in Secion V. B. Our Approach o Monocular SLAM As menioned in Secion I, ouliers in feaure correspondences and/or noise in he feaure posiions cause false esimaes especially when he robo moion is small. This is crucial a he iniializaion phase, where he sysem has no 3-D ye. The essenial poin here is ha many subses of feaures could generae a differen hypohesis of he moion when here are ouliers and/or noise. (Noe ha any subse of feaures would generae he same esimae wihou ouliers nor noise.) The RANSAC is a useful scheme o find a good hypohesis, bu unforunaely he hypohesis having he bes score is no necessarily he correc esimae. Fig. 1 shows examples of hypoheses generaed by SFM wih RANSAC. (b) is he correc esimae and (c) is he false one, bu he score of (b) is smaller han ha of (c). To cope wih his problem, our sysem searches all he plausible hypoheses over he pose space, and filers ou false hypoheses using an RBPF in order o find he correc one. In his process, we uilize he fac ha he ranslaion is deermined linearly based on epipolar geomery when a roaion angle is given. This enables us o search he pose (roaion and ranslaion) space exhausively only by raversing he roaion space. Furhermore, his implies ha he robo pose has virually 3-DOFs for roaion only, and ha he number of paricles could be reduced. More concreely, we discreize he roaion space, and find he mos plausible ranslaion for each discreized roaion angle. Given wo poin correspondences and a roaion angle, he ranslaion is exacly deermined up o scale based on epipolar geomery as menioned laer. To find he mos plausible ranslaion from a se of poin correspondences, we employ a voing scheme. For each pair of feaure correspondences, we calculae he ranslaion and voe ino he corresponding bin in he ranslaion space. Then, he bin wih he highes score is seleced as he mos plausible ranslaion a he roaion angle. By repeaing his process for all he discreized roaion angles, we have he score disribuion over he roaion space. Now, we choose he roaion angles having a high score as good hypoheses. Ouliers can be eliminaed hrough he voing process. Our approach can find all he feasible hypoheses over he pose space exhausively. Since he RBPF can filer ou false hypoheses efficienly, he key poin is wheher he obained hypoheses include he rue one or no. The RANSAC can generae feasible hypoheses, bu i is no realisic o examine all he hypoheses over he pose space exhausively since he RANSAC searches hypoheses over he correspondence space. The exhausiveness over he pose space is he major advanage of our approach. Anoher advanage is ha our approach is quie suiable for a moion sensor such as gyro. Odomery is no applicable o he robos ha move wih 6-DOF in 3-D space. Alhough a gyro measures merely roaion angles, i will be sufficien for our scheme. The measuremens from a gyro can narrow he search region in he roaion space significanly, and i will increase he accuracy and efficiency of our approach. 2422

I1 P ni I2 q1i q2i (a) Scene Fig. 1. (b) Correc hypohesis (c) False hypohesis (score = 40) (score = 42) Examples of reconsrucion hypoheses C1 Fig. 2. τ Epipolar geomery C2 R IV. MOTION ESTIMATION BY EXHAUSTIVE SEARCH A. Scoring Funcion over Pose Space Le I 1 and I 2 be images capured from a moving, and Q 1 and Q 2 be he feaure ses exraced from I 1 and I 2 respecively. The problem considered here is o esimae he moion r = ψ, τ from I 1 o I 2 given Q 1 and Q 2. Here, ψ is roaion angles (roll, pich, yaw), and τ is a ranslaion vecor. Noe ha we assume he inrinsic parameers are known. We propose a mehod ha searches he pose space exhausively. Firs, we define he scoring funcion G(r) for moion r. G(r) = g(q 1i,q 2i )D(r, q 1i,q 2i ) (2) q 1i Q 1 q 2i Q 2 g(q 1i,q 2i ) is he maching score of image feaure poins q 1i and q 2i. D(r, q 1i,q 2i ) represens he score relaed wih errors in he epipolar consrain, o be menioned laer. By calculaing G(r) for each r, we have a score disribuion over he pose space. The poses having a high score in his disribuion are regarded as a good hypohesis. However, i is no realisic o search direcly all he poin r in he pose space since he dimension of r is essenially five ( he scale canno be obained from images only). Makadia e al. proposed a mehod of reducing compuaional complexiy using spherical harmonic analysis [10]. We propose a mehod of calculaing Eq.(2) more direcly in he nex subsecion. In he general framework, q 1i and q 2i cover all he poins in he images, and no explici correspondences beween hem are necessary. In his paper, however, for simple implemenaion, we assume he explici one-o-one correspondences beween Q 1 and Q 2 using a feaure racker such as he KLT racker [15]. Thus, g(q 1i,q 2i ) is defined as follows. This resricion will be removed in he near fuure. { 1, q1i and q g(q 1i,q 2i )= 2i are mached (3) 0, oherwise B. Translaion Esimaion by Epipolar Geomery Eq.(2) can be calculaed efficienly by raversing he roaion space only. The basic idea is o calculae he ranslaion from wo poin correspondences using epipolar geomery given a roaion angle. Le q 1i and q 2i be a feaure poin in image I 1 and I 2 respecively as shown in Fig. 2. I is assumed ha q 1i and q 2i are mached by a feaure racker. Then, he well-known epipolar consrain holds as follows. (q 1i Rq 2i ) T τ =0 (4) Here, R and τ are he roaion marix and he ranslaion vecor of r respecively. q 1i Rq 2i is he normal vecor of he epipolar plane. We denoe i by n i. If he roaion marix R is consan, Eq.(4) will be a linear equaion wih respec o τ. Given wo poin correspondences, we can easily obain τ by compuing he cross produc of he normal vecors n i and n j of he wo epipolar planes which are deermined by he poin correspondences (q 1i,q 2i ) and (q 1j,q 2j )(i j). τ = n i n j (5) We assume τ =1since he real scale canno be obained from images. We calculae D(r, q 1i,q 2i ) in Eq.(2) as follows. D(r, q 1i,q 2i ) = g(q 1j,q 2j ) q 1j Q 1 q 2j Q 2 D 0 (r, q 1i,q 2i )D 0 (r, q 1j,q 2j ) D 0 (r, q 1,q 2 ) = e α (q1 Rq2)T τ 2 (6) D 0 (r, q 1,q 2 ) represens he score relaed wih errors in he epipolar consrain. α is a given consan. C. Voing ino Translaion Space We compue he scoring funcion G(r) using a voing scheme. (1) Discreizaion of he roaion space We define a region which will cover all he possible roaion angles beween I 1 and I 2, and discreize he region. We denoe a discreized angle by ψ n. This region is expeced o be small in he case of monocular SLAM, which is a sequenial process in usual. (2) Discreizaion of he ranslaion space We creae a voing able by discreizing he ranslaion space. Since τ is a uni vecor, τ is represened by wo angles in a polar coordinae sysem. (3) Esimaion of ranslaion for a roaion angle Given a discreized roaion angle ψ n, we calculae he ranslaion vecor τ using Eq.(5) for each pair of feaure poins in Q 1 Q 2. In his paper, we approximae D 0 in Eq.(6) simply as a dela funcion, and voe ino he bin corresponding o τ in he ranslaion voing able. Then, we find he bin τ m having he maximal 2423

score. Now, we define G( ψ n,τ m ) as he maximal score. (4) Esimaion of roaion angle By repeaing sep (3) for all he discreized roaion angles, we have he score disribuion over he roaion space. Noe his is an approximaion of G(r). (5) Selecion of pose hypoheses We employ as pose hypohesis each r a which G(r) exceeds a given hreshold h 1. The hypoheses obained a sep (5) have insufficien accuracy because of he discreizaion of he pose space. Thus, we refine each hypohesis using a non-linear opimizaion mehod ha minimizes he reprojecion errors, which is a well-known echnique in compuer vision. The compuaional complexiy of his procedure is O(KN 2 ) when we assume he feaure correspondence is one-o-one as Eq. (3). K is he number of discreized angles in he roaion space, and N is he number of feaure poins. D. Eliminaion of Ouliers The voing process eliminaes ouliers in he moion esimaion. If feaure correspondences include ouliers, he voes calculaed from he ouliers will be disribued randomly over he ranslaion space. Thus, ouliers will no affec he score disribuion as long as he oulier rae is no significanly large (see Secion VI-A). Once a pose r is obained, we can eliminae ouliers wih respec o r using epipolar geomery. If q 1 and/or q 2 are ouliers wih respec o r, (q 1 Rq 2 ) T τ will be large. Thus, we eliminae he feaures which make he value larger han a given hreshold. V. SLAM FORMALIZATION A. Moion Model The moion model p(x x 1,u ) is he probabiliy densiy ha he robo moves from x 1 o x given moion command u. Wihou moion sensors, we define he moion model using a Gaussian mixure which consiss of pose hypoheses esimaed by he abovemenioned mehod. Each pose hypohesis is represened by N(x i, Σ x i ). x i is calculaed as x i = ri + x 1, where r i is he i-h pose hypohesis obained by he voing scheme. The covariance is calculaed as Σ x i =(J T xσ 1 i z J x i ) 1, where J x i is he Jacobian of perspecive projecion funcion z = h(x,m c ) wih respec o he pose a x i. Σ z is he covariance of he feaure noise. If we have a moion sensor, we can reduce he number of possible hypoheses significanly. The moion model based on he velociy and acceleraion esimaed from he pas rajecory is also useful o filer ou he hypoheses. This is imporan from a pracical poin of view, bu we do no discuss i in his paper. B. Measuremen Model The measuremen model p(z x,m c,c ) is he probabiliy densiy ha landmark m c is projeced ono feaure poin z when he pose is x. c represens he correspondence beween m and z. We approximae his probabiliy densiy wih a Gaussian disribuion. Based on he perspecive projecion model, he j-h feaure poin z j, is a funcion of he pose x and he corresponding landmark m j, ha is, z j, = h(x,m j ). By linearizing his funcion using Taylor expansion wih respec o m j, we have he following equaion. z j, = ẑ j, + J mj, 1 (m j m j, 1 )+v j Here, ẑ j, = h(ˆx, m j, 1 ). ˆx is he predicion of x by he moion model. J mj, 1 is he Jacobian of h(x,m j ) wih respec o m j, 1. v j is measuremen noise in a 2- D feaure poin, which is represened by N(0,R). Then, z j, is represened as a Gaussian N(ẑ j, + J mj, 1 (m j m j, 1 ),R). C. Imporance Weigh We calculae he imporance weigh of each paricle according o FasSLAM1.0 [17]. The proposal disribuion is as follows. p(x 1: z 1: 1,u 1:,c 1: 1 )= p(x x 1,u )p(x 1: 1 z 1: 1,u 1: 1,c 1: 1 ) Imporance weigh w i is calculaed as follows. w i arge disribuion = proposal disribuion = p(xi 1: z 1:,u 1:,c 1: ) p(x i 1: z 1: 1,u 1:,c 1: 1 ) = η p(z m,x i,c )p(m x i 1: 1,z 1: 1,c 1: 1 )dm This is a convoluion of N(ẑ j, +J mj, 1 (m j m j, 1 ),R) and N( m j, 1, Σ mj, 1 ). We have he imporance weigh as follows. w i N(ẑj, i, R+ J m T j, 1 Σ mj, 1 J mj, 1 ) (7) j D. Landmark Updae The probabiliy densiy of landmark locaion is updaed as follows. In he RBPF-SLAM, his is calculaed using EKF. p(m c x 1:,z 1:,c 1: ) = ηp(z x,m c,c )p(m c x 1: 1,z 1: 1,c 1: 1 ) In his paper, however, we esimae landmark locaions simply using he riangulaion from feaure poins on wo images. When he baseline disance is shor, he errors in he locaion of a landmark reconsruced from images would be oo large o represen by a Gaussian disribuion because of he non-lineariy of perspecive projecion. Thus, we esimae he landmark locaion using he riangulaion a every frame, and selec he mos accurae esimaion based on he covariance marix of he esimaed landmark locaion. This is he landmark iniializaion problem well-known in monocular SLAM, and we will improve he process by employing EKF wih he inverse deph scheme [12] in he fuure. The covariance of a landmark locaion is calculaed as follows [8]. Σ mj, is compued using SVD. Σ mj, =(Jm T j, Σ 1 z j, J mj, ) 1 2424

E. Procedure Our mehod is performed in he following procedure. (a) Iniializaion ( =1o k) Since here are no a he iniializaion sep, he sysem esimaes he moion and simulaneously only from images wihou moion sensors. To ensure sufficien baseline disance, we use k images. Currenly, k is given by human. (1) Camera pose esimaion We compue he score disribuion from images I 1 and I k using he mehod in Secion IV, and creae paricles for he hypoheses having a high score. (2) Landmark iniializaion For each paricle, we eliminae ouliers and reconsruc by he riangulaion using I 1 and I k. (b) Sequenial reconsrucion ( >k) (1) Camera pose predicion We compue he score disribuion using he mehod in Secion IV, and selec he hypoheses having a high score. For each hypohesis, we eliminae ouliers and esimae he pose. Then, we creae new paricles by pairing each hypohesis a ime and each paricle a ime 1. The number of paricles increases in his process. (2) Imporance weigh and resampling We calculae he imporance weigh of each paricle based on Eq.(7), and resampling paricles according o he normalized imporance weighs. The number of paricles is reduced o he original one. (3) Landmark updae For each resampled paricle, we eliminae ouliers and reconsruc using he riangulaion. If he landmark is new, we jus reconsruc i from he firs wo images in which he landmark appears. If he landmark is already regisered, we updae i when he covariance of he new reconsrucion is smaller han he old one. The real scale canno be obained only from images. The scale of he generaed 3-D map is proporional o τ obained a he iniializaion sep. Noe ha we assume τ =1as menioned above. A he sequenial reconsrucion sep, we esimae he scale facor using he 3-D map buil so far. This is performed by minimizing he reprojecion errors of he in he 3-D map ono he images using a nonlinear opimizaion mehod. VI. EXPERIMENTS A. Simulaion We carried ou a simulaion o evaluae he performance of our mehod by comparison wih a RANSAC-based mehod. Fig.3 shows he success raes of pose esimaion by he wo mehods. In his simulaion, 50 are randomly generaed in 3-D space, and are projeced ono wo images a differen poses. Varying feaure noise level σ (Gaussian) and oulier rae, he relaive pose is reconsruced from he wo images. The 8-poin mehod success rae [%] 100 80 60 40 our mehod RANSAC (bes) RANSAC (all) success rae [%] 20 20 oulier oulier rae[%] rae[%] 0 0 10 20 30 40 50 60 70 10 20 30 40 50 60 70 (a) Feaure noise σ = 0 [pixel] (b) Feaure noise σ = 0.5 [pixel] Fig. 3. 100 80 60 40 Success rae of pose esimaion [7] is used for reconsrucion in he RANSAC-based mehod. The number of samples in RANSAC is 1000. In his simulaion, we judged a hypohesis is passed if is error in each roaion angle is wihin 1.0 [deg]. For our mehod, success means ha a leas one of he hypoheses seleced a sep (5) in Secion IV-C is passed. The hreshold h 1 was se o 70% of he maximal voes. For he RANSAC-based mehod, we employed wo crieria. One is ha i is successful when a leas one of he 1000 samples is passed. The oher is ha i is successful when he sample having he bes score is passed. Theoreically, in he case of using he 8-poin mehod, 1177 samples will provide 99% success rae a 50% oulier rae [8] when σ = 0. Fig.3 (a) suppors i. Fig.3 (b) shows ha he success rae of he RANSAC-based mehod is degraded more han ha of our mehod when feaure noise of σ =0.5 [pixel] is added. From his resul, we found ha our mehod ouperforms RANSAC in finding good hypoheses. We also found ha he bes hypohesis can be false. Muliple hypohesis racking by RBPF addresses his problem. Fig.4 shows he simulaions of monocular SLAM by our mehod. 50 are randomly generaed in 3-D space, and he moves along he predefined rajecories: a circle wih a radius of 700 [cm] and a sraigh line of 1600 [cm]. Feaure noise of σ = 0.5 [pixel] is added o each feaure on he images, and oulier rae is 20 % in each image. The number of paricles in RBPF is 20. Fig.4 shows he rajecory of he bes of he 20 paricles. In (a), he sandard deviaion of he poses in he bes paricle is σ x =7.4 [cm], σ y =55.0 [cm], σ z =95.2 [cm], σ roll = 0.36 [deg], σ pich =0.40 [deg], σ yaw =0.27 [deg]. In (b), he sandard deviaion is σ x =12.0 [cm], σ y =8.2 [cm], σ z =21.9 [cm], σ roll =0.37 [deg], σ pich =1.02 [deg], σ yaw =0.08 [deg]. B. Experimens in Real Environmens We conduced experimens in indoor and oudoor environmens. Images were capured by human wih a digial. The image size was 320 by 240 pixels. The number of paricles is 20. The correspondences beween feaure poins were obained using he KLT racker [15]. The number of feaure poin in an image was 50. The experimens were done off-line. The maps were reconsruced using key frames, each of which was exraced over every n frames (n =3 o 8). n was given by human, which was consan in one experimen. 2425

z x y wih covariance op view rajecory z x y rajecory rajecory 20m (a) Circular rajecory Fig. 4. (b) Sraigh rajecory (lef: sep2, righ: sep9) Simulaion resuls Fig. 6. Resul of anoher experimen in oudoors op view op view rajecory side view wih covariance (a) Snapsho of he environmen 10m (b) Resul rajecory (c) Cov. of he Fig. 7. Resul of an experimen in indoors Fig. 5. Resul of an experimen in oudoors Fig.5 shows he resul of an experimen in oudoors. The moved abou 15[m] and capured 30 images. The oal number of is 173. Alhough oudoor environmens have boh near and far, hey were reconsruced well as shown in (b). The covariance of each landmark is shown in (c). In his experimen, many hypoheses were generaed a he iniializaion sep. I is difficul o find which one is correc from a small number of measuremens. The RBPF filered ou false hypoheses o find he correc one based on he measuremens obained ime afer ime. Fig.6 shows he resul of anoher experimen in oudoors. The moved abou 80[m] and capured 180 images. The oal number of is 368. There are many a disan locaions. Fig.7 shows he resul of an experimen in indoors. The moved in a 10[m] 10[m] room and capured 180 images. The oal number of is 773. This is a good example of he 6-DOF moion in 3-D space. In hese experimens, he roaion space was discreized from -10 [deg] o 10 [deg] by 1 [deg] inerval for each angle. The compuaion ime is currenly 3 o 10 seconds per key frame. The compuaion ime will be reduced by program cusomizaion and parallel processing. VII. CONCLUSIONS The paper has presened a monocular SLAM scheme using a Rao-Blackwellised paricle filer. Our conribuion is an exhausive pose space search, in which all he plausible hypoheses are found efficienly using epipolar geomery and a voing scheme. By racking and refining muliple hypoheses using he RBPF, 3-D SLAM is performed robusly. Fuure work includes error analysis and more efficien implemenaion of he sysem. REFERENCES [1] A. J. Davison: Real-ime simulaneous localizaion and mapping wih a single, Proc. of CVPR 03, 2003. [2] F. Dellaer, S. Seiz, C. Thorpe, and S. Thrun: Srucure from moion wihou correspondences, Proc. of CVPR2000, 2000. [3] E. Eade and T. Drummond: Scalable Monocular SLAM, Proc. of CVPR 06, 2006. [4] P. Elinas, R. Sim, and J. J. Lile: σ SLAM: Sereo Vision SLAM Using he Rao-Blackwellised Paricle Filer and a Novel Mixure Proposal Disribuion, Proc. of ICRA2006, pp. 1564 1570, 2006. [5] M. Fischler and R. Bolles: Random Sample Consensus: a Paradigm for Model Fiing wih Applicaion o Image Analysis and Auomaed Carography, Communicaions ACM, 24:381-395, 1981. [6] A. W. Fizgibbon and A. Zisserman: Auomaic Camera Recovery for Closed or Open Image Sequences, Proc. of ECCV 98, 1998. [7] R. Harley: In defense of he eigh-poin algorihm, IEEE Trans. PAMI, Vol. 19, No. 6, pp. 580 593, 1997. [8] R. Harley and A. Zisserman: Muliple View Geomery in Compuer Vision, Cambridge Universiy Press, 2000. [9] N. M. Kwok and G. Dissanayake: An Efficien Muliple Hypohesis Filer for Bearing-Only SLAM, Proc of IROS2004, 2004. [10] A. Makadia, C. Geyer, and K. Daniilidis: Radon-based Srucure from Moion Wihou Correspondences, Proc of CVPR 05, 2005. [11] M. Monemerlo, S. Thrun, D. Koller, and B. Wegbrei: FasSLAM: A Facored Soluion o he Simulaneous Localizaion and Mapping Problem, Proc of AAAI2002, 2002. [12] J. M. M. Moniel, J. Civera, and A. J. Davison: Unified Inverse Deph Paramerizaion for Monocular SLAM, Proc. of RSS2006, 2006. [13] K. Murphy and S. Russell: Rao-Blackwellised Paricle Filering for Dynamic Bayesian Neworks, in A. Douce ed. : Sequenial Mone Carlo Mehods in Pracice, Springer, 2001. [14] D. Nisér, O. Narodisky, and J. Bergen: Visual Odomery, Proc. of CVPR 04, 2004. [15] J. Shi and C. Tomasi: Good Feaures o Track, Proc. of CVPR 94, pp. 593-600, 1994. [16] J. Sola, A. Monin, M. Devy, and T. Lemaire: Undelayed Iniializaion in Bearing Only SLAM, Proc. of IROS2005, pp. 2751-2756, 2005. [17] S. Thrun, W. Burgard, and D. Fox: Probabilisic Roboics, he MIT Press, 2005. [18] C. Tomasi and T. Kanade: Shape and Moion from Image Sreams under Orhography: A Facorizaion Approach, In. J. of Compuer Vision, 9(2):137-154. 2426