Efficient Optimization of Information-Theoretic Exploration in SLAM

Size: px

Start display at page:

Download "Efficient Optimization of Information-Theoretic Exploration in SLAM"

Dorthy Lawson
5 years ago
Views:

1 Proceedings of he Tweny-Third AAAI Conference on Arificial Inelligence (2008) Efficien Opimizaion of Informaion-Theoreic Exploraion in SLAM Thomas Kollar and Nicholas Roy Compuer Science and Arificial Inelligence Laboraory Massachuses Insiue of Technology The Saa Cener, 32 Vassar S. Cambridge, MA {kollar Absrac We presen a novel mehod for informaion-heoreic exploraion, leveraging recen work on mapping and localizaion. We describe exploraion as he consrained opimizaion problem of compuing a rajecory o minimize poserior map error, subjec o he consrains of raveling hrough a se of sensing locaions o ensure map coverage. This rajecory is found by reducing he map o a skeleon graph and searching for a minimum enropy our hrough he graph. We describe how a specific facorizaion of he map covariance allows he reuse of EKF updaes during he opimizaion, giving an efficien gradien ascen search for he maximum informaion gain our hrough sensing locaions, where each our naurally incorporaes revisiing well-known map regions. By generaing incremenally larger ours as he exploraion finds new regions of he environmen, we demonsrae ha our approach can perform auonomous exploraion wih improved accuracy. Inroducion Compuing he bes rajecories for a mobile robo exploring an unknown environmen ypically requires saisfying he wo compeing objecives of compleeness and accuracy. Firsly, he exploraion rajecory mus choose measuremens ha cover he environmen, ensuring ha he robo has colleced measuremens of he enirey of he environmen in order o build a complee map. Secondly, he exploraion rajecory mus choose measuremens ha allow he SLAM algorihm o correcly infer relaionships beween differen pars of he environmen. Generaing he mos accurae map requires maximizing he muual informaion beween differen environmenal feaures and minimizing he loss of of informaion due o he robo moion. Differen pieces of his problem have been examined in previous work (Yamauchi 1997; Feder, Leonard, & Smih 1999; Makarenko e al. 2002; Sachniss & Burgard 2003; Sim & Roy 2005; Sachniss 2006), bu a consisen opimizaion process for addressing hese wo problems remains an open problem. In paricular, here has been lile work on performing mulisep desinaion selecion, primarily due o he fac ha compuing even a sandard minimum-disance rajecory is NP-hard. The problem is furher complicaed by he fac ha he objecive funcions of informaion gain Copyrigh c 2008, Associaion for he Advancemen of Arificial Inelligence ( All righs reserved. and expeced error do no obey he riangle inequaliy, prevening he use of many common opimizaion echniques. In his paper, we describe an algorihm for compuing an exploraion rajecory of an unknown environmen by combining a global opimizaion of a our of environmen waypoins wih a local conroller ha opimizes he rajecory hrough he seleced waypoins. The primary conribuion of our paper is a local-global opimizaion algorihm ha uses hill climbing o opimize a coverage our hrough a parially mapped environmen and a learned conroller ha predics he rajecory beween waypoins ha maximizes he informaion gain. Our second conribuion is o show ha he hill climbing can be performed efficienly using an alernae form of sandard SLAM process-measuremen updaes. SLAM wih he EKF I is beyond he scope of his paper o summarize he sae of he ar in roboic mapping; our overall approach o exploraion is no specific o he map represenaion or inference algorihm, alhough we are able o achieve subsanial gains in efficiency by assuming Gaussian map and robo poseriors. Wihou loss of generaliy, we will resric our discussion o Exended Kalman filer SLAM. Le us denoe he robo posiion a ime as x = (x, y, θ ), and each map feaure or landmark as m i = (x, y). The complee map of n landmarks is described as m = {m 1,...,m n }. When he robo execues a conrol acion u, is nex posiion is given by a process model x +1 = g(x,u, w ) where w is some unknown random noise ha is added o he effec of he conrol. Afer each moion, he robo receives observaions according o is curren posiion and he landmarks. The observaion of landmark i is given by a measuremen model z i = h(x, m i, v ) where v is some unknown random noise ha is added o he received measuremen. Typically a se of observaions z is received a ime. The goal of a SLAM algorihm is o infer he robo pose and he map based on knowledge of he conrol acions and observaions. By filering he acions and observaions, a join disribuion over he robo pose and he map can be mainained, specifically he disribuion p(x,m u 1:,z 1: ). If we assume his disribuion is Gaussian, hen he poserior is paramerized by he mean µ and covariance Σ saisics. If he process and measuremen models are linear funcions and he unobservable noise erms are Gaussian, hen he mean of he poserior disribuion compued 1369

2 Algorihm 1 The Exended Kalman Filer Require: Prior mean and covariance (µ 1, Σ 1 ), conrol u and observaion z. 1: Compue process mean, µ = g(µ 1,u, 0) 2: Compue process Jacobian G µ 1. 3: Compue process covariance, Σ = G Σ 1 G T + R. 4: Compue measuremen Jacobian H µ 1. 5: Compue Kalman gain, ( ) 1. K = Σ H T H Σ H T + Q 6: Compue new mean, µ = µ + K (h(µ, 0) z ). 7: Compue new covariance, Σ T = (I K T H T )Σ T. by he Kalman filer can be shown o minimize he leassquares error. When he models are non-linear, he exended Kalman filer can be used o repeaedly linearize he process and measuremen models. The Jacobians of he process and measuremen models g and h used o linearize hese models are G and H respecively, and R and Q are he covariances of he Gaussian noise disribuions for he process and measuremen updaes. The complee Exended Kalman Filer algorihm is shown in figure 1. Exploraion Given sufficien daa, he mean esimae of he map provided by he EKF will usually converge o he rue map 1. Unforunaely, few if any guaranees are given for he rae a which he map converges; he qualiy of he map is highly dependen on he robo rajecory as well as a variey of oher feaures such as sensor model, feaure disinciveness for daa associaion, ec.. The goal in his paper is o choose rajecories ha lead o sensor daa ha resuls in he bes map, maximizing boh he map coverage and he map accuracy. There are differen ways o reconcile hese objecives, such as opimizing he robo rajecory wih respec o an overall objecive ha is a weighed sum of individual erms (Makarenko e al. 2002), bu such approaches raise addiional quesions such as how o choose he individual weighs. Our approach is o define he coverage problem as choosing he bes sensing locaions ha maximize he likelihood he complee environmen will be observed. If hese sensing locaions serve as consrains on a rajecory planner, and we compue a rajecory ha maximizes he map accuracy subjec o hese consrains, hen we have a principled way o opimize boh coverage and accuracy. The rajecory consrains based on coverage can be calculaed firs, and hen he consrained rajecory can be calculaed o maximize map accuracy. 1 Alhough, convergence guaranees only hold for he sandard Kalman filer, no he exended form. Map Coverage Le us firs consider he problem of coverage. We define a visibiliy funcion δ(m i,z 0:T ), δ(m i 1 if map feaure mi appears a leas once in he observaions,z 0:T )= z 0:T, (1) 0 oherwise. The goal of a coverage planner is o search for he rajecory ha maximizes he number of sensed feaures, argmax z 0:T m δ(m i,z 0:T ). (2) i=1 We canno conrol or predic a priori wha observaions z will be received. We can conrol he robo poses x 0:T, bu he observaions are generaed sochasically according o he map and he robo pose, p(z 0:T x 0:T,m). We do no know he map m, so equaion (2) mus be compued in expecaion over he map prior disribuion p(m) and observaion likelihoods, such ha argmaxe p(m) E p(z0:t x 0:T,m) x 0:T m δ(m i,z 0:T ). (3) i=1 The disribuion p(m) is acually condiioned on he measuremens up o he curren ime, which have been lef ou for breviy. Compuing he double expecaion exacly over boh map prior and observaions is compuaionally infeasible, bu he disribuion over maps has only limied impac on he opimizaion. A he beginning of he mapping process, he map prior will be very uncerain, which will cause all rajecories o have approximaely he same expeced performance, as all maps are equally likely. As more observaions are made, he map disribuion will begin o converge o he rue map, and he mean map esimae ˆm will become an increasingly good approximaion o he complee disribuion. When he map prior is informaive, Mone Carlo mehods may be used o esimae he ouer expecaion direcly. Thus, we can approximae equaion (3) as: argmax x 0:T E p(z0:t x 0:T, ˆm) m δ(m i,z 0:T ). (4) i=1 While equaion (4) is an approximaion o our rue coverage objecive, in pracice a robo coverage algorihm ha chooses a se of desinaion poins o maximize he likelihood of compleing he map based on he curren esimae, and hen follows a our hrough hese poins will have performance comparable o any oher sequenial decision making algorihm. We can gain addiional compuaional racabiliy in solving equaion (4) by considering only a subse of possible locaions of he robo, ha is, a reracion of he map free space. Map feaures are consrained (by definiion) o lie on he boundary of free space, so our reracion mus be a subse of map locaions ha guaranee ha every boundary poin can be observed from some poin in his subse. This propery of boundary visibiliy is known as reliabiliy (Cornea, Silver, & Min 2005); using he medial-axis 1370

(d) The graph nodes ha consiue sensing locaion consrains for he rajecory opimizaion. Algorihm 2 Map Skeleonizaion Require: Map m. 1: Median filer he binarized map.

3 (a) Original (b) Median Filered (c) Skeleon (d) Sensing Waypoins Figure 1: The map skeleonizaion process. (a) The binarized map. (b) The smoohed map using median filering o eliminae high-frequency noise. (c) The piece-wise linear graph of he medial-axis skeleon. (d) The graph nodes ha consiue sensing locaion consrains for he rajecory opimizaion. Algorihm 2 Map Skeleonizaion Require: Map m. 1: Median filer he binarized map. 2: Skeleonize he map using he medial-axis ransform. 3: Reduce he skeleon o a piece-wise linear graph. 4: Reurn he end-poins of lines (juncions and graph leaves). ransform (Blum 1967) o reduce he free-space o a minimal skeleon allows us o approximae he map wih a minimal se of poins while guaraneeing reliabiliy necessary for equaion (4). Efficien algorihms for compuing a medial axis represenaion ha guaranee reliabiliy generally lead o near-opimal reducions in erms of minimizing he resuling skeleon; compuing he rue minimum medial axis represenaion is NP-hard (Coeurjolly, Hulin, & Sivignon 2007). In order o selec a se of sensing locaions ha provide a guaranee of coverage, we rerac he map o is medial axis. We hen furher reduce he skeleon o a graph by convering he resuling map skeleon o a piece-wise linear represenaion, reaining only he end-poins of he resuling lines. We can hen use graph search algorihms o idenify a our of he skeleon ha maximizes he prediced map accuracy. By generaing a graph wih guaraneed boundary visibiliy and consraining our exploraion rajecory o raverse he graph nodes, he exploraion rajecory is consrained o have full map coverage. The complee skeleonizaion algorihm is given by Algorihm 2. Alhough he map free space is given by an occupancy grid, his grid is binarized before skeleonizaion. Noe also ha high frequency objecs in he map such as discreizaion arifacs or small mapping errors can creae an overabundance of end-poins and juncion poins. To reduce he effec of high-frequency areas o have less influence on he end-poins and juncion poins, we firs median filer he image, as shown in figure 1(b). Map Accuracy Given a graph hrough he environmen ha maximizes he expeced coverage of map feaures, we wish o compue a our hrough hese poins, ha is, an ordering on he graph poins, o maximize he resuling map accuracy. We canno measure he map accuracy wihou ground ruh, bu he covariance of he poserior disribuion provides an esimae of he expeced squared error. Minimizing he deerminan of he covariance of he poserior Gaussian is equivalen o minimizing he differenial enropy described by Darbellay & Vajda (2000) as H(p(x)) = 1 2 ln[(2πe)n Σ ] As a resul, compuing a our ha minimizes he poserior map enropy is equivalen o an informaion-heoreic raveling salesman problem where he objecive funcion is informaion maximizaion raher han disance minimizaion. Unforunaely, as wih mos raveling salesmen problems, his opimizaion is again NP-hard (Garey & Johnson 1979). Furhermore, he marginal gains for compuing he globally opimal rajecory are likely o be low. On he oher hand, local opimizaion will allow he planner o ake advanage of some informaion gahering acions such as revisiing known-locaions ha a greedy sraegy will no always use. As a resul, we use local opimizaion similar o he gradien-ascen heurisics used for solving shores-pah TSP problems, such as he edge-swapping heurisic. A basic our is generaed from a simple greedy neares-neighbor heurisic. A each ieraion wo edges are swapped and he informaion gain is calculaed. If he poserior map enropy is improved, hen his edge move is reained. Edge-swaps are coninued unil no more edges are available o reduce he our cos, and he bes (lowes prediced poserior enropy) our is used. Noe ha for he convenional shores-pah TSP, his approach is referred o as he 2-op heurisic and is known o be beer han a O(log(n)) approximaion o he rue disanceopimal soluion. This performance guaranee does no hold 1371

4 (a) Shores Pah Opimizaion (b) Informaion-Theoreic Opimizaion Figure 2: Opimized ours. (a) For he graph in figure 1, he ordering of poins ha minimizes he oal pah lengh. (H(p(m)) = 6.93) ((b) For he same graph, he ordering of poins found by minimizing he poserior map enropy. (H(p(m)) = 7.23). (Noe ha he red lines only show conneciviy for clariy of presenaion. The pah lenghs are calculaed hrough free space, bu he pahs end o overlap and obscure he resuling conneciviy.) (a) A our using a shores-pah conroller (b) The same our using he learned conroller Figure 3: Comparison of a pah execued by a convenional shores-pah conroller ha uses poin urns (a) wih a pah execued by he learned conroller. The our of he graph in boh cases was fixed. for he exploraion problem because informaion gain violaes he riangle inequaliy propery (Rosenkranz, Searns, & Lewis 1977), bu performs well in pracice. Figure 2 shows he resuls of using he 2-op heurisic wih he shores pah objecive (figure 2a) and he informaionheoreic objecive (figure 2b). The principal difference beween he planned ours is he order of poins a he op of he free space; by firs aking a sraigh rajecory ino he room, he informaion-heoreic opimizaion improves he map of he room and hen revisis he remaining locaions, using he improved map o say well-localized. Addiionally, he informaion-heoreic opimizaion uses waypoins in previously-visied locaions o relocalize. In conras, he shores-pah rajecory visis he early waypoins firs and canno deliberaely revisi a locaion o re-localize, resuling in a lower-qualiy map. The informaion-heoreic opimizaion significanly reduces he map enropy, and as expeced also reduces he squared error of he resuling map (able 1). One-sep updae In order o evaluae he poserior map enropy of a paricular exploraion rajecory, we can use he EKF algorihm o predic he poserior map esimae afer he rajecory. This predicion requires compuing he expeced moion updaes, simulaing measuremens from he curren map esimae and updaing he EKF a each ime sep. As a resul, he predicion process can be compuaionally demanding and he cos is inensified by he fac ha every modificaion of he exploraion rajecory requires he EKF o recalculae he map for he enire rajecory following he modificaion. Even if he rajecory modificaion is relaively minor, he non-lineariy of he covariance updae in Algorihm 1 means ha he EKF updaes from one rajecory canno easily be re-used in he overlapping secion of anoher rajecory. Noe ha he simulaed moions and measuremens can be re-used if he same graph edge appears in wo poenial rajecories; i is only he EKF updaes along ha mus be recalculaed. In previous work (Prenice & Roy 2007) we have shown ha muliple updaes o a Gaussian filer can be combined ino a single ransfer funcion by re-facoring he covariance, allowing muliple process and measuremen updaes o occur wih a consan number of operaions. If he covariance is facored as Σ 1 = B 1 C 1, hen he individual facors can be updaed as [ [ ] [ B W X B C] = C] (5) Y Z [ ] 0 I = I M 1 [ ] 0 G T G RG T [ B C], (6) 1 and Σ recovered as Σ = B C 1. (Noe ha M = H T Q 1 H, he measuremen informaion.) Assuming we have known measuremens and moions, we recursively apply he measuremen Jacobian H, he measuremen noise Q, he moion Jacobian G, and he moion noise R in he form of he ransfer funcion ζ = [ W X Y Z ] o he original [ ] B. C 0 We can iniialize he process as B 0 = Σ 0 and C 0 = I, where Σ 0 is he iniial covariance marix and I, he ideniy marix. Noe ha each of hese erms can be seen in he original formulaion of he EKF (Algorihm 1). Using his one-sep EKF updae, we can collec simulaed moions and measuremens along each edge in he graph, compue he relevan Jacobians for each ime sep and collec erms ino a single ransfer funcion for ha edge ha can be reused for he map poserior predicion of fuure exploraion rajecories. In paricular, we compue a our by creaing one-sep ransfer funcions beween he ineres poins. If we have already creaed a ransfer funcion o wihin some hreshold ǫ of he original orienaion, we will use he pregeneraed ransfer funcion o obain he one-sep updae. Thus, we will cache he ransfer funcions for any alreadyvisied edges and use his o drasically speed up he search process. Moion Conrol Opimizaion The coverage exploraion opimizaion assumes a specific model of robo moion along each edge in he graph bu 1372

5 Algorihm 3 Informaion-heoreic our opimizaion Require: Approximae shores-pah TSP our bes hrough sensing locaions derived from skeleonized map 1: while no converged do 2: Use he 2-op heurisic o creae a new our curr, iniialize Σ 0. 3: for each edge e ij in curr do 4: if Edge e ij was previously execued saring from orienaion θ k hen 5: Use he cached ransfer funcion ζ ijk o updae Σ o Σ +n. 6: else if e ij has no been execued before hen 7: Execue i using he learned conroller and sore ζ ijk for ha edge and iniial orienaion. 8: end if 9: end for 10: Compue he enropy H curr = enropy(σ T ) afer execuing our curr. 11: if H curr < H bes hen 12: bes = curr. 13: end if 14: end while 15: reurn bes does no have direc conrol of he robo moion beween sensing locaions. However, he rajecory beween sensor measuremens can have a significan effec on he accuracy of he map. When he robo moion is noisy, subsanial uncerainy is inroduced ino he robo pose; when few or no observaions are expeced (e.g., because of sensor range limis or environmenal sparsiy), his uncerainy is propagaed hroughou he SLAM filer ino fuure measuremens, increasing he expeced error of he poserior. To learn good conrol policies, we use Policy Search Dynamic Programming (Bagnell e al. 2003; Kollar & Roy 2008), a form of reinforcemen learning ha decomposes ino a series of one-sep learning algorihms, essenially urning he learning of sequenial decision making ino dynamic programming. The algorihm operaes by firs learning a one-sep policy a ime T (e.g., he end ime). Training daa is generaed by sampling a robo pose and map disribuion p(x,m), and using all possible conrol rajecories a o propagae each sampled prior disribuion forward o a poserior disribuion according o he EKF using a physically realisic vehicle and sensor simulaor. Each disribuionacion pair is labeled wih he resuling informaion gain, and he disribuion-acion pair ha minimizes he map enropy is kep as a raining insance for a supervised learning problem a ime. The resul is a classifier ha provides an acion a for any disribuion p(s) ha minimizes he map enropy. A one-sep policy for he previous ime sep T 1 is hen learned by sampling disribuions and acions as before, bu propagaing each disribuion according o each acion a followed by he ime T one-sep policy previously obained. A wo-sep enropy is obained for each disribuion and acion a T 1 from he resuling disribuion-acion pair: he enropy associaed wih each disribuion and acion is accumulaed from he moion and observaions associaed wih he conrol a ime T 1 and he moion and observaions received by running he learned policy for ime T. A new policy is learned for ime T 1, and he learner hen ieraes a each ime hrough o ime T using he policies π i for i {...T}, resuling in a moion conroller ha minimizes he enropy over he lengh of he rajecory = 0...T. Figure 3 compares a sandard shores-pah moion conroller wih he learned conroller for he same our of sensing locaions. As expeced, he map enropy is significanly reduced for he learned conroller (H(m) = 7.23) compared o he shores-pah conroller (H(m) = 6.93). Incremenal Exploraion To summarize, our goal has been o opimally explore unknown environmens by solving a local-global opimizaion problem, maximizing boh map coverage and map accuracy. The coverage requiremen has led us o ake he parial map, skeleonize i, and exrac sensing locaions from i, as seen in Algorihm 2. The map accuracy requiremen has led us o a hill-climbing algorihm ha opimizes he visiaion of sensing locaions while a he same ime minimizing he resuling enropy of he map. The iniial our is opimized wih respec o disance, he 2-op heurisic modifies he our and he bes ours are kep, as seen in Algorihm 3. Compuing he EKF updaes for previously visied our edges is inefficien and can be opimized by soring a ransfer funcion for each edge. Addiional improvemen can be achieved by using a local learned conroller o plan local rajecories beween sensing locaions. Once he opimal our is compued and hen execued by he robo, he parial map of he environmen will grow. When a complee our is accomplished, a new our will be compued for he new parial map and he nex round of exploraion will begin. In his secion, we have presened a complee algorihm for informaionheoreic exploraion and have elaboraed a number of ways ha i can be made more efficien. Resuls To demonsrae he effeciveness of he approach, we esed he exploraion algorihm on scenarios from he Radish daase using a simulaion wih uncerain robo moion and uncerain landmark measuremens. Treaing boh he WEAN and LONGWOOD maps as parial maps, we ran he opimizaion wih he shores pah conroller and wih he learned conroller. Overall, i is clear ha he map enropy afer exploraion has been reduced, and in all bu he one insance (he WEAN map wih he informaion-heoreic objecive and he learned conroller), he squared error is also reduced for he opimized rajecory. In he larger LONGWOOD example (Figure 4), i appears clear ha he our says in sigh of as many map feaures (red squares) as possible. Having shown ha we can successfully reduce he map enropy and improve map accuracy, we nex demonsrae an incremenal exploraion of he environmen in Figure 5. Here he robo has a parial map of he environmen, skeleonizes his map o find sensing locaions, compues an informaion-heoreic our, and hen execues ha our. Afer he execuion of he our in (d), he robo hen compues a new our, which i hen execues in (e), exploring his parial map as well unil he enire environmen has been observed (f). Table 2 gives a comparison of he enropy and map accuracies for he maps in Figure 5 corresponding o he disance opimal and informaion-heoreic exploraion algorihms. Noing ha each pair of maps (a-d, b-e, c-f) may be direcly compared, we found ha he squared error and he en- 1373

(a) Disance Opimal Tour (b) Disance Opimal Pah (c) Informaion-heoreic Tour (d) Informaion-heoreic Pah Figure 4: Tours for he Longwood map. (a) The shores pah ordering of he skeleon map graph.

6 (a) Disance Opimal Tour (b) Disance Opimal Pah (c) Informaion-heoreic Tour (d) Informaion-heoreic Pah Figure 4: Tours for he Longwood map. (a) The shores pah ordering of he skeleon map graph. (b) The resuling pah using he shorespah conroller. (c) The minimum-enropy pah ordering of he skeleon map graph. (d) The resuling pah using he learned conroller. MAP OPT. TYPE CONT. ENT. SE DIST. W Pah Lgh. Dis. Op m W Info Dis. Op m W Info Learned m L Pah Lgh. Learned m L Info Learned m Table 1: Algorihm Performance for a Single Tour. MAP is he map ha he experimen was performed in, W is he WEAN map and L is he LONGWOOD map. OPT. TYPE corresponds o graph our opimizaion, eiher shores pah or informaion-heoreic opimizaion. CONT. corresponds o he moion conroller, eiher shores pah or learned. ENT. and SE correspond o he merics of map enropy and map squared error compared o he known ground ruh respecively. DIST. corresponds o he disance he robo raveled over he enire our. ropy were boh significanly reduced by he opimized policy. Furher, by using one-sep ransfer funcions, we were able o reduce he oal number of edge predicion seps by 50%, for a similar ime savings of 50%. We expec his savings will grow for larger problem sizes. Conclusion We have presened a novel mehod ha frames exploraion as a consrained opimizaion problem. The algorihm reduces he curren environmenal map o a graph of he map skele- OPT. CONT. ENT. SE DIST (a) Dis Learned m (b) Dis Learned m (c) Dis Learned m (d) Info Learned m (e) Info Learned m (f) Info Learned m Table 2: Algorihm for a sequence of ours during online exploraion. OPT. corresponds o graph our opimizaion, eiher shores pah or informaion-heoreic opimizaion. The conroller CONT. in every case was he learned conroller. ENT. and SE correspond o he merics of map enropy and map squared error respecively. DIST. corresponds o he disance he robo raveled over he enire our. The informaion-heoreic our opimizaion resuls in boh a reduced map enropy and reduced map error compared o he known ground ruh. on, placing sensing consrains a he graph nodes which naurally include boundaries and froniers. Informaionheoreic ours are hen generaed hrough he sensing consrains using a gradien-ascen wih a heurisic; we can perform he opimizaion efficienly by using one-sep ransfer funcions on he covariance o re-use previously compued EKF updaes. To our knowledge his is one of few informaion-heoreic approaches o exploraion and he only one o consider mulisep ours of an environmen. Finally, a learner is employed o opimize he local ra- 1374

(a) (b) (c) (a-c) Map exploraion using shores pah opimizaion of exploraion rajecory (d) (e) (f) (d-f) Map exploraion using informaion-heoreic opimizaion of exploraion rajecory Figure 5: Example

In panels (a) and (d), he algorihm has only a limied view of he environmen, and plans a our hrough he visible free space.

jecory of he robo beween sensing consrains, o obain global informaion-heoreic exploraion ha can quickly find informaion-heoreic rajecories hrough an environmen, resuling in a high qualiy map.

7 (a) (b) (c) (a-c) Map exploraion using shores pah opimizaion of exploraion rajecory (d) (e) (f) (d-f) Map exploraion using informaion-heoreic opimizaion of exploraion rajecory Figure 5: Example exploraion rajecories, comparing a shores-pah coverage rajecory wih an informaion-heoreic coverage rajecory. In panels (a) and (d), he algorihm has only a limied view of he environmen, and plans a our hrough he visible free space. In doing so, he map is expanded (b and e), and he robo can plan a second our hrough he larger space. The process ieraes again hrough a ye larger map in (c and f). jecory of he robo beween sensing consrains, o obain global informaion-heoreic exploraion ha can quickly find informaion-heoreic rajecories hrough an environmen, resuling in a high qualiy map. References Bagnell, J.; Kakade, S.; Ng, A.; and Schneider, J Policy search by dynamic programming. In Neural Informaion Processing Sysems, volume 16. MIT Press. Blum, H A Transformaion for Exracion New Descripors of Shape, Models for he Percepion of Speech and Visual Form. Cambridge: MIT Press. Coeurjolly, D.; Hulin, J.; and Sivignon, I Finding a Minimum Medial Axis of a Discree Shape is NPhard. Technical Repor RR-LIRIS , LIRIS UMR 5205 CNRS/INSA de Lyon/Universi Claude Bernard Lyon 1/Universi Lumire Lyon 2/Ecole Cenrale de Lyon. Cornea, N. D.; Silver, D.; and Min, P Curveskeleon applicaions. In Proceedings IEEE Visualizaion, Darbellay, G. A., and Vajda, I Enropy expressions for mulivariae coninuous disribuions. IEEE Transacions on Informaion Theory 46(2): Feder, H. J. S.; Leonard, J. J.; and Smih, C. M Adapive mobile robo navigaion and mapping. Inernaional Journal of Roboics Research, Special Issue on Field and Service Roboics 18(7): Garey, M. R., and Johnson, D. S Compuers and Inracabiliy : A Guide o he Theory of NP-Compleeness. Series of Books in he Mahemaical Sciences. W. H. Freeman. Kollar, T., and Roy, N Trajecory opimizaion using reinforcemen learning for map exploraion. The Inernaional Journal of Roboics Research 27(2): Makarenko, A.; Williams, S.; Bourgaul, F.; and Durran- Whye, H An experimen in inegraed exploraion. Proc. of he IEEE/RSJ Inl. Conf. on Inelligen Robos and Sysems (IROS). Prenice, S., and Roy, N The belief roadmap: Efficien planning in linear pomdps by facoring he covariance. In Proceedings of he 13h Inernaional Symposium of Roboics Research (ISRR). Rosenkranz, D.; Searns, R.; and Lewis, P An analysis of several heurisics for he raveling salesman problem. SIAM Journal on Compuing 6:563. Sim, R., and Roy, N Global a-opimal robo exploraion in slam. In Proceedings of he IEEE Inernaional Conference on Roboics and Auomaion (ICRA). Sachniss, C., and Burgard, W Exploring unknown environmens wih mobile robos using coverage maps. In Proc. of he Inernaional Conference on Arificial Inelligence (IJCAI). Sachniss, C Exploraion and Mapping wih Mobile Robos. Ph.D. Disseraion, Universiy of Freiburg, Deparmen of Compuer Science. Yamauchi, B A fronier-based approach for auonomous exploraion. Proceedings of he IEEE Inernaional Symposium on Compuaional Inelligence in Roboics and Auomaion

Probabilistic Robotics SLAM

Probabilistic Robotics SLAM Probabilisic Roboics SLAM The SLAM Problem SLAM is he process by which a robo builds a map of he environmen and, a he same ime, uses his map o compue is locaion Localizaion: inferring locaion given a map