Visual Robot Homing using Sarsa(λ), Whole Image Measure, and Radial Basis Function.
|
|
- Melanie Howard
- 5 years ago
- Views:
Transcription
1 Vsul Robo Homng usng Srs(λ, Whole Imge Mesure, nd Rdl Bss Funcon. Abdulrhmn Alhhn, Kevn Burn, Sefn Wermer Hybrd Inellgen Sysems Reserch Group, School of Compung nd echnology, Unversy of Sunderlnd, SR6 DD UK Absrc hs pper descrbes model for vsul homng. I uses Srs(λ s s lernng lgorhm, combned wh he Jeffery Dvergence Mesure (JDM s wy of ermnng he sk nd ugmenng he rewrd sgnl. he vsul feures re ken o be he hsogrms dfference of he curren vew nd he sored vews of he gol locon, ken for ll RGB chnnels. A rdl bss funcon lyer cs on hose hsogrms o provde npu for he lner funcon pproxmor. An onpolcy on-lne Srs(λ mehod ws used o rn hree lner neurl neworks one for ech con o pproxme he convlue funcon wh he d of elgbly rces. he resuln neworks re rned o perform vsul robo homng, where hey cheved good resuls n fndng gol locon. hs work demonsres h vsul homng bsed on renforcemen lernng nd rdl bss funcon hs hgh poenl for lernng locl nvgon sks. I. INRODUCION A skll whch plys n negrl role n chevng robo uonomy s he bly o lern o opere n pror unknown envronmens[]. Vsul homng s he c of fndng gol locon by comprng he mge currenly vewed wh sored snpsho mges (normlly ken whle nml or robo s hedng off s home locon. Vsul nvgon s he c of nvgng form one locon o he oher n he envronmen, s effcenly s possble. In hs pper we presen model for vsul homng, whch cn lso be used n locl nvgon, usng renforcemen lernng (RL from now on nd n onlne snpsho comprson echnque. hs snpsho comprson fcles onlne lernng nd execuon n pror unknown envronmens o rech gol locon. Robocs borrows severl conceps from nml homng nd nvgon sreges descrbed n he bologcl lerure [2, 3]. Whle boh vsul homng nd vsul nvgon re reled, hey hve been kep frly pr due o he fc h vsul homng s more nspred by he bology nd due o he fc h vsul nvgon s more generl hn vsul homng. Neverheless, nvgon cn be ccomplshed more drecly by usng locl homng sreges o rech some locon, whou drecly buldng mp or Noe: gol locon nd home locon wll be used nerchngebly n hs pper. usng model of envronmen dynmcs. he lmon s h he lerned sreges o nvge o home s bound o h prculr locon. herfore, f he robo needs o nvge o dfferen locon, should be rned o do so. We rgue h our model cn lso be used for generl nvgon sks due o he fc h cn opere n ny envronmen nd requres no ddonl effor excep showng he robo, onlne or offlne, s gol locon, hen leng rns. Algorhms bsed on he snpsho model [3] propose vrous sreges for fndng feures whn mges nd esblshng correspondence beween hem n order o deermne home drecon. Block mchng, for exmple, kes block of pxels from one mge nd serches for he bes mchng block n noher mge whn fxed serch rdus [4]. he degree of mch beween blocks s usully judged by he Sum of Squred Dfferences (SSD or some oher locl correlon mesure[5]. In our model we wll ke more effecve pproch by comprng bns of hsogrms hrough Rdl Bses Funcon lyer, nd usng mges only ken round he home, nohng more. Renforcemen Lernng hs been used prevously n robocs nvgon nd conrol problems. Severl of he models h used re nspred by bologcl fndngs, e.g. [6]. Alhough successful, some of hose models lck he generly nd/or prccly, nd some re resrced o her envronmen. he model proposed by [7] for exmple depends hevly on objec recognon of lndmrk n he envronmen o cheve he sk. We hve ddressed hs ssue n our model by vodng objec recognon nd usng whole mge mesure echnque nsed, o mesure he dssmlry of curren nd gol vews o denfy wheher he robo reched he gol locon (wh he desred orenon. hs ws possble wh no pror knowledge or consrns regrdng hose mges. By ddng he bove dvnge o he lernng robusness nd generly of RL, coupled wh vsul ses nd rewrds, he model cheved hgh level of robusness, generly, nd pplcbly. Whle envronmen-dynmcs or mp-buldng my be necessry for more complex or nercve forms of nvgon or loclzon, vsul homng bsed on modelfree lernng cn offer n dpve form of locl homng. Alhough he mmede execuon of model-bsed
2 nvgon sysem cn be successful [8, 9], RL echnques hve go he dvnges of model-free sysems.e. here s no knowledge needed pror o operng he robo. I lerns he bes polcy for he envronmen dynmcs. Whle he de of usng snpshos o do robo loclzon s no new [], vsul homng bsed on renforcemen lernng nd rdl bss npu lyer nd whole mge mesure s novel conrbuon of hs pper. We begn by presenng n overvew of our renforcemen lernng conex nd Mrkov Decson Processes (MDP frmework followed by he emporl Dfference (D lernng lgorhm for connuous ses spce. hs s followed by deled descrpon of our model, demonsrng generly nd smplcy of execuon. hen we presen emprcl resuls of robo rechng gol locon vsully n smulon envronmen. II. BACKGROUND OF REINFORCEMEN LEARNING Renforcemen lernng concerns he problem of lernng o predc he sum of rewrds n gen s recevng whle nercng wh s envronmen n order o opmlly execue sk []. Insed of beng gven exmples of he desred behvor, he lernng gen mus fnd ou - usng s envronmen feedbck nd usng grdul explorve cons - how o c bes o execue sk. Usully hs feedbck s mnml sgnl of rewrd or punshmen nduced n some wy n he envronmen. hs sgnl s clled he renforcemen sgnl. In ny envronmen here exss se of ses h represen he suons h he gen cn fce (or recognze. hose ses defne he se spce denoed by S, whch cn be fne or nfne nd connuous. he cons re hose smple cves he gen s ble o do n cern se. he se of hose cons defne he cons spce A. hose cons cn lso be fne or nfne. he envronmen normlly recs or responds o ny con ken by he gen by reurnng sgnl ndcng or renforcng how good or bd hs con ws for he sk. I s clled he rewrd sgnl or he renforcemen sgnl. he dynmcs of n envronmen re he se of probbly dsrbuons h dsngush s nernl properes. hose re mnly he se rnson funcon nd he rewrd funcon. he se rnson funcon s probbly dsrbuon defned on he se spce h specfes he probbly of movng form se s me o noher se s' me + fer pplyng con : P Pr{ s = s s = s, } s s = + = he rewrd funcon s defned s he expeced rewrd reurned by he envronmen for ech se fer pplyng cern con: R s s = E{ r s = s, s s + =, = } where r s he cul rewrd reurned by he envronmen nd fully observed by he gen. As wh mos renforcemen work, we wll resrn ourselves o he Mrkovn envronmens. A Mrkov decson process (MDP s defned by uple ( S, A, P s s, Rss, γ, where γ [,] s dscoun re prmeer, nd where he Mrkov propery s ssfed [9, ]. A rjecory of experence s sequence s,, r2,s2, 2, r3,... where he gen n s kes con hen receves rewrd r 2 nd rnsonng o s 2 before kng 2, ec. A polcy specfes (probblsclly or deermnsclly he con h needs o be ken for ech dfferen se. : S A [,], ( s, =. where ( s, s he probbly of selecng con when n gen s n se s. A deermnsc polcy s mppng beween ses nd cons : S A. he ulme gol of renforcemen lernng mehods (lgorhms s o lern n opmum polcy h, when followed, mxmzes he ccumuled rewrds expeced o be gned by he gen durng nercon wh s envronmen. hs s normlly reched hrough esmng he expeced sum n some form snce model of he envronmen s normlly no vlble nd undesred o be requremen. Even n mehods h ssume model of he envronmen dynmcs o be known, such s Dynmc Progrmmng mehods, he expecon sll needs o be esmed due o he boosrppng chrcersc of such mehod. By boosrppng we men buldng on n own nl esmon o rech beer esmon closer o he rel vlue []. he dscouned sum of rewrds me sep s clled he reurn R where: R = r r γ r = k= γ γ r ( k + k+ Expeced ccumuled rewrds for cern polcy cn be expressed n wo forms: he vlue funcon V (s nd he con-vlue funcon Q ( s,. A vlue funcon for polcy s defned s: V ( s : S R. V specfes he expeced reurn (sum of rewrds r from he srng se s nd onwrds. Obvously ech polcy hs dfferen vlue funcon, hence he upper superscrp. = [ = ] = k V ( s E R s s E γ r + k + s = s (2 k = he cenrl de of RL s o ry o lern n esme of he vlue funcon of he doped polcy dependng on he nercon beween he gen nd s envronmen. In oher words, o predc he vlue funcon of he gen's MDP polcy. An essenl propery of he vlue funcon cn be deduced from he nrnsc recurson posses: = + k V ( s E r + γ γ r + k + 2 s = s k = ( s, P R + γv ( s (3 = ss ss s [ ] he con-vlue funcon s defned s
3 Q ( s, : S A R, where: = k Q ( s, E γ r + k + s = s, = (4 k = For clry, we wll presen below he mn resuls for he vlue funcon, hen we wll shf o he con-vlue funcon when presenng our model. III. OWARDS OUR MODEL Our work uses echnques developed for he problem of onlne on-polcy evluon, where n pproxme convlue funcon s mnned nd mproved fer ech me sep of followng he polcy. In prculr we re neresed n lner Q-funcon pproxmor h uses emporl Dfference lernng (D [2] snce D lernng cn be gurneed o converge wh ny lner funcon pproxmor nd suble sep sze [3]. For he connuous cse nd non-lner funcon pproxmon, convergence s no gurneed [4] lhough some models hve been presened wh good resuls [5] In hs work we focus on presenng model h lerns n pproxmon of polcy s con-vlue funcon from smple rjecores of experence followng h polcy. A mehod for solvng hs problem s core componen of our vsul robo homng model. In prculr, mnnng n onlne esme of he Q-funcon cn be combned wh generlzed polcy mprovemen (GPI o lern conroller []. For prculr vlue funcon V le he D error me be defned s: δ ( V r V ( s θ = + + γ V ( s (5 δ ( V θ hen, E [ δ ( V ] = = predcon+ predcon, h s, he men D error for he polcy s rue vlue funcon mus be zero. We re neresed n pproxmng V usng lner funcon pproxmor. In prculr, suppose we hve funcon whch gves feure n represenon of he se spce φ : S R. We re neresed n n pproxmed vlue funcon of he form n V = s θ θ R re he prmeers of he vlue θ ; funcon. Becuse he polcy s rue vlue funcon my no be n our spce of lner funcons, we wn o fnd se of prmeers h pproxmes he rue funcon. One possble pproch s o use he observed D error on smple rjecores of experence o gude he pproxmon. he sndrd one-sep D mehod for vlue funcon pproxmon s D(. he bsc de of D( s o djus he predced vlue of se o reduce he D error. Gven some new experence uple ( s,, r+, s +, he upde wh lner funcon pproxmon s: θ = θ + + α u ( θ (6 u θ = δ ( V s (7 ( θ V s he esmed vlue wh respec o θ θ nd α s he lernng re. he vecor u ( θ s lke grden esme h specfes how o chnge he predced vlue of s o reduce he observed D error. We wll cll u ( θ he D upde me. Afer updng he prmeer vecor, he experence uple s removed form memory. IV. HE PROPOSED VISUAL HOMING SARSA MODEL In hs secon we descrbe he proposed model. In he smples perspecve, ny renforcemen lernng model, (or ny MDP model n generl, consss of elemens nd experence gned bou hose elemens. he envronmen dynmcs encoded n he uple ( S, A, P ss, R ss, γ descrbes he bsc elemens of he model, whle he nercon beween he robo nd he envronmen consues he gned experence. hs experence s normlly encoded n he lernng prmeers usng some lernng mehod h mnly lerns vlue funcon. For conrol, rechng n * opml polcy cn be done hrough polcy mprovemen. We frs begn by descrbng he mn elemens, hen we descrbe he lernng rules nd lgorhm, nd conclude hs secon wh he overll model srucure. A. Bsc Elemens of he Sysem, he Se Spce: Snce we re consderng vsul homng, s nurl o choose he vson s he mn medum o dsngush beween dfferen suons. Hence, we ssume s he mge ech me sep h represens he curren se, nd he se spce S s he se of ll he mges h cn be possbly ken for ny locon (wh specfc orenon n he envronmen. hs complex se spce hs wo problems. Frs, ech se s of hgh dmensonly,.e. ech se s represened by lrge number of pxel componens. Second, hs se spce s huge nd polcy cnno be lerned drecly for ech se. Insed, feure represenon of he ses s used o reduce he hgh dmensonly of he mges se spce nd o gn he dvnges of codng [6]. hs feure represenon of se spce s ssumed o reserve he dsncveness of ses, hence cn reduce he hgh-dmensonly problem bu we re sll fced by he nrcbly problem. herefore, generlzon echnque s needed n order o ccommode he nrcbly of se spce. More precsely, generlzon s needed n order o pproxme he vlue for se h hs never been vsed before, hrough prevous vss o smlr ses. A nurl wy o do so s o use funcon pproxmon echnque such s neurl nework. We would lke o encode n hose feures mplcly how dfferen he curren mge vew s from hose of he gol. hs vsul clue should gude he process of fndng he gol locon. he problem s h hs pproch does no gve drec dsnce ndcon. We wll no ssume h he gol locon s lwys n he robo's feld of vew, bu by comprng he curren vew wh he gol vew we combne
4 he properes of dsncveness, dsnce nd orenon n one represenon. B. Defnng he gol locon: Snce he home locon cn be pproched from dfferen drecons, he wy s represened should ccommode hs fc. herefore, home (or gol locon s defned by m snpshos clled he sored vews. he few snpshos (normlly m 3 of he home locon re ken he very sr, ech from fxed dsnce bu from dfferen ngle. he dsnce should be compble wh he scle of he envronmen nd he chrcerscs of he home locon. hs llows for he hghes dsncveness of he locon whou loosng nfo or nvolvng unneeded nformon. hese snpshos re he only requremen of he sysem o lern o rech s home locon srng from ny poson n he envronmen (ncludng hose from whch cnno see he home from,.e. he robo should be ble o rech hdden gol locon. C. he Feures Vecors: We ke hsogrm of ech chnnel of he curren vew nd compre wh hose of he sored vews hrough rdl bss funcon (RBF lyer. hs gves us he feure n spce Φ : S R represenon (8 whch s used wh he Srs(λ lgorhm, s we shll see ler. 2 h ( s ( c h ( v( c, φ ( s ( c, = exp( (8 2 2σ he ndex s for he me sep, j snds for he j h sored vew, nd c s he ndex of he chnnel where we used he RGB represenon of mges. So v( c, s he mge of chnnel c of he j h sored vew, h ( v( c, s he bn of he mge v ( c,, nd h ( s (c s bn of he chnnel c of he curren ( vew. Of course he number of bns hs n effec on he performnce of hs mesure nd hence on he model, nd wll be suded n he expermenl secon. D. he Acon Spce: he se of cons s A = [Lef_Forwrd, Rgh_Forwrd, Go Forwrd], where he wo dfferenl wheel speeds were se o fxed vlues so h we hve counble se of cons. E. Dssmlry Mesure nd he ermnon Condon: We need wy o deermne how close he curren poson s o he gol locon, hs s done hrough mesurng how dssmlr s he curren vew o ech sored vew of he gol locon. One cn use ny of he dssmlry mesures dscussed exensvely n he nformon rerevl feld [, 7]. In prculr we re neresed n he Jeffery Dvergence Mesure, gven by (9. ( H, K 2h 2k JDM = h log + k log (9 h + k h + k Where H nd K re wo mges o be compred, h nd k re he number of elemens belong o bn of he hsogrms of H nd K, respecvely. Fg. ( shows smple vew of robo's cmer, pr (b shows he chnges h ook plce n JDM mesuremens when urnng wy from hs locon Jeffrey Fg.. Exmple of he JDM behvour relve o he robo roonl moon JDM hs been successfully used wh omn-dreconl cmer o perform robo loclzon []. We used norml cmer, however, o be ble o dsngush he robo s orenon whch s crucl o our nvgonl sk. hs s o vod he dsdvnge of orenon-nsensvy of omn-dreconl cmer whch s desrble for loclzon bu undesrble for nvgon. We wll denoe JDM ( c, s beng he Jeffery Dvergence Mesure beween he curren vew nd he sored vew j ccordng o he chnnel c, nd we denoe o be he verge dssmlry beween he curren JDM ( vew nd he sored vew j on ll of he chnnels: JDM ( = JDM ( c, C ( c ( (b We se our ermnon se o be he curren vew for whch one of s JDM ( c, wh he m sored vews s less hn cern hresholdψ,.e. he vew h mches well wh one of he gol vews. If mn( JDM ( c, <ψ ermne Epsode. c, j he wy o se hs envronmen-scle-specfc hreshold s dscussed n he expermenl secon. F. he Rewrd Funcon: he rewrd funcon R consss of hree prs: s s -he mn pr s he cos whch s se o - for ech sep ken by he robo whou rechng he home locon (rechng ermnon se. he rewrd sgnl cn be ugmened by noher wo sgnl o nsure hgher performnce lhough he model works regrdless of her nvolvemen. hose re: -Approchng he gol rewrd: s he mxmum reducon n dssmlry beween he curren sep nd he prevous sep. If hs dfference s decresng mens h he robo s cully movng n he rgh drecon owrds he home locon. Whle f s ncresng mens he oppose. We cll hs sgnl he dfferenl dssmlry sgnl nd s defned s:
5 JDM = mx( JDM ( JDM ( ( j - he Poson sgnl s he nverse of he curren dssmlry. JDM hus, s he curren locon dffers less, from he home locon, hs rewrd wll ncrese. r = cos + JDM + (2 JDM Of course he prevous wo rewrd componen wll only be consdered f he dssmlres of boh seps flls under cern hreshold ψ o ensure h he robo s pprochng he home locon. hs hreshold s envronmen- sclespecfc, nd s nroduced merely o enhnce he performnce. he overll srucure of he model s shown n Fg. 2. JDM r _ c Curren Srs(λ Feure vecor of ech φ (s hsogrm bn θ Q+ ( φ ( s, Curren Imge Sored Vews Imges C B m RBF of ech feure wh reference vew G. he Elgbly rce: An elgbly rce consues mechnsm for emporl cred ssgnmen. I mrks he memory prmeers ssoced wh he con s elgble for undergong lernng chnges []. Dependng on our pplcon, he elgbly rce for con s he dscouned sum of he feure vecors for he mges h he robo hs seen so fr, fer pplyng hs con. he elgbly rce for oher cons whch hs no been ken whle n he curren se s smply s prevous rce bu dscouned,.e. hose cons re now less responsble for he cred: γλ e ( + s f = e ( (3 γλe ( oherwse where λ s he dscoun re for he elgbly rces e H. he Lernng Mehod: he remnng s he lernng lgorhm. Our lgorhm s n on-polcy boosrppng Srs(λ [] wh lner pproxmon of he Q con-vlue funcon. Srs(λ s n B m s Curren se Q Hsogrm of ech chnnel = feures ( s, + Robo Conrol Polcy Q-funcon pproxmon ( s, Q( φ ( s, Fg. 2. he vrous componen of he proposed model. lgorhm h uses D(λ for conrol. I lerns on-lne hrough nercon wh smulon sofwre h feeds wh he robo vsul sensors. he lgorhm coded s conroller reurns he chosen con o be ken by he robo, nd updes s polcy hrough updng s se of prmeers used o pproxme he con-vlue funcon Q. hree lner neworks re used o pproxme he con-vlue funcon for he hree cons. ( ( ( θ ( ( = ( θ,, θ,, θ n =,.. A he curren mge ws pssed hrough n RBF lyer whch gves he feure vecor φ ( s = ( φ,, φ,, φn. he robo ws lef o run hrough severl epsodes. Afer ech epsode he lernng re ws decresed, nd he polcy ws mproved furher hrough GPI. he overll lgorhm s h of he Srs(λ conrol lgorhm [] nd s summrzed n Fg. 3. Inlzon θ ( = 2 Repe for ech epsode e ( = = : A s Inl robo vew, Genere usng smplng of probbly ( s, Repe (for ech sep of epsode ke con, Observe r +, s +, Genere usng smplng of probbly ( s = : A + [ r + γ s θ( s θ( ] δ γλe ( + s f = e ( γλe ( oherwse θ( θ( + α ep e ( δ s s unl mn( JDM ( < ψ j + + unl epsode == fnl_epsode,. Fg. 3. Lner on-polcy grden-descen Srs(λ conrol, wh RBF feures lgorhm for lner con-vlue funcon pproxmon nd Polcy Improvemen. he pproxme Q s mplcly funcon of θ he lernng re ws he sme used by Boyn [8]: n(fnl_epsode + α ep = α (4 n(fnl_epsode + epsode I. he polcy used o Genere Acons: A combnon of ε-greedy polcy nd Gbbs sof-mx [] polcy s used o pck n con nd o srke he blnce beween exploron nd exploon. Usng ε-greedy probbly llows exploron o be ncresed s needed by nlly seng ε o hgh vlue hen decresng hrough epsodes. Gbbs sof-mx probbly, +
6 Gbbs ( (, φ ( s = A j = exp [ φ ( s θ ( ( ] [ φ ( s θ ( ( j ] exp, (5 helped n ncresng he chnces of pckng he con wh he hghes vlue when he dfferences beween he vlues of nd he remnng cons s lrge,.e. helped n ncresng he chnces of pckng he con wh he hghes Q-vlue when he robo s sure h s he rgh one. [ s θ ( ( ] ε ε + f = rg mx A (6 Pr(, s = ε oherwse A ε + Gbbs(, s = Gbbs(, s + Pr(, s (7 J. he Neurl Nework Lyers: From neurl nework pon of vew, when consderng he RBF lyer ogeher wh he compeve lyer, one cn relze h hs rchecure s smlr o Probblsc Neurl Nework (PNN h clcules he probbly of pckng up cern con condonl o he gven gol. We wll cll he neurl nework used n our model he RBF-Q-D Nework (nd lgorhm becuse we used he RBF lyer for feure exrcon nd hen lner lyer wh Srs(λ lgorhm nd he dssmlry mesure. Fg. 3 shows smplfcon of our model wh s lyers. K. he Lner Neworks nd Feures Dmensons : he prmeers hve he sme dmenson s he feure spce whch s n = C B m ; where C = 3s he number of chnnels, B s he number of bns per mge nd m s he number of sored vews for he gol locon. Snce we use n RGB mges wh vlues n he rnge of [, 255] for ech pxel, he dmenson of he feure spce s gven by: 256 n C m (8 b where b s he bn s sze. Dfferen bn szes gve dfferen dmensons, whch n urn gve dfferen number of pproxmon prmeersθ. he equly s no complee due o he fc h he precse number of bns s gong o be 256 B = round ( +. b Noe h σ of he feures hs been chosen hrough connuous upde of he sum of he feures vecor colleced n ll he me seps so fr. / 2 2 σ. ( (, ( (, = h s + c j h v c j (9 hs llowed for mnnng beer ncremenl esmon of ech feure vrnce nd hence beer performnce. Afer enough exploron of he envronmen hs vlue s lmos sble nd chnges o re mnmzed. I hs been observed h he vrnce of hs nernl prmeer hs dropped fer epsode o neglgble vlue, whch mens resuls re relble for epsode> nd h he neurl neworks re lernng he vlue funcon for lmos he sme ses h re gong o be encounered n he fuure. L. Imporn enhncemens nd Lmons: One problem of unnecessry wnderng remns. Mnly s cused by consequen conflcng posve nd negve rewrds gven by he envronmen due o pprochng he gol nd wnderng round whou rechng. Smply sever punshmen ws ppled for he prculr cse when he robo goes from posve rewrdng o negve punshmens n wo successve seps. V. EXPERIMENAL RESULS he model ws ppled usng smuled Kheper [9] robo n Webos [2] smulon sofwre. he Kheper s mnure rel robo, 7 mm dmeer nd 3 mm hegh, nd s provded wh 8 nfr-red sensors for recve behvour, s well s colour cmer exenson. A.5x m 2 smuled envronmen hs been used s es bed for our model. he sk s o lern o nvge from ny locon n he envronmen o home locon (whou usng ny specfc objec or lndmrk. For rnng, he robo lwys srs from he sme locon, where cnno see he rge locon, nd he end se s he rge locon. rge locons Kheper robo n s srng locon Fg. 4. Snpshos of he relsc smuled envronmen. Fg. 4 shows he envronmen used. A cone, bll nd V re ncluded o dd more exure o he gol locon,.e. o enrch nd mke dfferen from he oher envronmen locons. We reemphsze h no objec recognon echnque ws used, only he JDM. he conroller wren s combnon of C++ code nd Mlb Engne code. he robo srs by kng (m= 3 snpshos for he gol locon. I hen goes hrough specfc number (5 of epsodes. he robo srs wh rndom polcy, nd fnshes n epsode when reches he desred locon. A. he Prccl Sengs of he Model Prmeers: For our pplcon we hve chosen he feure spce prmeers o be b=3, m=3 hence
7 n = 3 ( round(256 / = 774. λ ws se o he vlue of.8 dependng on he sudes [, 2] h referred o he rnge of [.7.8] s he pek of he performnce of he D(λ-lernng. he dscoun consn ws se oγ =,.e. here s no dscoun hrough me. ψ, ψ re purely emprcl nd were se o.7 nd 2 respecvely. B. Seng he Exploon vs. Exploron: Snce con spce s fne, nd o vod flucuon nd overshoong n he robo behvour, low wheel speeds were doped for hese cons. hs n urn requred seng he exploron o relvely hgh re (lmos 5% durng he erly epsodes. I ws hen dropped grdully hrough epsodes, n order o mke sure h mos of he poenl phs re suffcenly vsed. Seng exploron hgh lso helps n decresng he number of epsodes needed before rechng n ccepble performnce. hs explns he exponenl ppernce of he dfferen lernng curves dscussed below. he model performnce hs been suded for smll number of sored vews (m=3 o show he robusness of he model. One cn enhnce ccurcy by ncresng he dmenson spce bu would hve o rde-off speed of convergence nd execuon. he mos nurl wy o ncrese he se spce dmenson s by ncresng he number of hsogrm's bns consdered. However, o concenre on he pure effec of chngng m nd elmne he ncrese n se dmenson due o he ncrese n m (8, one cn se m=b hen chnge boh m nd b ogeher. hs could fx he dmenson of he feure spce nd consequenly he sze of he pproxmor, nd show he cul effec of chngng he number of vews m. C. Sudyng he Model Performnce: Fg. 5, shows he effec of lernng verged over 8 rls, ech wh 5 epsodes. All of he rls successfully converged. Dvergence occurred only when seng he lernng re o hgh vlue, or when exploron ws quckly decresed. he reson h we needed low lernng re s h we use Gbbs probbly dsrbuon for he exploron/exploon blnce. hs exponenlly formed probbly cn go quckly o nfny f cre s no ken when ssgnng s exponens. he fc h we hve relvely lrge se spce dmenson ws he mjor fcor n hs suon. Pr ( shows he mos mporn spec of ny renforcemen lernng model; he reurn vlues of ech epsode, convergng opmlly. Afer ll, he mn purpose of he RL-bsed model s o opmlly ncrese he sum of he receved rewrds. he reurn vlues (mosly negve hve ncresed nurlly hrough epsodes due o he mprovemen kng plce from epsode o epsode. hs s done v mprovng he doped polcy mplcly; by movng o beer esmes nd decresng exploron from epsode o he oher. he ccurcy of he con-vlue funcon esmes s grdully/ervely ncresng usng he lernng prmeer θ. Fg. 5 (b shows he decrese h ook plce n he number of seps needed o cheve he sk. hs norml decrese s n ccordnce wh pr ( nd becuse of he cos pr of he rewrd funcon. In fc, we decresed he dfference beween ψ nd ψ, so h he oher wo prs of he rewrd formul hve mnml effec on he model convergence. Fg. 5 (c depcs he chnges n he lernng prmeers hemselves, snds for he componen ndex of he lernng vecor. Mos mporn s h he hree prs hve n exponenllke shpe showng he hgh speed of convergence hs model hs reched. hs s hghly desrble n renforcemen lernng model due o s domnn convergence slowness problem []. In fc, one mjor conrbuon of hs work s he hgh performnce reched wh lle experence usng complex vsul npu. Fg. 6, depcs performnce nd nernl prmeers llusrons. Fg. 6, ( shows he lernng re decrese hrough he epsodes whch ws used hroughou he rls. Pr (b shows he decrese enforced on he exploron re ε whle pr (c shows he overll percenge of explorve con nd explove cons. Roues ken by he robo n hree epsodes (erly, mddle, nd fnl for one of he rls re shown n prs (d-(f. VI. DISCUSSION AND CONCLUSION b Fg. 5. Lernng Curves verged over 8 rls. c We hve ckled he polcy mprovemen for Srs(λ sysems combned wh JDM nd RBF. hs s novel o models nroduced n he lerure due o he wy we ppled renforcemen lernng usng neuro-dynmc progrmmng mehods lke Srs(λ. Below we se some of he dvnges of hs model:
8 c e Fg. 6. Lernng performnce nd smple roue for smple rl. Smplcy of lernng: he robo cn lern o perform s vsul nvgon sk n very smple wy whou long process of mp buldng. 2 Lmed sorge of nformon s requred n he form of m sored vews. 3 No pre or mnul processng s requred. No pror knowledge bou he envronmen s needed.e. no lndmrks re needed n hose vews. 4 An mporn dvnge of our model over MDP explc model-bsed pproches s h bducon of robo s solved drecly.e. he robo cn fnd s wy nd recover fer hs been dsplced from s curren poson nd pu n olly dfferen poson. o rse he dfferenbly of he vews, however, hey should be rch wh colours ec. (.e. good moun of nformon. hrough he lernng robusness nd generly of RL robos, coupled wh vsul ses nd rewrds, he sysem cheved hgh level of robusness, generly, nd pplcbly. hs combnon envely proved o work very well for our nvgon problem. Fuure work ncludes crryng ou more exensve expermens over our model by ryng dfferen confgurons usng (8, boh n erms of more vews o be consdered s well s dfferen bns szes nd dfferen envronmens. Fuure work cn lso nclude usng off-polcy nsed of he on-polcy mehod o ccommode for wo behvours lyers used by he gen. b d f [2] A. M. Anderson, "A model for lndmrk lernng n he honey-bee," Journl of Comprve Physology A vol. 4, pp , 977. [3] B. A. Crwrgh nd. S. Colle, "Lndmrk mps for honeybees," Bologcl Cybernecs, vol. 57, pp , 987. [4] A. Vrdy nd F. Oppcher, "A scle nvrn locl mge descrpor for vsul homng," n Bommec neurl lernng for nellgen robos., G. Plm nd S. Wermer, Eds.: Sprnger, 25 [5] M. Szenher, "Vsul Homng wh Lerned Gol Dsnce Informon," presened Proceedngs of he 3rd Inernonl Symposum on Auonomous Mnrobos for Reserch nd Edunmen (AMRE 25, Awr-Sp, Fuku, Jpn, 25. [6] D. Sheynkhovch, R. Chvrrg,. Srossln, nd W. Gersner, "Spl Represenon nd Nvgon n Bo-nspred Robo," n Bommec Neurl Lernng for Inellgen Robos, S. Wermer, M. Elshw, nd G. Plm, Eds.: Sprnger, 25, pp [7] C. Weber, D. Muse, M. Elshw, nd S. Wermer, "A cmer-drecon dependen vsul-moor coordne rnsformon for vsully guded neurl robo," Knowledge-Bsed Sysems, Scence Drec, Elsever vol. 9, pp , 26. [8] S. hrun, Y. Lu, D. Koller, A. Ng, Z. Ghhrmn, nd H. Durrn- Whye, "Smulneous loclzon nd mppng wh sprse exended nformon flers," Inernonl Journl of Robocs Reserch, vol. 23, pp , 24. [9] S. hrun, W. Burgrd, nd D. Fox, Probblsc Robocs. Cmbrdge, Msschuses; London, Englnd: he MI Press, 25. [] I. Ulrch nd I. Nourbkhsh, "Appernce-Bsed Plce Recognon for opologcl Loclzon," presened IEEE Inernonl Conference on Robocs nd Auomon, Sn Frncsco, CA, 2. [] R. S. Suon nd A. Bro, Renforcemen Lernng, n nroducon. Cmbrdge, Msschuses: MI Press, 998. [2] R. S. Suon, "Lernng o predc by he mehods of emporl dfferences," Mchne Lernng, vol. 3, pp. 9 44, 988. [3] J. N. sskls nd B. Vn Roy, "An nlyss of emporl-dfference lernng wh funcon pproxmon," IEEE rnscons on Auomc Conrol vol. 42, pp , 997. [4] J. A. Boyn, "echncl upde: Les-squres emporl dfference lernng.," Mchne Lernng vol. 49., pp , 22. [5] C. Gske, D. Weergreen, nd A. Zelnsky, "Q-Lernng n Connuous Se nd Acon Spces," presened Ausrln Jon Conference on Arfcl Inellgence, Ausrl, 999. [6] P. Sone, R. S. Suon, nd G. Kuhlmnn, "Renforcemen lernng for robocup soccer keepwy," Inernonl Socey for Adpve Behvor vol. 3, pp , 25. [7] Y. Rubner nd e l., "he Erh Mover's Dsnce s Merc for Imge Rerevl," Inernonl Journl of Compuer Vson, vol. 4, pp. 99-2, 2. [8] J. A. Boyn, "Les-squres emporl dfference lernng.," presened In Proceedngs of he Sxeenh Inernonl Conference on Mchne Lernng, Sn Frncsco, CA, 999. [9] D. Floreno nd F. Mondd, "Hrdwre soluons for evoluonry robocs," presened Frs Europen Workshop on Evoluonry Robocs, Berln, 998. [2] O. Mchel, " Webos: Professonl Moble Robo Smulon," Inernonl Journl of Advnced Roboc Sysems, vol., pp , 24. [2] L. C. Brd, "Resdul Algorhms: Renforcemen Lernng wh Funcon Approxmon," presened Inernonl Conference on Mchne Lernng, proceedngs of he welfh Inernonl Conference, Sn Frncsco, CA, 995. REFERENCES [] U. Nehmzow, Moble robocs: A Prccl Inroducon: Sprnger- Verlg, 2.
Hidden Markov Model. a ij. Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,..., sn
Hdden Mrkov Model S S servon : 2... Ses n me : 2... All ses : s s2... s 2 3 2 3 2 Hdden Mrkov Model Con d Dscree Mrkov Model 2 z k s s s s s s Degree Mrkov Model Hdden Mrkov Model Con d : rnson roly from
More informationDeveloping Communication Strategy for Multi-Agent Systems with Incremental Fuzzy Model
(IJACSA) Inernonl Journl of Advnced Compuer Scence nd Applcons, Developng Communcon Sregy for Mul-Agen Sysems wh Incremenl Fuzzy Model Sm Hmzeloo, Mnsoor Zolghdr Jhrom Deprmen of Compuer Scence nd Engneerng
More informationSupporting information How to concatenate the local attractors of subnetworks in the HPFP
n Effcen lgorh for Idenfyng Prry Phenoype rcors of Lrge-Scle Boolen Newor Sng-Mo Choo nd Kwng-Hyun Cho Depren of Mhecs Unversy of Ulsn Ulsn 446 Republc of Kore Depren of Bo nd Brn Engneerng Kore dvnced
More informationAdvanced Electromechanical Systems (ELE 847)
(ELE 847) Dr. Smr ouro-rener Topc 1.4: DC moor speed conrol Torono, 2009 Moor Speed Conrol (open loop conrol) Consder he followng crcu dgrm n V n V bn T1 T 5 T3 V dc r L AA e r f L FF f o V f V cn T 4
More informationReinforcement Learning for a New Piano Mover s Problem
Renforcemen Lernng for New Pno Mover s Problem Yuko ISHIWAKA Hkode Nonl College of Technology, Hkode, Hokkdo, Jpn Tomohro YOSHIDA Murorn Insue of Technology, Murorn, Hokkdo, Jpn nd Yuknor KAKAZU Reserch
More informationChapter 2: Evaluative Feedback
Chper 2: Evluive Feedbck Evluing cions vs. insrucing by giving correc cions Pure evluive feedbck depends olly on he cion ken. Pure insrucive feedbck depends no ll on he cion ken. Supervised lerning is
More informatione t dt e t dt = lim e t dt T (1 e T ) = 1
Improper Inegrls There re wo ypes of improper inegrls - hose wih infinie limis of inegrion, nd hose wih inegrnds h pproch some poin wihin he limis of inegrion. Firs we will consider inegrls wih infinie
More informationSolution in semi infinite diffusion couples (error function analysis)
Soluon n sem nfne dffuson couples (error funcon analyss) Le us consder now he sem nfne dffuson couple of wo blocks wh concenraon of and I means ha, n a A- bnary sysem, s bondng beween wo blocks made of
More informationPhysics 201 Lecture 2
Physcs 1 Lecure Lecure Chper.1-. Dene Poson, Dsplcemen & Dsnce Dsngush Tme nd Tme Inerl Dene Velocy (Aerge nd Insnneous), Speed Dene Acceleron Undersnd lgebrclly, hrough ecors, nd grphclly he relonshps
More informationA NEW INTERPRETATION OF INTERVAL-VALUED FUZZY INTERIOR IDEALS OF ORDERED SEMIGROUPS
ScInLhore),7),9-37,4 ISSN 3-536; CODEN: SINTE 8 9 A NEW INTERPRETATION O INTERVAL-VALUED UZZY INTERIOR IDEALS O ORDERED SEMIGROUPS Hdy Ullh Khn, b, Nor Hnz Srmn, Asghr Khn c nd z Muhmmd Khn d Deprmen of
More information( ) () we define the interaction representation by the unitary transformation () = ()
Hgher Order Perurbaon Theory Mchael Fowler 3/7/6 The neracon Represenaon Recall ha n he frs par of hs course sequence, we dscussed he chrödnger and Hesenberg represenaons of quanum mechancs here n he chrödnger
More informationRL for Large State Spaces: Policy Gradient. Alan Fern
RL for Lrge Se Spce: Polcy Grden Aln Fern RL v Polcy Grden Serch So fr ll of our RL echnque hve red o lern n ec or pprome uly funcon or Q-funcon Lern opml vlue of beng n e or kng n con from e. Vlue funcon
More informationMotion Feature Extraction Scheme for Content-based Video Retrieval
oon Feure Exrcon Scheme for Conen-bsed Vdeo Rerevl Chun Wu *, Yuwen He, L Zho, Yuzhuo Zhong Deprmen of Compuer Scence nd Technology, Tsnghu Unversy, Bejng 100084, Chn ABSTRACT Ths pper proposes he exrcon
More informationSimplified Variance Estimation for Three-Stage Random Sampling
Deprmen of ppled Sscs Johnnes Kepler Unversy Lnz IFS Reserch Pper Seres 04-67 Smplfed rnce Esmon for Three-Sge Rndom Smplng ndres Quember Ocober 04 Smplfed rnce Esmon for Three-Sge Rndom Smplng ndres Quember
More informationVariants of Pegasos. December 11, 2009
Inroducon Varans of Pegasos SooWoong Ryu bshboy@sanford.edu December, 009 Youngsoo Cho yc344@sanford.edu Developng a new SVM algorhm s ongong research opc. Among many exng SVM algorhms, we wll focus on
More informationMotion. Part 2: Constant Acceleration. Acceleration. October Lab Physics. Ms. Levine 1. Acceleration. Acceleration. Units for Acceleration.
Moion Accelerion Pr : Consn Accelerion Accelerion Accelerion Accelerion is he re of chnge of velociy. = v - vo = Δv Δ ccelerion = = v - vo chnge of velociy elpsed ime Accelerion is vecor, lhough in one-dimensionl
More informationOrigin Destination Transportation Models: Methods
In Jr. of Mhemcl Scences & Applcons Vol. 2, No. 2, My 2012 Copyrgh Mnd Reder Publcons ISSN No: 2230-9888 www.journlshub.com Orgn Desnon rnsporon Models: Mehods Jyo Gup nd 1 N H. Shh Deprmen of Mhemcs,
More informationMinimum Squared Error
Minimum Squred Error LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > for ll smples y i solve sysem of liner inequliies MSE procedure y i = i for ll smples
More informationANOTHER CATEGORY OF THE STOCHASTIC DEPENDENCE FOR ECONOMETRIC MODELING OF TIME SERIES DATA
Tn Corn DOSESCU Ph D Dre Cner Chrsn Unversy Buchres Consnn RAISCHI PhD Depren of Mhecs The Buchres Acdey of Econoc Sudes ANOTHER CATEGORY OF THE STOCHASTIC DEPENDENCE FOR ECONOMETRIC MODELING OF TIME SERIES
More informationPrivacy-Preserving Bayesian Network Parameter Learning
4h WSEAS In. Conf. on COMUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS nd CYBERNETICS Mm, Flord, USA, November 7-9, 005 pp46-5) rvcy-reservng Byesn Nework rmeer Lernng JIANJIE MA. SIVAUMAR School of EECS,
More informationMinimum Squared Error
Minimum Squred Error LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > 0 for ll smples y i solve sysem of liner inequliies MSE procedure y i i for ll smples
More informationContraction Mapping Principle Approach to Differential Equations
epl Journl of Science echnology 0 (009) 49-53 Conrcion pping Principle pproch o Differenil Equions Bishnu P. Dhungn Deprmen of hemics, hendr Rn Cmpus ribhuvn Universiy, Khmu epl bsrc Using n eension of
More informationA Kalman filtering simulation
A Klmn filering simulion The performnce of Klmn filering hs been esed on he bsis of wo differen dynmicl models, ssuming eiher moion wih consn elociy or wih consn ccelerion. The former is epeced o beer
More informationElectromagnetic Transient Simulation of Large Power Transformer Internal Fault
Inernonl Conference on Advnces n Energy nd Envronmenl Scence (ICAEES 5) Elecromgnec Trnsen Smulon of rge Power Trnsformer Inernl Ful Jun u,, Shwu Xo,, Qngsen Sun,c, Huxng Wng,d nd e Yng,e School of Elecrcl
More informationRL for Large State Spaces: Policy Gradient. Alan Fern
RL for Lrge Se Spce: Polcy Grden Aln Fern Movon for Polcy Serch So fr ll of our RL echnque hve red o lern n ec or pprome vlue funcon or Q-funcon Lern opml vlue of beng n e or kng n con from e. Vlue funcon
More informationLecture 4: Trunking Theory and Grade of Service (GOS)
Lecure 4: Trunkng Theory nd Grde of Servce GOS 4.. Mn Problems nd Defnons n Trunkng nd GOS Mn Problems n Subscrber Servce: lmed rdo specrum # of chnnels; mny users. Prncple of Servce: Defnon: Serve user
More informationRank One Update And the Google Matrix by Al Bernstein Signal Science, LLC
Introducton Rnk One Updte And the Google Mtrx y Al Bernsten Sgnl Scence, LLC www.sgnlscence.net here re two dfferent wys to perform mtrx multplctons. he frst uses dot product formulton nd the second uses
More informationMacroscopic quantum effects generated by the acoustic wave in a molecular magnet
Cudnovsky-Fes-09034 Mcroscopc qunum effecs genered by e cousc wve n moleculr mgne Gwng-Hee Km ejong Unv., Kore Eugene M. Cudnovksy Lemn College, CUNY Acknowledgemens D. A. Grnn Lemn College, CUNY Oulne
More informationII The Z Transform. Topics to be covered. 1. Introduction. 2. The Z transform. 3. Z transforms of elementary functions
II The Z Trnsfor Tocs o e covered. Inroducon. The Z rnsfor 3. Z rnsfors of eleenry funcons 4. Proeres nd Theory of rnsfor 5. The nverse rnsfor 6. Z rnsfor for solvng dfference equons II. Inroducon The
More informationOPERATOR-VALUED KERNEL RECURSIVE LEAST SQUARES ALGORITHM
3rd Europen Sgnl Processng Conference EUSIPCO OPERATOR-VALUED KERNEL RECURSIVE LEAST SQUARES ALGORITM P. O. Amblrd GIPSAlb/CNRS UMR 583 Unversé de Grenoble Grenoble, Frnce. Kdr LIF/CNRS UMR 779 Ax-Mrselle
More informationEEM 486: Computer Architecture
EEM 486: Compuer Archecure Lecure 4 ALU EEM 486 MIPS Arhmec Insrucons R-ype I-ype Insrucon Exmpe Menng Commen dd dd $,$2,$3 $ = $2 + $3 sub sub $,$2,$3 $ = $2 - $3 3 opernds; overfow deeced 3 opernds;
More informationMODELLING AND EXPERIMENTAL ANALYSIS OF MOTORCYCLE DYNAMICS USING MATLAB
MODELLING AND EXPERIMENTAL ANALYSIS OF MOTORCYCLE DYNAMICS USING MATLAB P. Florn, P. Vrání, R. Čermá Fculy of Mechncl Engneerng, Unversy of Wes Bohem Asrc The frs pr of hs pper s devoed o mhemcl modellng
More informationTighter Bounds for Multi-Armed Bandits with Expert Advice
Tgher Bounds for Mul-Armed Bnds wh Exper Advce H. Brendn McMhn nd Mhew Sreeer Google, Inc. Psburgh, PA 523, USA Absrc Bnd problems re clssc wy of formulng exploron versus exploon rdeoffs. Auer e l. [ACBFS02]
More informationReview: Transformations. Transformations - Viewing. Transformations - Modeling. world CAMERA OBJECT WORLD CSE 681 CSE 681 CSE 681 CSE 681
Revew: Trnsforons Trnsforons Modelng rnsforons buld cople odels b posonng (rnsforng sple coponens relve o ech oher ewng rnsforons plcng vrul cer n he world rnsforon fro world coordnes o cer coordnes Perspecve
More information4.8 Improper Integrals
4.8 Improper Inegrls Well you ve mde i hrough ll he inegrion echniques. Congrs! Unforunely for us, we sill need o cover one more inegrl. They re clled Improper Inegrls. A his poin, we ve only del wih inegrls
More informationQuery Data With Fuzzy Information In Object- Oriented Databases An Approach The Semantic Neighborhood Of Hedge Algebras
(IJCSIS) Inernonl Journl of Compuer Scence nd Informon Secury, Vol 9, No 5, My 20 Query D Wh Fuzzy Informon In Obec- Orened Dbses An Approch The Semnc Neghborhood Of edge Algebrs Don Vn Thng Kore-VeNm
More informationApplied Statistics Qualifier Examination
Appled Sttstcs Qulfer Exmnton Qul_june_8 Fll 8 Instructons: () The exmnton contns 4 Questons. You re to nswer 3 out of 4 of them. () You my use ny books nd clss notes tht you mght fnd helpful n solvng
More informationDynamic Team Decision Theory. EECS 558 Project Shrutivandana Sharma and David Shuman December 10, 2005
Dynamc Team Decson Theory EECS 558 Proec Shruvandana Sharma and Davd Shuman December 0, 005 Oulne Inroducon o Team Decson Theory Decomposon of he Dynamc Team Decson Problem Equvalence of Sac and Dynamc
More informationResearch Article Oscillatory Criteria for Higher Order Functional Differential Equations with Damping
Journl of Funcon Spces nd Applcons Volume 2013, Arcle ID 968356, 5 pges hp://dx.do.org/10.1155/2013/968356 Reserch Arcle Oscllory Crer for Hgher Order Funconl Dfferenl Equons wh Dmpng Pegung Wng 1 nd H
More informationTHE PREDICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS
THE PREICTION OF COMPETITIVE ENVIRONMENT IN BUSINESS INTROUCTION The wo dmensonal paral dfferenal equaons of second order can be used for he smulaon of compeve envronmen n busness The arcle presens he
More informationCubic Bezier Homotopy Function for Solving Exponential Equations
Penerb Journal of Advanced Research n Compung and Applcaons ISSN (onlne: 46-97 Vol. 4, No.. Pages -8, 6 omoopy Funcon for Solvng Eponenal Equaons S. S. Raml *,,. Mohamad Nor,a, N. S. Saharzan,b and M.
More informationSeptember 20 Homework Solutions
College of Engineering nd Compuer Science Mechnicl Engineering Deprmen Mechnicl Engineering A Seminr in Engineering Anlysis Fll 7 Number 66 Insrucor: Lrry Creo Sepember Homework Soluions Find he specrum
More information0 for t < 0 1 for t > 0
8.0 Sep nd del funcions Auhor: Jeremy Orloff The uni Sep Funcion We define he uni sep funcion by u() = 0 for < 0 for > 0 I is clled he uni sep funcion becuse i kes uni sep = 0. I is someimes clled he Heviside
More informationHEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD
Journal of Appled Mahemacs and Compuaonal Mechancs 3, (), 45-5 HEAT CONDUCTION PROBLEM IN A TWO-LAYERED HOLLOW CYLINDER BY USING THE GREEN S FUNCTION METHOD Sansław Kukla, Urszula Sedlecka Insue of Mahemacs,
More informationTHE EXISTENCE OF SOLUTIONS FOR A CLASS OF IMPULSIVE FRACTIONAL Q-DIFFERENCE EQUATIONS
Europen Journl of Mhemcs nd Compuer Scence Vol 4 No, 7 SSN 59-995 THE EXSTENCE OF SOLUTONS FOR A CLASS OF MPULSVE FRACTONAL Q-DFFERENCE EQUATONS Shuyun Wn, Yu Tng, Q GE Deprmen of Mhemcs, Ynbn Unversy,
More informationLinear Response Theory: The connection between QFT and experiments
Phys540.nb 39 3 Lnear Response Theory: The connecon beween QFT and expermens 3.1. Basc conceps and deas Q: ow do we measure he conducvy of a meal? A: we frs nroduce a weak elecrc feld E, and hen measure
More informationThe Characterization of Jones Polynomial. for Some Knots
Inernon Mhemc Forum,, 8, no, 9 - The Chrceron of Jones Poynom for Some Knos Mur Cncn Yuuncu Y Ünversy, Fcuy of rs nd Scences Mhemcs Deprmen, 8, n, Turkey m_cencen@yhoocom İsm Yr Non Educon Mnsry, 8, n,
More informationThe solution is often represented as a vector: 2xI + 4X2 + 2X3 + 4X4 + 2X5 = 4 2xI + 4X2 + 3X3 + 3X4 + 3X5 = 4. 3xI + 6X2 + 6X3 + 3X4 + 6X5 = 6.
[~ o o :- o o ill] i 1. Mrices, Vecors, nd Guss-Jordn Eliminion 1 x y = = - z= The soluion is ofen represened s vecor: n his exmple, he process of eliminion works very smoohly. We cn elimine ll enries
More informationV.Abramov - FURTHER ANALYSIS OF CONFIDENCE INTERVALS FOR LARGE CLIENT/SERVER COMPUTER NETWORKS
R&RATA # Vol.) 8, March FURTHER AALYSIS OF COFIDECE ITERVALS FOR LARGE CLIET/SERVER COMPUTER ETWORKS Vyacheslav Abramov School of Mahemacal Scences, Monash Unversy, Buldng 8, Level 4, Clayon Campus, Wellngon
More informationISSN 075-7 : (7) 0 007 C ( ), E-l: ssolos@glco FPGA LUT FPGA EM : FPGA, LUT, EM,,, () FPGA (feldprogrble ge rrs) [, ] () [], () [] () [5] [6] FPGA LUT (Look-Up-Tbles) EM (Ebedded Meor locks) [7, 8] LUT
More informationTo Possibilities of Solution of Differential Equation of Logistic Function
Arnold Dávd, Frnše Peller, Rená Vooroosová To Possbles of Soluon of Dfferenl Equon of Logsc Funcon Arcle Info: Receved 6 My Acceped June UDC 7 Recommended con: Dávd, A., Peller, F., Vooroosová, R. ().
More informationA new model for limit order book dynamics
Anewmodelforlimiorderbookdynmics JeffreyR.Russell UniversiyofChicgo,GrdueSchoolofBusiness TejinKim UniversiyofChicgo,DeprmenofSisics Absrc:Thispperproposesnewmodelforlimiorderbookdynmics.Thelimiorderbookconsiss
More informationFINANCIAL ECONOMETRICS
FINANCIAL ECONOMETRICS SPRING 07 WEEK IV NONLINEAR MODELS Prof. Dr. Burç ÜLENGİN Nonlner NONLINEARITY EXISTS IN FINANCIAL TIME SERIES ESPECIALLY IN VOLATILITY AND HIGH FREQUENCY DATA LINEAR MODEL IS DEFINED
More informationOn One Analytic Method of. Constructing Program Controls
Appled Mahemacal Scences, Vol. 9, 05, no. 8, 409-407 HIKARI Ld, www.m-hkar.com hp://dx.do.org/0.988/ams.05.54349 On One Analyc Mehod of Consrucng Program Conrols A. N. Kvko, S. V. Chsyakov and Yu. E. Balyna
More informationPhysics 2A HW #3 Solutions
Chper 3 Focus on Conceps: 3, 4, 6, 9 Problems: 9, 9, 3, 41, 66, 7, 75, 77 Phsics A HW #3 Soluions Focus On Conceps 3-3 (c) The ccelerion due o grvi is he sme for boh blls, despie he fc h he hve differen
More informationCHAPTER 10: LINEAR DISCRIMINATION
CHAPER : LINEAR DISCRIMINAION Dscrmnan-based Classfcaon 3 In classfcaon h K classes (C,C,, C k ) We defned dscrmnan funcon g j (), j=,,,k hen gven an es eample, e chose (predced) s class label as C f g
More informationIn the complete model, these slopes are ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL. (! i+1 -! i ) + [(!") i+1,q - [(!
ANALYSIS OF VARIANCE FOR THE COMPLETE TWO-WAY MODEL The frs hng o es n wo-way ANOVA: Is here neracon? "No neracon" means: The man effecs model would f. Ths n urn means: In he neracon plo (wh A on he horzonal
More informationEpistemic Game Theory: Online Appendix
Epsemc Game Theory: Onlne Appendx Edde Dekel Lucano Pomao Marcano Snscalch July 18, 2014 Prelmnares Fx a fne ype srucure T I, S, T, β I and a probably µ S T. Le T µ I, S, T µ, βµ I be a ype srucure ha
More informationCH.3. COMPATIBILITY EQUATIONS. Continuum Mechanics Course (MMC) - ETSECCPB - UPC
CH.3. COMPATIBILITY EQUATIONS Connuum Mechancs Course (MMC) - ETSECCPB - UPC Overvew Compably Condons Compably Equaons of a Poenal Vecor Feld Compably Condons for Infnesmal Srans Inegraon of he Infnesmal
More informationMODEL SOLUTIONS TO IIT JEE ADVANCED 2014
MODEL SOLUTIONS TO IIT JEE ADVANCED Pper II Code PART I 6 7 8 9 B A A C D B D C C B 6 C B D D C A 7 8 9 C A B D. Rhc(Z ). Cu M. ZM Secon I K Z 8 Cu hc W mu hc 8 W + KE hc W + KE W + KE W + KE W + KE (KE
More informationPHYS 1443 Section 001 Lecture #4
PHYS 1443 Secon 001 Lecure #4 Monda, June 5, 006 Moon n Two Dmensons Moon under consan acceleraon Projecle Moon Mamum ranges and heghs Reerence Frames and relae moon Newon s Laws o Moon Force Newon s Law
More informationPHYSICS 1210 Exam 1 University of Wyoming 14 February points
PHYSICS 1210 Em 1 Uniersiy of Wyoming 14 Februry 2013 150 poins This es is open-noe nd closed-book. Clculors re permied bu compuers re no. No collborion, consulion, or communicion wih oher people (oher
More informationChapter Newton-Raphson Method of Solving a Nonlinear Equation
Chpter.4 Newton-Rphson Method of Solvng Nonlner Equton After redng ths chpter, you should be ble to:. derve the Newton-Rphson method formul,. develop the lgorthm of the Newton-Rphson method,. use the Newton-Rphson
More informationAn improved statistical disclosure attack
In J Grnulr Compung, Rough Ses nd Inellgen Sysems, Vol X, No Y, xxxx An mproved sscl dsclosure c Bn Tng* Deprmen of Compuer Scence, Clforn Se Unversy Domnguez Hlls, Crson, CA, USA Eml: bng@csudhedu *Correspondng
More informationExistence and Uniqueness Results for Random Impulsive Integro-Differential Equation
Global Journal of Pure and Appled Mahemacs. ISSN 973-768 Volume 4, Number 6 (8), pp. 89-87 Research Inda Publcaons hp://www.rpublcaon.com Exsence and Unqueness Resuls for Random Impulsve Inegro-Dfferenal
More informationON THE DYNAMICS AND THERMODYNAMICS OF SMALL MARKOW-TYPE MATERIAL SYSTEMS
ON THE DYNAMICS AND THERMODYNAMICS OF SMALL MARKOW-TYPE MATERIAL SYSTEMS Andrzej Trzęsows Deprmen of Theory of Connuous Med, Insue of Fundmenl Technologcl Reserch, Polsh Acdemy of Scences, Pwńsego 5B,
More informationPartially Observable Systems. 1 Partially Observable Markov Decision Process (POMDP) Formalism
CS294-40 Lernng for Rootcs nd Control Lecture 10-9/30/2008 Lecturer: Peter Aeel Prtlly Oservle Systems Scre: Dvd Nchum Lecture outlne POMDP formlsm Pont-sed vlue terton Glol methods: polytree, enumerton,
More informationDecompression diagram sampler_src (source files and makefiles) bin (binary files) --- sh (sample shells) --- input (sample input files)
. Iroduco Probblsc oe-moh forecs gudce s mde b 50 esemble members mproved b Model Oupu scs (MO). scl equo s mde b usg hdcs d d observo d. We selec some prmeers for modfg forecs o use mulple regresso formul.
More informationConcise Derivation of Complex Bayesian Approximate Message Passing via Expectation Propagation
Concse Dervon of Complex Byesn Approxme Messge Pssng v Expecon Propgon Xngmng Meng, Sheng Wu, Lnlng Kung, Jnhu Lu Deprmen of Elecronc Engneerng, Tsnghu Unversy, Bejng, Chn Tsnghu Spce Cener, Tsnghu Unversy,
More informationClustering (Bishop ch 9)
Cluserng (Bshop ch 9) Reference: Daa Mnng by Margare Dunham (a slde source) 1 Cluserng Cluserng s unsupervsed learnng, here are no class labels Wan o fnd groups of smlar nsances Ofen use a dsance measure
More informationMulti-load Optimal Design of Burner-inner-liner Under Performance Index Constraint by Second-Order Polynomial Taylor Series Method
, 0005 (06) DOI: 0.05/ mecconf/06700005 ICMI 06 Mul-lod Opml Desgn of Burner-nner-lner Under Performnce Index Consrn by Second-Order Polynoml ylor Seres Mehod U Goqo, Wong Chun Nm, Zheng Mn nd ng Kongzheng
More informationNotes on the stability of dynamic systems and the use of Eigen Values.
Noes on he sabl of dnamc ssems and he use of Egen Values. Source: Macro II course noes, Dr. Davd Bessler s Tme Seres course noes, zarads (999) Ineremporal Macroeconomcs chaper 4 & Techncal ppend, and Hamlon
More informationPerson Movement Prediction Using Hidden Markov Models
Person Movemen Predcon Usng dden Mrkov Models Arpd Geller Lucn Vnn Compuer Scence Deprmen Lucn Blg Unversy of Su E Corn Sr o 4 Su-5525 omn {rpdgeller lucnvnn}@ulsuro Asrc: Uquous sysems use conex nformon
More informationOrdinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s
Ordnary Dfferenal Equaons n Neuroscence wh Malab eamples. Am - Gan undersandng of how o se up and solve ODE s Am Undersand how o se up an solve a smple eample of he Hebb rule n D Our goal a end of class
More informationFTCS Solution to the Heat Equation
FTCS Soluon o he Hea Equaon ME 448/548 Noes Gerald Reckenwald Porland Sae Unversy Deparmen of Mechancal Engneerng gerry@pdxedu ME 448/548: FTCS Soluon o he Hea Equaon Overvew Use he forward fne d erence
More informationGENERATING CERTAIN QUINTIC IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS. Youngwoo Ahn and Kitae Kim
Korean J. Mah. 19 (2011), No. 3, pp. 263 272 GENERATING CERTAIN QUINTIC IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS Youngwoo Ahn and Kae Km Absrac. In he paper [1], an explc correspondence beween ceran
More informationTesting a new idea to solve the P = NP problem with mathematical induction
Tesng a new dea o solve he P = NP problem wh mahemacal nducon Bacground P and NP are wo classes (ses) of languages n Compuer Scence An open problem s wheher P = NP Ths paper ess a new dea o compare he
More information2.1 Constitutive Theory
Secon.. Consuve Theory.. Consuve Equaons Governng Equaons The equaons governng he behavour of maerals are (n he spaal form) dρ v & ρ + ρdv v = + ρ = Conservaon of Mass (..a) d x σ j dv dvσ + b = ρ v& +
More informationNumerical Simulations of Femtosecond Pulse. Propagation in Photonic Crystal Fibers. Comparative Study of the S-SSFM and RK4IP
Appled Mhemcl Scences Vol. 6 1 no. 117 5841 585 Numercl Smulons of Femosecond Pulse Propgon n Phoonc Crysl Fbers Comprve Sudy of he S-SSFM nd RK4IP Mourd Mhboub Scences Fculy Unversy of Tlemcen BP.119
More informationModeling and Predicting Sequences: HMM and (may be) CRF. Amr Ahmed Feb 25
Modelg d redcg Sequeces: HMM d m be CRF Amr Ahmed 070 Feb 25 Bg cure redcg Sgle Lbel Ipu : A se of feures: - Bg of words docume - Oupu : Clss lbel - Topc of he docume - redcg Sequece of Lbels Noo Noe:
More informationAverage & instantaneous velocity and acceleration Motion with constant acceleration
Physics 7: Lecure Reminders Discussion nd Lb secions sr meeing ne week Fill ou Pink dd/drop form if you need o swich o differen secion h is FULL. Do i TODAY. Homework Ch. : 5, 7,, 3,, nd 6 Ch.: 6,, 3 Submission
More information[ ] 2. [ ]3 + (Δx i + Δx i 1 ) / 2. Δx i-1 Δx i Δx i+1. TPG4160 Reservoir Simulation 2018 Lecture note 3. page 1 of 5
TPG460 Reservor Smulaon 08 page of 5 DISCRETIZATIO OF THE FOW EQUATIOS As we already have seen, fne dfference appromaons of he paral dervaves appearng n he flow equaons may be obaned from Taylor seres
More informationOnline Supplement for Dynamic Multi-Technology. Production-Inventory Problem with Emissions Trading
Onlne Supplemen for Dynamc Mul-Technology Producon-Invenory Problem wh Emssons Tradng by We Zhang Zhongsheng Hua Yu Xa and Baofeng Huo Proof of Lemma For any ( qr ) Θ s easy o verfy ha he lnear programmng
More informationNeural assembly binding in linguistic representation
Neurl ssembly binding in linguisic represenion Frnk vn der Velde & Mrc de Kmps Cogniive Psychology Uni, Universiy of Leiden, Wssenrseweg 52, 2333 AK Leiden, The Neherlnds, vdvelde@fsw.leidenuniv.nl Absrc.
More informationAdaptive and Coordinated Traffic Signal Control Based on Q-Learning and MULTIBAND Model
Adpve nd Coordned Trffc Sgnl Conrol Bsed on Q-Lernng nd MULTIBAND Model Shoufeng Lu,. Trffc nd Trnsporon College Chngsh Unversy of Scence nd Technology Chngsh, Chn slusf@gml.com Xmn Lu, nd Shqng D. Shngh
More informationRobustness Experiments with Two Variance Components
Naonal Insue of Sandards and Technology (NIST) Informaon Technology Laboraory (ITL) Sascal Engneerng Dvson (SED) Robusness Expermens wh Two Varance Componens by Ana Ivelsse Avlés avles@ns.gov Conference
More informationTight results for Next Fit and Worst Fit with resource augmentation
Tgh resuls for Nex F and Wors F wh resource augmenaon Joan Boyar Leah Epsen Asaf Levn Asrac I s well known ha he wo smple algorhms for he classc n packng prolem, NF and WF oh have an approxmaon rao of
More informationIntroduction. Voice Coil Motors. Introduction - Voice Coil Velocimeter Electromechanical Systems. F = Bli
UNIVERSITY O TECHNOLOGY, SYDNEY ACULTY O ENGINEERING 4853 Elecroechncl Syses Voce Col Moors Topcs o cover:.. Mnec Crcus 3. EM n Voce Col 4. orce n Torque 5. Mhecl Moel 6. Perornce Voce cols re wely use
More informationRemember: Project Proposals are due April 11.
Bonformtcs ecture Notes Announcements Remember: Project Proposls re due Aprl. Clss 22 Aprl 4, 2002 A. Hdden Mrov Models. Defntons Emple - Consder the emple we tled bout n clss lst tme wth the cons. However,
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4
CS434a/54a: Paern Recognon Prof. Olga Veksler Lecure 4 Oulne Normal Random Varable Properes Dscrmnan funcons Why Normal Random Varables? Analycally racable Works well when observaon comes form a corruped
More informationJ i-1 i. J i i+1. Numerical integration of the diffusion equation (I) Finite difference method. Spatial Discretization. Internal nodes.
umercal negraon of he dffuson equaon (I) Fne dfference mehod. Spaal screaon. Inernal nodes. R L V For hermal conducon le s dscree he spaal doman no small fne spans, =,,: Balance of parcles for an nernal
More informationVolatility Interpolation
Volaly Inerpolaon Prelmnary Verson March 00 Jesper Andreasen and Bran Huge Danse Mares, Copenhagen wan.daddy@danseban.com brno@danseban.com Elecronc copy avalable a: hp://ssrn.com/absrac=69497 Inro Local
More informationChapter 2 Linear Mo on
Chper Lner M n .1 Aerge Velcy The erge elcy prcle s dened s The erge elcy depends nly n he nl nd he nl psns he prcle. Ths mens h prcle srs rm pn nd reurn bck he sme pn, s dsplcemen, nd s s erge elcy s
More informationActive Model Based Predictive Control for Unmanned Helicopter in Full Flight Envelope
he 2 IEEE/RSJ Inernonl Conference on Inellgen Robos nd Sysems Ocober 8-22, 2, pe, wn Acve Model Bsed Predcve Conrol for Unmnned Helcoper n Full Flgh Envelope Dle Song, Junong Q, Jnd Hn, nd Gungjun Lu Absrc-
More informationChapter Newton-Raphson Method of Solving a Nonlinear Equation
Chpter 0.04 Newton-Rphson Method o Solvng Nonlner Equton Ater redng ths chpter, you should be ble to:. derve the Newton-Rphson method ormul,. develop the lgorthm o the Newton-Rphson method,. use the Newton-Rphson
More informationChapter Lagrangian Interpolation
Chaper 5.4 agrangan Inerpolaon Afer readng hs chaper you should be able o:. dere agrangan mehod of nerpolaon. sole problems usng agrangan mehod of nerpolaon and. use agrangan nerpolans o fnd deraes and
More informationSoftware Reliability Growth Models Incorporating Fault Dependency with Various Debugging Time Lags
Sofwre Relbly Growh Models Incorporng Ful Dependency wh Vrous Debuggng Tme Lgs Chn-Yu Hung 1 Chu-T Ln 1 Sy-Yen Kuo Mchel R. Lyu 3 nd Chun-Chng Sue 4 1 Deprmen of Compuer Scence Nonl Tsng Hu Unversy Hsnchu
More informationJordan Journal of Physics
Volume, Number, 00. pp. 47-54 RTICLE Jordn Journl of Physcs Frconl Cnoncl Qunzon of he Free Elecromgnec Lgrngn ensy E. K. Jrd, R. S. w b nd J. M. Khlfeh eprmen of Physcs, Unversy of Jordn, 94 mmn, Jordn.
More informationLi An-Ping. Beijing , P.R.China
A New Type of Cpher: DICING_csb L An-Png Bejng 100085, P.R.Chna apl0001@sna.com Absrac: In hs paper, we wll propose a new ype of cpher named DICING_csb, whch s derved from our prevous sream cpher DICING.
More information2D Motion WS. A horizontally launched projectile s initial vertical velocity is zero. Solve the following problems with this information.
Nme D Moion WS The equions of moion h rele o projeciles were discussed in he Projecile Moion Anlsis Acii. ou found h projecile moes wih consn eloci in he horizonl direcion nd consn ccelerion in he ericl
More information