RL for Large State Spaces: Policy Gradient. Alan Fern
|
|
- Charla Smith
- 6 years ago
- Views:
Transcription
1 RL for Lrge Se Spce: Polcy Grden Aln Fern
2 Movon for Polcy Serch So fr ll of our RL echnque hve red o lern n ec or pprome vlue funcon or Q-funcon Lern opml vlue of beng n e or kng n con from e. Vlue funcon cn ofen be much more comple o repreen hn he correpondng polcy Do we relly cre bou knowng Qlef = Qrgh = Or ju h rgh beer hn lef n e Move erchng drecly n prmeerzed polcy pce Byp lernng vlue funcon nd drecly opmze he vlue of polcy 2
3 Movon for Polcy Serch Conder pplyng RL o problem wh huge con pce Connou: Acon re vecor n ome ubpce of R d Combnorl: Acon re vecor of dcree vrble wh fne domn # con eponenl n # of vrble. Acon = Connuou Moor Torque Acon = Jon Dcree Trge ow would we repreen he Q-funcon Q? 3
4 Movon for Polcy Serch Conder pplyng RL o problem wh huge con pce Connou: Acon re vecor n ome ubpce of R d Combnorl: Acon re vecor of dcree vrble wh fne domn # con eponenl n # of vrble. Could ry o defne feure over e-con pr Qˆ 0 f 2 f2... n fn or ue non-lner ppromor In eher ce cn be chllengng o compue greedy con rg m Q θ Serch over n nfne or eponenlly lrge e of con 4
5 Movon for Polcy Serch Conder pplyng RL o problem wh huge con pce Connou: Acon re vecor n ome ubpce of R d Combnorl: Acon re vecor of dcree vrble wh fne domn # con eponenl n # of vrble. Ined drecly lern polcy h ke e npu nd oupu con e.g. neurl nework Robo Senor ory Torque moor... Torque moor n 5
6 6 Remnder: Grden Acen Gven funcon f n of n rel vlue = n uppoe we wn o mmze f wh repec o A common pproch o dong h grden cen The grden of f pon denoed by f n n-dmenonl vecor h pon n he drecon where f ncree mo eeply pon Vecor clculu ell u h f ju vecor of prl dervve where lm 0 f f f n n f f f
7 Ade: Grden Acen Grden cen ervely follow he grden drecon rng ome nl pon Inlze o rndom vlue Repe unl oppng condon f θ Locl opm of f Wh proper decy of lernng re grden decen gurneed o converge o locl opm. θ 2 7
8 RL v Polcy Grden Acen The polcy grden pproch h he followng chem:. Selec pce of prmeerzed polce 2. Compue he grden of he vlue of curren polcy wr prmeer 3. Move prmeer n he drecon of he grden 4. Repe hee ep unl we rech locl mm 5. Pobly lo dd n rck for delng wh bd locl mm e.g. rndom rer So we mu nwer he followng queon: ow hould we repreen prmeerzed polce? ow cn we compue he grden? 8
9 Prmeerzed Polce One emple of pce of prmerc polce : rg m Qˆ where ˆ Q my be lner funcon e.g. Qˆ 0 f 2 f2... n fn The gol o lern prmeer h gve good polcy Noe h no mporn h cul Q-funcon Rher we only requre goodne ˆ Q ˆ Q be cloe o he good rnkng con n order of 9
10 Polcy Grden Acen For mplcy we wll mke he followng umpon: Ech run/rjecory of polcy r from fed nl e Ech run/rjecory lwy reche ermnl e n fne number of ep lernvely we f he horzon o Le be epeced vlue of polcy nl e ju he epeced dcouned ol rewrd of rjecory of Our objecve o fnd h mmze 0
11 Polcy Grden Acen Polcy grden cen ell u o ervely upde prmeer v: Problem: generlly very comple nd rre h we cn compue cloed form for he grden of even f we hve n ec model of he yem. Key de: eme he grden bed on eperence
12 Grden Emon Concern: Compung or emng he grden of dconnuou funcon cn be problemc. For our emple prmerc polcy rg m Qˆ connuou? 2
13 Emple: Dconnou Sr Se 0.0 R=0 A0 R = 00 A.0 S2 R = -00 ˆ 0 rg mq 0 rg m f 0 2 f2 0 Conder fng θ 2 nd vryng θ Q θ 0 A0 = θ f 0 A0 + θ 2 f 2 0 A0 π θ 0 = A0 dconnuou 00 π θ 0 = A Q θ 0 A = θ f 0 A + θ 2 f 2 0 A -00 3
14 Grden Emon Concern: Compung or emng he grden of dconnuou funcon cn be problemc. For our emple prmerc polcy rg m Qˆ connuou? No. There re vlue of where rbrrly mll chnge cue he polcy o chnge. Snce dfferen polce cn hve dfferen vlue h men h chngng cn cue dconnuou jump of. 4
15 Probblc Polce We would lke o vod polce h drclly chnge wh mll prmeer chnge ledng o dconnue A probblc polcy ke e npu nd reurn drbuon over con Gven e reurn he probbly h elec con n Noe h ll well defned for probblc polce Now uncerny of rjecore come from envronmen nd polcy Impornly f connuou relve o chngng hen lo connuou relve o chngng A common form for probblc polce he ofm funcon or Bolzmnn eploron funcon ep Qˆ Pr ep Qˆ ' ' A 5
16 Emple: Dconnou Sr Se 0 Pr.0 R=0 A0 R = 00 A ' A.0 Q ˆ S2 ep ep Qˆ ' R = -00 ρ θ = 00 π θ A0 00 π θ = 00 π θ A0 π θ A = 00 2π θ A0 A 00 π θ 0 A0 π θ 0 A = 0.5 π F θ 2 nd vryng θ θ 0 A -00 6
17 7 Emprcl Grden Emon Our fr nïve pproch o emng o mply compue emprcl grden eme Recll h θ = θ θ n nd o we cn compue he grden by emprclly emng ech prl dervve So for mll we cn eme he prl dervve by Th requre emng n+ vlue: lm 0 n n n n... n
18 Emprcl Grden Emon ow do we eme he qune n n... For ech e of prmeer mply eecue he polcy for N rl/epode nd verge he vlue cheved cro he rl Trl of ρθ θ 2 θ 3 Trl of ρθ + ε θ 2 θ 3 Trl of ρθ θ 2 θ 3 + ε
19 Emprcl Grden Emon ow do we eme he qune n n... For ech e of prmeer mply eecue he polcy for N rl/epode nd verge he vlue cheved cro he rl Th requre ol of Nn+ epode o ge grden eme For ochc envronmen nd polce he vlue of N mu be relvely lrge o ge good eme of he rue vlue Ofen we wn o ue relvely lrge number of prmeer e.g. neurl nework Ofen epenve o run epode of he polcy So whle h cn work well n mny uon ofen no prccl pproch compuonlly 9
20 Lkelhood Ro Grden Emon The emprcl grden mehod cn be ppled even when he funconl form of he polcy blck bo.e. don know mppng from o con drbuon owever h men we re no ble o eplo knowledge bou he polcy form when we know If we know he funconl form of he polcy nd cn compue grden wh repec o we cn do beer. Eme drecly from rjecore of ju he curren polcy Trl of ρθ θ 2 θ 3 Trl of ρθ + ε θ 2 θ 3 Trl of ρθ θ 2 θ 3 + ε
21 Lkelhood Ro Grden Emon Our pproch bed on he generl de of lkelhood ro emon. We ll fr decrbe he generl de nd hen pply o polcy grden emon Trl of ρθ θ 2 θ 3 Trl of ρθ + ε θ 2 θ 3 Trl of ρθ θ 2 θ 3 + ε
22 Ade: Generl Lkelhood Ro Grden Eme Le F be rel-vlued funcon over fne domn D Everyhng generlze o connuou domn Le X be rndom vrble over D drbued ccordng o P he prmeer vecor of h drbuon Conder he epecon of FX condoned on F X P D E F We wh o eme gven by: P F D D P F Ofen no cloed form for um D cn be huge 22
23 23 Rewrng So ju he epeced vlue of z XFX Ge unbed eme of by vergng over N mple of X j he j h mple of X Only requre bly o mple X nd o compue z Doe no depend on how bg D! log X F X z E F P P F P P P F P D D D z j N j j F z N X P θ Ade: Generl Lkelhood Ro Grden Eme
24 Defne equence of e nd - con genered for ngle epode of In generl wll dffer cro epode. X rndom due o polcy nd envronmen drbued : Defne he ol rewrd of X epeced ol rewrd of We wn o eme he grden of ρθ Apply lkelhood ro mehod! 24 Applcon o Polcy Grden 2 2 X T P R X F R E X F E
25 Recll for rndom vrble X we hve unbed eme We cn genere mple of by runnng polcy from he r e unl ermnl e h mpled epode um of oberved rewrd durng j h epode 25 Applcon o Polcy Grden j N j j F z N 2 2 X 2 2 j j R F log log log log log T T P z Doe no depend on knowng model! Allow model-free mplemenon.
26 Recll for rndom vrble X we hve unbed eme Conder ngle erm Snce con me doe no nfluence rewrd before me + we cn derve he followng reul: h non-rvl o derve Th jufe ung modfed compuon for ech erm: 26 Applcon o Polcy Grden j N j j F z N Tol rewrd fer me. k k R F z log k k R E F z E log k k R log Eme ll unbed bu generlly h mller vrnce.
27 Applcon o Polcy Grden Pung everyhng ogeher we ge: lengh of rjecory j Oberved rewrd fer kng j n e j N j j log j j R j k N j k # of mpled rjecore of curren polcy Drecon o move prmeer n order o ncree he probbly h polcy elec j n e j Inerpreon: ech epode conrbue weghed um of grden drecon Grden drecon for ncreng probbly of j n j weghed by um of rewrd oberved fer kng j n j Inuvely h ncree/decree probbly of kng con h re ypclly followed by good/bd rewrd equence 27
28 Bc Polcy Grden Algorhm Repe unl oppng condon. Eecue for N epode o ge e of e con rewrd equence 2. N j j log j j R j k N j k 3. Unnecery o ore N epode ue onlne men upde n ep 2 Ddvnge: mll # of upde per # epode Alo no well defned for non-epodc nfne horzon problem Onlne polcy grden lgorhm perform upde fer ech ep n envronmen ofen lern fer 28
29 29 Towrd Onlne Algorhm Conder he compuon for ngle epode Noce h we cn compue z n n onlne wy We cn now ncremenlly compue Δ T for ech epode Sorge requremen depend only on # of polcy prmeer k k k z R R R 2 2 log log log ; 0 z z z 0 ; 0 z R Ju reorgnze erm
30 Towrd Onlne Algorhm So he overll grden eme cn be done by ncremenlly compung Δ for ech of N epode nd compung her men The men of he Δ cro epode cn be compued onlne Tol memory requremen ll depend only on # prmeer Independen of lengh nd number of epode! Bu wh f epode go on forever? We could connully mnn Δ T ung conn moun of memory bu we would never cully do prmeer upde Alo Δ T cn hve nfne vrnce n h eng we wll no how h Soluon: Upde polcy prmeer fer ech rewrd oberved rher hn mply upde grden eme Δ T Inroduce dcounng Reul n OLPOMDP lgorhm 30
31 Onlne Polcy Grden OLPOMDP Repe forever. Oberve e 2. Drw con ccordng o drbuon 3. Eecue nd oberve rewrd r 4. ;; dcouned um of ;; grden drecon 5. z z log r z Perform polcy upde ech me ep nd eecue ndefnely Th he OLPOMDP lgorhm [Ber & Brle 2000]
32 Inerpreon Repe forever. Oberve e 2. Drw con ccordng o drbuon 3. Eecue nd oberve rewrd r 4. ;; dcouned um of ;; grden drecon 5. z z log r z Sep 4 compue n elgbly rce z Dcouned um of grden over prevou e-con pr Pon n drecon of prmeer pce h ncree probbly of kng more recen con n more recen e For pove rewrd ep 5 wll ncree probbly of recen con nd decree for negve rewrd.
33 33 Compung he Grden of Polcy Boh lgorhm requre compuon of For he Bolzmnn drbuon wh lner ppromon we hve: where ere he prl dervve compong he grden re: log A Q Q ' ' ˆ ep ˆ ep... ˆ f f f Q n n ' ' ' log f f
34 Connou Acon Spce ow do we repreen ochc polcy over connuou con pce? ech con vecor R d Common Choce: Lern mppng φ from e/npu o d-dmenonl con vecor e.g. neurl nework Polcy d-dm Gun wh men φ nd covrnce Σ π = Gunφ Σ φ S Gunφ Σ... φ d nd Seleced or Lerned 34
35 Conrollng elcoper Polcy grden echnque hve been ued o cree conroller for dffcul helcoper mneuver For emple nvered helcoper flgh. 35
36 Qudruped Locomoon Opmze g of 4-legged robo over rough errn 36
37 Procve Secury Inellgen Bone Conroller Ued OLPOMDP o procvely dcover mmlly dmgng bone ck n peer-o-peer nework
38 Polcy Grden Recp When polce hve much mpler repreenon hn he correpondng vlue funcon drec erch n polcy pce cn be good de Or f we lredy hve comple prmerc conroller polcy grden llow u o focu on opmzng prmeer eng For belne lgorhm he grden eme re unbed.e. hey wll converge o he rgh vlue bu hve hgh vrnce for lrge T Cn requre lrge N o ge relble eme OLPOMDP cn rde-off b nd vrnce v he dcoun prmeer nd doe no requre noon of epode Cn be prone o fndng locl mm Mny wy of delng wh h e.g. rndom rer or nellgen nlzon. 38
39 Oher Trck Much recen work n deep lernng ue polcy grden ofen wh couple of oher rck Addng belne o reduce vrnce Only updng polcy ech ep whn ru regon o vod chngng he polcy oo drclly Tru Regon Polcy Opmzon very nce combnon of hee de. Tru Regon Polcy Opmzon John Schulmn Sergey Levne Phlpp Morz Mchel I. Jordn Peer Abbeel. Inernonl Conference on Mchne Lernng ICML 205. pdf rxv preprn Implemenon vlble. 39
RL for Large State Spaces: Policy Gradient. Alan Fern
RL for Lrge Se Spce: Polcy Grden Aln Fern RL v Polcy Grden Serch So fr ll of our RL echnque hve red o lern n ec or pprome uly funcon or Q-funcon Lern opml vlue of beng n e or kng n con from e. Vlue funcon
More information1.B Appendix to Chapter 1
Secon.B.B Append o Chper.B. The Ordnr Clcl Here re led ome mporn concep rom he ordnr clcl. The Dervve Conder ncon o one ndependen vrble. The dervve o dened b d d lm lm.b. where he ncremen n de o n ncremen
More informationMultivariate Time Series Analysis
Mulvre me Sere Anl Le { : } be Mulvre me ere. Denon: () = men vlue uncon o { : } = E[ ] or. (,) = Lgged covrnce mr o { : } = E{[ - ()][ - ()]'} or, Denon: e me ere { : } onr e jon drbuon o,,, e me e jon
More informationChapter 2: Evaluative Feedback
Chper 2: Evluive Feedbck Evluing cions vs. insrucing by giving correc cions Pure evluive feedbck depends olly on he cion ken. Pure insrucive feedbck depends no ll on he cion ken. Supervised lerning is
More informationCalculus 241, section 12.2 Limits/Continuity & 12.3 Derivatives/Integrals notes by Tim Pilachowski r r r =, with a domain of real ( )
Clculu 4, econ Lm/Connuy & Devve/Inel noe y Tm Plchow, wh domn o el Wh we hve o : veco-vlued uncon, ( ) ( ) ( ) j ( ) nume nd ne o veco The uncon, nd A w done wh eul uncon ( x) nd connuy e he componen
More information10/20/2009. Announcements. Last Time: Q-Learning. Example: Pacman. Function Approximation. Feature-Based Representations
Introducton to Artfcl Intellgence V.047-00 Fll 009 Lecture : Renforcement Lernng Announcement Agnment due next Mondy t mdnght Plee end eml to me bout fnl exm Rob Fergu Dept of Computer Scence, Cournt Inttute,
More information(,,, ) (,,, ). In addition, there are three other consumers, -2, -1, and 0. Consumer -2 has the utility function
MACROECONOMIC THEORY T J KEHOE ECON 87 SPRING 5 PROBLEM SET # Conder an overlappng generaon economy le ha n queon 5 on problem e n whch conumer lve for perod The uly funcon of he conumer born n perod,
More information8. INVERSE Z-TRANSFORM
8. INVERSE Z-TRANSFORM The proce by whch Z-trnform of tme ere, nmely X(), returned to the tme domn clled the nvere Z-trnform. The nvere Z-trnform defned by: Computer tudy Z X M-fle trn.m ued to fnd nvere
More informationModeling and Predicting Sequences: HMM and (may be) CRF. Amr Ahmed Feb 25
Modelg d redcg Sequeces: HMM d m be CRF Amr Ahmed 070 Feb 25 Bg cure redcg Sgle Lbel Ipu : A se of feures: - Bg of words docume - Oupu : Clss lbel - Topc of he docume - redcg Sequece of Lbels Noo Noe:
More informationInterval Estimation. Consider a random variable X with a mean of X. Let X be distributed as X X
ECON 37: Ecoomercs Hypohess Tesg Iervl Esmo Wh we hve doe so fr s o udersd how we c ob esmors of ecoomcs reloshp we wsh o sudy. The queso s how comforble re we wh our esmors? We frs exme how o produce
More informationHidden Markov Model. a ij. Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,..., sn
Hdden Mrkov Model S S servon : 2... Ses n me : 2... All ses : s s2... s 2 3 2 3 2 Hdden Mrkov Model Con d Dscree Mrkov Model 2 z k s s s s s s Degree Mrkov Model Hdden Mrkov Model Con d : rnson roly from
More informatione t dt e t dt = lim e t dt T (1 e T ) = 1
Improper Inegrls There re wo ypes of improper inegrls - hose wih infinie limis of inegrion, nd hose wih inegrnds h pproch some poin wihin he limis of inegrion. Firs we will consider inegrls wih infinie
More informationMinimum Squared Error
Minimum Squred Error LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > 0 for ll smples y i solve sysem of liner inequliies MSE procedure y i i for ll smples
More informationMinimum Squared Error
Minimum Squred Error LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > for ll smples y i solve sysem of liner inequliies MSE procedure y i = i for ll smples
More informationA Kalman filtering simulation
A Klmn filering simulion The performnce of Klmn filering hs been esed on he bsis of wo differen dynmicl models, ssuming eiher moion wih consn elociy or wih consn ccelerion. The former is epeced o beer
More informationSupporting information How to concatenate the local attractors of subnetworks in the HPFP
n Effcen lgorh for Idenfyng Prry Phenoype rcors of Lrge-Scle Boolen Newor Sng-Mo Choo nd Kwng-Hyun Cho Depren of Mhecs Unversy of Ulsn Ulsn 446 Republc of Kore Depren of Bo nd Brn Engneerng Kore dvnced
More information4.8 Improper Integrals
4.8 Improper Inegrls Well you ve mde i hrough ll he inegrion echniques. Congrs! Unforunely for us, we sill need o cover one more inegrl. They re clled Improper Inegrls. A his poin, we ve only del wih inegrls
More informationA. Inventory model. Why are we interested in it? What do we really study in such cases.
Some general yem model.. Inenory model. Why are we nereed n? Wha do we really udy n uch cae. General raegy of machng wo dmlar procee, ay, machng a fa proce wh a low one. We need an nenory or a buffer or
More informationBipartite Matching. Matching. Bipartite Matching. Maxflow Formulation
Mching Inpu: undireced grph G = (V, E). Biprie Mching Inpu: undireced, biprie grph G = (, E).. Mching Ern Myr, Hrld äcke Biprie Mching Inpu: undireced, biprie grph G = (, E). Mflow Formulion Inpu: undireced,
More informationCHAPTER 10: LINEAR DISCRIMINATION
CHAPER : LINEAR DISCRIMINAION Dscrmnan-based Classfcaon 3 In classfcaon h K classes (C,C,, C k ) We defned dscrmnan funcon g j (), j=,,,k hen gven an es eample, e chose (predced) s class label as C f g
More informationReinforcement learning
Reinforcement lerning Regulr MDP Given: Trnition model P Rewrd function R Find: Policy π Reinforcement lerning Trnition model nd rewrd function initilly unknown Still need to find the right policy Lern
More informationMotion. Part 2: Constant Acceleration. Acceleration. October Lab Physics. Ms. Levine 1. Acceleration. Acceleration. Units for Acceleration.
Moion Accelerion Pr : Consn Accelerion Accelerion Accelerion Accelerion is he re of chnge of velociy. = v - vo = Δv Δ ccelerion = = v - vo chnge of velociy elpsed ime Accelerion is vecor, lhough in one-dimensionl
More informationLecture 6: Learning for Control (Generalised Linear Regression)
Lecure 6: Learnng for Conrol (Generalsed Lnear Regresson) Conens: Lnear Mehods for Regresson Leas Squares, Gauss Markov heorem Recursve Leas Squares Lecure 6: RLSC - Prof. Sehu Vjayakumar Lnear Regresson
More informationChapter 6. Isoparametric Formulation
ME 78 FIIE ELEME MEHOD Chper. Ioprerc Forlon Se fncon h ed o defne he eleen geoer ed o defne he dplceen whn he eleen ode r Eleen Lner geoer Lner dplceen ode Be Eleen Qdrc geoer Qdrc dplceen We gn he e
More informationLAPLACE TRANSFORMS. 1. Basic transforms
LAPLACE TRANSFORMS. Bic rnform In hi coure, Lplce Trnform will be inroduced nd heir properie exmined; ble of common rnform will be buil up; nd rnform will be ued o olve ome dierenil equion by rnforming
More informationLecture 11: Stereo and Surface Estimation
Lecure : Sereo and Surface Emaon When camera poon have been deermned, ung rucure from moon, we would lke o compue a dene urface model of he cene. In h lecure we wll udy he o called Sereo Problem, where
More informationHidden Markov Models. A Specific Form of Process.. Doubly Stochastic Processes. What a sensible agent must do. A Common Trait
-755 Mchne Lernng for Sgnl roceng Hdden Mrkov Model 04 Oc 0 redcon : holy grl hycl recore Auooble rocke hevenly bode Nurl phenoen Weher Fnncl d Sock rke World ffr Who gong o hve he ne prng? Sgnl Audo vdeo..
More informationLecture VI Regression
Lecure VI Regresson (Lnear Mehods for Regresson) Conens: Lnear Mehods for Regresson Leas Squares, Gauss Markov heorem Recursve Leas Squares Lecure VI: MLSC - Dr. Sehu Vjayakumar Lnear Regresson Model M
More informationAdvanced Electromechanical Systems (ELE 847)
(ELE 847) Dr. Smr ouro-rener Topc 1.4: DC moor speed conrol Torono, 2009 Moor Speed Conrol (open loop conrol) Consder he followng crcu dgrm n V n V bn T1 T 5 T3 V dc r L AA e r f L FF f o V f V cn T 4
More informationSolution in semi infinite diffusion couples (error function analysis)
Soluon n sem nfne dffuson couples (error funcon analyss) Le us consder now he sem nfne dffuson couple of wo blocks wh concenraon of and I means ha, n a A- bnary sysem, s bondng beween wo blocks made of
More informationLecture 11 SVM cont
Lecure SVM con. 0 008 Wha we have done so far We have esalshed ha we wan o fnd a lnear decson oundary whose margn s he larges We know how o measure he margn of a lnear decson oundary Tha s: he mnmum geomerc
More informationAdvanced Machine Learning & Perception
Advanced Machne Learnng & Percepon Insrucor: Tony Jebara SVM Feaure & Kernel Selecon SVM Eensons Feaure Selecon (Flerng and Wrappng) SVM Feaure Selecon SVM Kernel Selecon SVM Eensons Classfcaon Feaure/Kernel
More informationVisual Robot Homing using Sarsa(λ), Whole Image Measure, and Radial Basis Function.
Vsul Robo Homng usng Srs(λ, Whole Imge Mesure, nd Rdl Bss Funcon. Abdulrhmn Alhhn, Kevn Burn, Sefn Wermer Hybrd Inellgen Sysems Reserch Group, School of Compung nd echnology, Unversy of Sunderlnd, SR6
More informationControl Systems. Mathematical Modeling of Control Systems.
Conrol Syem Mahemacal Modelng of Conrol Syem chbum@eoulech.ac.kr Oulne Mahemacal model and model ype. Tranfer funcon model Syem pole and zero Chbum Lee -Seoulech Conrol Syem Mahemacal Model Model are key
More informationThe solution is often represented as a vector: 2xI + 4X2 + 2X3 + 4X4 + 2X5 = 4 2xI + 4X2 + 3X3 + 3X4 + 3X5 = 4. 3xI + 6X2 + 6X3 + 3X4 + 6X5 = 6.
[~ o o :- o o ill] i 1. Mrices, Vecors, nd Guss-Jordn Eliminion 1 x y = = - z= The soluion is ofen represened s vecor: n his exmple, he process of eliminion works very smoohly. We cn elimine ll enries
More informationFlow Networks Alon Efrat Slides courtesy of Charles Leiserson with small changes by Carola Wenk. Flow networks. Flow networks CS 445
CS 445 Flow Nework lon Efr Slide corey of Chrle Leieron wih mll chnge by Crol Wenk Flow nework Definiion. flow nework i direced grph G = (V, E) wih wo diingihed erice: orce nd ink. Ech edge (, ) E h nonnegie
More information( ) () we define the interaction representation by the unitary transformation () = ()
Hgher Order Perurbaon Theory Mchael Fowler 3/7/6 The neracon Represenaon Recall ha n he frs par of hs course sequence, we dscussed he chrödnger and Hesenberg represenaons of quanum mechancs here n he chrödnger
More informationVariants of Pegasos. December 11, 2009
Inroducon Varans of Pegasos SooWoong Ryu bshboy@sanford.edu December, 009 Youngsoo Cho yc344@sanford.edu Developng a new SVM algorhm s ongong research opc. Among many exng SVM algorhms, we wll focus on
More informationA Demand System for Input Factors when there are Technological Changes in Production
A Demand Syem for Inpu Facor when here are Technologcal Change n Producon Movaon Due o (e.g.) echnologcal change here mgh no be a aonary relaonhp for he co hare of each npu facor. When emang demand yem
More informationII The Z Transform. Topics to be covered. 1. Introduction. 2. The Z transform. 3. Z transforms of elementary functions
II The Z Trnsfor Tocs o e covered. Inroducon. The Z rnsfor 3. Z rnsfors of eleenry funcons 4. Proeres nd Theory of rnsfor 5. The nverse rnsfor 6. Z rnsfor for solvng dfference equons II. Inroducon The
More informationgraph of unit step function t
.5 Piecewie coninuou forcing funcion...e.g. urning he forcing on nd off. The following Lplce rnform meril i ueful in yem where we urn forcing funcion on nd off, nd when we hve righ hnd ide "forcing funcion"
More informationDecompression diagram sampler_src (source files and makefiles) bin (binary files) --- sh (sample shells) --- input (sample input files)
. Iroduco Probblsc oe-moh forecs gudce s mde b 50 esemble members mproved b Model Oupu scs (MO). scl equo s mde b usg hdcs d d observo d. We selec some prmeers for modfg forecs o use mulple regresso formul.
More informationA Group Key Management Scheme Based on Random Transmission for VANET Liang PANG1, b, Da WEI1, 2, Qi ZHAO3, Jianqi ZHU1, 2, a
6h Inernonl Conference on Mchnery, Merl, Envronmen, Boechnology nd Compuer (MMEBC 06) A Group ey Mngemen cheme Bed on Rndom rnmon for ANE ng PANG, b, D WEI,, Q ZHAO3, Jn ZHU,, College of Compuer cence
More informationISSN 075-7 : (7) 0 007 C ( ), E-l: ssolos@glco FPGA LUT FPGA EM : FPGA, LUT, EM,,, () FPGA (feldprogrble ge rrs) [, ] () [], () [] () [5] [6] FPGA LUT (Look-Up-Tbles) EM (Ebedded Meor locks) [7, 8] LUT
More informationPolicy optimization. Stochastic approach
Polcy opmzo Sochc pproch Dcree-me Mrkov Proce Sory Mrkov ch Sochc proce over fe e e S S {.. 2 S} Oe ep ro probbly: Prob j - p j Se ro me: geomerc drbuo Prob j T p j p - 2 Dcree-me Mrkov Proce Sory corollble
More informationVolatility Interpolation
Volaly Inerpolaon Prelmnary Verson March 00 Jesper Andreasen and Bran Huge Danse Mares, Copenhagen wan.daddy@danseban.com brno@danseban.com Elecronc copy avalable a: hp://ssrn.com/absrac=69497 Inro Local
More informationTransformations. Ordered set of numbers: (1,2,3,4) Example: (x,y,z) coordinates of pt in space. Vectors
Trnformion Ordered e of number:,,,4 Emple:,,z coordine of p in pce. Vecor If, n i i, K, n, i uni ecor Vecor ddiion +w, +, +, + V+w w Sclr roduc,, Inner do roduc α w. w +,.,. The inner produc i SCLR!. w,.,
More informationReview: Transformations. Transformations - Viewing. Transformations - Modeling. world CAMERA OBJECT WORLD CSE 681 CSE 681 CSE 681 CSE 681
Revew: Trnsforons Trnsforons Modelng rnsforons buld cople odels b posonng (rnsforng sple coponens relve o ech oher ewng rnsforons plcng vrul cer n he world rnsforon fro world coordnes o cer coordnes Perspecve
More informationHow about the more general "linear" scalar functions of scalars (i.e., a 1st degree polynomial of the following form with a constant term )?
lmcd Lnear ransformaon of a vecor he deas presened here are que general hey go beyond he radonal mar-vecor ype seen n lnear algebra Furhermore, hey do no deal wh bass and are equally vald for any se of
More informationAverage & instantaneous velocity and acceleration Motion with constant acceleration
Physics 7: Lecure Reminders Discussion nd Lb secions sr meeing ne week Fill ou Pink dd/drop form if you need o swich o differen secion h is FULL. Do i TODAY. Homework Ch. : 5, 7,, 3,, nd 6 Ch.: 6,, 3 Submission
More informationNotes on the stability of dynamic systems and the use of Eigen Values.
Noes on he sabl of dnamc ssems and he use of Egen Values. Source: Macro II course noes, Dr. Davd Bessler s Tme Seres course noes, zarads (999) Ineremporal Macroeconomcs chaper 4 & Techncal ppend, and Hamlon
More informationDepartment of Economics University of Toronto
Deparmen of Economcs Unversy of Torono ECO408F M.A. Economercs Lecure Noes on Heeroskedascy Heeroskedascy o Ths lecure nvolves lookng a modfcaons we need o make o deal wh he regresson model when some of
More information4. Runge-Kutta Formula For Differential Equations
NCTU Deprme o Elecrcl d Compuer Egeerg 5 Sprg Course by Pro. Yo-Pg Ce. Ruge-Ku Formul For Derel Equos To solve e derel equos umerclly e mos useul ormul s clled Ruge-Ku ormul
More informationTHE EXISTENCE OF SOLUTIONS FOR A CLASS OF IMPULSIVE FRACTIONAL Q-DIFFERENCE EQUATIONS
Europen Journl of Mhemcs nd Compuer Scence Vol 4 No, 7 SSN 59-995 THE EXSTENCE OF SOLUTONS FOR A CLASS OF MPULSVE FRACTONAL Q-DFFERENCE EQUATONS Shuyun Wn, Yu Tng, Q GE Deprmen of Mhemcs, Ynbn Unversy,
More informationOPERATOR-VALUED KERNEL RECURSIVE LEAST SQUARES ALGORITHM
3rd Europen Sgnl Processng Conference EUSIPCO OPERATOR-VALUED KERNEL RECURSIVE LEAST SQUARES ALGORITM P. O. Amblrd GIPSAlb/CNRS UMR 583 Unversé de Grenoble Grenoble, Frnce. Kdr LIF/CNRS UMR 779 Ax-Mrselle
More informationOutline. Probabilistic Model Learning. Probabilistic Model Learning. Probabilistic Model for Time-series Data: Hidden Markov Model
Probablsc Model for Tme-seres Daa: Hdden Markov Model Hrosh Mamsuka Bonformacs Cener Kyoo Unversy Oulne Three Problems for probablsc models n machne learnng. Compung lkelhood 2. Learnng 3. Parsng (predcon
More informationArtificial Intelligence Markov Decision Problems
rtificil Intelligence Mrkov eciion Problem ilon - briefly mentioned in hpter Ruell nd orvig - hpter 7 Mrkov eciion Problem; pge of Mrkov eciion Problem; pge of exmple: probbilitic blockworld ction outcome
More informationReinforcement learning
CS 75 Mchine Lening Lecue b einfocemen lening Milos Huskech milos@cs.pi.edu 539 Senno Sque einfocemen lening We wn o len conol policy: : X A We see emples of bu oupus e no given Insed of we ge feedbck
More informationSection P.1 Notes Page 1 Section P.1 Precalculus and Trigonometry Review
Secion P Noe Pge Secion P Preclculu nd Trigonomer Review ALGEBRA AND PRECALCULUS Eponen Lw: Emple: 8 Emple: Emple: Emple: b b Emple: 9 EXAMPLE: Simplif: nd wrie wi poiive eponen Fir I will flip e frcion
More informationGo over vector and vector algebra Displacement and position in 2-D Average and instantaneous velocity in 2-D Average and instantaneous acceleration
Mh Csquee Go oe eco nd eco lgeb Dsplcemen nd poson n -D Aege nd nsnneous eloc n -D Aege nd nsnneous cceleon n -D Poecle moon Unfom ccle moon Rele eloc* The componens e he legs of he gh ngle whose hpoenuse
More informationLinear Response Theory: The connection between QFT and experiments
Phys540.nb 39 3 Lnear Response Theory: The connecon beween QFT and expermens 3.1. Basc conceps and deas Q: ow do we measure he conducvy of a meal? A: we frs nroduce a weak elecrc feld E, and hen measure
More informationForms of Energy. Mass = Energy. Page 1. SPH4U: Introduction to Work. Work & Energy. Particle Physics:
SPH4U: Inroducion o ork ork & Energy ork & Energy Discussion Definiion Do Produc ork of consn force ork/kineic energy heore ork of uliple consn forces Coens One of he os iporn conceps in physics Alernive
More informationOrdinary Differential Equations in Neuroscience with Matlab examples. Aim 1- Gain understanding of how to set up and solve ODE s
Ordnary Dfferenal Equaons n Neuroscence wh Malab eamples. Am - Gan undersandng of how o se up and solve ODE s Am Undersand how o se up an solve a smple eample of he Hebb rule n D Our goal a end of class
More informationIntroduction. Section 9: HIGHER ORDER TWO DIMENSIONAL SHAPE FUNCTIONS
Secon 9: HIGHER ORDER TWO DIMESIO SHPE FUCTIOS Inroducon We ne conder hpe funcon for hgher order eleen. To do h n n orderl fhon we nroduce he concep of re coordne. Conder ere of rngulr eleen depced n he
More informationReinforcement Learning and Policy Reuse
Reinforcement Lerning nd Policy Reue Mnuel M. Veloo PEL Fll 206 Reding: Reinforcement Lerning: An Introduction R. Sutton nd A. Brto Probbilitic policy reue in reinforcement lerning gent Fernndo Fernndez
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 4
CS434a/54a: Paern Recognon Prof. Olga Veksler Lecure 4 Oulne Normal Random Varable Properes Dscrmnan funcons Why Normal Random Varables? Analycally racable Works well when observaon comes form a corruped
More informationMathematics 805 Final Examination Answers
. 5 poins Se he Weiersrss M-es. Mhemics 85 Finl Eminion Answers Answer: Suppose h A R, nd f n : A R. Suppose furher h f n M n for ll A, nd h Mn converges. Then f n converges uniformly on A.. 5 poins Se
More information4. Runge-Kutta Formula For Differential Equations. A. Euler Formula B. Runge-Kutta Formula C. An Example for Fourth-Order Runge-Kutta Formula
NCTU Deprme o Elecrcl d Compuer Egeerg Seor Course By Pro. Yo-Pg Ce. Ruge-Ku Formul For Derel Equos A. Euler Formul B. Ruge-Ku Formul C. A Emple or Four-Order Ruge-Ku Formul
More informationUnscented Transformation Unscented Kalman Filter
Usceed rsformo Usceed Klm Fler Usceed rcle Fler Flerg roblem Geerl roblem Seme where s he se d s he observo Flerg s he problem of sequell esmg he ses (prmeers or hdde vrbles) of ssem s se of observos become
More informationFTCS Solution to the Heat Equation
FTCS Soluon o he Hea Equaon ME 448/548 Noes Gerald Reckenwald Porland Sae Unversy Deparmen of Mechancal Engneerng gerry@pdxedu ME 448/548: FTCS Soluon o he Hea Equaon Overvew Use he forward fne d erence
More informationBellman Optimality Equation for V*
Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s
More information0 for t < 0 1 for t > 0
8.0 Sep nd del funcions Auhor: Jeremy Orloff The uni Sep Funcion We define he uni sep funcion by u() = 0 for < 0 for > 0 I is clled he uni sep funcion becuse i kes uni sep = 0. I is someimes clled he Heviside
More informationChapters 2 Kinematics. Position, Distance, Displacement
Chapers Knemacs Poson, Dsance, Dsplacemen Mechancs: Knemacs and Dynamcs. Knemacs deals wh moon, bu s no concerned wh he cause o moon. Dynamcs deals wh he relaonshp beween orce and moon. The word dsplacemen
More informationPHYSICS 1210 Exam 1 University of Wyoming 14 February points
PHYSICS 1210 Em 1 Uniersiy of Wyoming 14 Februry 2013 150 poins This es is open-noe nd closed-book. Clculors re permied bu compuers re no. No collborion, consulion, or communicion wih oher people (oher
More informationReinforcement Learning for a New Piano Mover s Problem
Renforcemen Lernng for New Pno Mover s Problem Yuko ISHIWAKA Hkode Nonl College of Technology, Hkode, Hokkdo, Jpn Tomohro YOSHIDA Murorn Insue of Technology, Murorn, Hokkdo, Jpn nd Yuknor KAKAZU Reserch
More informationt s (half of the total time in the air) d?
.. In Cl or Homework Eercie. An Olmpic long jumper i cpble of jumping 8.0 m. Auming hi horizonl peed i 9.0 m/ he lee he ground, how long w he in he ir nd how high did he go? horizonl? 8.0m 9.0 m / 8.0
More informationRotations.
oons j.lbb@phscs.o.c.uk To s summ Fmes of efeence Invnce une nsfomons oon of wve funcon: -funcons Eule s ngles Emple: e e - - Angul momenum s oon geneo Genec nslons n Noehe s heoem Fmes of efeence Conse
More informationScattering at an Interface: Oblique Incidence
Course Insrucor Dr. Raymond C. Rumpf Offce: A 337 Phone: (915) 747 6958 E Mal: rcrumpf@uep.edu EE 4347 Appled Elecromagnecs Topc 3g Scaerng a an Inerface: Oblque Incdence Scaerng These Oblque noes may
More informationBEST PATTERN OF MULTIPLE LINEAR REGRESSION
ERI COADA GERMAY GEERAL M.R. SEFAIK AIR FORCE ACADEMY ARMED FORCES ACADEMY ROMAIA SLOVAK REPUBLIC IERAIOAL COFERECE of SCIEIFIC PAPER AFASES Brov 6-8 M BES PAER OF MULIPLE LIEAR REGRESSIO Corel GABER PEROLEUM-GAS
More informationPhysics 201 Lecture 2
Physcs 1 Lecure Lecure Chper.1-. Dene Poson, Dsplcemen & Dsnce Dsngush Tme nd Tme Inerl Dene Velocy (Aerge nd Insnneous), Speed Dene Acceleron Undersnd lgebrclly, hrough ecors, nd grphclly he relonshps
More informationChapter 2 Linear Mo on
Chper Lner M n .1 Aerge Velcy The erge elcy prcle s dened s The erge elcy depends nly n he nl nd he nl psns he prcle. Ths mens h prcle srs rm pn nd reurn bck he sme pn, s dsplcemen, nd s s erge elcy s
More informationA NEW INTERPRETATION OF INTERVAL-VALUED FUZZY INTERIOR IDEALS OF ORDERED SEMIGROUPS
ScInLhore),7),9-37,4 ISSN 3-536; CODEN: SINTE 8 9 A NEW INTERPRETATION O INTERVAL-VALUED UZZY INTERIOR IDEALS O ORDERED SEMIGROUPS Hdy Ullh Khn, b, Nor Hnz Srmn, Asghr Khn c nd z Muhmmd Khn d Deprmen of
More informationIX.1.1 The Laplace Transform Definition 700. IX.1.2 Properties 701. IX.1.3 Examples 702. IX.1.4 Solution of IVP for ODEs 704
Chper IX The Inegrl Trnform Mehod IX. The plce Trnform November 4, 7 699 IX. THE APACE TRANSFORM IX.. The plce Trnform Definiion 7 IX.. Properie 7 IX..3 Emple 7 IX..4 Soluion of IVP for ODE 74 IX..5 Soluion
More informationKinematics Quantities. Linear Motion. Coordinate System. Kinematics Quantities. Velocity. Position. Don t Forget Units!
Knemtc Quntte Lner Phyc 11 Eyre Tme Intnt t Fundmentl Tme Interl t Dened Poton Fundmentl Dplcement Dened Aerge g Dened Aerge Accelerton g Dened Knemtc Quntte Scler: Mgntude Tme Intnt, Tme Interl nd Speed
More information2010 Sectional Physics Solution Set
. Crrec nwer: D WYSE CDEMIC CHLLENGE Secnl hyc E 00 Slun Se y 0 y 4.0 / 9.8 /.45 y. Crrec nwer: y 8 0 / 8 /. Crrec nwer: E y y 0 ( 4 / ) ( 4.9 / ) 5.6 y y 4. Crrec nwer: E 5. Crrec nwer: The e rce c n
More informationIX.1.1 The Laplace Transform Definition 700. IX.1.2 Properties 701. IX.1.3 Examples 702. IX.1.4 Solution of IVP for ODEs 704
Chper IX The Inegrl Trnform Mehod IX. The plce Trnform November 6, 8 699 IX. THE APACE TRANSFORM IX.. The plce Trnform Definiion 7 IX.. Properie 7 IX..3 Emple 7 IX..4 Soluion of IVP for ODE 74 IX..5 Soluion
More informationPen Tip Position Estimation Using Least Square Sphere Fitting for Customized Attachments of Haptic Device
for Cuomed Ahmen of Hp Deve Mno KOEDA nd Mhko KAO Deprmen of Compuer Sene Ful of Informon Sene nd Ar Ok Elero-Communon Unver Kok 30-70, Shjonwe, Ok, 575-0063, JAPA {koed, 0809@oeu.jp} Ar In h pper, mehod
More informationChapter Simpson s 1/3 Rule of Integration. ( x)
Cper 7. Smpso s / Rule o Iegro Aer redg s per, you sould e le o. derve e ormul or Smpso s / rule o egro,. use Smpso s / rule o solve egrls,. develop e ormul or mulple-segme Smpso s / rule o egro,. use
More informationFINANCIAL ECONOMETRICS
FINANCIAL ECONOMETRICS SPRING 07 WEEK IV NONLINEAR MODELS Prof. Dr. Burç ÜLENGİN Nonlner NONLINEARITY EXISTS IN FINANCIAL TIME SERIES ESPECIALLY IN VOLATILITY AND HIGH FREQUENCY DATA LINEAR MODEL IS DEFINED
More informationSeptember 20 Homework Solutions
College of Engineering nd Compuer Science Mechnicl Engineering Deprmen Mechnicl Engineering A Seminr in Engineering Anlysis Fll 7 Number 66 Insrucor: Lrry Creo Sepember Homework Soluions Find he specrum
More informationCSC 373: Algorithm Design and Analysis Lecture 9
CSC 373: Algorihm Deign n Anlyi Leure 9 Alln Boroin Jnury 28, 2013 1 / 16 Leure 9: Announemen n Ouline Announemen Prolem e 1 ue hi Friy. Term Te 1 will e hel nex Mony, Fe in he uoril. Two nnounemen o follow
More informationSSRG International Journal of Thermal Engineering (SSRG-IJTE) Volume 4 Issue 1 January to April 2018
SSRG Inernaonal Journal of Thermal Engneerng (SSRG-IJTE) Volume 4 Iue 1 January o Aprl 18 Opmal Conrol for a Drbued Parameer Syem wh Tme-Delay, Non-Lnear Ung he Numercal Mehod. Applcaon o One- Sded Hea
More informationLecture 18: The Laplace Transform (See Sections and 14.7 in Boas)
Lecure 8: The Lalace Transform (See Secons 88- and 47 n Boas) Recall ha our bg-cure goal s he analyss of he dfferenal equaon, ax bx cx F, where we emloy varous exansons for he drvng funcon F deendng on
More informationGMM parameter estimation. Xiaoye Lu CMPS290c Final Project
GMM paraeer esaon Xaoye Lu M290c Fnal rojec GMM nroducon Gaussan ure Model obnaon of several gaussan coponens Noaon: For each Gaussan dsrbuon:, s he ean and covarance ar. A GMM h ures(coponens): p ( 2π
More informationf t f a f x dx By Lin McMullin f x dx= f b f a. 2
Accumulion: Thoughs On () By Lin McMullin f f f d = + The gols of he AP* Clculus progrm include he semen, Sudens should undersnd he definie inegrl s he ne ccumulion of chnge. 1 The Topicl Ouline includes
More informationTesting a new idea to solve the P = NP problem with mathematical induction
Tesng a new dea o solve he P = NP problem wh mahemacal nducon Bacground P and NP are wo classes (ses) of languages n Compuer Scence An open problem s wheher P = NP Ths paper ess a new dea o compare he
More informationHidden Markov Models Following a lecture by Andrew W. Moore Carnegie Mellon University
Hdden Markov Models Followng a lecure by Andrew W. Moore Carnege Mellon Unversy www.cs.cmu.edu/~awm/uorals A Markov Sysem Has N saes, called s, s 2.. s N s 2 There are dscree meseps, 0,, s s 3 N 3 0 Hdden
More informationPartially Observable Systems. 1 Partially Observable Markov Decision Process (POMDP) Formalism
CS294-40 Lernng for Rootcs nd Control Lecture 10-9/30/2008 Lecturer: Peter Aeel Prtlly Oservle Systems Scre: Dvd Nchum Lecture outlne POMDP formlsm Pont-sed vlue terton Glol methods: polytree, enumerton,
More informationEssential Prime Implicants
CS/EE 4/4: Comuer Aded Desgn of Dgl Crcus Chrs J. yers Lecure 8: rme Selecon nd UC Redng: Cher 4.-9 Essenl rme Imlcns Cube s n essenl rme mlcn of f conns mnerm no conned by ny oher rme mlcn of. f f All
More informationDeveloping Communication Strategy for Multi-Agent Systems with Incremental Fuzzy Model
(IJACSA) Inernonl Journl of Advnced Compuer Scence nd Applcons, Developng Communcon Sregy for Mul-Agen Sysems wh Incremenl Fuzzy Model Sm Hmzeloo, Mnsoor Zolghdr Jhrom Deprmen of Compuer Scence nd Engneerng
More information