Chapter 2: Evaluative Feedback
|
|
- Emerald Fletcher
- 5 years ago
- Views:
Transcription
1 Chper 2: Evluive Feedbck Evluing cions vs. insrucing by giving correc cions Pure evluive feedbck depends olly on he cion ken. Pure insrucive feedbck depends no ll on he cion ken. Supervised lerning is insrucive; opimizion is evluive Associive vs. Nonssociive: Associive: inpus mpped o oupus; lern he bes oupu for ech inpu Nonssociive: lern (find) one bes oupu n-rmed bndi ( les how we re i) is: Nonssociive Evluive feedbck R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 1
2 The n-armed Bndi Problem Choose repeedly from one of n cions; ech choice is clled ply Afer ech ply, you ge rewrd, where E r = Q * ( ) These re unknown cion vlues Disribuion of depends only on Objecive is o mximize he rewrd in he long erm, e.g., over 1000 plys To solve he n-rmed bndi problem, you mus explore vriey of cions nd he exploi he bes of hem r r R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 2
3 The Explorion/Exploiion Dilemm Suppose you form esimes Q ( ) * Q ( ) cion vlue esimes The greedy cion is You cn exploi ll he ime; you cn explore ll he ime You cn never sop exploring; bu you should lwys reduce exploring * = rg mxq ( ) * * = exploiion explorion R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 3
4 Acion-Vlue Mehods Mehods h dp cion-vlue esimes nd nohing else, e.g.: suppose by he -h ply, cion hd been chosen k imes, producing rewrds r r r 1, 2, K, k, hen Q ( ) = r + r + r 1 2 L k k smple verge k lim * Q ( ) = Q ( ) R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 4
5 ε-greedy Acion Selecion Greedy cion selecion: * = = rg mxq ( ) ε-greedy: = { * wih probbiliy 1 ε rndom cion wih probbiliy ε... he simples wy o ry o blnce explorion nd exploiion R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 5
6 10-Armed Tesbed n = 10 possible cions Ech ech Q * ( ) r 1000 plys is chosen rndomly from norml disribuion: is lso norml: * η( Q ( ), 1) repe he whole hing 2000 imes nd verge he resuls η( 0, 1) R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 6
7 ε-greedy Mehods on he 10-Armed Tesbed Averge rewrd = 0.1 = Plys 100% 80% = 0.1 % Opiml cion 60% 40% 20% = 0 (greedy) = % Plys R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 7
8 Sofmx Acion Selecion Sofmx cion selecion mehods grde cion probs. by esimed vlues. The mos common sofmx uses Gibbs, or Bolzmnn, disribuion: Choose cion on ply wih probbiliy e Q n b= 1 ( ) τ where τ is he compuionl emperure e Q ( b) τ, R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 8
9 Binry Bndi Tsks Suppose you hve jus wo cions: nd jus wo rewrds: r = success or r = = 1 or = 2 filure Then you migh infer rge or desired cion: d = { he oher cion if success if filure nd hen lwys ply he cion h ws mos ofen he rge Cll his he supervised lgorihm I works fine on deerminisic sks R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 9
10 Coningency Spce The spce of ll possible binry bndi sks: 1 EASY PROBLEMS B DIFFICULT PROBLEMS Success probbiliy for cion DIFFICULT PROBLEMS EASY PROBLEMS 0 A Success probbiliy for cion 1 R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 10
11 Liner Lerning Auom Le ( ) = Pr = be he only dped prmeer π { } L (Liner, rewrd - incion) R I On success : π + 1( ) = π ( ) + α( 1 π ( )) 0 < α < 1 (he oher cion probs. re djused o sill sum o 1) On filure : no chnge L (Liner, rewrd - penly) R-P On success : π ( ) = π ( ) + α( 1 π ( )) 0 < α < (he oher cion probs. re djused o sill sum o 1) On filure : π ( ) = π ( ) + α( 0 π ( )) 0 < α < For wo cions, sochsic, incremenl version of he supervised lgorihm R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 11
12 Performnce on Binry Bndi Tsks A nd B 100% 90% BANDIT A L R-I cion vlues % Opiml cion 80% 70% 60% supervised 50% L R-P Plys 100% 90% BANDIT B cion vlues % Opiml cion 80% 70% L R-I L R-P 60% supervised 50% Plys R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 12
13 Incremenl Implemenion Recll he smple verge esimion mehod: The verge of he firs k rewrds is (dropping he dependence on ): Q k = r1 + r2 + Lr k k Cn we do his incremenlly (wihou soring ll he rewrds)? We could keep running sum nd coun, or, equivlenly: 1 Q + 1 = Q + r + 1 Q k + 1 [ ] k k k k This is common form for upde rules: NewEsime = OldEsime + SepSize[Trge OldEsime] R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 13
14 Trcking Nonsionry Problem Choosing Q k o be smple verge is pproprie in sionry problem, i.e., when none of he Q * ( ) chnge over ime, Bu no in nonsionry problem. Beer in he nonsionry cse is: [ ] Q = Q + α r Q k + 1 k k + 1 k for consn α, 0 < α 1 k = ( 1 α) Q + α( 1 α) 0 k i= 1 k i exponenil, recency-weighed verge r i R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 14
15 Opimisic Iniil Vlues All mehods so fr depend on Q ( 0 ), i.e., hey re bised. Suppose insed we iniilize he cion vlues opimisiclly, i.e., on he 10-rmed esbed, use Q0 ( ) = 5 for ll 100% 80% opimisic, greedy Q 0 = 5, = 0 % Opiml cion 60% 40% relisic, ε-greedy Q 0 = 0, = % 0% Plys R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 15
16 Reinforcemen Comprison Compre rewrds o reference rewrd, verge of observed rewrds, e.g., n Srenghen or weken he cion ken depending on Le p( ) denoe he preference for cion Preferences deermine cion probbiliies, e.g., by Gibbs disribuion: p ( ) e π ( ) = Pr{ = } = n p ( b) e Then: b= 1 [ ] = + [ ] p ( ) = p ( ) + r r nd r r α r r r r r R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 16
17 Performnce of Reinforcemen Comprison Mehod 100% 80% reinforcemen comprison % Opiml cion 60% 40% 20% -greedy = 0.1, α = 1/k -greedy = 0.1, α = 0.1 0% Plys R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 17
18 Pursui Mehods Minin boh cion-vlue esimes nd cion preferences Alwys pursue he greedy cion, i.e., mke he greedy cion more likely o be seleced Afer he -h ply, upde he cion vlues o ge * The new greedy cion is = rg mxq ( ) Q +1 Then: [ ] π ( * ) = π ( * ) + β 1 π ( * ) nd he probs. of he oher cions decremened o minin he sum of 1 R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 18
19 Performnce of Pursui Mehod % Opiml cion 100% 80% 60% 40% 20% pursui reinforcemen comprison -greedy = 0.1, α = 1/k 0% Plys R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 19
20 Associive Serch Imgine swiching bndis ech ply Bndi 3 cions R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 20
21 Conclusions These re ll very simple mehods bu hey re compliced enough we will build on hem Ides for improvemens: esiming uncerinies... inervl esimion pproximing Byes opiml soluions Giens indices The full RL problem offers some ides for soluion... R. S. Suon nd A. G. Bro: Reinforcemen Lerning: An Inroducion 21
Making Complex Decisions Markov Decision Processes. Making Complex Decisions: Markov Decision Problem
Mking Comple Decisions Mrkov Decision Processes Vsn Honvr Bioinformics nd Compuionl Biology Progrm Cener for Compuionl Inelligence, Lerning, & Discovery honvr@cs.ise.edu www.cs.ise.edu/~honvr/ www.cild.ise.edu/
More information3. Renewal Limit Theorems
Virul Lborories > 14. Renewl Processes > 1 2 3 3. Renewl Limi Theorems In he inroducion o renewl processes, we noed h he rrivl ime process nd he couning process re inverses, in sens The rrivl ime process
More informationReinforcement learning
CS 75 Mchine Lening Lecue b einfocemen lening Milos Huskech milos@cs.pi.edu 539 Senno Sque einfocemen lening We wn o len conol policy: : X A We see emples of bu oupus e no given Insed of we ge feedbck
More informationMinimum Squared Error
Minimum Squred Error LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > 0 for ll smples y i solve sysem of liner inequliies MSE procedure y i i for ll smples
More information4.8 Improper Integrals
4.8 Improper Inegrls Well you ve mde i hrough ll he inegrion echniques. Congrs! Unforunely for us, we sill need o cover one more inegrl. They re clled Improper Inegrls. A his poin, we ve only del wih inegrls
More informationMinimum Squared Error
Minimum Squred Error LDF: Minimum Squred-Error Procedures Ide: conver o esier nd eer undersood prolem Percepron y i > for ll smples y i solve sysem of liner inequliies MSE procedure y i = i for ll smples
More informationReinforcement Learning
Reiforceme Corol lerig Corol polices h choose opiml cios Q lerig Covergece Chper 13 Reiforceme 1 Corol Cosider lerig o choose cios, e.g., Robo lerig o dock o bery chrger o choose cios o opimize fcory oupu
More informatione t dt e t dt = lim e t dt T (1 e T ) = 1
Improper Inegrls There re wo ypes of improper inegrls - hose wih infinie limis of inegrion, nd hose wih inegrnds h pproch some poin wihin he limis of inegrion. Firs we will consider inegrls wih infinie
More information0 for t < 0 1 for t > 0
8.0 Sep nd del funcions Auhor: Jeremy Orloff The uni Sep Funcion We define he uni sep funcion by u() = 0 for < 0 for > 0 I is clled he uni sep funcion becuse i kes uni sep = 0. I is someimes clled he Heviside
More informationProbability, Estimators, and Stationarity
Chper Probbiliy, Esimors, nd Sionriy Consider signl genered by dynmicl process, R, R. Considering s funcion of ime, we re opering in he ime domin. A fundmenl wy o chrcerize he dynmics using he ime domin
More informationMotion. Part 2: Constant Acceleration. Acceleration. October Lab Physics. Ms. Levine 1. Acceleration. Acceleration. Units for Acceleration.
Moion Accelerion Pr : Consn Accelerion Accelerion Accelerion Accelerion is he re of chnge of velociy. = v - vo = Δv Δ ccelerion = = v - vo chnge of velociy elpsed ime Accelerion is vecor, lhough in one-dimensionl
More informationOptimality of Myopic Policy for a Class of Monotone Affine Restless Multi-Armed Bandit
Univeriy of Souhern Cliforni Opimliy of Myopic Policy for Cl of Monoone Affine Rele Muli-Armed Bndi Pri Mnourifrd USC Tr Jvidi UCSD Bhkr Krihnmchri USC Dec 0, 202 Univeriy of Souhern Cliforni Inroducion
More informationProperties of Logarithms. Solving Exponential and Logarithmic Equations. Properties of Logarithms. Properties of Logarithms. ( x)
Properies of Logrihms Solving Eponenil nd Logrihmic Equions Properies of Logrihms Produc Rule ( ) log mn = log m + log n ( ) log = log + log Properies of Logrihms Quoien Rule log m = logm logn n log7 =
More informationA Kalman filtering simulation
A Klmn filering simulion The performnce of Klmn filering hs been esed on he bsis of wo differen dynmicl models, ssuming eiher moion wih consn elociy or wih consn ccelerion. The former is epeced o beer
More informationENGR 1990 Engineering Mathematics The Integral of a Function as a Function
ENGR 1990 Engineering Mhemics The Inegrl of Funcion s Funcion Previously, we lerned how o esime he inegrl of funcion f( ) over some inervl y dding he res of finie se of rpezoids h represen he re under
More informationReinforcement Learning. Markov Decision Processes
einforcemen Lerning Mrkov Decision rocesses Mnfred Huber 2014 1 equenil Decision Mking N-rmed bi problems re no good wy o model sequenil decision problem Only dels wih sic decision sequences Could be miiged
More information5.1-The Initial-Value Problems For Ordinary Differential Equations
5.-The Iniil-Vlue Problems For Ordinry Differenil Equions Consider solving iniil-vlue problems for ordinry differenil equions: (*) y f, y, b, y. If we know he generl soluion y of he ordinry differenil
More informationS Radio transmission and network access Exercise 1-2
S-7.330 Rdio rnsmission nd nework ccess Exercise 1 - P1 In four-symbol digil sysem wih eqully probble symbols he pulses in he figure re used in rnsmission over AWGN-chnnel. s () s () s () s () 1 3 4 )
More informationLecture 2: Learning from Evaluative Feedback. or Bandit Problems
Lecture 2: Learning from Evaluative Feedback or Bandit Problems 1 Edward L. Thorndike (1874-1949) Puzzle Box 2 Learning by Trial-and-Error Law of Effect: Of several responses to the same situation, those
More informationBellman Optimality Equation for V*
Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s
More informationINTEGRALS. Exercise 1. Let f : [a, b] R be bounded, and let P and Q be partitions of [a, b]. Prove that if P Q then U(P ) U(Q) and L(P ) L(Q).
INTEGRALS JOHN QUIGG Eercise. Le f : [, b] R be bounded, nd le P nd Q be priions of [, b]. Prove h if P Q hen U(P ) U(Q) nd L(P ) L(Q). Soluion: Le P = {,..., n }. Since Q is obined from P by dding finiely
More informationThe solution is often represented as a vector: 2xI + 4X2 + 2X3 + 4X4 + 2X5 = 4 2xI + 4X2 + 3X3 + 3X4 + 3X5 = 4. 3xI + 6X2 + 6X3 + 3X4 + 6X5 = 6.
[~ o o :- o o ill] i 1. Mrices, Vecors, nd Guss-Jordn Eliminion 1 x y = = - z= The soluion is ofen represened s vecor: n his exmple, he process of eliminion works very smoohly. We cn elimine ll enries
More informationQuestion Details Int Vocab 1 [ ] Question Details Int Vocab 2 [ ]
/3/5 Assignmen Previewer 3 Bsic: Definie Inegrls (67795) Due: Wed Apr 5 5 9: AM MDT Quesion 3 5 6 7 8 9 3 5 6 7 8 9 3 5 6 Insrucions Red ody's Noes nd Lerning Gols. Quesion Deils In Vocb [37897] The chnge
More informationContraction Mapping Principle Approach to Differential Equations
epl Journl of Science echnology 0 (009) 49-53 Conrcion pping Principle pproch o Differenil Equions Bishnu P. Dhungn Deprmen of hemics, hendr Rn Cmpus ribhuvn Universiy, Khmu epl bsrc Using n eension of
More informationRL Lecture 7: Eligibility Traces. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction 1
RL Lecure 7: Eligibiliy Traces R. S. Suon and A. G. Baro: Reinforcemen Learning: An Inroducion 1 N-sep TD Predicion Idea: Look farher ino he fuure when you do TD backup (1, 2, 3,, n seps) R. S. Suon and
More informationA Time Truncated Improved Group Sampling Plans for Rayleigh and Log - Logistic Distributions
ISSNOnline : 39-8753 ISSN Prin : 347-67 An ISO 397: 7 Cerified Orgnizion Vol. 5, Issue 5, My 6 A Time Trunced Improved Group Smpling Plns for Ryleigh nd og - ogisic Disribuions P.Kvipriy, A.R. Sudmni Rmswmy
More informationChapter Direct Method of Interpolation
Chper 5. Direc Mehod of Inerpolion Afer reding his chper, you should be ble o:. pply he direc mehod of inerpolion,. sole problems using he direc mehod of inerpolion, nd. use he direc mehod inerpolns o
More informationAverage & instantaneous velocity and acceleration Motion with constant acceleration
Physics 7: Lecure Reminders Discussion nd Lb secions sr meeing ne week Fill ou Pink dd/drop form if you need o swich o differen secion h is FULL. Do i TODAY. Homework Ch. : 5, 7,, 3,, nd 6 Ch.: 6,, 3 Submission
More informationINVESTIGATION OF REINFORCEMENT LEARNING FOR BUILDING THERMAL MASS CONTROL
INVESTIGATION OF REINFORCEMENT LEARNING FOR BUILDING THERMAL MASS CONTROL Simeng Liu nd Gregor P. Henze, Ph.D., P.E. Universiy of Nebrsk Lincoln, Archiecurl Engineering 1110 Souh 67 h Sree, Peer Kiewi
More information( ) ( ) ( ) ( ) ( ) ( y )
8. Lengh of Plne Curve The mos fmous heorem in ll of mhemics is he Pyhgoren Theorem. I s formulion s he disnce formul is used o find he lenghs of line segmens in he coordine plne. In his secion you ll
More informationREAL ANALYSIS I HOMEWORK 3. Chapter 1
REAL ANALYSIS I HOMEWORK 3 CİHAN BAHRAN The quesions re from Sein nd Shkrchi s e. Chper 1 18. Prove he following sserion: Every mesurble funcion is he limi.e. of sequence of coninuous funcions. We firs
More informationf t f a f x dx By Lin McMullin f x dx= f b f a. 2
Accumulion: Thoughs On () By Lin McMullin f f f d = + The gols of he AP* Clculus progrm include he semen, Sudens should undersnd he definie inegrl s he ne ccumulion of chnge. 1 The Topicl Ouline includes
More informationPhysics 2A HW #3 Solutions
Chper 3 Focus on Conceps: 3, 4, 6, 9 Problems: 9, 9, 3, 41, 66, 7, 75, 77 Phsics A HW #3 Soluions Focus On Conceps 3-3 (c) The ccelerion due o grvi is he sme for boh blls, despie he fc h he hve differen
More informationSolutions to Problems from Chapter 2
Soluions o Problems rom Chper Problem. The signls u() :5sgn(), u () :5sgn(), nd u h () :5sgn() re ploed respecively in Figures.,b,c. Noe h u h () :5sgn() :5; 8 including, bu u () :5sgn() is undeined..5
More information( dg. ) 2 dt. + dt. dt j + dh. + dt. r(t) dt. Comparing this equation with the one listed above for the length of see that
Arc Length of Curves in Three Dimensionl Spce If the vector function r(t) f(t) i + g(t) j + h(t) k trces out the curve C s t vries, we cn mesure distnces long C using formul nerly identicl to one tht we
More informationAn integral having either an infinite limit of integration or an unbounded integrand is called improper. Here are two examples.
Improper Inegrls To his poin we hve only considered inegrls f(x) wih he is of inegrion nd b finie nd he inegrnd f(x) bounded (nd in fc coninuous excep possibly for finiely mny jump disconinuiies) An inegrl
More informationOne Practical Algorithm for Both Stochastic and Adversarial Bandits
One Prcicl Algorihm for Boh Sochsic nd Adversril Bndis Full Version Including Appendices Yevgeny Seldin Queenslnd Universiy of Technology, Brisbne, Ausrli Aleksndrs Slivkins Microsof Reserch, New York
More informationReinforcement learning II
CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic
More informationA new model for limit order book dynamics
Anewmodelforlimiorderbookdynmics JeffreyR.Russell UniversiyofChicgo,GrdueSchoolofBusiness TejinKim UniversiyofChicgo,DeprmenofSisics Absrc:Thispperproposesnewmodelforlimiorderbookdynmics.Thelimiorderbookconsiss
More informationPHYSICS 1210 Exam 1 University of Wyoming 14 February points
PHYSICS 1210 Em 1 Uniersiy of Wyoming 14 Februry 2013 150 poins This es is open-noe nd closed-book. Clculors re permied bu compuers re no. No collborion, consulion, or communicion wih oher people (oher
More informationAdministrivia CSE 190: Reinforcement Learning: An Introduction
Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these
More informationLecture 2 October ε-approximation of 2-player zero-sum games
Opimizaion II Winer 009/10 Lecurer: Khaled Elbassioni Lecure Ocober 19 1 ε-approximaion of -player zero-sum games In his lecure we give a randomized ficiious play algorihm for obaining an approximae soluion
More informationHonours Introductory Maths Course 2011 Integration, Differential and Difference Equations
Honours Inroducory Mhs Course 0 Inegrion, Differenil nd Difference Equions Reding: Ching Chper 4 Noe: These noes do no fully cover he meril in Ching, u re men o supplemen your reding in Ching. Thus fr
More informationChapter 21. Reinforcement Learning. The Reinforcement Learning Agent
CSE 47 Chaper Reinforcemen Learning The Reinforcemen Learning Agen Agen Sae u Reward r Acion a Enironmen CSE AI Faculy Why reinforcemen learning Programming an agen o drie a car or fly a helicoper is ery
More informationA 1.3 m 2.5 m 2.8 m. x = m m = 8400 m. y = 4900 m 3200 m = 1700 m
PHYS : Soluions o Chper 3 Home Work. SSM REASONING The displcemen is ecor drwn from he iniil posiion o he finl posiion. The mgniude of he displcemen is he shores disnce beween he posiions. Noe h i is onl
More informationForms of Energy. Mass = Energy. Page 1. SPH4U: Introduction to Work. Work & Energy. Particle Physics:
SPH4U: Inroducion o ork ork & Energy ork & Energy Discussion Definiion Do Produc ork of consn force ork/kineic energy heore ork of uliple consn forces Coens One of he os iporn conceps in physics Alernive
More informationP441 Analytical Mechanics - I. Coupled Oscillators. c Alex R. Dzierba
Lecure 3 Mondy - Deceber 5, 005 Wrien or ls upded: Deceber 3, 005 P44 Anlyicl Mechnics - I oupled Oscillors c Alex R. Dzierb oupled oscillors - rix echnique In Figure we show n exple of wo coupled oscillors,
More informationTHREE IMPORTANT CONCEPTS IN TIME SERIES ANALYSIS: STATIONARITY, CROSSING RATES, AND THE WOLD REPRESENTATION THEOREM
THR IMPORTANT CONCPTS IN TIM SRIS ANALYSIS: STATIONARITY, CROSSING RATS, AND TH WOLD RPRSNTATION THORM Prof. Thoms B. Fomb Deprmen of conomics Souhern Mehodis Universi June 8 I. Definiion of Covrince Sionri
More informationAQA Maths M2. Topic Questions from Papers. Differential Equations. Answers
AQA Mahs M Topic Quesions from Papers Differenial Equaions Answers PhysicsAndMahsTuor.com Q Soluion Marks Toal Commens M 600 0 = A Applying Newonís second law wih 0 and. Correc equaion = 0 dm Separaing
More informationMATH 124 AND 125 FINAL EXAM REVIEW PACKET (Revised spring 2008)
MATH 14 AND 15 FINAL EXAM REVIEW PACKET (Revised spring 8) The following quesions cn be used s review for Mh 14/ 15 These quesions re no cul smples of quesions h will pper on he finl em, bu hey will provide
More information(b) 10 yr. (b) 13 m. 1.6 m s, m s m s (c) 13.1 s. 32. (a) 20.0 s (b) No, the minimum distance to stop = 1.00 km. 1.
Answers o Een Numbered Problems Chper. () 7 m s, 6 m s (b) 8 5 yr 4.. m ih 6. () 5. m s (b).5 m s (c).5 m s (d) 3.33 m s (e) 8. ().3 min (b) 64 mi..3 h. ().3 s (b) 3 m 4..8 mi wes of he flgpole 6. (b)
More informationMath 2142 Exam 1 Review Problems. x 2 + f (0) 3! for the 3rd Taylor polynomial at x = 0. To calculate the various quantities:
Mah 4 Eam Review Problems Problem. Calculae he 3rd Taylor polynomial for arcsin a =. Soluion. Le f() = arcsin. For his problem, we use he formula f() + f () + f ()! + f () 3! for he 3rd Taylor polynomial
More informationProcess Monitoring and Feedforward Control for Proactive Quality Improvement
Inernionl Journl of Performbiliy Engineering Vol. 8, No. 6, November 0, pp. 60-64. RAMS Consulns Prined in Indi Process Monioring nd Feedforwrd Conrol for Procive Quliy Improvemen. Inroducion LIHUI SHI
More informationIntroduction to LoggerPro
Inroducion o LoggerPro Sr/Sop collecion Define zero Se d collecion prmeers Auoscle D Browser Open file Sensor seup window To sr d collecion, click he green Collec buon on he ool br. There is dely of second
More information1.0 Electrical Systems
. Elecricl Sysems The ypes of dynmicl sysems we will e sudying cn e modeled in erms of lgeric equions, differenil equions, or inegrl equions. We will egin y looking fmilir mhemicl models of idel resisors,
More informationChapter 7: Solving Trig Equations
Haberman MTH Secion I: The Trigonomeric Funcions Chaper 7: Solving Trig Equaions Le s sar by solving a couple of equaions ha involve he sine funcion EXAMPLE a: Solve he equaion sin( ) The inverse funcions
More informationLinear Time-invariant systems, Convolution, and Cross-correlation
Linear Time-invarian sysems, Convoluion, and Cross-correlaion (1) Linear Time-invarian (LTI) sysem A sysem akes in an inpu funcion and reurns an oupu funcion. x() T y() Inpu Sysem Oupu y() = T[x()] An
More informationA LOG IS AN EXPONENT.
Ojeives: n nlze nd inerpre he ehvior of rihmi funions, inluding end ehvior nd smpoes. n solve rihmi equions nlill nd grphill. n grph rihmi funions. n deermine he domin nd rnge of rihmi funions. n deermine
More informationSome basic notation and terminology. Deterministic Finite Automata. COMP218: Decision, Computation and Language Note 1
COMP28: Decision, Compuion nd Lnguge Noe These noes re inended minly s supplemen o he lecures nd exooks; hey will e useful for reminders ou noion nd erminology. Some sic noion nd erminology An lphe is
More informationSeptember 20 Homework Solutions
College of Engineering nd Compuer Science Mechnicl Engineering Deprmen Mechnicl Engineering A Seminr in Engineering Anlysis Fll 7 Number 66 Insrucor: Lrry Creo Sepember Homework Soluions Find he specrum
More information22.615, MHD Theory of Fusion Systems Prof. Freidberg Lecture 9: The High Beta Tokamak
.65, MHD Theory of Fusion Sysems Prof. Freidberg Lecure 9: The High e Tokmk Summry of he Properies of n Ohmic Tokmk. Advnges:. good euilibrium (smll shif) b. good sbiliy ( ) c. good confinemen ( τ nr )
More informationRESPONSE UNDER A GENERAL PERIODIC FORCE. When the external force F(t) is periodic with periodτ = 2π
RESPONSE UNDER A GENERAL PERIODIC FORCE When he exernl force F() is periodic wih periodτ / ω,i cn be expnded in Fourier series F( ) o α ω α b ω () where τ F( ) ω d, τ,,,... () nd b τ F( ) ω d, τ,,... (3)
More informationDipartimento di Elettronica Informazione e Bioingegneria Robotics
Diprimeno di Eleronic Inormzione e Bioingegneri Roboics From moion plnning o rjecories @ 015 robo clssiicions Robos cn be described by Applicion(seelesson1) Geomery (see lesson mechnics) Precision (see
More informationSimulation-Solving Dynamic Models ABE 5646 Week 2, Spring 2010
Simulaion-Solving Dynamic Models ABE 5646 Week 2, Spring 2010 Week Descripion Reading Maerial 2 Compuer Simulaion of Dynamic Models Finie Difference, coninuous saes, discree ime Simple Mehods Euler Trapezoid
More informationSolutions for Assignment 2
Faculy of rs and Science Universiy of Torono CSC 358 - Inroducion o Compuer Neworks, Winer 218 Soluions for ssignmen 2 Quesion 1 (2 Poins): Go-ack n RQ In his quesion, we review how Go-ack n RQ can be
More information1. Introduction. 1 b b
Journl of Mhemicl Inequliies Volume, Number 3 (007), 45 436 SOME IMPROVEMENTS OF GRÜSS TYPE INEQUALITY N. ELEZOVIĆ, LJ. MARANGUNIĆ AND J. PEČARIĆ (communiced b A. Čižmešij) Absrc. In his pper some inequliies
More informationLinear Response Theory: The connection between QFT and experiments
Phys540.nb 39 3 Linear Response Theory: The connecion beween QFT and experimens 3.1. Basic conceps and ideas Q: How do we measure he conduciviy of a meal? A: we firs inroduce a weak elecric field E, and
More informationT-Match: Matching Techniques For Driving Yagi-Uda Antennas: T-Match. 2a s. Z in. (Sections 9.5 & 9.7 of Balanis)
3/0/018 _mch.doc Pge 1 of 6 T-Mch: Mching Techniques For Driving Ygi-Ud Anenns: T-Mch (Secions 9.5 & 9.7 of Blnis) l s l / l / in The T-Mch is shun-mching echnique h cn be used o feed he driven elemen
More informationEnsamble methods: Bagging and Boosting
Lecure 21 Ensamble mehods: Bagging and Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Ensemble mehods Mixure of expers Muliple base models (classifiers, regressors), each covers a differen par
More informationMAT 266 Calculus for Engineers II Notes on Chapter 6 Professor: John Quigg Semester: spring 2017
MAT 66 Clculus for Engineers II Noes on Chper 6 Professor: John Quigg Semeser: spring 7 Secion 6.: Inegrion by prs The Produc Rule is d d f()g() = f()g () + f ()g() Tking indefinie inegrls gives [f()g
More informationAho-Corasick Automata
Aho-Corsick Auom Sring D Srucures Over he nex few dys, we're going o be exploring d srucures specificlly designed for sring processing. These d srucures nd heir vrins re frequenly used in prcice Looking
More informationMachine Learning Reinforcement Learning
Mchine Lerning Reinforcemen Lerning Leon 2 Mchine Lerning Mchine Lerning Supervied Lerning Techer ell lerner wh o remember Reinforcemen Lerning Environmen provide hin o lerner Unupervied Lerning Lerner
More information{ } = E! & $ " k r t +k +1
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationSolution for Assignment 1 : Intro to Probability and Statistics, PAC learning
Solution for Assignment 1 : Intro to Probbility nd Sttistics, PAC lerning 10-701/15-781: Mchine Lerning (Fll 004) Due: Sept. 30th 004, Thursdy, Strt of clss Question 1. Bsic Probbility ( 18 pts) 1.1 (
More informationChapter 4: Dynamic Programming
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More information6.003 Homework #9 Solutions
6.003 Homework #9 Soluions Problems. Fourier varieies a. Deermine he Fourier series coefficiens of he following signal, which is periodic in 0. x () 0 3 0 a 0 5 a k a k 0 πk j3 e 0 e j πk 0 jπk πk e 0
More information1 jordan.mcd Eigenvalue-eigenvector approach to solving first order ODEs. -- Jordan normal (canonical) form. Instructor: Nam Sun Wang
jordnmcd Eigenvlue-eigenvecor pproch o solving firs order ODEs -- ordn norml (cnonicl) form Insrucor: Nm Sun Wng Consider he following se of coupled firs order ODEs d d x x 5 x x d d x d d x x x 5 x x
More informationLAPLACE TRANSFORMS. 1. Basic transforms
LAPLACE TRANSFORMS. Bic rnform In hi coure, Lplce Trnform will be inroduced nd heir properie exmined; ble of common rnform will be buil up; nd rnform will be ued o olve ome dierenil equion by rnforming
More informationFactorized Decision Forecasting via Combining Value-based and Reward-based Estimation
Fcorized Decision Forecsing vi Combining Vlue-bsed nd Rewrd-bsed Esimion Brin D. Ziebr Crnegie Mellon Universiy Pisburgh, PA 15213 bziebr@cs.cmu.edu Absrc A powerful recen perspecive for predicing sequenil
More informationMTH 146 Class 11 Notes
8.- Are of Surfce of Revoluion MTH 6 Clss Noes Suppose we wish o revolve curve C round n is nd find he surfce re of he resuling solid. Suppose f( ) is nonnegive funcion wih coninuous firs derivive on he
More informationBipartite Matching. Matching. Bipartite Matching. Maxflow Formulation
Mching Inpu: undireced grph G = (V, E). Biprie Mching Inpu: undireced, biprie grph G = (, E).. Mching Ern Myr, Hrld äcke Biprie Mching Inpu: undireced, biprie grph G = (, E). Mflow Formulion Inpu: undireced,
More information( ) a system of differential equations with continuous parametrization ( T = R + These look like, respectively:
XIII. DIFFERENCE AND DIFFERENTIAL EQUATIONS Ofen funcions, or a sysem of funcion, are paramerized in erms of some variable, usually denoed as and inerpreed as ime. The variable is wrien as a funcion of
More informationLaplace Transforms. Examples. Is this equation differential? y 2 2y + 1 = 0, y 2 2y + 1 = 0, (y ) 2 2y + 1 = cos x,
Laplace Transforms Definiion. An ordinary differenial equaion is an equaion ha conains one or several derivaives of an unknown funcion which we call y and which we wan o deermine from he equaion. The equaion
More informationLearning a Class from Examples. Training set X. Class C 1. Class C of a family car. Output: Input representation: x 1 : price, x 2 : engine power
Alpaydin Chaper, Michell Chaper 7 Alpaydin slides are in urquoise. Ehem Alpaydin, copyrigh: The MIT Press, 010. alpaydin@boun.edu.r hp://www.cmpe.boun.edu.r/ ehem/imle All oher slides are based on Michell.
More informationHomework-8(1) P8.3-1, 3, 8, 10, 17, 21, 24, 28,29 P8.4-1, 2, 5
Homework-8() P8.3-, 3, 8, 0, 7, 2, 24, 28,29 P8.4-, 2, 5 Secion 8.3: The Response of a Firs Order Circui o a Consan Inpu P 8.3- The circui shown in Figure P 8.3- is a seady sae before he swich closes a
More informationMathematics 805 Final Examination Answers
. 5 poins Se he Weiersrss M-es. Mhemics 85 Finl Eminion Answers Answer: Suppose h A R, nd f n : A R. Suppose furher h f n M n for ll A, nd h Mn converges. Then f n converges uniformly on A.. 5 poins Se
More informationVidyalankar. 1. (a) Y = a cos dy d = a 3 cos2 ( sin ) x = a sin dx d = a 3 sin2 cos slope = dy dx. dx = y. cos. sin. 3a sin cos = cot at = 4 = 1
. (). (b) Vilnkr S.Y. Diplom : Sem. III [AE/CE/CH/CM/CO/CR/CS/CW/DE/EE/EP/IF/EJ/EN/ET/EV/EX/IC/IE/IS/ ME/MU/PG/PT/PS/CD/CV/ED/EI/FE/IU/MH/MI] Applied Mhemics Prelim Quesion Pper Soluion Y cos d cos ( sin
More informationExplaining Total Factor Productivity. Ulrich Kohli University of Geneva December 2015
Explaining Toal Facor Produciviy Ulrich Kohli Universiy of Geneva December 2015 Needed: A Theory of Toal Facor Produciviy Edward C. Presco (1998) 2 1. Inroducion Toal Facor Produciviy (TFP) has become
More informationUSING ITERATIVE LINEAR REGRESSION MODEL TO TIME SERIES MODELS
Elecronic Journl of Applied Sisicl Anlysis EJASA (202), Elecron. J. App. S. Anl., Vol. 5, Issue 2, 37 50 e-issn 2070-5948, DOI 0.285/i20705948v5n2p37 202 Universià del Sleno hp://sib-ese.unile.i/index.php/ejs/index
More information1 Online Learning and Regret Minimization
2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in
More informationTemperature Rise of the Earth
Avilble online www.sciencedirec.com ScienceDirec Procedi - Socil nd Behviorl Scien ce s 88 ( 2013 ) 220 224 Socil nd Behviorl Sciences Symposium, 4 h Inernionl Science, Socil Science, Engineering nd Energy
More information6.003 Homework #8 Solutions
6.003 Homework #8 Soluions Problems. Fourier Series Deermine he Fourier series coefficiens a k for x () shown below. x ()= x ( + 0) 0 a 0 = 0 a k = e /0 sin(/0) for k 0 a k = π x()e k d = 0 0 π e 0 k d
More informationEnsamble methods: Boosting
Lecure 21 Ensamble mehods: Boosing Milos Hauskrech milos@cs.pi.edu 5329 Senno Square Schedule Final exam: April 18: 1:00-2:15pm, in-class Term projecs April 23 & April 25: a 1:00-2:30pm in CS seminar room
More information6.003 Homework #9 Solutions
6.00 Homework #9 Soluions Problems. Fourier varieies a. Deermine he Fourier series coefficiens of he following signal, which is periodic in 0. x () 0 0 a 0 5 a k sin πk 5 sin πk 5 πk for k 0 a k 0 πk j
More informationEstimating the population parameter, r, q and K based on surplus production model. Wang, Chien-Hsiung
SCTB15 Working Pper ALB 7 Esiming he populion prmeer, r, q nd K bsed on surplus producion model Wng, Chien-Hsiung Biologicl nd Fishery Division Insiue of Ocenogrphy Nionl Tiwn Universiy Tipei, Tiwn Tile:
More informationCSE/NB 528 Lecture 14: Reinforcement Learning (Chapter 9)
CSE/NB 528 Lecure 14: Reinforcemen Learning Chaper 9 Image from hp://clasdean.la.asu.edu/news/images/ubep2001/neuron3.jpg Lecure figures are from Dayan & Abbo s book hp://people.brandeis.edu/~abbo/book/index.hml
More informationDeep Reinforcement Learning with Double Q-Learning
Proceedings of he Thirieh AAAI Conference on Arificil Inelligence (AAAI-6) Deep Reinforcemen Lerning wih Double Q-Lerning Hdo vn Hssel, Arhur Guez, nd Dvid Silver Google DeepMind Absrc The populr Q-lerning
More informationThe Finite Element Method for the Analysis of Non-Linear and Dynamic Systems
Swiss Federl Insiue of Pge 1 The Finie Elemen Mehod for he Anlysis of Non-Liner nd Dynmic Sysems Prof. Dr. Michel Hvbro Fber Dr. Nebojs Mojsilovic Swiss Federl Insiue of ETH Zurich, Swizerlnd Mehod of
More informationEfficient Optimal Learning for Contextual Bandits
fficien Opiml Lerning for Conexul Bndis Miroslv Dudik mdudik@yhoo-inccom Dniel Hsu djhsu@rcirugersedu Syen Kle skle@yhoo-inccom Nikos Krmpzikis nk@cscornelledu John Lngford jl@yhoo-inccom Lev Reyzin lreyzin@ccgechedu
More informationAJAE appendix for Is Exchange Rate Pass-Through in Pork Meat Export Prices Constrained by the Supply of Live Hogs?
AJAE ppendix for Is Exchnge Re Pss-Through in Por Me Expor Prices Consrined by he Supply of Live Hogs? Jen-Philippe Gervis Cnd Reserch Chir in Agri-indusries nd Inernionl Trde Cener for Reserch in he Economics
More information