Outline. Reinforcement Learning. What is RL? Reinforcement learning is learning what to do so as to maximize a numerical reward signal
|
|
- Leslie Ball
- 6 years ago
- Views:
Transcription
1 Otine Reinfocement Leaning Jne, 005 CS 486/686 Univesity of Wateoo Rsse & Novig Sect.-. What is einfocement eaning Tempoa-Diffeence eaning Q-eaning Machine Leaning Spevised Leaning Teache tes eane what to emembe Reinfocement Leaning Envionment povides hints to eane What is RL? Reinfocement eaning is eaning what to do so as to maximize a nmeica ewad signa Leane is not tod what actions to take, bt mst discove them by tying them ot and seeing what the ewad is Unspevised Leaning Leane discoves on its own 4 What is RL Anima Psychoogy Reinfocement eaning diffes fom spevised eaning Spevised eaning Don t toch. Yo wi get bnt Reinfocement eaning Och! Negative einfocements: Pain and hnge Positive einfocements: Pease and food Reinfocements sed to tain animas Let s do the same with comptes! 5 6
2 RL Exampes Game paying (backgammon, soitaie) Opeations eseach (picing, vehice oting) Eevato scheding Heicopte conto Reinfocement Leaning Definition: Makov decision pocess with nknown tansition and ewad modes Set of states S Set of actions A Actions may be stochastic Set of einfocement signas (ewads) Rewads may be deayed 7 8 Poicy optimization Reinfocement Leaning Pobem Makov Decision Pocess: Find optima poicy given tansition and ewad mode Execte poicy fond State Agent Rewad Action Reinfocement eaning: Lean an optima poicy whie inteacting with the envionment Envionment a0 a s0 s s 0 a 9 Goa: Lean to choose actions that maximize 0 +γ +γ +, whee 0 γ < 0 Exampe: Inveted Pendm State: x(t),x (t), θ(t), θ (t) Action: Foce F Rewad: fo any step whee poe baanced Pobem: Find δ:s A that maximizes ewads R Chaacteisitics Reinfocements: ewads Tempoa cedit assignment: when a ewad is eceived, which action shod be cedited? Expoation/expoitation tadeoff: as agent eans, shod it expoit its cent knowedge to maximize ewads o expoe to efine its knowedge? Lifeong eaning: einfocement eaning
3 Types of RL Passive vs Active eaning Passive eaning: the agent exectes a fixed poicy and ties to evaate it Active eaning: the agent pdates its poicy as it eans Mode based vs mode fee Mode-based: ean tansition and ewad mode and se it to detemine optima poicy Mode fee: deive optima poicy withot eaning the mode Passive Leaning Tansition and ewad mode known: Evaate δ: V δ (s) = R(s) + γ Σ s P(s s,δ(s)) V δ (s ) Tansition and ewad mode nknown: Estimate poicy vae as agent exectes poicy: V δ (s) = E δ [ Σ t γ t R(s t )] Mode based vs mode fee 4 Passive eaning Passive ADP γ = i = fo non-temina states Do not know the tansition pobabiities Adaptive dynamic pogamming (ADP) Mode-based Lean tansition pobabiities and ewads fom obsevations Then pdate the vaes of the states (,) (,) (,) (,) (,) (,) (,) (4,) + (,) (,) (,) (,) (,) (,) (,) (4,) + (,) (,) (,) (,) (4,) - What is the vae V(s) of being in state s? 5 6 γ = ADP Exampe (,) (,) (,) (,) (,) (,) (,) (4,) + (,) (,) (,) (,) (,) (,) (,) (4,) + (,) (,) (,) (,) (4,) - P((,) (,),) =/ P((,) (,),) =/ i = fo non-temina states V δ (s) = R(s) + γ Σ s P(s s,δ(s)) V δ (s ) Use this infomation in Passive TD Tempoa diffeence (TD) Mode fee At each time step Obseve: s,a,s, Update V δ (s) afte each move V δ (s) = V δ (s) + α (R(s) + γ V δ (s ) V δ (s)) We need to ean a the tansition pobabiities! 7 Leaning ate Tempoa diffeence 8
4 TD Convegence Thm: If α is appopiatey deceased with nmbe of times a state is visited then V δ (s) conveges to coect vae α mst satisfy: Σ t α t Σ t (α t ) < Often α(s) = /n(s) n(s) = # of times s is visited Active Leaning Utimatey, we ae inteested in impoving δ Tansition and ewad mode known: V * (s) = max a R(s) + γ Σ s P(s s,a) V * (s ) Tansition and ewad mode nknown: Impove poicy as agent exectes poicy Mode based vs mode fee 9 0 Q-eaning (aka active tempoa diffeence) Q-fnction: Q:S A R Vae of state-action pai Poicy δ(s) = agmax a Q(s,a) is the optima poicy Beman s eqation: Q*(s,a) = R(s) + γ Σ s P(s s,a) max a Q*(s,a ) Q-eaning Fo each state s and action a initiaize Q(s,a) (0 o andom) Obseve cent state Loop Seect action a and execte it Receive immediate ewad Obseve new state s Update Q(a,s) Q(s,a) = Q(s,a) + α((s)+γ max a Q(s,a ) Q(s,a)) s=s Q-eaning exampe 7 s s =0 fo non-temina states γ=0.9 α=0.5 Q(s,ight) = Q(s,ight) + α ((s ) + γ max a Q(s,a ) Q(s,ight)) = ( max[66,8,00] 7) = (7) = 8.5 Q-eaning Fo each state s and action a initiaize Q(s,a) (0 o andom) Obseve cent state Loop Seect action a and execte it Receive immediate ewad Obseve new state s Update Q(a,s) Q(s,a) = Q(s,a) + α((s)+γ max a Q(s,a ) Q(s,a)) s=s 4 4
5 Expoation vs Expoitation Common expoation methods If an agent aways chooses the action with the highest vae then it is expoiting The eaned mode is not the ea mode Leads to sboptima ests By taking andom actions (pe expoation) an agent may ean the mode Bt what is the se of eaning a compete mode if pats of it ae neve sed? Need a baance between expoitation and expoation 5 ε-geedy: With pobabiity ε execte andom action Othewise execte best action a* a* = agmax a Q(s,a) Botzmann expoation P(a) = e Q(s,a)/T Σ a e Q(s,a)/T 6 Expoation and Q-eaning Q-eaning conveges to optima Q- vaes if Evey state is visited infinitey often (de to expoation) The action seection becomes geedy as time appoaches infinity The eaning ate a is deceased fast enogh bt not too fast A Timph fo Reinfocement Leaning: TD-Gammon Backgammon paye: TD eaning with a nea netwok epesentation of the vae fnction: 7 8 Next Cass Machine eaning Decision tees Rsse and Novig: chapte 8 9 5
Value Prediction with FA. Chapter 8: Generalization and Function Approximation. Adapt Supervised Learning Algorithms. Backups as Training Examples [ ]
Chapte 8: Genealization and Function Appoximation Objectives of this chapte:! Look at how expeience with a limited pat of the state set be used to poduce good behavio ove a much lage pat.! Oveview of function
More informationProblem set 6. Solution. The problem of firm 3 is. The FOC is: 2 =0. The reaction function of firm 3 is: = 2
Pobem set 6 ) Thee oigopoists opeate in a maket with invese demand function given by = whee = + + and is the quantity poduced by fim i. Each fim has constant magina cost of poduction, c, and no fixed cost.
More informationChapter 8: Generalization and Function Approximation
Chapte 8: Genealization and Function Appoximation Objectives of this chapte: Look at how expeience with a limited pat of the state set be used to poduce good behavio ove a much lage pat. Oveview of function
More informationMechanics Physics 151
Mechanics Physics 151 Lectue 6 Kepe Pobem (Chapte 3) What We Did Last Time Discussed enegy consevation Defined enegy function h Conseved if Conditions fo h = E Stated discussing Centa Foce Pobems Reduced
More informationMerging to ordered sequences. Efficient (Parallel) Sorting. Merging (cont.)
Efficient (Paae) Soting One of the most fequent opeations pefomed by computes is oganising (soting) data The access to soted data is moe convenient/faste Thee is a constant need fo good soting agoithms
More informationSeidel s Trapezoidal Partitioning Algorithm
CS68: Geometic Agoithms Handout #6 Design and Anaysis Oigina Handout #6 Stanfod Univesity Tuesday, 5 Febuay 99 Oigina Lectue #7: 30 Januay 99 Topics: Seide s Tapezoida Patitioning Agoithm Scibe: Michae
More information4/18/2005. Statistical Learning Theory
Statistical Leaning Theoy Statistical Leaning Theoy A model of supevised leaning consists of: a Envionment - Supplying a vecto x with a fixed but unknown pdf F x (x b Teache. It povides a desied esponse
More informationGravity and isostasy
Gavity and isostasy Reading: owle p60 74 Theoy of gavity Use two of Newton s laws: ) Univesal law of gavitation: Gmm = m m Univesal gavitational constant G=6.67 x 0 - Nm /kg ) Second law of motion: = ma
More informationMULTILAYER PERCEPTRONS
Last updated: Nov 26, 2012 MULTILAYER PERCEPTRONS Outline 2 Combining Linea Classifies Leaning Paametes Outline 3 Combining Linea Classifies Leaning Paametes Implementing Logical Relations 4 AND and OR
More informationHomework 1 Solutions CSE 101 Summer 2017
Homewok 1 Soutions CSE 101 Summe 2017 1 Waming Up 1.1 Pobem and Pobem Instance Find the smaest numbe in an aay of n integes a 1, a 2,..., a n. What is the input? What is the output? Is this a pobem o a
More informationPAPER 39 STOCHASTIC NETWORKS
MATHEMATICAL TRIPOS Pat III Tuesday, 2 June, 2015 1:30 pm to 4:30 pm PAPER 39 STOCHASTIC NETWORKS Attempt no moe than FOUR questions. Thee ae FIVE questions in total. The questions cay equal weight. STATIONERY
More information1 Explicit Explore or Exploit (E 3 ) Algorithm
2.997 Decision-Making in Lage-Scale Systems Mach 3 MIT, Sping 2004 Handout #2 Lectue Note 9 Explicit Exploe o Exploit (E 3 ) Algoithm Last lectue, we studied the Q-leaning algoithm: [ ] Q t+ (x t, a t
More informationDetermining solar characteristics using planetary data
Detemining sola chaacteistics using planetay data Intoduction The Sun is a G-type main sequence sta at the cente of the Sola System aound which the planets, including ou Eath, obit. In this investigation
More informationEMPORIUM H O W I T W O R K S F I R S T T H I N G S F I R S T, Y O U N E E D T O R E G I S T E R.
H O W I T W O R K S F I R S T T H I N G S F I R S T, Y O U N E E D T O R E G I S T E R I n o r d e r t o b u y a n y i t e m s, y o u w i l l n e e d t o r e g i s t e r o n t h e s i t e. T h i s i s
More informationSolutions to two problems in optimizing a bar
ectre 19b Sotions to two probems in optimizing a bar ME 56 at the Indian Institte of Science, Bengar Variationa Methods and Strctra Optimization G. K. Ananthasresh Professor, Mechanica Engineering, Indian
More informationStanford University CS259Q: Quantum Computing Handout 8 Luca Trevisan October 18, 2012
Stanfod Univesity CS59Q: Quantum Computing Handout 8 Luca Tevisan Octobe 8, 0 Lectue 8 In which we use the quantum Fouie tansfom to solve the peiod-finding poblem. The Peiod Finding Poblem Let f : {0,...,
More informationLinear Program for Partially Observable Markov Decision Processes. MS&E 339B June 9th, 2004 Erick Delage
Linea Pogam fo Patiall Obsevable Makov Decision Pocesses MS&E 339B June 9th 2004 Eick Delage Intoduction Patiall Obsevable Makov Decision Pocesses Etension of the Makov Decision Pocess to a wold with uncetaint
More informationPHYS 705: Classical Mechanics. Central Force Problems II
PHYS 75: Cassica Mechanics Centa Foce Pobems II Obits in Centa Foce Pobem Sppose we e inteested moe in the shape of the obit, (not necessay the time evotion) Then, a sotion fo = () o = () wod be moe sef!
More informationBayesian Congestion Control over a Markovian Network Bandwidth Process
Bayesian Congestion Contol ove a Makovian Netwok Bandwidth Pocess Paisa Mansouifad,, Bhaska Kishnamachai, Taa Javidi Ming Hsieh Depatment of Electical Engineeing, Univesity of Southen Califonia, Los Angeles,
More informationCSCE 478/878 Lecture 4: Experimental Design and Analysis. Stephen Scott. 3 Building a tree on the training set Introduction. Outline.
In Homewok, you ae (supposedly) Choosing a data set 2 Extacting a test set of size > 3 3 Building a tee on the taining set 4 Testing on the test set 5 Repoting the accuacy (Adapted fom Ethem Alpaydin and
More informationData Flow Anomaly Analysis
Pof. D. Liggesmeye, 1 Contents Data flows an ata flow anomalies State machine fo ata flow anomaly analysis Example withot loops Example with loops Data Flow Anomaly Analysis Softwae Qality Assance Softwae
More information16 Modeling a Language by a Markov Process
K. Pommeening, Language Statistics 80 16 Modeling a Language by a Makov Pocess Fo deiving theoetical esults a common model of language is the intepetation of texts as esults of Makov pocesses. This model
More informationFlux. Area Vector. Flux of Electric Field. Gauss s Law
Gauss s Law Flux Flux in Physics is used to two distinct ways. The fist meaning is the ate of flow, such as the amount of wate flowing in a ive, i.e. volume pe unit aea pe unit time. O, fo light, it is
More informationChapter 4. Newton s Laws of Motion
Chapte 4 Newton s Laws of Motion 4.1 Foces and Inteactions A foce is a push o a pull. It is that which causes an object to acceleate. The unit of foce in the metic system is the Newton. Foce is a vecto
More informationA Bijective Approach to the Permutational Power of a Priority Queue
A Bijective Appoach to the Pemutational Powe of a Pioity Queue Ia M. Gessel Kuang-Yeh Wang Depatment of Mathematics Bandeis Univesity Waltham, MA 02254-9110 Abstact A pioity queue tansfoms an input pemutation
More informationGoodness-of-fit for composite hypotheses.
Section 11 Goodness-of-fit fo composite hypotheses. Example. Let us conside a Matlab example. Let us geneate 50 obsevations fom N(1, 2): X=nomnd(1,2,50,1); Then, unning a chi-squaed goodness-of-fit test
More informationHOW TO TEACH THE FUNDAMENTALS OF INFORMATION SCIENCE, CODING, DECODING AND NUMBER SYSTEMS?
6th INTERNATIONAL MULTIDISCIPLINARY CONFERENCE HOW TO TEACH THE FUNDAMENTALS OF INFORMATION SCIENCE, CODING, DECODING AND NUMBER SYSTEMS? Cecília Sitkuné Göömbei College of Nyíegyháza Hungay Abstact: The
More informationUniversity of Illinois at Chicago Department of Physics. Electricity & Magnetism Qualifying Examination
E&M poblems Univesity of Illinois at Chicago Depatment of Physics Electicity & Magnetism Qualifying Examination Januay 3, 6 9. am : pm Full cedit can be achieved fom completely coect answes to 4 questions.
More informationSolution to Problem First, the firm minimizes the cost of the inputs: min wl + rk + sf
Econ 0A Poblem Set 4 Solutions ue in class on Tu 4 Novembe. No late Poblem Sets accepted, so! This Poblem set tests the knoledge that ou accumulated mainl in lectues 5 to 9. Some of the mateial ill onl
More informationReinforcement Learning. Machine Learning, Fall 2010
Reinforcement Learning Machine Learning, Fall 2010 1 Administrativia This week: finish RL, most likely start graphical models LA2: due on Thursday LA3: comes out on Thursday TA Office hours: Today 1:30-2:30
More informationAP Physics C: Electricity and Magnetism 2003 Scoring Guidelines
AP Physics C: Electicity and Magnetism 3 Scoing Guidelines The mateials included in these files ae intended fo use by AP teaches fo couse and exam pepaation; pemission fo any othe use must be sought fom
More informationHomework 7 Solutions
Homewok 7 olutions Phys 4 Octobe 3, 208. Let s talk about a space monkey. As the space monkey is oiginally obiting in a cicula obit and is massive, its tajectoy satisfies m mon 2 G m mon + L 2 2m mon 2
More informationCSC242: Intro to AI. Lecture 23
CSC242: Intro to AI Lecture 23 Administrivia Posters! Tue Apr 24 and Thu Apr 26 Idea! Presentation! 2-wide x 4-high landscape pages Learning so far... Input Attributes Alt Bar Fri Hun Pat Price Rain Res
More informationJackson 4.7 Homework Problem Solution Dr. Christopher S. Baird University of Massachusetts Lowell
Jackson 4.7 Homewok obem Soution D. Chistophe S. Baid Univesity of Massachusetts Lowe ROBLEM: A ocaized distibution of chage has a chage density ρ()= 6 e sin θ (a) Make a mutipoe expansion of the potentia
More informationEM Boundary Value Problems
EM Bounday Value Poblems 10/ 9 11/ By Ilekta chistidi & Lee, Seung-Hyun A. Geneal Desciption : Maxwell Equations & Loentz Foce We want to find the equations of motion of chaged paticles. The way to do
More informationLecture 2 - Thermodynamics Overview
2.625 - Electochemical Systems Fall 2013 Lectue 2 - Themodynamics Oveview D.Yang Shao-Hon Reading: Chapte 1 & 2 of Newman, Chapte 1 & 2 of Bad & Faulkne, Chaptes 9 & 10 of Physical Chemisty I. Lectue Topics:
More informationPhysics 235 Chapter 5. Chapter 5 Gravitation
Chapte 5 Gavitation In this Chapte we will eview the popeties of the gavitational foce. The gavitational foce has been discussed in geat detail in you intoductoy physics couses, and we will pimaily focus
More information- 5 - TEST 1R. This is the repeat version of TEST 1, which was held during Session.
- 5 - TEST 1R This is the epeat vesion of TEST 1, which was held duing Session. This epeat test should be attempted by those students who missed Test 1, o who wish to impove thei mak in Test 1. IF YOU
More informationHopefully Helpful Hints for Gauss s Law
Hopefully Helpful Hints fo Gauss s Law As befoe, thee ae things you need to know about Gauss s Law. In no paticula ode, they ae: a.) In the context of Gauss s Law, at a diffeential level, the electic flux
More informationV7: Diffusional association of proteins and Brownian dynamics simulations
V7: Diffusional association of poteins and Bownian dynamics simulations Bownian motion The paticle movement was discoveed by Robet Bown in 1827 and was intepeted coectly fist by W. Ramsay in 1876. Exact
More information763620SS STATISTICAL PHYSICS Solutions 2 Autumn 2012
763620SS STATISTICAL PHYSICS Solutions 2 Autumn 2012 1. Continuous Random Walk Conside a continuous one-dimensional andom walk. Let w(s i ds i be the pobability that the length of the i th displacement
More informationPhysics 2020, Spring 2005 Lab 5 page 1 of 8. Lab 5. Magnetism
Physics 2020, Sping 2005 Lab 5 page 1 of 8 Lab 5. Magnetism PART I: INTRODUCTION TO MAGNETS This week we will begin wok with magnets and the foces that they poduce. By now you ae an expet on setting up
More informationPHYSICS 272 Electric & Magnetic Interactions. Prof. Andrew Hirsch Room: 178, Phone: 42218
PHYSICS 7 Electic & Magnetic Inteactions Pof. Andew Hisch Hisch@pudue.edu Room: 78, Phone: 48 Couse Content This couse deals with electic and magnetic inteactions, which ae cental to the stuctue of matte,
More informationEN40: Dynamics and Vibrations. Midterm Examination Thursday March
EN40: Dynamics and Vibations Midtem Examination Thusday Mach 9 2017 School of Engineeing Bown Univesity NAME: Geneal Instuctions No collaboation of any kind is pemitted on this examination. You may bing
More informationAngular Momentum About Spin Axis
Angla Momentm Angla Momentm Abot Spin Axis Ω AM (pe nit mass) = velocity moment am M =( + ) = a cos φ Thin-shell appoximation: take distance fom any point in atmosphee to planet s baycente eqal to constant
More informationCircular Motion & Torque Test Review. The period is the amount of time it takes for an object to travel around a circular path once.
Honos Physics Fall, 2016 Cicula Motion & Toque Test Review Name: M. Leonad Instuctions: Complete the following woksheet. SHOW ALL OF YOUR WORK ON A SEPARATE SHEET OF PAPER. 1. Detemine whethe each statement
More informationChapter 2: Introduction to Implicit Equations
Habeman MTH 11 Section V: Paametic and Implicit Equations Chapte : Intoduction to Implicit Equations When we descibe cuves on the coodinate plane with algebaic equations, we can define the elationship
More informationMonte Carlo-based Bidirectional Pedestrian Counting Method for Compound-Eye Sensor Systems
Vo. 4.Specia Issue ICCSII ISSN 279-847 Jouna of Emeging Tends in Computing and Infomation Sciences 29-213 CIS Jouna. A ights eseved. http://www.cisjouna.og Monte Cao-based Bidiectiona Pedestian Counting
More informationInternet Appendix for A Bayesian Approach to Real Options: The Case of Distinguishing Between Temporary and Permanent Shocks
Intenet Appendix fo A Bayesian Appoach to Real Options: The Case of Distinguishing Between Tempoay and Pemanent Shocks Steven R. Genadie Gaduate School of Business, Stanfod Univesity Andey Malenko Gaduate
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department. Problem Set 10 Solutions. r s
MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Depatment Physics 8.033 Decembe 5, 003 Poblem Set 10 Solutions Poblem 1 M s y x test paticle The figue above depicts the geomety of the poblem. The position
More informationLC transfer of energy between the driving source and the circuit will be a maximum.
The Q of oscillatos efeences: L.. Fotney Pinciples of Electonics: Analog and Digital, Hacout Bace Jovanovich 987, Chapte (AC Cicuits) H. J. Pain The Physics of Vibations and Waves, 5 th edition, Wiley
More informationRegularization. Stephen Scott and Vinod Variyam. Introduction. Outline. Machine. Learning. Problems. Measuring. Performance.
leaning can geneally be distilled to an optimization poblem Choose a classifie (function, hypothesis) fom a set of functions that minimizes an objective function Clealy we want pat of this function to
More informationAn Inventory Model for Two Warehouses with Constant Deterioration and Quadratic Demand Rate under Inflation and Permissible Delay in Payments
Ameican Jounal of Engineeing Reseach (AJER) 16 Ameican Jounal of Engineeing Reseach (AJER) e-issn: 3-847 p-issn : 3-936 Volume-5, Issue-6, pp-6-73 www.aje.og Reseach Pape Open Access An Inventoy Model
More informationHINDCASTING OF WIND AND WAVE CLIMATE OF SEAS AROUND RUSSIA
Leonid J. Lopatoukhin, Alexande V. Boukhanovsky, Ecateina S. Chenysheva, Segey V. Ivanov HINDCASTING OF WIND AND WAVE CLIMATE OF SEAS AROUND RUSSIA St. Petesbug State Univesity, Institute fo High Pefomance
More informationPhysics 2B Chapter 22 Notes - Magnetic Field Spring 2018
Physics B Chapte Notes - Magnetic Field Sping 018 Magnetic Field fom a Long Staight Cuent-Caying Wie In Chapte 11 we looked at Isaac Newton s Law of Gavitation, which established that a gavitational field
More informationFigure 1. We will begin by deriving a very general expression before returning to Equations 1 and 2 to determine the specifics.
Deivation of the Laplacian in Spheical Coodinates fom Fist Pinciples. Fist, let me state that the inspiation to do this came fom David Giffiths Intodction to Electodynamics textbook Chapte 1, Section 4.
More informationPhysics 4A Chapter 8: Dynamics II Motion in a Plane
Physics 4A Chapte 8: Dynamics II Motion in a Plane Conceptual Questions and Example Poblems fom Chapte 8 Conceptual Question 8.5 The figue below shows two balls of equal mass moving in vetical cicles.
More informationPHYSICS 272 Electric & Magnetic Interactions
PHYS 7: Matte and Inteactions II -- Electic And Magnetic Inteactions http://www.physics.pudue.edu/academic_pogams/couses/phys7/ PHYSICS 7 Electic & Magnetic Inteactions Lectue 3 Chaged Objects; Polaization
More informationRydberg-Rydberg Interactions
Rydbeg-Rydbeg Inteactions F. Robicheaux Aubun Univesity Rydbeg gas goes to plasma Dipole blockade Coheent pocesses in fozen Rydbeg gases (expts) Theoetical investigation of an excitation hopping though
More informationOSCILLATIONS AND GRAVITATION
1. SIMPLE HARMONIC MOTION Simple hamonic motion is any motion that is equivalent to a single component of unifom cicula motion. In this situation the velocity is always geatest in the middle of the motion,
More informationIEEE/ACM Transactions on Networking. Markov-modeled Downlink Environment: Opportunistic Multiuser Scheduling and the Stability Region
Makov-modeled Downlink Envionment: Oppotunistic Multiuse Scheduling and the Stability Region Jounal: Manuscipt ID: Manuscipt Type: Date Submitted by the Autho: daft Oiginal Aticle n/a Complete List of
More informationExtra notes for circular motion: Circular motion : v keeps changing, maybe both speed and
Exta notes fo cicula motion: Cicula motion : v keeps changing, maybe both speed and diection ae changing. At least v diection is changing. Hence a 0. Acceleation NEEDED to stay on cicula obit: a cp v /,
More informationPrediction of Motion Trajectories Based on Markov Chains
2011 Intenational Confeence on Compute Science and Infomation Technology (ICCSIT 2011) IPCSIT vol. 51 (2012) (2012) IACSIT Pess, Singapoe DOI: 10.7763/IPCSIT.2012.V51.50 Pediction of Motion Tajectoies
More informationPushdown Automata (PDAs)
CHAPTER 2 Context-Fee Languages Contents Context-Fee Gammas definitions, examples, designing, ambiguity, Chomsky nomal fom Pushdown Automata definitions, examples, euivalence with context-fee gammas Non-Context-Fee
More informationDeep Reinforcement Learning. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 19, 2017
Deep Reinforcement Learning STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 19, 2017 Outline Introduction to Reinforcement Learning AlphaGo (Deep RL for Computer Go)
More informationMachine Learning. Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING. Slides adapted from Tom Mitchell and Peter Abeel
Machine Learning Machine Learning: Jordan Boyd-Graber University of Maryland REINFORCEMENT LEARNING Slides adapted from Tom Mitchell and Peter Abeel Machine Learning: Jordan Boyd-Graber UMD Machine Learning
More informationF g. = G mm. m 1. = 7.0 kg m 2. = 5.5 kg r = 0.60 m G = N m 2 kg 2 = = N
Chapte answes Heinemann Physics 4e Section. Woked example: Ty youself.. GRAVITATIONAL ATTRACTION BETWEEN SMALL OBJECTS Two bowling balls ae sitting next to each othe on a shelf so that the centes of the
More informationMachine Learning and Rendering
East Building, Balloom BC nvidia.com/siggaph2018 Machine Leaning and Rendeing Alex Kelle, Diecto of Reseach Machine Leaning and Rendeing Couse web page at https://sites.google.com/site/mlandendeing/ 14:00
More informationTemporal-Difference Learning
.997 Decision-Making in Lage-Scale Systems Mach 17 MIT, Sping 004 Handout #17 Lectue Note 13 1 Tempoal-Diffeence Leaning We now conside the poblem of computing an appopiate paamete, so that, given an appoximation
More informationPerformance Loss Bounds for Approximate Value Iteration with State Aggregation
MATHEMATICS OF OPERATIONS RESEARCH Vol. 31, No. 2, May 2006, pp. 234 244 issn 0364-765X eissn 1526-5471 06 3102 0234 infoms doi 10.1287/moo.1060.0188 2006 INFORMS Pefomance Loss Bounds fo Appoximate Value
More informationPS113 Chapter 5 Dynamics of Uniform Circular Motion
PS113 Chapte 5 Dynamics of Unifom Cicula Motion 1 Unifom cicula motion Unifom cicula motion is the motion of an object taveling at a constant (unifom) speed on a cicula path. The peiod T is the time equied
More informationDEVIL PHYSICS THE BADDEST CLASS ON CAMPUS IB PHYSICS
DEVIL PHYSICS THE BADDEST CLASS ON CAMPUS IB PHYSICS LSN 10-: MOTION IN A GRAVITATIONAL FIELD Questions Fom Reading Activity? Gavity Waves? Essential Idea: Simila appoaches can be taken in analyzing electical
More informationMechanics Physics 151
Mechanics Physics 5 Lectue 5 Centa Foce Pobem (Chapte 3) What We Did Last Time Intoduced Hamiton s Pincipe Action intega is stationay fo the actua path Deived Lagange s Equations Used cacuus of vaiation
More informationyou of a spring. The potential energy for a spring is given by the parabola U( x)
Small oscillations The theoy of small oscillations is an extemely impotant topic in mechanics. Conside a system that has a potential enegy diagam as below: U B C A x Thee ae thee points of stable equilibium,
More informationDescribing Circular motion
Unifom Cicula Motion Descibing Cicula motion In ode to undestand cicula motion, we fist need to discuss how to subtact vectos. The easiest way to explain subtacting vectos is to descibe it as adding a
More informationSuggested Solutions to Homework #4 Econ 511b (Part I), Spring 2004
Suggested Solutions to Homewok #4 Econ 5b (Pat I), Sping 2004. Conside a neoclassical gowth model with valued leisue. The (epesentative) consume values steams of consumption and leisue accoding to P t=0
More informationCentral Coverage Bayes Prediction Intervals for the Generalized Pareto Distribution
Statistics Reseach Lettes Vol. Iss., Novembe Cental Coveage Bayes Pediction Intevals fo the Genealized Paeto Distibution Gyan Pakash Depatment of Community Medicine S. N. Medical College, Aga, U. P., India
More informationMechanics Physics 151
Mechanics Physics 5 Lectue 5 Centa Foce Pobem (Chapte 3) What We Did Last Time Intoduced Hamiton s Pincipe Action intega is stationay fo the actua path Deived Lagange s Equations Used cacuus of vaiation
More informationbiologically-inspired computing lecture 9 Informatics luis rocha 2015 INDIANA UNIVERSITY biologically Inspired computing
luis ocha 25 lectue 9 -inspied luis ocha 25 Sections I485/H4 couse outlook Assignments: 35% Students will complete 4/5 assignments based on algoithms pesented in class Lab meets in I (West) 9 on Lab Wednesdays
More informationMultiple Experts with Binary Features
Multiple Expets with Binay Featues Ye Jin & Lingen Zhang Decembe 9, 2010 1 Intoduction Ou intuition fo the poect comes fom the pape Supevised Leaning fom Multiple Expets: Whom to tust when eveyone lies
More informationKey Establishment Protocols. Cryptography CS 507 Erkay Savas Sabanci University
Key Establishment Potocols Cyptogaphy CS 507 Ekay Savas Sabanci Univesity ekays@sabanciuniv.edu Key distibution poblem Secuity of the keys Even if the cyptogaphic algoithms & potocols ae cyptogaphically
More informationJackson 3.3 Homework Problem Solution Dr. Christopher S. Baird University of Massachusetts Lowell
Jackson 3.3 Homewok Pobem Soution D. Chistophe S. Baid Univesity of Massachusetts Lowe POBLEM: A thin, fat, conducting, cicua disc of adius is ocated in the x-y pane with its cente at the oigin, and is
More informationVector Control. Application to Induction Motor Control. DSP in Motion Control - Seminar
Vecto Contol Application to Induction Moto Contol Vecto Contol - Pinciple The Aim of Vecto Contol is to Oient the Flux Poducing Component of the Stato Cuent to some Suitable Flux Vecto unde all Opeating
More informationThree-dimensional systems with spherical symmetry
Thee-dimensiona systems with spheica symmety Thee-dimensiona systems with spheica symmety 006 Quantum Mechanics Pof. Y. F. Chen Thee-dimensiona systems with spheica symmety We conside a patice moving in
More informationUnit 6 Test Review Gravitation & Oscillation Chapters 13 & 15
A.P. Physics C Unit 6 Test Review Gavitation & Oscillation Chaptes 13 & 15 * In studying fo you test, make sue to study this eview sheet along with you quizzes and homewok assignments. Multiple Choice
More informationSAMPLE QUIZ 3 - PHYSICS For a right triangle: sin θ = a c, cos θ = b c, tan θ = a b,
SAMPLE QUIZ 3 - PHYSICS 1301.1 his is a closed book, closed notes quiz. Calculatos ae pemitted. he ONLY fomulas that may be used ae those given below. Define all symbols and justify all mathematical expessions
More informationObjectives: After finishing this unit you should be able to:
lectic Field 7 Objectives: Afte finishing this unit you should be able to: Define the electic field and explain what detemines its magnitude and diection. Wite and apply fomulas fo the electic field intensity
More informationMarkov Decision Processes (and a small amount of reinforcement learning)
Markov Decision Processes (and a small amount of reinforcement learning) Slides adapted from: Brian Williams, MIT Manuela Veloso, Andrew Moore, Reid Simmons, & Tom Mitchell, CMU Nicholas Roy 16.4/13 Session
More informationCentral Force Motion
Cental Foce Motion Cental Foce Poblem Find the motion of two bodies inteacting via a cental foce. Examples: Gavitational foce (Keple poblem): m1m F 1, ( ) =! G ˆ Linea estoing foce: F 1, ( ) =! k ˆ Two
More informationLecture 25: Learning 4. Victor R. Lesser. CMPSCI 683 Fall 2010
Lecture 25: Learning 4 Victor R. Lesser CMPSCI 683 Fall 2010 Final Exam Information Final EXAM on Th 12/16 at 4:00pm in Lederle Grad Res Ctr Rm A301 2 Hours but obviously you can leave early! Open Book
More information4. Some Applications of first order linear differential
August 30, 2011 4-1 4. Some Applications of fist ode linea diffeential Equations The modeling poblem Thee ae seveal steps equied fo modeling scientific phenomena 1. Data collection (expeimentation) Given
More informationTeachers notes. Beyond the Thrills excursions. Worksheets in this book. Completing the worksheets
Beyond the Thills excusions Teaches notes Physics is the science of how the wold (and Univese) woks. Luna Pak Sydney is a lage hands-on physics laboatoy full of fee falling objects, otating systems and
More informationConjugate Gradient Methods. Michael Bader. Summer term 2012
Gadient Methods Outlines Pat I: Quadatic Foms and Steepest Descent Pat II: Gadients Pat III: Summe tem 2012 Pat I: Quadatic Foms and Steepest Descent Outlines Pat I: Quadatic Foms and Steepest Descent
More informationTANTON S TAKE ON CONTINUOUS COMPOUND INTEREST
CURRICULUM ISPIRATIOS: www.maa.og/ci www.theglobalmathpoject.og IOVATIVE CURRICULUM OLIE EXPERIECES: www.gdaymath.com TATO TIDBITS: www.jamestanton.com TATO S TAKE O COTIUOUS COMPOUD ITEREST DECEMBER 208
More informationMacro Theory B. The Permanent Income Hypothesis
Maco Theoy B The Pemanent Income Hypothesis Ofe Setty The Eitan Beglas School of Economics - Tel Aviv Univesity May 15, 2015 1 1 Motivation 1.1 An econometic check We want to build an empiical model with
More informationLoad Balancing and Pricing for Spectrum Access Control in Cognitive Radio Networks
Gobecom 2014 - Cognitive Radio and Netwoks Symposium 1 Load Baancing and Picing fo Spectum Access Conto in Cognitive Radio Netwoks Nguyen H. Tan, Dai H. Tan, Long Bao Le, Zhu Han and Choong Seon Hong Depatment
More informationPHYSICS 272 Electric & Magnetic Interactions. Prof. Andrew Hirsch Room: 178, Phone: 42218
PHYSICS 7 Electic & Magnetic Inteactions Pof. Andew Hisch Hisch@pudue.edu Room: 78, Phone: 48 Couse Content This couse deals with electic and magnetic inteactions, which ae cental to the stuctue of matte,
More informationReinforcement Learning
Reinforcement Learning Ron Parr CompSci 7 Department of Computer Science Duke University With thanks to Kris Hauser for some content RL Highlights Everybody likes to learn from experience Use ML techniques
More informationPhysics 11 Chapter 4: Forces and Newton s Laws of Motion. Problem Solving
Physics 11 Chapte 4: Foces and Newton s Laws of Motion Thee is nothing eithe good o bad, but thinking makes it so. William Shakespeae It s not what happens to you that detemines how fa you will go in life;
More informationNotes on Reinforcement Learning
1 Introduction Notes on Reinforcement Learning Paulo Eduardo Rauber 2014 Reinforcement learning is the study of agents that act in an environment with the goal of maximizing cumulative reward signals.
More information