Artificial Intelligence Markov Decision Problems
|
|
- Tyler Simpson
- 6 years ago
- Views:
Transcription
1 rtificil Intelligence Mrkov eciion Problem ilon - briefly mentioned in hpter Ruell nd orvig - hpter 7 Mrkov eciion Problem; pge of Mrkov eciion Problem; pge of exmple: probbilitic blockworld ction outcome probbility time (= cot) move ucce filure..9 minute minute pint ucce. minute Mrkov eciion Problem; pge of how to determine the verge pln-execution time of given pln t move to t i = verge pln-execution time until tte i reched if the gent in tte i nd follow the pln G t F t color move F to G t t = +. t +.9 t t = +. t t = +. t +.9 t t = t = (= verge pln-execution time) t = t = t = Mrkov eciion Problem; pge of color color H I move H to I color J J Pln verge pln execution time 6. minute Pln verge pln execution time 9. minute K L move K to L move to G F color move F to G S move S to tble T U move T to U Pln verge pln execution time. minute Pln verge pln execution time. minute M P O V W Y X move M to move O to P move V to W move X to Y R Q Z move Q to R move Z to
2 how to (lmot) detemine pln with miniml verge pln-execution time determinitic plnning nd erch move to move to ground color color move to the deciion tree i infinite! ction re determinitic Mrkov eciion Problem; pge 5 of move to the vlue ocited with thee chnce node hould be the me ==> the ction ocited with thee choice node hould be the me ==> whenever the configurtion of the block i the me, one wnt to execute the me ction (= policy) ction hve determinitic effect tte nd ction determine uniquely the ucceor tte tte re completely obervble pln i equence of ction (= pth) minimize totl cot optiml pln i cyclic Mrkov eciion Problem; pge 6 of ction re probbilitic the robot cn drift probbilitic plnning nd erch =Mrkov eciion Problem (MP) Mrkov property ction hve probbilitic effect tte nd ction uniquely determine prob ditribution over ucceor tte tte re completely obervble pln i mpping from tte to ction (= policy) minimize expected totl cot optiml pln cn be cyclic.5 - W W determinitic plnning nd erch pln i equence of ction (= pth) cn be found uing (forwrd or bckwrd) erch Mrkov eciion Problem pln i mpping from tte to ction (= policy) how to find it? determine the expected ditnce of ll tte greedily ign the ction to ech tte tht decree the expected ditnce the mot Mrkov eciion Problem; pge 7 of Mrkov eciion Problem; pge 8 of
3 () ucc(,) c(,) gd() determinitic plnning nd erch = ditnce of tte = tte = ction = et of ction tht cn be executed in tte = the tte tht reult from the execution of ction in tte = the cot tht reult from the execution of ction in tte gd() = gd() = min ε () c(,) + gd(ucc(,)) if i tte if i not tte () ucc(,) c(,) p(,) gd() Mrkov eciion Problem.5 = tte = ction = et of ction tht cn be executed in tte = the et of tte tht cn reult from the execution of ction in tte = the cot tht reult from the execution of ction in tte = the probbility tht tte reult from the execution of ction in tte = expected ditnce of tte gd() = gd() = min ε () (c(,) + Σ ε ucc(,) p(,) gd( )) ellmn eqution if i tte if i not tte () = the optiml ction to execute in tte () = the optiml ction to execute in tte () = rgmin ε () c(,) + gd(ucc(,)) if i not tte () = rgmin ε () (c(,) + Σ ε ucc(,) p(,) gd( )) if i not tte Mrkov eciion Problem; pge 9 of Mrkov eciion Problem; pge of exmple determinitic plnning nd erch determinitic plnning nd erch Mrkov eciion Problem; pge of Mrkov eciion Problem.5 given the expected ditnce, we cn ue the definition to check them but clculting them i chicken-nd-egg problem () ucc(,) c(,) Mrkov eciion Problem; pge of gd() = ditnce of tte (= miniml cot until i reched if execution t gd i () = miniml cot until i reched or i ction hve been executed if execution in tte for i lrger thn contnt: gd() = gd i () (= once gd i () = gd i- () for ll tte ) gd () = = tte = ction = et of ction tht cn be executed in tte = the tte tht reult from the execution of ction in tte = the cot tht reult from the execution of ction in tte gd i () = gd i () = min ε () (c(,) + gd i- (ucc(,))) if i tte if i not tte
4 () ucc(,) c(,) p(,) gd() gd i () Mrkov eciion Problem = tte = ction = et of ction tht cn be executed in tte = the et of tte tht cn reult from the execution of ction in tte = the cot tht reult from the execution of ction in tte = the probbility tht tte reult from the execution of ction in tte = expected ditnce of tte = miniml expected cot until i reched or i ction hve been executed if execution in tte gd() = lim i -> infinity gd i () (not necerily fter finite mount of time). i :=. Set (for ll εs) gd i () =.. i := i+. Set (for ll εs) Vlue Itertion mintin pproximtion of the ditnce (= vlue) gd i () = gd i () = min ε () (c(,) + Σ ε ucc(,) p(,) gd i- ( )) 5. If (for ome εs) gd i () - gd i- () > mll contnt, go to. 6. Set (for ll εs tht re not tte) () = rgmin ε () (c(,) + Σ ε ucc(,) p(,) gd i ( )) if i tte if i not tte gd () = gd i () = gd i () = min ε () (c(,) + Σ ε ucc(,) p(,) gd i- ( )) if i tte if i not tte Mrkov eciion Problem; pge of Mrkov eciion Problem; pge of exmple of Vlue Itertion ().5 tte i 5 6 exmple of Vlue Itertion ().5 tte (no dicounting) tte which ction to execute in the tte? : + = : =.75 execute! Mrkov eciion Problem; pge 5 of Mrkov eciion Problem; pge 6 of
5 Mrkov eciion Problem; pge 7 of Policy Itertion mintin policy. i :=. Set (for ll εs tht re not tte) i () to n rbitrry ction in ().. Set gd i () to the verge pln-execution time until tte i reched if the gent in tte nd follow policy i. i := i+ 5. Set (for ll εs tht re not tte) i () = rgmin ε () (c(,) + Σ ε ucc(,) p(,) gd i ( )) 6. If (for ome εs tht i not tte) i () doe not equl i- (), go to 7. Set (for ll εs tht re not tte) () = i (). ote: The initil policy o h to gurntee tht the gent reche tte with probbility one no mtter which tte it i ed in. Mrkov eciion Problem; pge 8 of exmple of Policy Itertion.5 tte (no dicounting) policy t i= () = (could lo hve been ) nd (tte ) = gd () = +.5 gd () +.5 gd (tte) = 6 gd (tte ) = +.5 gd () +.5 gd () = gd () = policy t i= () = nd (tte ) = gd () = +. gd () = gd (tte ) = +.5 gd () +.5 gd () =.5 gd () = policy t i= () = nd (tte ) = execute ction in the tte! extenion: no () extenion: no () cnnot minimize expected totl cot wht if there i no? living in the world cn no longer minimize expected cot until the i reched expected totl cot = infinite expected totl cot = infinite here: - cn minimize expected cot per ction execution - cn minimize expected totl dicounted cot Mrkov eciion Problem; pge 9 of Mrkov eciion Problem; pge of
6 extenion: no () totl dicounted cot = dicount fctor extenion: no () cn minimize the expected totl dicounted cot - ume γ =.9 if the interet rte i (-γ)/γ (for < γ < ), how much money do I need to py omeone right now o tht there i no difference to pying the following yerly intllment expected totl dicounted cot =.9 expected totl dicounted cot =. x dollr right now re worth ( + (-γ)/γ)x = x/γ dollr in yer o, y dollr in yer re worth γ y dollr right now nwer: + γ + γ + γ + γ + Mrkov eciion Problem; pge of Mrkov eciion Problem; pge of - dicounting mke the totl cot finite c c c c c expected totl dicounted cot = c/(-γ) - dicounting moothe out the horizon - dicounting cn be interpreted the probbility of dying Mrkov eciion Problem; pge of extenion: no (5) dicounting: if the interet rte i (-γ)/γ, then y dollr in yer re worth γ y dollr right now dying: if I die lter thi yer with probbility -γ, then the expected vlue of y dollr in yer i γ y right now γ () ucc(,) c(,) p(,) gd() Mrkov eciion Problem; pge of = dicount fctor ( < γ < ); if there i, cn et γ = (no dicounting) = tte = ction = et of ction tht cn be executed in tte = the et of tte tht cn reult from the execution of ction in tte = the cot tht reult from the execution of ction in tte = the probbility tht tte reult from the execution of ction in tte = miniml expected dicounted totl cot if execution in tte gd() = gd() = min ε () (c(,) + γ Σ ε ucc(,) p(,) gd( )) gd i () () = the optiml ction to execute in tte () = rgmin ε () (c(,) + γ Σ ε ucc(,) p(,) gd( )) if i tte if i not tte = miniml expected dicounted totl cot until i reched or i ction hve been executed if execution in tte gd () = gd i () = gd i () = min ε () (c(,) + γ Σ ε ucc(,) p(,) gd i- ( )) gd() = lim i -> infinity gd i () Vlue-Itertion with or without dicounting for ll if i tte if i not tte if i not tte gd() doe not necerily converge fter finite mount of time () converge fter finite mount of time if gd() i pproximted with gd i () for ll
7 exmple of Vlue Itertion ().5 tte (dicount fctor =.9) i exmple of Vlue Itertion ().5 tte (dicount fctor =.9) which ction to execute in the tte? tte : + = : =.75 execute! (In generl, the optiml ction depend on the dicount fctor!) Mrkov eciion Problem; pge 5 of Mrkov eciion Problem; pge 6 of lerning for optimiztion reinforcement lerning with Mrkov eciion Proce Model exm exmple find policy (behvior) tht mximize the expected totl dicounted rewrd even in the preence of delyed rewrd if you don t know the ction outcome (rewrd nd probbilitie): reinforcement lerning lerning for optimiztion reinforcement lerning with Mrkov eciion Proce Model pproch etimte the probbilitie nd rewrd ue vlue-itertion time 8 time p(,) =? explortion/exploittion trdeoff Mrkov eciion Problem; pge 7 of Mrkov eciion Problem; pge 8 of
8 lerning for optimiztion reinforcement lerning with Mrkov eciion Proce Model pproch ue Q-lerning Mrkov eciion Problem; pge 9 of if you execute ction in tte nd you receive cot c nd mke trnition to tte then updte Q(,) = Q(,) + α (c + γ V( ) - Q(,)) lerning rte dicount fctor < γ < V( ) = min ε ( ) Q(,) Q(,) = miniml expected dicounted totl cot until i reched if execution in tte nd the firt ction executed i V( ) = miniml expected dicounted totl cot until i reched if execution in tte (= gd( ) ) lerning for optimiztion reinforcement lerning with Mrkov eciion Proce Model pproch. Initilize Q(,) = for ll tte nd ction.. := the current tte.. if i tte then.. hooe n ction to execute in the current tte. (The ction believed to be bet i := rgmin ε () Q(,).) 5. xecute ction. Oberve the cot c nd ucceor tte. 6. Updte Q(,) = Q(,) + α (c + γ V( ) - Q(,)). 7. Goto. Q(, ) = 5. Mrkov eciion Problem; pge of Q(,) =. cot prob.5 prob.5 Q(, ) Q(, ) Q(, ) =. =. =.5 Q(, ) =.9
Reinforcement Learning and Policy Reuse
Reinforcement Lerning nd Policy Reue Mnuel M. Veloo PEL Fll 206 Reding: Reinforcement Lerning: An Introduction R. Sutton nd A. Brto Probbilitic policy reue in reinforcement lerning gent Fernndo Fernndez
More informationReinforcement learning
Reinforcement lerning Regulr MDP Given: Trnition model P Rewrd function R Find: Policy π Reinforcement lerning Trnition model nd rewrd function initilly unknown Still need to find the right policy Lern
More informationReinforcement learning II
CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic
More informationReinforcement Learning
Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm
More information19 Optimal behavior: Game theory
Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,
More informationDecision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees
CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize
More informationModule 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo
Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:
More informationBellman Optimality Equation for V*
Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s
More informationAdministrivia CSE 190: Reinforcement Learning: An Introduction
Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these
More informationNon-Myopic Multi-Aspect Sensing with Partially Observable Markov Decision Processes
Non-Myopic Multi-Apect Sening with Prtilly Oervle Mrkov Deciion Procee Shiho Ji 2 Ronld Prr nd Lwrence Crin Deprtment of Electricl & Computer Engineering 2 Deprtment of Computer Engineering Duke Univerity
More informationCS 188: Artificial Intelligence Fall 2010
CS 188: Artificil Intelligence Fll 2010 Lecture 18: Decision Digrms 10/28/2010 Dn Klein C Berkeley Vlue of Informtion 1 Decision Networks ME: choose the ction which mximizes the expected utility given
More informationReinforcement Learning for Robotic Locomotions
Reinforcement Lerning for Robotic Locomotion Bo Liu Stnford Univerity 121 Cmpu Drive Stnford, CA 94305, USA bliuxix@tnford.edu Hunzhong Xu Stnford Univerity 121 Cmpu Drive Stnford, CA 94305, USA xuhunvc@tnford.edu
More informationCS 188: Artificial Intelligence
CS 188: Artificil Intelligence Lecture 19: Decision Digrms Pieter Abbeel --- C Berkeley Mny slides over this course dpted from Dn Klein, Sturt Russell, Andrew Moore Decision Networks ME: choose the ction
More information{ } = E! & $ " k r t +k +1
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationChapter 4: Dynamic Programming
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationBias in Natural Actor-Critic Algorithms
Bi in Nturl Actor-Critic Algorithm Philip S. Thom pthom@c.um.edu Deprtment of Computer Science, Univerity of Mchuett, Amhert, MA 01002 USA Technicl Report UM-CS-2012-018 Abtrct We how tht two populr dicounted
More informationUninformed Search Lecture 4
Lecture 4 Wht re common serch strtegies tht operte given only serch problem? How do they compre? 1 Agend A quick refresher DFS, BFS, ID-DFS, UCS Unifiction! 2 Serch Problem Formlism Defined vi the following
More informationDecision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information
CS 188: Artificil Intelligence nd Vlue of Informtion Instructors: Dn Klein nd Pieter Abbeel niversity of Cliforni, Berkeley [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI
More informationLecture 3 Gaussian Probability Distribution
Introduction Lecture 3 Gussin Probbility Distribution Gussin probbility distribution is perhps the most used distribution in ll of science. lso clled bell shped curve or norml distribution Unlike the binomil
More informationTP 10:Importance Sampling-The Metropolis Algorithm-The Ising Model-The Jackknife Method
TP 0:Importnce Smpling-The Metropoli Algorithm-The Iing Model-The Jckknife Method June, 200 The Cnonicl Enemble We conider phyicl ytem which re in therml contct with n environment. The environment i uully
More informationPHYSICS 211 MIDTERM I 22 October 2003
PHYSICS MIDTERM I October 3 Exm i cloed book, cloed note. Ue onl our formul heet. Write ll work nd nwer in exm booklet. The bck of pge will not be grded unle ou o requet on the front of the pge. Show ll
More informationMath 116 Final Exam April 26, 2013
Mth 6 Finl Exm April 26, 23 Nme: EXAM SOLUTIONS Instructor: Section:. Do not open this exm until you re told to do so. 2. This exm hs 5 pges including this cover. There re problems. Note tht the problems
More informationCS 188: Artificial Intelligence Spring 2007
CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment
More informationMarkov Decision Processes
Mrkov Deciion Procee A Brief Introduction nd Overview Jck L. King Ph.D. Geno UK Limited Preenttion Outline Introduction to MDP Motivtion for Study Definition Key Point of Interet Solution Technique Prtilly
More information1 Probability Density Functions
Lis Yn CS 9 Continuous Distributions Lecture Notes #9 July 6, 28 Bsed on chpter by Chris Piech So fr, ll rndom vribles we hve seen hve been discrete. In ll the cses we hve seen in CS 9, this ment tht our
More informationCS 188 Introduction to Artificial Intelligence Fall 2018 Note 7
CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees
More informationAcceptance Sampling by Attributes
Introduction Acceptnce Smpling by Attributes Acceptnce smpling is concerned with inspection nd decision mking regrding products. Three spects of smpling re importnt: o Involves rndom smpling of n entire
More informationMArkov decision processes (MDPs) have been widely
Spre Mrkov Deciion Procee with Cul Spre Tlli Entropy Regulriztion for Reinforcement Lerning yungje Lee, Sungjoon Choi, nd Songhwi Oh rxiv:709.0693v3 [c.lg] 3 Oct 07 Abtrct In thi pper, re Mrkov deciion
More information4-4 E-field Calculations using Coulomb s Law
1/11/5 ection_4_4_e-field_clcultion_uing_coulomb_lw_empty.doc 1/1 4-4 E-field Clcultion uing Coulomb Lw Reding Aignment: pp. 9-98 Specificlly: 1. HO: The Uniform, Infinite Line Chrge. HO: The Uniform Dik
More informationAnalysis of Variance and Design of Experiments-II
Anlyi of Vrince nd Deign of Experiment-II MODULE VI LECTURE - 7 SPLIT-PLOT AND STRIP-PLOT DESIGNS Dr. Shlbh Deprtment of Mthemtic & Sttitic Indin Intitute of Technology Knpur Anlyi of covrince ith one
More information20.2. The Transform and its Inverse. Introduction. Prerequisites. Learning Outcomes
The Trnform nd it Invere 2.2 Introduction In thi Section we formlly introduce the Lplce trnform. The trnform i only pplied to cul function which were introduced in Section 2.1. We find the Lplce trnform
More informationImproper Integrals. Type I Improper Integrals How do we evaluate an integral such as
Improper Integrls Two different types of integrls cn qulify s improper. The first type of improper integrl (which we will refer to s Type I) involves evluting n integrl over n infinite region. In the grph
More informationAdding and Subtracting Rational Expressions
6.4 Adding nd Subtrcting Rtionl Epressions Essentil Question How cn you determine the domin of the sum or difference of two rtionl epressions? You cn dd nd subtrct rtionl epressions in much the sme wy
More informationPolicy Gradient Methods for Reinforcement Learning with Function Approximation
Policy Grdient Method for Reinforcement Lerning with Function Approximtion Richrd S. Sutton, Dvid McAlleter, Stinder Singh, Yihy Mnour AT&T Lb Reerch, 180 Prk Avenue, Florhm Prk, NJ 07932 Abtrct Function
More informationWorking with Powers and Exponents
Working ith Poer nd Eponent Nme: September. 00 Repeted Multipliction Remember multipliction i y to rite repeted ddition. To y +++ e rite. Sometime multipliction i done over nd over nd over. To rite e rite.
More informationProperties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives
Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn
More informationEfficient Planning in R-max
Efficient Plnning in R-mx Mrek Grześ nd Jee Hoey Dvid R. Cheriton School of Computer Science, Univerity of Wterloo 200 Univerity Avenue Wet, Wterloo, ON, N2L 3G1, Cnd {mgrze, jhoey}@c.uwterloo.c ABSTRACT
More informationChapter 6 Notes, Larson/Hostetler 3e
Contents 6. Antiderivtives nd the Rules of Integrtion.......................... 6. Are nd the Definite Integrl.................................. 6.. Are............................................ 6. Reimnn
More informationThe graphs of Rational Functions
Lecture 4 5A: The its of Rtionl Functions s x nd s x + The grphs of Rtionl Functions The grphs of rtionl functions hve severl differences compred to power functions. One of the differences is the behvior
More informationf(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral
Improper Integrls Every time tht we hve evluted definite integrl such s f(x) dx, we hve mde two implicit ssumptions bout the integrl:. The intervl [, b] is finite, nd. f(x) is continuous on [, b]. If one
More informationEach term is formed by adding a constant to the previous term. Geometric progression
Chpter 4 Mthemticl Progressions PROGRESSION AND SEQUENCE Sequence A sequence is succession of numbers ech of which is formed ccording to definite lw tht is the sme throughout the sequence. Arithmetic Progression
More informationMathematics Extension 1
04 Bored of Studies Tril Emintions Mthemtics Etension Written by Crrotsticks & Trebl. Generl Instructions Totl Mrks 70 Reding time 5 minutes. Working time hours. Write using blck or blue pen. Blck pen
More information1. Extend QR downwards to meet the x-axis at U(6, 0). y
In the digrm, two stright lines re to be drwn through so tht the lines divide the figure OPQRST into pieces of equl re Find the sum of the slopes of the lines R(6, ) S(, ) T(, 0) Determine ll liner functions
More informationWe will see what is meant by standard form very shortly
THEOREM: For fesible liner progrm in its stndrd form, the optimum vlue of the objective over its nonempty fesible region is () either unbounded or (b) is chievble t lest t one extreme point of the fesible
More information2D1431 Machine Learning Lab 3: Reinforcement Learning
2D1431 Mchine Lerning Lb 3: Reinforcement Lerning Frnk Hoffmnn modified by Örjn Ekeberg December 7, 2004 1 Introduction In this lb you will lern bout dynmic progrmming nd reinforcement lerning. It is ssumed
More informationChapter 5 Plan-Space Planning
Lecture slides for Automted Plnning: Theory nd Prctice Chpter 5 Pln-Spce Plnning Dn S. Nu CMSC 722, AI Plnning University of Mrylnd, Spring 2008 1 Stte-Spce Plnning Motivtion g 1 1 g 4 4 s 0 g 5 5 g 2
More informationMath 8 Winter 2015 Applications of Integration
Mth 8 Winter 205 Applictions of Integrtion Here re few importnt pplictions of integrtion. The pplictions you my see on n exm in this course include only the Net Chnge Theorem (which is relly just the Fundmentl
More informationCf. Linn Sennott, Stochastic Dynamic Programming and the Control of Queueing Systems, Wiley Series in Probability & Statistics, 1999.
Cf. Linn Sennott, Stochstic Dynmic Progrmming nd the Control of Queueing Systems, Wiley Series in Probbility & Sttistics, 1999. D.L.Bricker, 2001 Dept of Industril Engineering The University of Iow MDP
More informationChapter 5 : Continuous Random Variables
STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 216 Néhémy Lim Chpter 5 : Continuous Rndom Vribles Nottions. N {, 1, 2,...}, set of nturl numbers (i.e. ll nonnegtive integers); N {1, 2,...}, set of ll
More informationPre-Calculus TMTA Test 2018
. For the function f ( x) ( x )( x )( x 4) find the verge rte of chnge from x to x. ) 70 4 8.4 8.4 4 7 logb 8. If logb.07, logb 4.96, nd logb.60, then ).08..867.9.48. For, ) sec (sin ) is equivlent to
More informationA sequence is a list of numbers in a specific order. A series is a sum of the terms of a sequence.
Core Module Revision Sheet The C exm is hour 30 minutes long nd is in two sections. Section A (36 mrks) 8 0 short questions worth no more thn 5 mrks ech. Section B (36 mrks) 3 questions worth mrks ech.
More informationMulti-Armed Bandits: Non-adaptive and Adaptive Sampling
CSE 547/Stt 548: Mchine Lerning for Big Dt Lecture Multi-Armed Bndits: Non-dptive nd Adptive Smpling Instructor: Shm Kkde 1 The (stochstic) multi-rmed bndit problem The bsic prdigm is s follows: K Independent
More informationThe Fundamental Theorem of Calculus, Particle Motion, and Average Value
The Fundmentl Theorem of Clculus, Prticle Motion, nd Averge Vlue b Three Things to Alwys Keep In Mind: (1) v( dt p( b) p( ), where v( represents the velocity nd p( represents the position. b (2) v ( dt
More informationSolutions Problem Set 2. Problem (a) Let M denote the DFA constructed by swapping the accept and non-accepting state in M.
Solution Prolem Set 2 Prolem.4 () Let M denote the DFA contructed y wpping the ccept nd non-ccepting tte in M. For ny tring w B, w will e ccepted y M, tht i, fter conuming the tring w, M will e in n ccepting
More informationMath 31S. Rumbos Fall Solutions to Assignment #16
Mth 31S. Rumbos Fll 2016 1 Solutions to Assignment #16 1. Logistic Growth 1. Suppose tht the growth of certin niml popultion is governed by the differentil eqution 1000 dn N dt = 100 N, (1) where N(t)
More informationName Solutions to Test 3 November 8, 2017
Nme Solutions to Test 3 November 8, 07 This test consists of three prts. Plese note tht in prts II nd III, you cn skip one question of those offered. Some possibly useful formuls cn be found below. Brrier
More informationSection 6: Area, Volume, and Average Value
Chpter The Integrl Applied Clculus Section 6: Are, Volume, nd Averge Vlue Are We hve lredy used integrls to find the re etween the grph of function nd the horizontl xis. Integrls cn lso e used to find
More informationContinuous Random Variables
STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 217 Néhémy Lim Continuous Rndom Vribles Nottion. The indictor function of set S is rel-vlued function defined by : { 1 if x S 1 S (x) if x S Suppose tht
More informationStudent Session Topic: Particle Motion
Student Session Topic: Prticle Motion Prticle motion nd similr problems re on the AP Clculus exms lmost every yer. The prticle my be prticle, person, cr, etc. The position, velocity or ccelertion my be
More informationRobot Planning in Partially Observable Continuous Domains
Robot Plnning in Prtilly Obervble Continuou Domin Joep M. Port Intitut de Robòtic i Informàtic Indutril (UPC-CSIC) Lloren i Artig 4-6, 828, Brcelon Spin Emil: port@iri.upc.edu Mtthij T. J. Spn Informtic
More informationLine Integrals. Partitioning the Curve. Estimating the Mass
Line Integrls Suppose we hve curve in the xy plne nd ssocite density δ(p ) = δ(x, y) t ech point on the curve. urves, of course, do not hve density or mss, but it my sometimes be convenient or useful to
More informationReview of Calculus, cont d
Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some
More informationDATA Search I 魏忠钰. 复旦大学大数据学院 School of Data Science, Fudan University. March 7 th, 2018
DATA620006 魏忠钰 Serch I Mrch 7 th, 2018 Outline Serch Problems Uninformed Serch Depth-First Serch Bredth-First Serch Uniform-Cost Serch Rel world tsk - Pc-mn Serch problems A serch problem consists of:
More informationMath 2142 Homework 2 Solutions. Problem 1. Prove the following formulas for Laplace transforms for s > 0. a s 2 + a 2 L{cos at} = e st.
Mth 2142 Homework 2 Solution Problem 1. Prove the following formul for Lplce trnform for >. L{1} = 1 L{t} = 1 2 L{in t} = 2 + 2 L{co t} = 2 + 2 Solution. For the firt Lplce trnform, we need to clculte:
More informationRobot Planning in Partially Observable Continuous Domains
Robot Plnning in Prtilly Obervble Continuou Domin Joep M. Port Intitut de Robòtic i Informàtic Indutril (UPC-CSIC) Lloren i Artig 4-6, 828, Brcelon Spin Emil: port@iri.upc.edu Mtthij T. J. Spn Informtic
More informationModule 9: Tries and String Matching
Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer
More informationModule 9: Tries and String Matching
Module 9: Tries nd String Mtching CS 240 - Dt Structures nd Dt Mngement Sjed Hque Veronik Irvine Tylor Smith Bsed on lecture notes by mny previous cs240 instructors Dvid R. Cheriton School of Computer
More informationReinforcement Learning
Reinforcement Learning Yihay Manour Google Inc. & Tel-Aviv Univerity Outline Goal of Reinforcement Learning Mathematical Model (MDP) Planning Learning Current Reearch iue 2 Goal of Reinforcement Learning
More informationPHYS 601 HW 5 Solution. We wish to find a Fourier expansion of e sin ψ so that the solution can be written in the form
5 Solving Kepler eqution Conider the Kepler eqution ωt = ψ e in ψ We wih to find Fourier expnion of e in ψ o tht the olution cn be written in the form ψωt = ωt + A n innωt, n= where A n re the Fourier
More informationAPPROXIMATE INTEGRATION
APPROXIMATE INTEGRATION. Introduction We hve seen tht there re functions whose nti-derivtives cnnot be expressed in closed form. For these resons ny definite integrl involving these integrnds cnnot be
More informationMATH SS124 Sec 39 Concepts summary with examples
This note is mde for students in MTH124 Section 39 to review most(not ll) topics I think we covered in this semester, nd there s exmples fter these concepts, go over this note nd try to solve those exmples
More informationLesson 1.6 Exercises, pages 68 73
Lesson.6 Exercises, pges 68 7 A. Determine whether ech infinite geometric series hs finite sum. How do you know? ) + +.5 + 6.75 +... r is:.5, so the sum is not finite. b) 0.5 0.05 0.005 0.0005... r is:
More informationList all of the possible rational roots of each equation. Then find all solutions (both real and imaginary) of the equation. 1.
Mth Anlysis CP WS 4.X- Section 4.-4.4 Review Complete ech question without the use of grphing clcultor.. Compre the mening of the words: roots, zeros nd fctors.. Determine whether - is root of 0. Show
More informationExam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1
Exm, Mthemtics 471, Section ETY6 6:5 pm 7:4 pm, Mrch 1, 16, IH-115 Instructor: Attil Máté 1 17 copies 1. ) Stte the usul sufficient condition for the fixed-point itertion to converge when solving the eqution
More informationHIGHER SCHOOL CERTIFICATE EXAMINATION MATHEMATICS 3 UNIT (ADDITIONAL) AND 3/4 UNIT (COMMON) Time allowed Two hours (Plus 5 minutes reading time)
HIGHER SCHOOL CERTIFICATE EXAMINATION 998 MATHEMATICS 3 UNIT (ADDITIONAL) AND 3/4 UNIT (COMMON) Time llowed Two hours (Plus 5 minutes reding time) DIRECTIONS TO CANDIDATES Attempt ALL questions ALL questions
More informationOrthogonal Polynomials
Mth 4401 Gussin Qudrture Pge 1 Orthogonl Polynomils Orthogonl polynomils rise from series solutions to differentil equtions, lthough they cn be rrived t in vriety of different mnners. Orthogonl polynomils
More informationAP Calculus Multiple Choice: BC Edition Solutions
AP Clculus Multiple Choice: BC Edition Solutions J. Slon Mrch 8, 04 ) 0 dx ( x) is A) B) C) D) E) Divergent This function inside the integrl hs verticl symptotes t x =, nd the integrl bounds contin this
More informationCS 109 Lecture 11 April 20th, 2016
CS 09 Lecture April 0th, 06 Four Prototypicl Trjectories Review The Norml Distribution is Norml Rndom Vrible: ~ Nµ, σ Probbility Density Function PDF: f x e σ π E[ ] µ Vr σ x µ / σ Also clled Gussin Note:
More informationWe divide the interval [a, b] into subintervals of equal length x = b a n
Arc Length Given curve C defined by function f(x), we wnt to find the length of this curve between nd b. We do this by using process similr to wht we did in defining the Riemnn Sum of definite integrl:
More informationWeek 12 Notes. Aim: How do we use differentiation to maximize/minimize certain values (e.g. profit, cost,
Week 2 Notes ) Optimiztion Problems: Aim: How o we use ifferentition to mximize/minimize certin vlues (e.g. profit, cost, volume, ) Exmple: Suppose you own tour bus n you book groups of 20 to 70 people
More informationMATH 115 FINAL EXAM. April 25, 2005
MATH 115 FINAL EXAM April 25, 2005 NAME: Solution Key INSTRUCTOR: SECTION NO: 1. Do not open this exm until you re told to begin. 2. This exm hs 9 pges including this cover. There re 9 questions. 3. Do
More informationMath 1B, lecture 4: Error bounds for numerical methods
Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the
More informationGoals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite
Unit #8 : The Integrl Gols: Determine how to clculte the re described by function. Define the definite integrl. Eplore the reltionship between the definite integrl nd re. Eplore wys to estimte the definite
More informationIndividual Contest. English Version. Time limit: 90 minutes. Instructions:
Elementry Mthemtics Interntionl Contest Instructions: Individul Contest Time limit: 90 minutes Do not turn to the first pge until you re told to do so. Write down your nme, your contestnt numer nd your
More informationx ) dx dx x sec x over the interval (, ).
Curve on 6 For -, () Evlute the integrl, n (b) check your nswer by ifferentiting. ( ). ( ). ( ).. 6. sin cos 7. sec csccot 8. sec (sec tn ) 9. sin csc. Evlute the integrl sin by multiplying the numertor
More informationWe know that if f is a continuous nonnegative function on the interval [a, b], then b
1 Ares Between Curves c 22 Donld Kreider nd Dwight Lhr We know tht if f is continuous nonnegtive function on the intervl [, b], then f(x) dx is the re under the grph of f nd bove the intervl. We re going
More informationMORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.)
MORE FUNCTION GRAPHING; OPTIMIZATION FRI, OCT 25, 203 (Lst edited October 28, 203 t :09pm.) Exercise. Let n be n rbitrry positive integer. Give n exmple of function with exctly n verticl symptotes. Give
More informationNew data structures to reduce data size and search time
New dt structures to reduce dt size nd serch time Tsuneo Kuwbr Deprtment of Informtion Sciences, Fculty of Science, Kngw University, Hirtsuk-shi, Jpn FIT2018 1D-1, No2, pp1-4 Copyright (c)2018 by The Institute
More informationMath 116 Calculus II
Mth 6 Clculus II Contents 5 Exponentil nd Logrithmic functions 5. Review........................................... 5.. Exponentil functions............................... 5.. Logrithmic functions...............................
More informationPh2b Quiz - 1. Instructions
Ph2b Winter 217-18 Quiz - 1 Due Dte: Mondy, Jn 29, 218 t 4pm Ph2b Quiz - 1 Instructions 1. Your solutions re due by Mondy, Jnury 29th, 218 t 4pm in the quiz box outside 21 E. Bridge. 2. Lte quizzes will
More informationThe ifs Package. December 28, 2005
The if Pckge December 28, 2005 Verion 0.1-1 Title Iterted Function Sytem Author S. M. Icu Mintiner S. M. Icu Iterted Function Sytem Licene GPL Verion 2 or lter. R topic documented:
More information1 The Riemann Integral
The Riemnn Integrl. An exmple leding to the notion of integrl (res) We know how to find (i.e. define) the re of rectngle (bse height), tringle ( (sum of res of tringles). But how do we find/define n re
More informationPrecalculus Spring 2017
Preclculus Spring 2017 Exm 3 Summry (Section 4.1 through 5.2, nd 9.4) Section P.5 Find domins of lgebric expressions Simplify rtionl expressions Add, subtrct, multiply, & divide rtionl expressions Simplify
More informationMA FINAL EXAM INSTRUCTIONS
MA 33 FINAL EXAM INSTRUCTIONS NAME INSTRUCTOR. Intructor nme: Chen, Dong, Howrd, or Lundberg 2. Coure number: MA33. 3. SECTION NUMBERS: 6 for MWF :3AM-:2AM REC 33 cl by Erik Lundberg 7 for MWF :3AM-:2AM
More informationTHE KENNESAW STATE UNIVERSITY HIGH SCHOOL MATHEMATICS COMPETITION PART I MULTIPLE CHOICE NO CALCULATORS 90 MINUTES
THE 08 09 KENNESW STTE UNIVERSITY HIGH SHOOL MTHEMTIS OMPETITION PRT I MULTIPLE HOIE For ech of the following questions, crefully blcken the pproprite box on the nswer sheet with # pencil. o not fold,
More informationWhat's Your Body Composition?
Wht' Your Body Compoition? DETERMINING YOUR BODY FAT The firt tep determ your compoition i clculte your body ft percente of your tl weiht. Refer now the workheet for comput your percente of body ft. (The
More informationDo the one-dimensional kinetic energy and momentum operators commute? If not, what operator does their commutator represent?
1 Problem 1 Do the one-dimensionl kinetic energy nd momentum opertors commute? If not, wht opertor does their commuttor represent? KE ˆ h m d ˆP i h d 1.1 Solution This question requires clculting the
More informationDuality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.
Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we
More informationHidden Markov Models
Hidden Mrkov Models Huptseminr Mchine Lerning 18.11.2003 Referent: Nikols Dörfler 1 Overview Mrkov Models Hidden Mrkov Models Types of Hidden Mrkov Models Applictions using HMMs Three centrl problems:
More informationCS 188: Artificial Intelligence Fall Announcements
CS 188: Artificil Intelligence Fll 2009 Lecture 20: Prticle Filtering 11/5/2009 Dn Klein UC Berkeley Announcements Written 3 out: due 10/12 Project 4 out: due 10/19 Written 4 proly xed, Project 5 moving
More information