{ } = E! & $ " k r t +k +1
|
|
- Benjamin Knight
- 6 years ago
- Views:
Transcription
1 Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence, optiml policies Discuss efficiency nd utility of DP R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 1 Policy Evlution Policy Evlution: for given policy π, compute the stte-vlue function V! Recll: Stte - vlue function for policy! : V! (s) = E! R t s t = s % # { } = E! & $ " k r t +k +1 s t = s ' k =0 ( ) * Bellmn eqution for V! : $ V! (s) =!(s, ) P s s " $ s " [ R s s " + # V! ( s ")] system of S simultneous liner equtions R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 2 1
2 Itertive Methods V 0! V 1! L! V k! V k +1! L! V " sweep A sweep consists of pplying bckup opertion to ech stte. A full policy-evlution bckup: V k +1 (s) " &#(s,)& P s s $ [ R + %V ( $ s s $ s )] k s $ R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 3 Itertive Policy Evlution R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 4 2
3 A Smll Gridworld An undiscounted episodic tsk Nonterminl sttes: 1, 2,..., 14; One terminl stte (shown twice s shded squres) Actions tht would tke gent off the grid leve stte unchnged Rewrd is 1 until the terminl stte is reched R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 5 Itertive Policy Evl for the Smll Gridworld " = equiprobble rndom ction choices R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 6 3
4 Policy Improvement Suppose we hve computed V! for deterministic policy π. For given stte s, would it be better to do n ction! "(s)? The vlue of doing in stte s is : Q! (s, ) = E! r t +1 + " V! (s t +1 ) s t = s, t = { } s [ s +" V! ( s #)] = $ P s # R s # s # It is better to switch to ction for stte s if nd only if Q! (s, ) > V! (s) R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 7 Policy Improvement Cont. Do this for ll sttes to get new policy "! tht is greedy with respect to V " : "!(s) = rgmx Q " (s, ) Then V "! # V " = rgmx # R + $ V " ( s s! ) s! P s! s! [ ] R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 8 4
5 Policy Improvement Cont. Wht if V "! = V "? i.e., for ll s #S, V "! (s) = mx $ P R s s! s s! But this is the Bellmn Optimlity Eqution. s! [ +% V " ( s!)]? So V "! = V # nd both " nd "! re optiml policies. R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 9 Policy Itertion! 0 " V! 0 "! 1 " V! 1 " L! * " V * "! * policy evlution policy improvement greedifiction R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 10 5
6 Policy Itertion R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 11 Jck s Cr Rentl $10 for ech cr rented (must be vilble when request rec d) Two loctions, mximum of 20 crs t ech Crs returned nd requested rndomly " Poisson distribution, n returns/requests with prob n n! e#" 1st loction: verge requests = 3, verge returns = 3 2nd loction: verge requests = 4, verge returns = 2 Cn move up to 5 crs between loctions overnight Sttes, Actions, Rewrds? Trnsition probbilities? R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 12 6
7 Jck s Cr Rentl R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 13 Jck s CR Exercise Suppose the first cr moved is free From 1st to 2nd loction Becuse n employee trvels tht wy nywy (by bus) Suppose only 10 crs cn be prked for free t ech loction More thn 10 cost $4 for using n extr prking lot Such rbitrry nonlinerities re common in rel problems R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 14 7
8 Vlue Itertion Recll the full policy-evlution bckup: % % V k +1 (s)! "(s, ) P s s # s # [ R s s # + $ V k ( s #)] Here is the full vlue-itertion bckup: V k +1 (s)! mx $ P s s " R s s " s " [ + # V k ( s ")] R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 15 Vlue Itertion Cont. R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 16 8
9 Gmbler s Problem Gmbler cn repetedly bet $ on coin flip Heds he wins his stke, tils he loses it Initil cpitl {$1, $2, $99} Gmbler wins if his cpitl becomes $100 loses if it becomes $0 Coin is unfir Heds (gmbler wins) with probbility p =.4 " n n! e#" Sttes, Actions, Rewrds? R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 17 Gmbler s Problem Solution R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 18 9
10 Herd Mngement You re consultnt to frmer mnging herd of cows Herd consists of 5 kinds of cows: Young Milking Breeding Old Sick Number of ech kind is the Stte Number sold of ech kind is the Action Cows trnsition from one kind to nother Young cows cn be born R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 19 Asynchronous DP All the DP methods described so fr require exhustive sweeps of the entire stte set. Asynchronous DP does not use sweeps. Insted it works like this: Repet until convergence criterion is met: Pick stte t rndom nd pply the pproprite bckup Still need lots of computtion, but does not get locked into hopelessly long sweeps Cn you select sttes to bckup intelligently? YES: n gent s experience cn ct s guide. R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 20 10
11 Generlized Policy Itertion Generlized Policy Itertion (GPI): ny interction of policy evlution nd policy improvement, independent of their grnulrity. A geometric metphor for convergence of GPI: R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 21 Efficiency of DP To find n optiml policy is polynomil in the number of sttes BUT, the number of sttes is often stronomicl, e.g., often growing exponentilly with the number of stte vribles (wht Bellmn clled the curse of dimensionlity ). In prctice, clssicl DP cn be pplied to problems with few millions of sttes. Asynchronous DP cn be pplied to lrger problems, nd pproprite for prllel computtion. It is surprisingly esy to come up with MDPs for which DP methods re not prcticl. R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 22 11
12 Summry Policy evlution: bckups without mx Policy improvement: form greedy policy, if only loclly Policy itertion: lternte the bove two processes Vlue itertion: bckups with mx Full bckups (to be contrsted lter with smple bckups) Generlized Policy Itertion (GPI) Asynchronous DP: wy to void exhustive sweeps Bootstrpping: updting estimtes bsed on other estimtes R. S. Sutton nd A. G. Brto: Reinforcement Lerning: An Introduction 23 12
Chapter 4: Dynamic Programming
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationAdministrivia CSE 190: Reinforcement Learning: An Introduction
Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these
More informationBellman Optimality Equation for V*
Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s
More informationChapter 4: Dynamic Programming
Chapter 4: Dynamic Programming Objectives of this chapter: Overview of a collection of classical solution methods for MDPs known as dynamic programming (DP) Show how DP can be used to compute value functions,
More informationReinforcement learning II
CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic
More informationModule 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo
Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:
More information2D1431 Machine Learning Lab 3: Reinforcement Learning
2D1431 Mchine Lerning Lb 3: Reinforcement Lerning Frnk Hoffmnn modified by Örjn Ekeberg December 7, 2004 1 Introduction In this lb you will lern bout dynmic progrmming nd reinforcement lerning. It is ssumed
More information19 Optimal behavior: Game theory
Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,
More informationReinforcement Learning
Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm
More informationIntroduction to Reinforcement Learning. Part 6: Core Theory II: Bellman Equations and Dynamic Programming
Introduction to Reinforcement Learning Part 6: Core Theory II: Bellman Equations and Dynamic Programming Bellman Equations Recursive relationships among values that can be used to compute values The tree
More informationEfficient Planning. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction
Efficient Plnning 1 Tuesdy clss summry: Plnning: ny computtionl process tht uses model to crete or improve policy Dyn frmework: 2 Questions during clss Why use simulted experience? Cn t you directly compute
More informationReinforcement learning
Reinforcement lerning Regulr MDP Given: Trnition model P Rewrd function R Find: Policy π Reinforcement lerning Trnition model nd rewrd function initilly unknown Still need to find the right policy Lern
More informationLecture 3 Gaussian Probability Distribution
Introduction Lecture 3 Gussin Probbility Distribution Gussin probbility distribution is perhps the most used distribution in ll of science. lso clled bell shped curve or norml distribution Unlike the binomil
More informationAQA Further Pure 1. Complex Numbers. Section 1: Introduction to Complex Numbers. The number system
Complex Numbers Section 1: Introduction to Complex Numbers Notes nd Exmples These notes contin subsections on The number system Adding nd subtrcting complex numbers Multiplying complex numbers Complex
More informationWe will see what is meant by standard form very shortly
THEOREM: For fesible liner progrm in its stndrd form, the optimum vlue of the objective over its nonempty fesible region is () either unbounded or (b) is chievble t lest t one extreme point of the fesible
More informationCS 188: Artificial Intelligence Spring 2007
CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment
More informationDecision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information
CS 188: Artificil Intelligence nd Vlue of Informtion Instructors: Dn Klein nd Pieter Abbeel niversity of Cliforni, Berkeley [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI
More informationHow do you know you have SLE?
Simultneous Liner Equtions Simultneous Liner Equtions nd Liner Algebr Simultneous liner equtions (SLE s) occur frequently in Sttics, Dynmics, Circuits nd other engineering clsses Need to be ble to, nd
More informationTHE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.
THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem
More informationCS 188: Artificial Intelligence Fall 2010
CS 188: Artificil Intelligence Fll 2010 Lecture 18: Decision Digrms 10/28/2010 Dn Klein C Berkeley Vlue of Informtion 1 Decision Networks ME: choose the ction which mximizes the expected utility given
More informationThe Regulated and Riemann Integrals
Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue
More informationReview of Calculus, cont d
Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some
More informationReinforcement Learning and Policy Reuse
Reinforcement Lerning nd Policy Reue Mnuel M. Veloo PEL Fll 206 Reding: Reinforcement Lerning: An Introduction R. Sutton nd A. Brto Probbilitic policy reue in reinforcement lerning gent Fernndo Fernndez
More informationOperations with Polynomials
38 Chpter P Prerequisites P.4 Opertions with Polynomils Wht you should lern: How to identify the leding coefficients nd degrees of polynomils How to dd nd subtrct polynomils How to multiply polynomils
More information1 Probability Density Functions
Lis Yn CS 9 Continuous Distributions Lecture Notes #9 July 6, 28 Bsed on chpter by Chris Piech So fr, ll rndom vribles we hve seen hve been discrete. In ll the cses we hve seen in CS 9, this ment tht our
More informationCS 188: Artificial Intelligence
CS 188: Artificil Intelligence Lecture 19: Decision Digrms Pieter Abbeel --- C Berkeley Mny slides over this course dpted from Dn Klein, Sturt Russell, Andrew Moore Decision Networks ME: choose the ction
More information20 MATHEMATICS POLYNOMIALS
0 MATHEMATICS POLYNOMIALS.1 Introduction In Clss IX, you hve studied polynomils in one vrible nd their degrees. Recll tht if p(x) is polynomil in x, the highest power of x in p(x) is clled the degree of
More informationNormal Distribution. Lecture 6: More Binomial Distribution. Properties of the Unit Normal Distribution. Unit Normal Distribution
Norml Distribution Lecture 6: More Binomil Distribution If X is rndom vrible with norml distribution with men µ nd vrince σ 2, X N (µ, σ 2, then P(X = x = f (x = 1 e 1 (x µ 2 2 σ 2 σ Sttistics 104 Colin
More informationSTEP FUNCTIONS, DELTA FUNCTIONS, AND THE VARIATION OF PARAMETERS FORMULA. 0 if t < 0, 1 if t > 0.
STEP FUNCTIONS, DELTA FUNCTIONS, AND THE VARIATION OF PARAMETERS FORMULA STEPHEN SCHECTER. The unit step function nd piecewise continuous functions The Heviside unit step function u(t) is given by if t
More informationOrdinary Differential Equations- Boundary Value Problem
Ordinry Differentil Equtions- Boundry Vlue Problem Shooting method Runge Kutt method Computer-bsed solutions o BVPFD subroutine (Fortrn IMSL subroutine tht Solves (prmeterized) system of differentil equtions
More informationArtificial Intelligence Markov Decision Problems
rtificil Intelligence Mrkov eciion Problem ilon - briefly mentioned in hpter Ruell nd orvig - hpter 7 Mrkov eciion Problem; pge of Mrkov eciion Problem; pge of exmple: probbilitic blockworld ction outcome
More informationBefore we can begin Ch. 3 on Radicals, we need to be familiar with perfect squares, cubes, etc. Try and do as many as you can without a calculator!!!
Nme: Algebr II Honors Pre-Chpter Homework Before we cn begin Ch on Rdicls, we need to be fmilir with perfect squres, cubes, etc Try nd do s mny s you cn without clcultor!!! n The nth root of n n Be ble
More information5.2 Exponent Properties Involving Quotients
5. Eponent Properties Involving Quotients Lerning Objectives Use the quotient of powers property. Use the power of quotient property. Simplify epressions involving quotient properties of eponents. Use
More informationAQA Further Pure 2. Hyperbolic Functions. Section 2: The inverse hyperbolic functions
Hperbolic Functions Section : The inverse hperbolic functions Notes nd Emples These notes contin subsections on The inverse hperbolic functions Integrtion using the inverse hperbolic functions Logrithmic
More information4.4 Areas, Integrals and Antiderivatives
. res, integrls nd ntiderivtives 333. Ares, Integrls nd Antiderivtives This section explores properties of functions defined s res nd exmines some connections mong res, integrls nd ntiderivtives. In order
More informationStudent Activity 3: Single Factor ANOVA
MATH 40 Student Activity 3: Single Fctor ANOVA Some Bsic Concepts In designed experiment, two or more tretments, or combintions of tretments, is pplied to experimentl units The number of tretments, whether
More informationA Fast and Reliable Policy Improvement Algorithm
A Fst nd Relible Policy Improvement Algorithm Ysin Abbsi-Ydkori Peter L. Brtlett Stephen J. Wright Queenslnd University of Technology UC Berkeley nd QUT University of Wisconsin-Mdison Abstrct We introduce
More informationLecture 1: Introduction to integration theory and bounded variation
Lecture 1: Introduction to integrtion theory nd bounded vrition Wht is this course bout? Integrtion theory. The first question you might hve is why there is nything you need to lern bout integrtion. You
More informationMATH 115 FINAL EXAM. April 25, 2005
MATH 115 FINAL EXAM April 25, 2005 NAME: Solution Key INSTRUCTOR: SECTION NO: 1. Do not open this exm until you re told to begin. 2. This exm hs 9 pges including this cover. There re 9 questions. 3. Do
More informationSUMMER KNOWHOW STUDY AND LEARNING CENTRE
SUMMER KNOWHOW STUDY AND LEARNING CENTRE Indices & Logrithms 2 Contents Indices.2 Frctionl Indices.4 Logrithms 6 Exponentil equtions. Simplifying Surds 13 Opertions on Surds..16 Scientific Nottion..18
More informationPlanning in Markov Decision Processes
Carnegie Mellon School of Computer Science Deep Reinforcement Learning and Control Planning in Markov Decision Processes Lecture 3, CMU 10703 Katerina Fragkiadaki Markov Decision Process (MDP) A Markov
More informationChapter 14. Matrix Representations of Linear Transformations
Chpter 4 Mtrix Representtions of Liner Trnsformtions When considering the Het Stte Evolution, we found tht we could describe this process using multipliction by mtrix. This ws nice becuse computers cn
More informationChapters 4 & 5 Integrals & Applications
Contents Chpters 4 & 5 Integrls & Applictions Motivtion to Chpters 4 & 5 2 Chpter 4 3 Ares nd Distnces 3. VIDEO - Ares Under Functions............................................ 3.2 VIDEO - Applictions
More informationDecision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees
CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize
More informationJim Lambers MAT 169 Fall Semester Lecture 4 Notes
Jim Lmbers MAT 169 Fll Semester 2009-10 Lecture 4 Notes These notes correspond to Section 8.2 in the text. Series Wht is Series? An infinte series, usully referred to simply s series, is n sum of ll of
More informationAdvanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004
Advnced Clculus: MATH 410 Notes on Integrls nd Integrbility Professor Dvid Levermore 17 October 2004 1. Definite Integrls In this section we revisit the definite integrl tht you were introduced to when
More informationEquations and Inequalities
Equtions nd Inequlities Equtions nd Inequlities Curriculum Redy ACMNA: 4, 5, 6, 7, 40 www.mthletics.com Equtions EQUATIONS & Inequlities & INEQUALITIES Sometimes just writing vribles or pronumerls in
More informationExam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1
Exm, Mthemtics 471, Section ETY6 6:5 pm 7:4 pm, Mrch 1, 16, IH-115 Instructor: Attil Máté 1 17 copies 1. ) Stte the usul sufficient condition for the fixed-point itertion to converge when solving the eqution
More informationMatrix Solution to Linear Equations and Markov Chains
Trding Systems nd Methods, Fifth Edition By Perry J. Kufmn Copyright 2005, 2013 by Perry J. Kufmn APPENDIX 2 Mtrix Solution to Liner Equtions nd Mrkov Chins DIRECT SOLUTION AND CONVERGENCE METHOD Before
More informationMonte Carlo method in solving numerical integration and differential equation
Monte Crlo method in solving numericl integrtion nd differentil eqution Ye Jin Chemistry Deprtment Duke University yj66@duke.edu Abstrct: Monte Crlo method is commonly used in rel physics problem. The
More informationNUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by.
NUMERICAL INTEGRATION 1 Introduction The inverse process to differentition in clculus is integrtion. Mthemticlly, integrtion is represented by f(x) dx which stnds for the integrl of the function f(x) with
More informationRead section 3.3, 3.4 Announcements:
Dte: 3/1/13 Objective: SWBAT pply properties of exponentil functions nd will pply properties of rithms. Bell Ringer: 1. f x = 3x 6, find the inverse, f 1 x., Using your grphing clcultor, Grph 1. f x,f
More information1.2. Linear Variable Coefficient Equations. y + b "! = a y + b " Remark: The case b = 0 and a non-constant can be solved with the same idea as above.
1 12 Liner Vrible Coefficient Equtions Section Objective(s): Review: Constnt Coefficient Equtions Solving Vrible Coefficient Equtions The Integrting Fctor Method The Bernoulli Eqution 121 Review: Constnt
More informationContinuous Random Variables
STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 217 Néhémy Lim Continuous Rndom Vribles Nottion. The indictor function of set S is rel-vlued function defined by : { 1 if x S 1 S (x) if x S Suppose tht
More information4 7x =250; 5 3x =500; Read section 3.3, 3.4 Announcements: Bell Ringer: Use your calculator to solve
Dte: 3/14/13 Objective: SWBAT pply properties of exponentil functions nd will pply properties of rithms. Bell Ringer: Use your clcultor to solve 4 7x =250; 5 3x =500; HW Requests: Properties of Log Equtions
More informationLECTURE NOTE #12 PROF. ALAN YUILLE
LECTURE NOTE #12 PROF. ALAN YUILLE 1. Clustering, K-mens, nd EM Tsk: set of unlbeled dt D = {x 1,..., x n } Decompose into clsses w 1,..., w M where M is unknown. Lern clss models p(x w)) Discovery of
More information1 The Riemann Integral
The Riemnn Integrl. An exmple leding to the notion of integrl (res) We know how to find (i.e. define) the re of rectngle (bse height), tringle ( (sum of res of tringles). But how do we find/define n re
More informationMath 113 Exam 2 Practice
Mth 3 Exm Prctice Februry 8, 03 Exm will cover 7.4, 7.5, 7.7, 7.8, 8.-3 nd 8.5. Plese note tht integrtion skills lerned in erlier sections will still be needed for the mteril in 7.5, 7.8 nd chpter 8. This
More informationPhysics 201 Lab 3: Measurement of Earth s local gravitational field I Data Acquisition and Preliminary Analysis Dr. Timothy C. Black Summer I, 2018
Physics 201 Lb 3: Mesurement of Erth s locl grvittionl field I Dt Acquisition nd Preliminry Anlysis Dr. Timothy C. Blck Summer I, 2018 Theoreticl Discussion Grvity is one of the four known fundmentl forces.
More informationUnit #9 : Definite Integral Properties; Fundamental Theorem of Calculus
Unit #9 : Definite Integrl Properties; Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl
More informationApplying Q-Learning to Flappy Bird
Applying Q-Lerning to Flppy Bird Moritz Ebeling-Rump, Mnfred Ko, Zchry Hervieux-Moore Abstrct The field of mchine lerning is n interesting nd reltively new re of reserch in rtificil intelligence. In this
More informationGenetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary
Outline Genetic Progrmming Evolutionry strtegies Genetic progrmming Summry Bsed on the mteril provided y Professor Michel Negnevitsky Evolutionry Strtegies An pproch simulting nturl evolution ws proposed
More informationCS 188 Introduction to Artificial Intelligence Fall 2018 Note 7
CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees
More informationChapter 4 Contravariance, Covariance, and Spacetime Diagrams
Chpter 4 Contrvrince, Covrince, nd Spcetime Digrms 4. The Components of Vector in Skewed Coordintes We hve seen in Chpter 3; figure 3.9, tht in order to show inertil motion tht is consistent with the Lorentz
More informationCS5371 Theory of Computation. Lecture 20: Complexity V (Polynomial-Time Reducibility)
CS5371 Theory of Computtion Lecture 20: Complexity V (Polynomil-Time Reducibility) Objectives Polynomil Time Reducibility Prove Cook-Levin Theorem Polynomil Time Reducibility Previously, we lernt tht if
More informationLecture 6 Regular Grammars
Lecture 6 Regulr Grmmrs COT 4420 Theory of Computtion Section 3.3 Grmmr A grmmr G is defined s qudruple G = (V, T, S, P) V is finite set of vribles T is finite set of terminl symbols S V is specil vrible
More informationBest Approximation. Chapter The General Case
Chpter 4 Best Approximtion 4.1 The Generl Cse In the previous chpter, we hve seen how n interpolting polynomil cn be used s n pproximtion to given function. We now wnt to find the best pproximtion to given
More informationProblem Set 3 Solutions
Chemistry 36 Dr Jen M Stndrd Problem Set 3 Solutions 1 Verify for the prticle in one-dimensionl box by explicit integrtion tht the wvefunction ψ ( x) π x is normlized To verify tht ψ ( x) is normlized,
More information1 Online Learning and Regret Minimization
2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in
More informationNumerical integration
2 Numericl integrtion This is pge i Printer: Opque this 2. Introduction Numericl integrtion is problem tht is prt of mny problems in the economics nd econometrics literture. The orgniztion of this chpter
More informationDIRECT CURRENT CIRCUITS
DRECT CURRENT CUTS ELECTRC POWER Consider the circuit shown in the Figure where bttery is connected to resistor R. A positive chrge dq will gin potentil energy s it moves from point to point b through
More informationHow to simulate Turing machines by invertible one-dimensional cellular automata
How to simulte Turing mchines by invertible one-dimensionl cellulr utomt Jen-Christophe Dubcq Déprtement de Mthémtiques et d Informtique, École Normle Supérieure de Lyon, 46, llée d Itlie, 69364 Lyon Cedex
More informationJonathan Mugan. July 15, 2013
Jonthn Mugn July 15, 2013 Imgine rt in Skinner box. The rt cn see screen of imges, nd dot in the lower-right corner determines if there will be shock. Bottom-up methods my not find this dot, but top-down
More informationMath& 152 Section Integration by Parts
Mth& 5 Section 7. - Integrtion by Prts Integrtion by prts is rule tht trnsforms the integrl of the product of two functions into other (idelly simpler) integrls. Recll from Clculus I tht given two differentible
More informationMath 1B, lecture 4: Error bounds for numerical methods
Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the
More informationExponentials - Grade 10 [CAPS] *
OpenStx-CNX module: m859 Exponentils - Grde 0 [CAPS] * Free High School Science Texts Project Bsed on Exponentils by Rory Adms Free High School Science Texts Project Mrk Horner Hether Willims This work
More informationIf deg(num) deg(denom), then we should use long-division of polynomials to rewrite: p(x) = s(x) + r(x) q(x), q(x)
Mth 50 The method of prtil frction decomposition (PFD is used to integrte some rtionl functions of the form p(x, where p/q is in lowest terms nd deg(num < deg(denom. q(x If deg(num deg(denom, then we should
More informationThis lecture covers Chapter 8 of HMU: Properties of CFLs
This lecture covers Chpter 8 of HMU: Properties of CFLs Turing Mchine Extensions of Turing Mchines Restrictions of Turing Mchines Additionl Reding: Chpter 8 of HMU. Turing Mchine: Informl Definition B
More informationMarkov Decision Processes
Mrkov Deciion Procee A Brief Introduction nd Overview Jck L. King Ph.D. Geno UK Limited Preenttion Outline Introduction to MDP Motivtion for Study Definition Key Point of Interet Solution Technique Prtilly
More informationNUMERICAL INTEGRATION
NUMERICAL INTEGRATION How do we evlute I = f (x) dx By the fundmentl theorem of clculus, if F (x) is n ntiderivtive of f (x), then I = f (x) dx = F (x) b = F (b) F () However, in prctice most integrls
More informationCS667 Lecture 6: Monte Carlo Integration 02/10/05
CS667 Lecture 6: Monte Crlo Integrtion 02/10/05 Venkt Krishnrj Lecturer: Steve Mrschner 1 Ide The min ide of Monte Crlo Integrtion is tht we cn estimte the vlue of n integrl by looking t lrge number of
More informationChapter 3 Solving Nonlinear Equations
Chpter 3 Solving Nonliner Equtions 3.1 Introduction The nonliner function of unknown vrible x is in the form of where n could be non-integer. Root is the numericl vlue of x tht stisfies f ( x) 0. Grphiclly,
More informationA sequence is a list of numbers in a specific order. A series is a sum of the terms of a sequence.
Core Module Revision Sheet The C exm is hour 30 minutes long nd is in two sections. Section A (36 mrks) 8 0 short questions worth no more thn 5 mrks ech. Section B (36 mrks) 3 questions worth mrks ech.
More informationLesson 1: Quadratic Equations
Lesson 1: Qudrtic Equtions Qudrtic Eqution: The qudrtic eqution in form is. In this section, we will review 4 methods of qudrtic equtions, nd when it is most to use ech method. 1. 3.. 4. Method 1: Fctoring
More informationNumerical Integration
Chpter 5 Numericl Integrtion Numericl integrtion is the study of how the numericl vlue of n integrl cn be found. Methods of function pproximtion discussed in Chpter??, i.e., function pproximtion vi the
More informationActor-Critic. Hung-yi Lee
Actor-Critic Hung-yi Lee Asynchronous Advntge Actor-Critic (A3C) Volodymyr Mnih, Adrià Puigdomènech Bdi, Mehdi Mirz, Alex Grves, Timothy P. Lillicrp, Tim Hrley, Dvid Silver, Kory Kvukcuoglu, Asynchronous
More information12 TRANSFORMING BIVARIATE DENSITY FUNCTIONS
1 TRANSFORMING BIVARIATE DENSITY FUNCTIONS Hving seen how to trnsform the probbility density functions ssocited with single rndom vrible, the next logicl step is to see how to trnsform bivrite probbility
More informationState space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies
Stte spce systems nlysis (continued) Stbility A. Definitions A system is sid to be Asymptoticlly Stble (AS) when it stisfies ut () = 0, t > 0 lim xt () 0. t A system is AS if nd only if the impulse response
More informationCS 188: Artificial Intelligence Fall Announcements
CS 188: Artificil Intelligence Fll 2009 Lecture 20: Prticle Filtering 11/5/2009 Dn Klein UC Berkeley Announcements Written 3 out: due 10/12 Project 4 out: due 10/19 Written 4 proly xed, Project 5 moving
More informationfractions Let s Learn to
5 simple lgebric frctions corne lens pupil retin Norml vision light focused on the retin concve lens Shortsightedness (myopi) light focused in front of the retin Corrected myopi light focused on the retin
More informationFinite Horizon Risk Sensitive MDP and Linear Programming
Finite Horizon Risk Sensitive MDP nd Liner Progrmming Atul Kumr, Veerrun Kvith nd N. Hemchndr IEOR, Indin Institute of Technology Bomby, Indi Abstrct In the context of stndrd Mrkov decision processes (MDPs),
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic
More informationEnergy Bands Energy Bands and Band Gap. Phys463.nb Phenomenon
Phys463.nb 49 7 Energy Bnds Ref: textbook, Chpter 7 Q: Why re there insultors nd conductors? Q: Wht will hppen when n electron moves in crystl? In the previous chpter, we discussed free electron gses,
More informationNondeterminism and Nodeterministic Automata
Nondeterminism nd Nodeterministic Automt 61 Nondeterminism nd Nondeterministic Automt The computtionl mchine models tht we lerned in the clss re deterministic in the sense tht the next move is uniquely
More information( ) 1. Algebra 2: Final Exam Review. y e + e e ) 4 x 10 = 10,000 = 9) Name
Algebr : Finl Exm Review Nme Chpter 6 Grph ech function. Determine if the function represents exponentil growth or decy. Determine the eqution of the symptote, the domin nd the rnge of the function. )
More informationLecture 20: Numerical Integration III
cs4: introduction to numericl nlysis /8/0 Lecture 0: Numericl Integrtion III Instructor: Professor Amos Ron Scribes: Mrk Cowlishw, Yunpeng Li, Nthnel Fillmore For the lst few lectures we hve discussed
More informationFinite Automata. Informatics 2A: Lecture 3. John Longley. 22 September School of Informatics University of Edinburgh
Lnguges nd Automt Finite Automt Informtics 2A: Lecture 3 John Longley School of Informtics University of Edinburgh jrl@inf.ed.c.uk 22 September 2017 1 / 30 Lnguges nd Automt 1 Lnguges nd Automt Wht is
More informationHow do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?
XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk bout solving systems of liner equtions. These re problems tht give couple of equtions with couple of unknowns, like: 6 2 3 7 4
More informationCBSE-XII-2015 EXAMINATION. Section A. 1. Find the sum of the order and the degree of the following differential equation : = 0
CBSE-XII- EXMINTION MTHEMTICS Pper & Solution Time : Hrs. M. Mrks : Generl Instruction : (i) ll questions re compulsory. There re questions in ll. (ii) This question pper hs three sections : Section, Section
More informationZ b. f(x)dx. Yet in the above two cases we know what f(x) is. Sometimes, engineers want to calculate an area by computing I, but...
Chpter 7 Numericl Methods 7. Introduction In mny cses the integrl f(x)dx cn be found by finding function F (x) such tht F 0 (x) =f(x), nd using f(x)dx = F (b) F () which is known s the nlyticl (exct) solution.
More information