Administrivia CSE 190: Reinforcement Learning: An Introduction
|
|
- Rudolf Hunter
- 5 years ago
- Views:
Transcription
1 Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these slides re cribbed from Rich Sutton 2 Gols for this chpter Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming DP Show how DP cn be used to compute vlue functions, nd hence, optiml policies Discuss efficiency nd utility of DP Lst Time: Vlue Functions The vlue of stte ihe expected return strting from tht stte; depends on the gent s policy: Stte - vlue function for policy! : # { } = E! & $ " k r t +k +1 V! s = E! R t The vlue of tking n ction in stte under policy! ihe expected return strting from tht stte, tking tht ction, nd therefter following! : % ' k =0 * 3 Action- vlue function for policy! : # { } = E! & $ " k r t + k +1, t = Q! s, = E! R t, t = CSE 190: Reinforcement Lerning, Lecture k = 0on Chpter 4 % ' * 4
2 Lst Time: Bellmn Eqution for Policy! The bsic ide: R t = r t +1 +! r t +2 +! 2 r t + 3 +! 3 r t + 4! = r t +1 +! r t +2 +! r t + 3 +! 2 r t + 4! Lst Time: More on the Bellmn Eqution V! s =!s, P s s" $% R s s" + # V! s "& ' s" This is set of equtions in fct, liner, one for ech stte. The vlue function for! is its unique solution. = r t +1 +! R t +1 So: V! s = E! R t { } { } = E! r t +1 + " V +1 Bckup digrms: Or, without the expecttion opertor: V! s =!s, P s s" $% R s s" + # V! s "& ' s" 5 for V! for Q! 6 Lst Time: Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V!! s = mx "As Q# s, { } = mx E r t +1 + $ V! +1, t = "As Lst Time: Bellmn Optimlity Eqution for Q* Q! s, = E{ r t +1 + " mxq! +1, #, t = } = P $ % R # + " mxq! s #, # # & ' = mx & P s s% ' R s s% + $ V! s % * "As s% The relevnt bckup digrm: The relevnt bckup digrm: V * ihe unique solution of this system of nonliner equtions. 7 Q * ihe unique solution of this system of nonliner equtions. 8
3 This Time Policy Evlution How to solve these equtions using itertion Cn solve for optiml V* Policy Evlution: for given policy!, compute the stte-vlue function V! Recll: Stte - vlue function for policy! : But often it is fster to evlute nd improve the policy Alternting figuring out V! nd improving! # { } = E! & $ " k r t + k +1 V! s = E! R t Bellmn eqution for V! : % ' k =0 V! s =!s, P s s" $% R s s" + # V! s "& ' s" system of S simultneous liner equtions * 9 10 Itertive Methods Itertive Methods V 0! V 1!!! V k!!!! V " V 0! V 1!!! V k!!!! V " sweep sweep A sweep consists of pplying bckup opertion to ech stte. A full policy-evlution bckup: s! "s, P %& R s #' A sweep consists of pplying bckup opertion to ech stte. A full policy-evlution bckup: s! "s, P %& R s #' 11 12
4 Itertive Policy Evlution A Smll Gridworld 13 An undiscounted episodic tsk Nonterminl sttes: 1, 2,..., 14; One terminl stte shown twice s shded squres Actionht would tke gent off the grid leve stte unchnged Rewrd is 1 until the terminl stte is reched 14 A Smll Gridworld A Smll Gridworld Note here tht the ctions re deterministic, so this eqution: s! "s, P %& R s #' Becomes: s! "s,%& R s #' And it is undiscounted, so this: Becomes: s! "s,%& R s #' s! "s, $% R +V k s #& ' 15 16
5 A Smll Gridworld A Smll Gridworld s! "s, $% R +V k s #& ' s! "s,up $% R UP + V k s #& ' + "s, RIGHT $% R RIGHT + V k s ##& ' + "s, DOWN $% R DOWN +V k s ### & ' + s! 0.25 "1 + V k s # 0.25 "1 +V k s # 0.25 "1 +V k s # 0.25 "1 + V k s # "s, LEFT $% R LEFT + V k s #### & ' A Smll Gridworld A Smll Gridworld For stte 4, for exmple, we hve: 4! 0.25 UP ["1 + V k terminl] RIGHT "1+ V k DOWN "1+V k LEFT "1+V k 4 s! 0.25 "1 +V k s # 0.25 "1+V k s # 0.25 "1+ V k s # 0.25 "1+V k s # 19 20
6 A Smll Gridworld A Smll Gridworld s! 0.25 "1+V k s # 0.25 "1+V k s # 0.25 "1+ V k s # 0.25 "1+V k s # 4! 0.25 UP ["1+ V k terminl] RIGHT "1+ V k DOWN "1 +V k LEFT "1+V k ! 0.25 UP "1+ 0 A Smll Gridworld 0.25 RIGHT "1 + " DOWN "1 +" LEFT "1 +" 1 = "1.75 Itertive Policy Evlution for the Smll Gridworld! = equiprobble rndom ction choices 23 24
7 Itertive Policy Evlution for the Smll Gridworld! = equiprobble rndom ction choices But look wht hppens if these vlues re used to mke new policy! note - this won t t lwys hppen! Exercise for the reder: Wht re the vlues of the sttes under the optiml policy? 25 Policy Improvement Suppose we hve computed V! for deterministic policy!. For given stte s, would it be better to do n ction! "s? The vlue of doing in stte s is: { } Q! s, = E! r t +1 + " V! +1, t = = $ P %& R + " V! s #' It is better to switch to ction for stte s if nd only if Q! s, > V! s 26 Policy Improvement Cont. Policy Improvement Cont. Do this for ll stteo get new policy "! tht is greedy with respect to V " : "!s = rgmxq " s, Then V "! # V " = rgmx # P s s! R s s! s! %& + $ V " s!' Wht if V "! = V "? i.e., for ll s #S, V "! s = mx$ s &' R s s! + % V " s!? P s! s! But this ihe Bellmn Optimlity Eqution. So V "! = V # nd both " nd "! re optiml policies
8 Policy Itertion Policy Itertion! 0 " V! 0 "! 1 " V! 1 "!! * " V * "! * policy evlution policy improvement greedifiction Jck s Cr Rentl Jck s Cr Rentl $10 for ech cr rented must be vilble when request rec d Two loctions, mximum of 20 cr ech Crs returned nd requested rndomly Poisson distribution, n returns/requests with prob " n e -" /n! where " is the expected number 1st loction: verge requests = 3, verge returns = 3 2nd loction: verge requests = 4, verge returns = 2 Cn move up to 5 crs between loctions overnight t $2/cr. Sttes, Actions, Rewrds? Trnsition probbilities? Note this mkes sense - loction 2 on verge loses 2 crs per dy
9 Jck s CR Exercise Suppose the first cr moved is free From 1st to 2nd loction Becuse n employee trvelht wy nywy by bus Suppose only 10 crs cn be prked for free t ech loction More thn 10 cost $4 for using n extr prking lot Such rbitrry nonlinerities re common in rel problems Policy itertion: Cn we do better? Ech itertion involves policy evlution, which is itself n itertive process It looks like from the previous exmple tht policy evlution my converge long fter the greedy policy bsed on the vlues hs converged. Cn we skip steps somehow? Yes: policy evlution cn be stopped erly nd under most cses, convergence is still gurnteed! A very specil cse: Stopping fter one sweep of policy evlution. This is clled vlue itertion Vlue Itertion Vlue Itertion Cont. Recll the full policy-evlution bckup: s! "s, P %& R s #' Here ihe full vlue-itertion bckup: s! mx P s s" s" $% R s s" + # V k s "& ' Note how this combines policy improvement nd evlution. It is simply the Bellmn optimlity eqution turned into n updte eqution! In prctice, often policy evlution sum is performed severl times between policy improvement mx sweeps
10 Gmbler s Problem Gmbler s Problem Solution Gmbler cn repetedly bet $ on coin flip Heds he wins his stke, tils he loses it Initil cpitl # {$1, $2, $99} Gmbler wins if his cpitl becomes $100 loses if it becomes $0 Coin is unfir Heds gmbler wins with probbility p =.4! n n! e"! Sttes, Actions, Rewrds? Herd Mngement Asynchronous DP You re consultnt to frmer mnging herd of cows Herd consists of 5 kinds of cows: Young Milking Breeding Old Sick Number of ech kind ihe Stte Number sold of ech kind ihe Action Cowrnsition from one kind to nother Young cows cn be born All the DP methods described so fr require exhustive sweeps of the entire stte set. Asynchronous DP does not use sweeps. Insted it works like this: Repet until convergence criterion is met: Pick stte t rndom nd pply the pproprite bckup Still need lots of computtion, but does not get locked into hopelessly long sweeps Cn you select stteo bckup intelligently? YES: n gent s experience cn ct s guide
11 Generlized Policy Itertion Generlized Policy Itertion GPI: ny interction of policy evlution nd policy improvement, independent of their grnulrity. A geometric metphor for convergence of GPI: Efficiency of DP To find n optiml policy is polynomil in the number of sttes BUT, the number of sttes is often stronomicl, e.g., often growing exponentilly with the number of stte vribles wht Bellmn clled the curse of dimensionlity. In prctice, clssicl DP cn be pplied to problems with few million sttes. Asynchronous DP cn be pplied to lrger problems, nd is pproprite for prllel computtion. It is surprisingly esy to come up with MDPs for which DP methods re not prcticl Summry Policy evlution: bckups without mx Policy improvement: form greedy policy, if only loclly Policy itertion: lternte the bove two processes Vlue itertion: bckups with mx Full bckupo be contrsted lter with smple bckups Asynchronous DP: wy to void exhustive sweeps Generlized Policy Itertion GPI Bootstrpping: updting estimtes bsed on other estimtes END 43
{ } = E! & $ " k r t +k +1
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationChapter 4: Dynamic Programming
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationBellman Optimality Equation for V*
Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s
More informationChapter 4: Dynamic Programming
Chapter 4: Dynamic Programming Objectives of this chapter: Overview of a collection of classical solution methods for MDPs known as dynamic programming (DP) Show how DP can be used to compute value functions,
More informationReinforcement learning II
CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic
More informationModule 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo
Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:
More information2D1431 Machine Learning Lab 3: Reinforcement Learning
2D1431 Mchine Lerning Lb 3: Reinforcement Lerning Frnk Hoffmnn modified by Örjn Ekeberg December 7, 2004 1 Introduction In this lb you will lern bout dynmic progrmming nd reinforcement lerning. It is ssumed
More informationReinforcement Learning
Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm
More information19 Optimal behavior: Game theory
Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,
More information1 Probability Density Functions
Lis Yn CS 9 Continuous Distributions Lecture Notes #9 July 6, 28 Bsed on chpter by Chris Piech So fr, ll rndom vribles we hve seen hve been discrete. In ll the cses we hve seen in CS 9, this ment tht our
More informationAQA Further Pure 1. Complex Numbers. Section 1: Introduction to Complex Numbers. The number system
Complex Numbers Section 1: Introduction to Complex Numbers Notes nd Exmples These notes contin subsections on The number system Adding nd subtrcting complex numbers Multiplying complex numbers Complex
More informationCS 188: Artificial Intelligence Spring 2007
CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment
More informationThe Regulated and Riemann Integrals
Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue
More informationProperties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives
Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn
More informationWe will see what is meant by standard form very shortly
THEOREM: For fesible liner progrm in its stndrd form, the optimum vlue of the objective over its nonempty fesible region is () either unbounded or (b) is chievble t lest t one extreme point of the fesible
More informationTHE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.
THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem
More informationLecture 3 Gaussian Probability Distribution
Introduction Lecture 3 Gussin Probbility Distribution Gussin probbility distribution is perhps the most used distribution in ll of science. lso clled bell shped curve or norml distribution Unlike the binomil
More informationODE: Existence and Uniqueness of a Solution
Mth 22 Fll 213 Jerry Kzdn ODE: Existence nd Uniqueness of Solution The Fundmentl Theorem of Clculus tells us how to solve the ordinry differentil eqution (ODE) du = f(t) dt with initil condition u() =
More informationMath 1B, lecture 4: Error bounds for numerical methods
Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the
More informationExam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1
Exm, Mthemtics 471, Section ETY6 6:5 pm 7:4 pm, Mrch 1, 16, IH-115 Instructor: Attil Máté 1 17 copies 1. ) Stte the usul sufficient condition for the fixed-point itertion to converge when solving the eqution
More informationReview of Calculus, cont d
Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some
More informationGoals: Determine how to calculate the area described by a function. Define the definite integral. Explore the relationship between the definite
Unit #8 : The Integrl Gols: Determine how to clculte the re described by function. Define the definite integrl. Eplore the reltionship between the definite integrl nd re. Eplore wys to estimte the definite
More informationNUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by.
NUMERICAL INTEGRATION 1 Introduction The inverse process to differentition in clculus is integrtion. Mthemticlly, integrtion is represented by f(x) dx which stnds for the integrl of the function f(x) with
More informationCS5371 Theory of Computation. Lecture 20: Complexity V (Polynomial-Time Reducibility)
CS5371 Theory of Computtion Lecture 20: Complexity V (Polynomil-Time Reducibility) Objectives Polynomil Time Reducibility Prove Cook-Levin Theorem Polynomil Time Reducibility Previously, we lernt tht if
More informationMonte Carlo method in solving numerical integration and differential equation
Monte Crlo method in solving numericl integrtion nd differentil eqution Ye Jin Chemistry Deprtment Duke University yj66@duke.edu Abstrct: Monte Crlo method is commonly used in rel physics problem. The
More information4.4 Areas, Integrals and Antiderivatives
. res, integrls nd ntiderivtives 333. Ares, Integrls nd Antiderivtives This section explores properties of functions defined s res nd exmines some connections mong res, integrls nd ntiderivtives. In order
More informationUnit #9 : Definite Integral Properties; Fundamental Theorem of Calculus
Unit #9 : Definite Integrl Properties; Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl
More informationNumerical integration
2 Numericl integrtion This is pge i Printer: Opque this 2. Introduction Numericl integrtion is problem tht is prt of mny problems in the economics nd econometrics literture. The orgniztion of this chpter
More informationLecture 6 Regular Grammars
Lecture 6 Regulr Grmmrs COT 4420 Theory of Computtion Section 3.3 Grmmr A grmmr G is defined s qudruple G = (V, T, S, P) V is finite set of vribles T is finite set of terminl symbols S V is specil vrible
More informationDecision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees
CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize
More informationCS667 Lecture 6: Monte Carlo Integration 02/10/05
CS667 Lecture 6: Monte Crlo Integrtion 02/10/05 Venkt Krishnrj Lecturer: Steve Mrschner 1 Ide The min ide of Monte Crlo Integrtion is tht we cn estimte the vlue of n integrl by looking t lrge number of
More informationLecture 1: Introduction to integration theory and bounded variation
Lecture 1: Introduction to integrtion theory nd bounded vrition Wht is this course bout? Integrtion theory. The first question you might hve is why there is nything you need to lern bout integrtion. You
More information20 MATHEMATICS POLYNOMIALS
0 MATHEMATICS POLYNOMIALS.1 Introduction In Clss IX, you hve studied polynomils in one vrible nd their degrees. Recll tht if p(x) is polynomil in x, the highest power of x in p(x) is clled the degree of
More informationMath& 152 Section Integration by Parts
Mth& 5 Section 7. - Integrtion by Prts Integrtion by prts is rule tht trnsforms the integrl of the product of two functions into other (idelly simpler) integrls. Recll from Clculus I tht given two differentible
More informationState space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies
Stte spce systems nlysis (continued) Stbility A. Definitions A system is sid to be Asymptoticlly Stble (AS) when it stisfies ut () = 0, t > 0 lim xt () 0. t A system is AS if nd only if the impulse response
More informationReview of Gaussian Quadrature method
Review of Gussin Qudrture method Nsser M. Asi Spring 006 compiled on Sundy Decemer 1, 017 t 09:1 PM 1 The prolem To find numericl vlue for the integrl of rel vlued function of rel vrile over specific rnge
More informationOperations with Polynomials
38 Chpter P Prerequisites P.4 Opertions with Polynomils Wht you should lern: How to identify the leding coefficients nd degrees of polynomils How to dd nd subtrct polynomils How to multiply polynomils
More informationAdvanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004
Advnced Clculus: MATH 410 Notes on Integrls nd Integrbility Professor Dvid Levermore 17 October 2004 1. Definite Integrls In this section we revisit the definite integrl tht you were introduced to when
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 CMSC 330 1 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All exmples so fr Nondeterministic
More information5.2 Exponent Properties Involving Quotients
5. Eponent Properties Involving Quotients Lerning Objectives Use the quotient of powers property. Use the power of quotient property. Simplify epressions involving quotient properties of eponents. Use
More informationNormal Distribution. Lecture 6: More Binomial Distribution. Properties of the Unit Normal Distribution. Unit Normal Distribution
Norml Distribution Lecture 6: More Binomil Distribution If X is rndom vrible with norml distribution with men µ nd vrince σ 2, X N (µ, σ 2, then P(X = x = f (x = 1 e 1 (x µ 2 2 σ 2 σ Sttistics 104 Colin
More informationMORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.)
MORE FUNCTION GRAPHING; OPTIMIZATION FRI, OCT 25, 203 (Lst edited October 28, 203 t :09pm.) Exercise. Let n be n rbitrry positive integer. Give n exmple of function with exctly n verticl symptotes. Give
More informationUninformed Search Lecture 4
Lecture 4 Wht re common serch strtegies tht operte given only serch problem? How do they compre? 1 Agend A quick refresher DFS, BFS, ID-DFS, UCS Unifiction! 2 Serch Problem Formlism Defined vi the following
More informationReinforcement learning
Reinforcement lerning Regulr MDP Given: Trnition model P Rewrd function R Find: Policy π Reinforcement lerning Trnition model nd rewrd function initilly unknown Still need to find the right policy Lern
More informationProblem Set 3 Solutions
Chemistry 36 Dr Jen M Stndrd Problem Set 3 Solutions 1 Verify for the prticle in one-dimensionl box by explicit integrtion tht the wvefunction ψ ( x) π x is normlized To verify tht ψ ( x) is normlized,
More informationSTEP FUNCTIONS, DELTA FUNCTIONS, AND THE VARIATION OF PARAMETERS FORMULA. 0 if t < 0, 1 if t > 0.
STEP FUNCTIONS, DELTA FUNCTIONS, AND THE VARIATION OF PARAMETERS FORMULA STEPHEN SCHECTER. The unit step function nd piecewise continuous functions The Heviside unit step function u(t) is given by if t
More informationLECTURE NOTE #12 PROF. ALAN YUILLE
LECTURE NOTE #12 PROF. ALAN YUILLE 1. Clustering, K-mens, nd EM Tsk: set of unlbeled dt D = {x 1,..., x n } Decompose into clsses w 1,..., w M where M is unknown. Lern clss models p(x w)) Discovery of
More informationP 3 (x) = f(0) + f (0)x + f (0) 2. x 2 + f (0) . In the problem set, you are asked to show, in general, the n th order term is a n = f (n) (0)
1 Tylor polynomils In Section 3.5, we discussed how to pproximte function f(x) round point in terms of its first derivtive f (x) evluted t, tht is using the liner pproximtion f() + f ()(x ). We clled this
More informationChapter 3 Solving Nonlinear Equations
Chpter 3 Solving Nonliner Equtions 3.1 Introduction The nonliner function of unknown vrible x is in the form of where n could be non-integer. Root is the numericl vlue of x tht stisfies f ( x) 0. Grphiclly,
More informationBefore we can begin Ch. 3 on Radicals, we need to be familiar with perfect squares, cubes, etc. Try and do as many as you can without a calculator!!!
Nme: Algebr II Honors Pre-Chpter Homework Before we cn begin Ch on Rdicls, we need to be fmilir with perfect squres, cubes, etc Try nd do s mny s you cn without clcultor!!! n The nth root of n n Be ble
More informationBest Approximation. Chapter The General Case
Chpter 4 Best Approximtion 4.1 The Generl Cse In the previous chpter, we hve seen how n interpolting polynomil cn be used s n pproximtion to given function. We now wnt to find the best pproximtion to given
More information1.2. Linear Variable Coefficient Equations. y + b "! = a y + b " Remark: The case b = 0 and a non-constant can be solved with the same idea as above.
1 12 Liner Vrible Coefficient Equtions Section Objective(s): Review: Constnt Coefficient Equtions Solving Vrible Coefficient Equtions The Integrting Fctor Method The Bernoulli Eqution 121 Review: Constnt
More informationPhysics 201 Lab 3: Measurement of Earth s local gravitational field I Data Acquisition and Preliminary Analysis Dr. Timothy C. Black Summer I, 2018
Physics 201 Lb 3: Mesurement of Erth s locl grvittionl field I Dt Acquisition nd Preliminry Anlysis Dr. Timothy C. Blck Summer I, 2018 Theoreticl Discussion Grvity is one of the four known fundmentl forces.
More informationImproper Integrals, and Differential Equations
Improper Integrls, nd Differentil Equtions October 22, 204 5.3 Improper Integrls Previously, we discussed how integrls correspond to res. More specificlly, we sid tht for function f(x), the region creted
More informationEquations and Inequalities
Equtions nd Inequlities Equtions nd Inequlities Curriculum Redy ACMNA: 4, 5, 6, 7, 40 www.mthletics.com Equtions EQUATIONS & Inequlities & INEQUALITIES Sometimes just writing vribles or pronumerls in
More informationCS 188 Introduction to Artificial Intelligence Fall 2018 Note 7
CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees
More informationCS103B Handout 18 Winter 2007 February 28, 2007 Finite Automata
CS103B ndout 18 Winter 2007 Ferury 28, 2007 Finite Automt Initil text y Mggie Johnson. Introduction Severl childrens gmes fit the following description: Pieces re set up on plying ord; dice re thrown or
More informationContinuous Random Variables
STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 217 Néhémy Lim Continuous Rndom Vribles Nottion. The indictor function of set S is rel-vlued function defined by : { 1 if x S 1 S (x) if x S Suppose tht
More informationIntroduction to Reinforcement Learning. Part 6: Core Theory II: Bellman Equations and Dynamic Programming
Introduction to Reinforcement Learning Part 6: Core Theory II: Bellman Equations and Dynamic Programming Bellman Equations Recursive relationships among values that can be used to compute values The tree
More informationHow to simulate Turing machines by invertible one-dimensional cellular automata
How to simulte Turing mchines by invertible one-dimensionl cellulr utomt Jen-Christophe Dubcq Déprtement de Mthémtiques et d Informtique, École Normle Supérieure de Lyon, 46, llée d Itlie, 69364 Lyon Cedex
More informationChapter 4 Contravariance, Covariance, and Spacetime Diagrams
Chpter 4 Contrvrince, Covrince, nd Spcetime Digrms 4. The Components of Vector in Skewed Coordintes We hve seen in Chpter 3; figure 3.9, tht in order to show inertil motion tht is consistent with the Lorentz
More informationDecision Networks. CS 188: Artificial Intelligence. Decision Networks. Decision Networks. Decision Networks and Value of Information
CS 188: Artificil Intelligence nd Vlue of Informtion Instructors: Dn Klein nd Pieter Abbeel niversity of Cliforni, Berkeley [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI
More informationJim Lambers MAT 169 Fall Semester Lecture 4 Notes
Jim Lmbers MAT 169 Fll Semester 2009-10 Lecture 4 Notes These notes correspond to Section 8.2 in the text. Series Wht is Series? An infinte series, usully referred to simply s series, is n sum of ll of
More informationHow do you know you have SLE?
Simultneous Liner Equtions Simultneous Liner Equtions nd Liner Algebr Simultneous liner equtions (SLE s) occur frequently in Sttics, Dynmics, Circuits nd other engineering clsses Need to be ble to, nd
More informationCS 188: Artificial Intelligence Fall 2010
CS 188: Artificil Intelligence Fll 2010 Lecture 18: Decision Digrms 10/28/2010 Dn Klein C Berkeley Vlue of Informtion 1 Decision Networks ME: choose the ction which mximizes the expected utility given
More informationCS 188: Artificial Intelligence
CS 188: Artificil Intelligence Lecture 19: Decision Digrms Pieter Abbeel --- C Berkeley Mny slides over this course dpted from Dn Klein, Sturt Russell, Andrew Moore Decision Networks ME: choose the ction
More informationReview of basic calculus
Review of bsic clculus This brief review reclls some of the most importnt concepts, definitions, nd theorems from bsic clculus. It is not intended to tech bsic clculus from scrtch. If ny of the items below
More information1 Online Learning and Regret Minimization
2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in
More informationa a a a a a a a a a a a a a a a a a a a a a a a In this section, we introduce a general formula for computing determinants.
Section 9 The Lplce Expnsion In the lst section, we defined the determinnt of (3 3) mtrix A 12 to be 22 12 21 22 2231 22 12 21. In this section, we introduce generl formul for computing determinnts. Rewriting
More informationA-Level Mathematics Transition Task (compulsory for all maths students and all further maths student)
A-Level Mthemtics Trnsition Tsk (compulsory for ll mths students nd ll further mths student) Due: st Lesson of the yer. Length: - hours work (depending on prior knowledge) This trnsition tsk provides revision
More informationChapters 4 & 5 Integrals & Applications
Contents Chpters 4 & 5 Integrls & Applictions Motivtion to Chpters 4 & 5 2 Chpter 4 3 Ares nd Distnces 3. VIDEO - Ares Under Functions............................................ 3.2 VIDEO - Applictions
More informationEfficient Planning. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction
Efficient Plnning 1 Tuesdy clss summry: Plnning: ny computtionl process tht uses model to crete or improve policy Dyn frmework: 2 Questions during clss Why use simulted experience? Cn t you directly compute
More informationDuality # Second iteration for HW problem. Recall our LP example problem we have been working on, in equality form, is given below.
Dulity #. Second itertion for HW problem Recll our LP emple problem we hve been working on, in equlity form, is given below.,,,, 8 m F which, when written in slightly different form, is 8 F Recll tht we
More informationGenetic Programming. Outline. Evolutionary Strategies. Evolutionary strategies Genetic programming Summary
Outline Genetic Progrmming Evolutionry strtegies Genetic progrmming Summry Bsed on the mteril provided y Professor Michel Negnevitsky Evolutionry Strtegies An pproch simulting nturl evolution ws proposed
More informationSUMMER KNOWHOW STUDY AND LEARNING CENTRE
SUMMER KNOWHOW STUDY AND LEARNING CENTRE Indices & Logrithms 2 Contents Indices.2 Frctionl Indices.4 Logrithms 6 Exponentil equtions. Simplifying Surds 13 Opertions on Surds..16 Scientific Nottion..18
More informationAPPROXIMATE INTEGRATION
APPROXIMATE INTEGRATION. Introduction We hve seen tht there re functions whose nti-derivtives cnnot be expressed in closed form. For these resons ny definite integrl involving these integrnds cnnot be
More informationCMSC 330: Organization of Programming Languages. DFAs, and NFAs, and Regexps (Oh my!)
CMSC 330: Orgniztion of Progrmming Lnguges DFAs, nd NFAs, nd Regexps (Oh my!) CMSC330 Spring 2018 Types of Finite Automt Deterministic Finite Automt (DFA) Exctly one sequence of steps for ech string All
More informationMath 113 Exam 2 Practice
Mth 3 Exm Prctice Februry 8, 03 Exm will cover 7.4, 7.5, 7.7, 7.8, 8.-3 nd 8.5. Plese note tht integrtion skills lerned in erlier sections will still be needed for the mteril in 7.5, 7.8 nd chpter 8. This
More informationThis lecture covers Chapter 8 of HMU: Properties of CFLs
This lecture covers Chpter 8 of HMU: Properties of CFLs Turing Mchine Extensions of Turing Mchines Restrictions of Turing Mchines Additionl Reding: Chpter 8 of HMU. Turing Mchine: Informl Definition B
More informationNumerical Integration
Chpter 5 Numericl Integrtion Numericl integrtion is the study of how the numericl vlue of n integrl cn be found. Methods of function pproximtion discussed in Chpter??, i.e., function pproximtion vi the
More informationBernoulli Numbers Jeff Morton
Bernoulli Numbers Jeff Morton. We re interested in the opertor e t k d k t k, which is to sy k tk. Applying this to some function f E to get e t f d k k tk d k f f + d k k tk dk f, we note tht since f
More informationDIRECT CURRENT CIRCUITS
DRECT CURRENT CUTS ELECTRC POWER Consider the circuit shown in the Figure where bttery is connected to resistor R. A positive chrge dq will gin potentil energy s it moves from point to point b through
More information12 TRANSFORMING BIVARIATE DENSITY FUNCTIONS
1 TRANSFORMING BIVARIATE DENSITY FUNCTIONS Hving seen how to trnsform the probbility density functions ssocited with single rndom vrible, the next logicl step is to see how to trnsform bivrite probbility
More informationChapter 14. Matrix Representations of Linear Transformations
Chpter 4 Mtrix Representtions of Liner Trnsformtions When considering the Het Stte Evolution, we found tht we could describe this process using multipliction by mtrix. This ws nice becuse computers cn
More informationp-adic Egyptian Fractions
p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction
More informationZ b. f(x)dx. Yet in the above two cases we know what f(x) is. Sometimes, engineers want to calculate an area by computing I, but...
Chpter 7 Numericl Methods 7. Introduction In mny cses the integrl f(x)dx cn be found by finding function F (x) such tht F 0 (x) =f(x), nd using f(x)dx = F (b) F () which is known s the nlyticl (exct) solution.
More informationTypes of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. NFA for (a b)*abb.
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt
More informationRiemann is the Mann! (But Lebesgue may besgue to differ.)
Riemnn is the Mnn! (But Lebesgue my besgue to differ.) Leo Livshits My 2, 2008 1 For finite intervls in R We hve seen in clss tht every continuous function f : [, b] R hs the property tht for every ɛ >
More informationTypes of Finite Automata. CMSC 330: Organization of Programming Languages. Comparing DFAs and NFAs. Comparing DFAs and NFAs (cont.) Finite Automata 2
CMSC 330: Orgniztion of Progrmming Lnguges Finite Automt 2 Types of Finite Automt Deterministic Finite Automt () Exctly one sequence of steps for ech string All exmples so fr Nondeterministic Finite Automt
More informationVectors , (0,0). 5. A vector is commonly denoted by putting an arrow above its symbol, as in the picture above. Here are some 3-dimensional vectors:
Vectors 1-23-2018 I ll look t vectors from n lgeric point of view nd geometric point of view. Algericlly, vector is n ordered list of (usully) rel numers. Here re some 2-dimensionl vectors: (2, 3), ( )
More informationODE: Existence and Uniqueness of a Solution
Mth 22 Fll 213 Jerry Kzdn ODE: Existence nd Uniqueness of Solution The Fundmentl Theorem of Clculus tells us how to solve the ordinry dierentil eqution (ODE) du f(t) dt with initil condition u() : Just
More informationArtificial Intelligence Markov Decision Problems
rtificil Intelligence Mrkov eciion Problem ilon - briefly mentioned in hpter Ruell nd orvig - hpter 7 Mrkov eciion Problem; pge of Mrkov eciion Problem; pge of exmple: probbilitic blockworld ction outcome
More information3.4 Numerical integration
3.4. Numericl integrtion 63 3.4 Numericl integrtion In mny economic pplictions it is necessry to compute the definite integrl of relvlued function f with respect to "weight" function w over n intervl [,
More informationW. We shall do so one by one, starting with I 1, and we shall do it greedily, trying
Vitli covers 1 Definition. A Vitli cover of set E R is set V of closed intervls with positive length so tht, for every δ > 0 nd every x E, there is some I V with λ(i ) < δ nd x I. 2 Lemm (Vitli covering)
More information7.2 The Definite Integral
7.2 The Definite Integrl the definite integrl In the previous section, it ws found tht if function f is continuous nd nonnegtive, then the re under the grph of f on [, b] is given by F (b) F (), where
More information4 7x =250; 5 3x =500; Read section 3.3, 3.4 Announcements: Bell Ringer: Use your calculator to solve
Dte: 3/14/13 Objective: SWBAT pply properties of exponentil functions nd will pply properties of rithms. Bell Ringer: Use your clcultor to solve 4 7x =250; 5 3x =500; HW Requests: Properties of Log Equtions
More informationCS 275 Automata and Formal Language Theory
CS 275 Automt nd Forml Lnguge Theory Course Notes Prt II: The Recognition Problem (II) Chpter II.5.: Properties of Context Free Grmmrs (14) Anton Setzer (Bsed on book drft by J. V. Tucker nd K. Stephenson)
More informationStudent Activity 3: Single Factor ANOVA
MATH 40 Student Activity 3: Single Fctor ANOVA Some Bsic Concepts In designed experiment, two or more tretments, or combintions of tretments, is pplied to experimentl units The number of tretments, whether
More informationHow do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?
XII. LINEAR ALGEBRA: SOLVING SYSTEMS OF EQUATIONS Tody we re going to tlk bout solving systems of liner equtions. These re problems tht give couple of equtions with couple of unknowns, like: 6 2 3 7 4
More informationSummary: Method of Separation of Variables
Physics 246 Electricity nd Mgnetism I, Fll 26, Lecture 22 1 Summry: Method of Seprtion of Vribles 1. Seprtion of Vribles in Crtesin Coordintes 2. Fourier Series Suggested Reding: Griffiths: Chpter 3, Section
More information