Actor-Critic. Hung-yi Lee
|
|
- Audra Cameron
- 5 years ago
- Views:
Transcription
1 Actor-Critic Hung-yi Lee
2 Asynchronous Advntge Actor-Critic (A3C) Volodymyr Mnih, Adrià Puigdomènech Bdi, Mehdi Mirz, Alex Grves, Timothy P. Lillicrp, Tim Hrley, Dvid Silver, Kory Kvukcuoglu, Asynchronous Methods for Deep Reinforcement Lerning, ICML, 2016
3 Review Policy Grdient N തR θ 1 N n=1 T n t=1 With sufficient smples, pproximte the expecttion of G. s Cn we estimte the expected vlue of G? bseline T n γ t t r n t t b logp θ n n t s t =t G n t : obtined vi interction Very unstble G = 100 G = 3 G = 1 G = 2 G = 10
4 Review Q-Lerning Stte vlue function V π s When using ctor π, the cumulted rewrd expects to be obtined fter visiting stte s Stte-ction vlue function Q π s, When using ctor π, the cumulted rewrd expects to be obtined fter tking t stte s for discrete ction only s V π V π s Q π s, = left s Q π Q π s, = right sclr Q π s, = fire Estimted by TD or MC
5 Actor-Critic Q π θ s t n, t n V π θ s t n V π θ s t n N തR θ 1 N n=1 T n t=1 bseline T n γ t t r n t b logp θ n n t s t t =t G t n : obtined vi interction E G t n = Q π θ s t n, t n
6 Advntge Actor-Critic Q π s t n, t n V π s t n Estimte two networks? We cn only estimte one. r t n + V π s t+1 n V π s t n Only estimte stte vlue A little bit vrince Q π s t n, t n Q π s t n, t n = E r n t + V π n s t+1 = r n t + V π n s t+1
7 Advntge Actor-Critic π intercts with the environment π = π TD or MC Updte ctor from π π bsed on V π s Lerning V π s N തR θ 1 N n=1 T n r n t + V π n s t+1 V π s n t logp θ n n t s t t=1
8 Advntge Actor-Critic Tips The prmeters of ctor π s cn be shred Network s Network nd critic V π s left right fire Network V π s Use output entropy s regulriztion for π s Lrger entropy is preferred explortion
9 Asynchronous Advntge Actor-Critic (A3C) The ide is from 李思叡
10 Asynchronous Source of imge: 1. Copy globl prmeters 2. Smpling some dt 3. Compute grdients 4. Updte globl models θ θ 1 θ θ 1 θ 1 θ 2 +η θ (other workers lso updte models)
11 Pthwise Derivtive Policy Grdient Dvid Silver, Guy Lever, Nicols Heess, Thoms Degris, Dn Wierstr, Mrtin Riedmiller, Deterministic Policy Grdient Algorithms, ICML, 2014 Timothy P. Lillicrp, Jonthn J. Hunt, Alexnder Pritzel, Nicols Heess, Tom Erez, Yuvl Tss, Dvid Silver, Dn Wierstr, CONTINUOUS CONTROL WITH DEEP REINFORCEMENT LEARNING, ICLR, 2016
12 Another Wy to use Critic Originl Actor-critic Q π s, Pthwise derivtive policy grdient From Q function we know tht tking t stte s is better thn Q π s, 1 2 decrese increse We know the prmeters of Q function
13 Actor Critic Pthwise derivtive policy grdient Originl Actor-critic Action is continuous vector = rg mx Q s, s Actor π Actor s the solver of this optimiztion problem
14 Pthwise Derivtive Policy Grdient π s = rg mx Qπ s, is the output of n ctor Grdient scent: θ π θ π + η θ πq π s, Updte π π s Fixed Q π Q π s, s Actor π = This is lrge network
15 Explortion π = π π intercts with the environment TD or MC Reply Buffer Find new ctor π better thn π Lerning Q π s, θ π θ π + η θ πq π s, s Updte π π Actor π = s Q π Q π s,
16 Q-Lerning Algorithm Initilize Q-function Q, trget Q-function Q = Q, ctor π, trget ctor π = π In ech episode For ech time step t Given stte s t, tke ction t bsed on Q (explortion) Obtin rewrd r t, nd rech new stte s t+1 Store (s t, t, r t, s t+1 ) into buffer Smple (s i, i, r i, s i+1 ) from buffer (usully btch) Trget y = r i + mx Q s i+1, Updte the prmeters of Q to mke Q s i, i close to y (regression) Updte the prmeters of π to mximize Q s i,π s i Every C steps reset Q = Q Every C steps reset π = π
17 Q-Lerning Algorithm Pthwise Derivtive Policy Grdient Initilize Q-function Q, trget Q-function Q = Q, ctor π, trget ctor π = π In ech episode For ech time step t 1 Given stte s t, tke ction t bsed on Q π (explortion) Obtin rewrd r t, nd rech new stte s t+1 Store (s t, t, r t, s t+1 ) into buffer Smple (s i, i, r i, s i+1 ) from buffer (usully btch) 2 Trget y = r i + mx Q s i+1, Q s i+1, π s i+1 Updte the prmeters of Q to mke Q s i, i close to y (regression) 3 Updte the prmeters of π to mximize Q s i,π s i Every C steps reset Q = Q 4 Every C steps reset π = π
18 Connection with GAN Dvid Pfu, Oriol Vinyls, Connecting Genertive Adversril Networks nd Actor-Critic Methods, rxiv preprint, 2016
Reinforcement learning II
CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic
More informationDeep Reinforcement Learning. Scratching the surface
Deep Reinforcement Learning Scratching the surface Deep Reinforcement Learning Scenario of Reinforcement Learning Observation State Agent Action Change the environment Don t do that Reward Environment
More informationReinforcement Learning
Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm
More information2D1431 Machine Learning Lab 3: Reinforcement Learning
2D1431 Mchine Lerning Lb 3: Reinforcement Lerning Frnk Hoffmnn modified by Örjn Ekeberg December 7, 2004 1 Introduction In this lb you will lern bout dynmic progrmming nd reinforcement lerning. It is ssumed
More informationAdministrivia CSE 190: Reinforcement Learning: An Introduction
Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these
More informationarxiv: v3 [stat.ml] 20 Jul 2018
GAN Q-lerning Thng Don Desutels Fculty of Mngement McGill University thng.don@mil.mcgill.c Bogdn Mzoure Deprtment of Mthemtics & Sttistics McGill University bogdn.mzoure@mil.mcgill.c rxiv:1805.04874v3
More information19 Optimal behavior: Game theory
Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,
More informationReinforcement learning
Reinforcement lerning Regulr MDP Given: Trnition model P Rewrd function R Find: Policy π Reinforcement lerning Trnition model nd rewrd function initilly unknown Still need to find the right policy Lern
More information{ } = E! & $ " k r t +k +1
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationChapter 4: Dynamic Programming
Chpter 4: Dynmic Progrmming Objectives of this chpter: Overview of collection of clssicl solution methods for MDPs known s dynmic progrmming (DP) Show how DP cn be used to compute vlue functions, nd hence,
More informationBellman Optimality Equation for V*
Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s
More information1 Online Learning and Regret Minimization
2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in
More informationLECTURE NOTE #12 PROF. ALAN YUILLE
LECTURE NOTE #12 PROF. ALAN YUILLE 1. Clustering, K-mens, nd EM Tsk: set of unlbeled dt D = {x 1,..., x n } Decompose into clsses w 1,..., w M where M is unknown. Lern clss models p(x w)) Discovery of
More informationarxiv: v1 [cs.lg] 23 Oct 2018
Hierrchicl Approches for Reinforcement Lerning in Prmeterized Action Spce Ermo Wei nd Drew Wicke nd Sen Luke Deprtment of Computer Science, George Mson University, Firfx, VA USA ewei@cs.gmu.edu, dwicke@gmu.edu,
More informationarxiv: v1 [cs.ai] 14 Feb 2018
Yng Go 1 Huzhe(Hrry) Xu 1 Ji Lin 2 Fisher Yu 1 Sergey Levine 1 Trevor Drrell 1 rxiv:1802.05313v1 cs.ai 14 Feb 2018 Abstrct Robust rel-world lerning should benefit from both demonstrtions nd interctions
More informationMulti-Armed Bandits: Non-adaptive and Adaptive Sampling
CSE 547/Stt 548: Mchine Lerning for Big Dt Lecture Multi-Armed Bndits: Non-dptive nd Adptive Smpling Instructor: Shm Kkde 1 The (stochstic) multi-rmed bndit problem The bsic prdigm is s follows: K Independent
More informationSolution for Assignment 1 : Intro to Probability and Statistics, PAC learning
Solution for Assignment 1 : Intro to Probbility nd Sttistics, PAC lerning 10-701/15-781: Mchine Lerning (Fll 004) Due: Sept. 30th 004, Thursdy, Strt of clss Question 1. Bsic Probbility ( 18 pts) 1.1 (
More informationMath 426: Probability Final Exam Practice
Mth 46: Probbility Finl Exm Prctice. Computtionl problems 4. Let T k (n) denote the number of prtitions of the set {,..., n} into k nonempty subsets, where k n. Argue tht T k (n) kt k (n ) + T k (n ) by
More informationn f(x i ) x. i=1 In section 4.2, we defined the definite integral of f from x = a to x = b as n f(x i ) x; f(x) dx = lim i=1
The Fundmentl Theorem of Clculus As we continue to study the re problem, let s think bck to wht we know bout computing res of regions enclosed by curves. If we wnt to find the re of the region below the
More information3.4 Numerical integration
3.4. Numericl integrtion 63 3.4 Numericl integrtion In mny economic pplictions it is necessry to compute the definite integrl of relvlued function f with respect to "weight" function w over n intervl [,
More informationarxiv: v2 [cs.lg] 30 May 2018
Fourier Policy Grdients Mtthew Fellows * 1 Kmil Ciosek * 1 Shimon Whiteson 1 rxiv:180.06891v cs.lg] 30 My 018 Abstrct We propose new wy of deriving policy grdient updtes for reinforcement lerning. Our
More informationContinuous Random Variables
STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 217 Néhémy Lim Continuous Rndom Vribles Nottion. The indictor function of set S is rel-vlued function defined by : { 1 if x S 1 S (x) if x S Suppose tht
More informationarxiv: v2 [cs.lg] 7 Mar 2018
Comparing Deep Reinforcement Learning and Evolutionary Methods in Continuous Control arxiv:1712.00006v2 [cs.lg] 7 Mar 2018 Shangtong Zhang 1, Osmar R. Zaiane 2 12 Dept. of Computing Science, University
More information1 Linear Least Squares
Lest Squres Pge 1 1 Liner Lest Squres I will try to be consistent in nottion, with n being the number of dt points, nd m < n being the number of prmeters in model function. We re interested in solving
More informationthan 1. It means in particular that the function is decreasing and approaching the x-
6 Preclculus Review Grph the functions ) (/) ) log y = b y = Solution () The function y = is n eponentil function with bse smller thn It mens in prticulr tht the function is decresing nd pproching the
More informationLearning to Serve and Bounce a Ball
Sndr Amend Gregor Gebhrdt Technische Universität Drmstdt Abstrct In this pper we investigte lerning the tsks of bll serving nd bll bouncing. These tsks disply chrcteristics which re common in vriety of
More informationJack Simons, Henry Eyring Scientist and Professor Chemistry Department University of Utah
1. Born-Oppenheimer pprox.- energy surfces 2. Men-field (Hrtree-Fock) theory- orbitls 3. Pros nd cons of HF- RHF, UHF 4. Beyond HF- why? 5. First, one usully does HF-how? 6. Bsis sets nd nottions 7. MPn,
More informationPlanning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments
Plnning to Be Surprised: Optiml Byesin Explortion in Dynmic Environments Yi Sun, Fustino Gomez, nd Jürgen Schmidhuber IDSIA, Glleri 2, Mnno, CH-6928, Switzerlnd {yi,tino,juergen}@idsi.ch Abstrct. To mximize
More informationChapter 9: Inferences based on Two samples: Confidence intervals and tests of hypotheses
Chpter 9: Inferences bsed on Two smples: Confidence intervls nd tests of hypotheses 9.1 The trget prmeter : difference between two popultion mens : difference between two popultion proportions : rtio of
More informationA Fast and Reliable Policy Improvement Algorithm
A Fst nd Relible Policy Improvement Algorithm Ysin Abbsi-Ydkori Peter L. Brtlett Stephen J. Wright Queenslnd University of Technology UC Berkeley nd QUT University of Wisconsin-Mdison Abstrct We introduce
More informationIntroduction of Reinforcement Learning
Introduction of Reinforcement Learning Deep Reinforcement Learning Reference Textbook: Reinforcement Learning: An Introduction http://incompleteideas.net/sutton/book/the-book.html Lectures of David Silver
More informationDecision Networks. CS 188: Artificial Intelligence Fall Example: Decision Networks. Decision Networks. Decisions as Outcome Trees
CS 188: Artificil Intelligence Fll 2011 Decision Networks ME: choose the ction which mximizes the expected utility given the evidence mbrell Lecture 17: Decision Digrms 10/27/2011 Cn directly opertionlize
More informationTHE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.
THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem
More informationModule 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo
Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:
More informationArtificial Intelligence Markov Decision Problems
rtificil Intelligence Mrkov eciion Problem ilon - briefly mentioned in hpter Ruell nd orvig - hpter 7 Mrkov eciion Problem; pge of Mrkov eciion Problem; pge of exmple: probbilitic blockworld ction outcome
More informationReinforcement Learning and Policy Reuse
Reinforcement Lerning nd Policy Reue Mnuel M. Veloo PEL Fll 206 Reding: Reinforcement Lerning: An Introduction R. Sutton nd A. Brto Probbilitic policy reue in reinforcement lerning gent Fernndo Fernndez
More informationA DEEP REINFORCEMENT LEARNING APPROACH TO USING WHOLE BUILDING ENERGY MODEL FOR HVAC OPTIMAL CONTROL
2018 Building Performnce Modeling Conference nd SimBuild co-orgnized by ASHRAE nd IBPSA-USA Chicgo, IL September 26-28, 2018 A DEEP REINFORCEMENT LEARNING APPROACH TO USING WHOLE BUILDING ENERGY MODEL
More informationRiemann is the Mann! (But Lebesgue may besgue to differ.)
Riemnn is the Mnn! (But Lebesgue my besgue to differ.) Leo Livshits My 2, 2008 1 For finite intervls in R We hve seen in clss tht every continuous function f : [, b] R hs the property tht for every ɛ >
More informationName Solutions to Test 3 November 8, 2017
Nme Solutions to Test 3 November 8, 07 This test consists of three prts. Plese note tht in prts II nd III, you cn skip one question of those offered. Some possibly useful formuls cn be found below. Brrier
More informationA signalling model of school grades: centralized versus decentralized examinations
A signlling model of school grdes: centrlized versus decentrlized exmintions Mri De Pol nd Vincenzo Scopp Diprtimento di Economi e Sttistic, Università dell Clbri m.depol@unicl.it; v.scopp@unicl.it 1 The
More informationEfficient Planning. R. S. Sutton and A. G. Barto: Reinforcement Learning: An Introduction
Efficient Plnning 1 Tuesdy clss summry: Plnning: ny computtionl process tht uses model to crete or improve policy Dyn frmework: 2 Questions during clss Why use simulted experience? Cn t you directly compute
More informationInterpreting Integrals and the Fundamental Theorem
Interpreting Integrls nd the Fundmentl Theorem Tody, we go further in interpreting the mening of the definite integrl. Using Units to Aid Interprettion We lredy know tht if f(t) is the rte of chnge of
More informationDeep Reinforcement Learning via Policy Optimization
Deep Reinforcement Learning via Policy Optimization John Schulman July 3, 2017 Introduction Deep Reinforcement Learning: What to Learn? Policies (select next action) Deep Reinforcement Learning: What to
More informationExam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1
Exm, Mthemtics 471, Section ETY6 6:5 pm 7:4 pm, Mrch 1, 16, IH-115 Instructor: Attil Máté 1 17 copies 1. ) Stte the usul sufficient condition for the fixed-point itertion to converge when solving the eqution
More informationHidden Markov Models
Hidden Mrkov Models Huptseminr Mchine Lerning 18.11.2003 Referent: Nikols Dörfler 1 Overview Mrkov Models Hidden Mrkov Models Types of Hidden Mrkov Models Applictions using HMMs Three centrl problems:
More informationAcceptance Sampling by Attributes
Introduction Acceptnce Smpling by Attributes Acceptnce smpling is concerned with inspection nd decision mking regrding products. Three spects of smpling re importnt: o Involves rndom smpling of n entire
More informationCS 188: Artificial Intelligence Fall Announcements
CS 188: Artificil Intelligence Fll 2009 Lecture 20: Prticle Filtering 11/5/2009 Dn Klein UC Berkeley Announcements Written 3 out: due 10/12 Project 4 out: due 10/19 Written 4 proly xed, Project 5 moving
More informationBayesian Networks: Approximate Inference
pproches to inference yesin Networks: pproximte Inference xct inference Vrillimintion Join tree lgorithm pproximte inference Simplify the structure of the network to mkxct inferencfficient (vritionl methods,
More informationSpace Curves. Recall the parametric equations of a curve in xy-plane and compare them with parametric equations of a curve in space.
Clculus 3 Li Vs Spce Curves Recll the prmetric equtions of curve in xy-plne nd compre them with prmetric equtions of curve in spce. Prmetric curve in plne x = x(t) y = y(t) Prmetric curve in spce x = x(t)
More informationOrdinary Differential Equations- Boundary Value Problem
Ordinry Differentil Equtions- Boundry Vlue Problem Shooting method Runge Kutt method Computer-bsed solutions o BVPFD subroutine (Fortrn IMSL subroutine tht Solves (prmeterized) system of differentil equtions
More informationClassical Mechanics. From Molecular to Con/nuum Physics I WS 11/12 Emiliano Ippoli/ October, 2011
Clssicl Mechnics From Moleculr to Con/nuum Physics I WS 11/12 Emilino Ippoli/ October, 2011 Wednesdy, October 12, 2011 Review Mthemtics... Physics Bsic thermodynmics Temperture, idel gs, kinetic gs theory,
More informationCS 188: Artificial Intelligence Spring 2007
CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment
More informationSOLUTIONS FOR ADMISSIONS TEST IN MATHEMATICS, COMPUTER SCIENCE AND JOINT SCHOOLS WEDNESDAY 5 NOVEMBER 2014
SOLUTIONS FOR ADMISSIONS TEST IN MATHEMATICS, COMPUTER SCIENCE AND JOINT SCHOOLS WEDNESDAY 5 NOVEMBER 014 Mrk Scheme: Ech prt of Question 1 is worth four mrks which re wrded solely for the correct nswer.
More informationNUMERICAL INTEGRATION
NUMERICAL INTEGRATION How do we evlute I = f (x) dx By the fundmentl theorem of clculus, if F (x) is n ntiderivtive of f (x), then I = f (x) dx = F (x) b = F (b) F () However, in prctice most integrls
More informationReview of Calculus, cont d
Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some
More informationMATH 115 FINAL EXAM. April 25, 2005
MATH 115 FINAL EXAM April 25, 2005 NAME: Solution Key INSTRUCTOR: SECTION NO: 1. Do not open this exm until you re told to begin. 2. This exm hs 9 pges including this cover. There re 9 questions. 3. Do
More informationChapter 5 : Continuous Random Variables
STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 216 Néhémy Lim Chpter 5 : Continuous Rndom Vribles Nottions. N {, 1, 2,...}, set of nturl numbers (i.e. ll nonnegtive integers); N {1, 2,...}, set of ll
More information14.4. Lengths of curves and surfaces of revolution. Introduction. Prerequisites. Learning Outcomes
Lengths of curves nd surfces of revolution 4.4 Introduction Integrtion cn be used to find the length of curve nd the re of the surfce generted when curve is rotted round n xis. In this section we stte
More informationAPPROXIMATE INTEGRATION
APPROXIMATE INTEGRATION. Introduction We hve seen tht there re functions whose nti-derivtives cnnot be expressed in closed form. For these resons ny definite integrl involving these integrnds cnnot be
More informationMath 270A: Numerical Linear Algebra
Mth 70A: Numericl Liner Algebr Instructor: Michel Holst Fll Qurter 014 Homework Assignment #3 Due Give to TA t lest few dys before finl if you wnt feedbck. Exercise 3.1. (The Bsic Liner Method for Liner
More informationCBE 291b - Computation And Optimization For Engineers
The University of Western Ontrio Fculty of Engineering Science Deprtment of Chemicl nd Biochemicl Engineering CBE 9b - Computtion And Optimiztion For Engineers Mtlb Project Introduction Prof. A. Jutn Jn
More informationAsymptotic results for Normal-Cauchy model
Asymptotic results for Norml-Cuchy model John D. Cook Deprtment of Biosttistics P. O. Box 342, Unit 49 The University of Texs, M. D. Anderson Cncer Center Houston, Texs 7723-42, USA cook@mdnderson.org
More informationDiscrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17
EECS 70 Discrete Mthemtics nd Proility Theory Spring 2013 Annt Shi Lecture 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion,
More informationPhysics 9 Fall 2011 Homework 2 - Solutions Friday September 2, 2011
Physics 9 Fll 0 Homework - s Fridy September, 0 Mke sure your nme is on your homework, nd plese box your finl nswer. Becuse we will be giving prtil credit, be sure to ttempt ll the problems, even if you
More informationSection 11.5 Estimation of difference of two proportions
ection.5 Estimtion of difference of two proportions As seen in estimtion of difference of two mens for nonnorml popultion bsed on lrge smple sizes, one cn use CLT in the pproximtion of the distribution
More informationNUMERICAL INTEGRATION. The inverse process to differentiation in calculus is integration. Mathematically, integration is represented by.
NUMERICAL INTEGRATION 1 Introduction The inverse process to differentition in clculus is integrtion. Mthemticlly, integrtion is represented by f(x) dx which stnds for the integrl of the function f(x) with
More informationMath& 152 Section Integration by Parts
Mth& 5 Section 7. - Integrtion by Prts Integrtion by prts is rule tht trnsforms the integrl of the product of two functions into other (idelly simpler) integrls. Recll from Clculus I tht given two differentible
More informationNon-Linear & Logistic Regression
Non-Liner & Logistic Regression If the sttistics re boring, then you've got the wrong numbers. Edwrd R. Tufte (Sttistics Professor, Yle University) Regression Anlyses When do we use these? PART 1: find
More informationThis lecture covers Chapter 8 of HMU: Properties of CFLs
This lecture covers Chpter 8 of HMU: Properties of CFLs Turing Mchine Extensions of Turing Mchines Restrictions of Turing Mchines Additionl Reding: Chpter 8 of HMU. Turing Mchine: Informl Definition B
More informationGuided Learning of Control Graphs for Physics-Based Characters
Guided Lerning of Control Grphs for Physics-Bsed Chrcters Libin Liu 1 Michiel vn de Pnne 1 KngKng Yin 2 1 The University of British Columbi 2 Ntionl University of Singpore 1 Why Physics-bsed Chrcters?
More informationQuantum Physics III (8.06) Spring 2005 Solution Set 5
Quntum Physics III (8.06 Spring 005 Solution Set 5 Mrch 8, 004. The frctionl quntum Hll effect (5 points As we increse the flux going through the solenoid, we increse the mgnetic field, nd thus the vector
More informationChapter 3 Solving Nonlinear Equations
Chpter 3 Solving Nonliner Equtions 3.1 Introduction The nonliner function of unknown vrible x is in the form of where n could be non-integer. Root is the numericl vlue of x tht stisfies f ( x) 0. Grphiclly,
More informationOperations with Polynomials
38 Chpter P Prerequisites P.4 Opertions with Polynomils Wht you should lern: How to identify the leding coefficients nd degrees of polynomils How to dd nd subtrct polynomils How to multiply polynomils
More informationCS667 Lecture 6: Monte Carlo Integration 02/10/05
CS667 Lecture 6: Monte Crlo Integrtion 02/10/05 Venkt Krishnrj Lecturer: Steve Mrschner 1 Ide The min ide of Monte Crlo Integrtion is tht we cn estimte the vlue of n integrl by looking t lrge number of
More informationDiscrete Mathematics and Probability Theory Summer 2014 James Cook Note 17
CS 70 Discrete Mthemtics nd Proility Theory Summer 2014 Jmes Cook Note 17 I.I.D. Rndom Vriles Estimting the is of coin Question: We wnt to estimte the proportion p of Democrts in the US popultion, y tking
More informationNew Expansion and Infinite Series
Interntionl Mthemticl Forum, Vol. 9, 204, no. 22, 06-073 HIKARI Ltd, www.m-hikri.com http://dx.doi.org/0.2988/imf.204.4502 New Expnsion nd Infinite Series Diyun Zhng College of Computer Nnjing University
More information5.7 Improper Integrals
458 pplictions of definite integrls 5.7 Improper Integrls In Section 5.4, we computed the work required to lift pylod of mss m from the surfce of moon of mss nd rdius R to height H bove the surfce of the
More information5.3 Nonlinear stability of Rayleigh-Bénard convection
118 5.3 Nonliner stbility of Ryleigh-Bénrd convection In Chpter 1, we sw tht liner stbility only tells us whether system is stble or unstble to infinitesimlly-smll perturbtions, nd tht there re cses in
More information( dg. ) 2 dt. + dt. dt j + dh. + dt. r(t) dt. Comparing this equation with the one listed above for the length of see that
Arc Length of Curves in Three Dimensionl Spce If the vector function r(t) f(t) i + g(t) j + h(t) k trces out the curve C s t vries, we cn mesure distnces long C using formul nerly identicl to one tht we
More informationA. Limits - L Hopital s Rule ( ) How to find it: Try and find limits by traditional methods (plugging in). If you get 0 0 or!!, apply C.! 1 6 C.
A. Limits - L Hopitl s Rule Wht you re finding: L Hopitl s Rule is used to find limits of the form f ( x) lim where lim f x x! c g x ( ) = or lim f ( x) = limg( x) = ". ( ) x! c limg( x) = 0 x! c x! c
More informationWe divide the interval [a, b] into subintervals of equal length x = b a n
Arc Length Given curve C defined by function f(x), we wnt to find the length of this curve between nd b. We do this by using process similr to wht we did in defining the Riemnn Sum of definite integrl:
More informationNumerical integration
2 Numericl integrtion This is pge i Printer: Opque this 2. Introduction Numericl integrtion is problem tht is prt of mny problems in the economics nd econometrics literture. The orgniztion of this chpter
More informationdifferent methods (left endpoint, right endpoint, midpoint, trapezoid, Simpson s).
Mth 1A with Professor Stnkov Worksheet, Discussion #41; Wednesdy, 12/6/217 GSI nme: Roy Zho Problems 1. Write the integrl 3 dx s limit of Riemnn sums. Write it using 2 intervls using the 1 x different
More informationCS 109 Lecture 11 April 20th, 2016
CS 09 Lecture April 0th, 06 Four Prototypicl Trjectories Review The Norml Distribution is Norml Rndom Vrible: ~ Nµ, σ Probbility Density Function PDF: f x e σ π E[ ] µ Vr σ x µ / σ Also clled Gussin Note:
More informationProperties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives
Block #6: Properties of Integrls, Indefinite Integrls Gols: Definition of the Definite Integrl Integrl Clcultions using Antiderivtives Properties of Integrls The Indefinite Integrl 1 Riemnn Sums - 1 Riemnn
More informationAdvanced Calculus: MATH 410 Notes on Integrals and Integrability Professor David Levermore 17 October 2004
Advnced Clculus: MATH 410 Notes on Integrls nd Integrbility Professor Dvid Levermore 17 October 2004 1. Definite Integrls In this section we revisit the definite integrl tht you were introduced to when
More information38.2. The Uniform Distribution. Introduction. Prerequisites. Learning Outcomes
The Uniform Distribution 8. Introduction This Section introduces the simplest type of continuous probbility distribution which fetures continuous rndom vrible X with probbility density function f(x) which
More informationThe Wave Equation I. MA 436 Kurt Bryan
1 Introduction The Wve Eqution I MA 436 Kurt Bryn Consider string stretching long the x xis, of indeterminte (or even infinite!) length. We wnt to derive n eqution which models the motion of the string
More informationTrust Region Policy Optimization
Consider n infinite-horizon discounted Mrkov decision process (MDP), defined by the tuple (S, A, P, r, ρ 0, γ), where S is finite set of sttes, A is finite set of ctions, P : S A S R is the trnsition probbility
More informationdt. However, we might also be curious about dy
Section 0. The Clculus of Prmetric Curves Even though curve defined prmetricly my not be function, we cn still consider concepts such s rtes of chnge. However, the concepts will need specil tretment. For
More informationIntro to Nuclear and Particle Physics (5110)
Intro to Nucler nd Prticle Physics (5110) Feb, 009 The Nucler Mss Spectrum The Liquid Drop Model //009 1 E(MeV) n n(n-1)/ E/[ n(n-1)/] (MeV/pir) 1 C 16 O 0 Ne 4 Mg 7.7 14.44 19.17 8.48 4 5 6 6 10 15.4.41
More informationGlobal Motion. Estimate motion using all pixels in the image. Parametric flow gives an equation, which describes optical flow for each pixel.
Globl Flow Globl Motion Estimte motion using ll piels in the imge. Prmetric low gives n eqution, which describes opticl low or ech piel. Aine Projective Globl motion cn be used to generte mosics Object-bsed
More informationP 3 (x) = f(0) + f (0)x + f (0) 2. x 2 + f (0) . In the problem set, you are asked to show, in general, the n th order term is a n = f (n) (0)
1 Tylor polynomils In Section 3.5, we discussed how to pproximte function f(x) round point in terms of its first derivtive f (x) evluted t, tht is using the liner pproximtion f() + f ()(x ). We clled this
More informationSection 6.1 Definite Integral
Section 6.1 Definite Integrl Suppose we wnt to find the re of region tht is not so nicely shped. For exmple, consider the function shown elow. The re elow the curve nd ove the x xis cnnot e determined
More informationMATHS NOTES. SUBJECT: Maths LEVEL: Higher TEACHER: Aidan Roantree. The Institute of Education Topics Covered: Powers and Logs
MATHS NOTES The Institute of Eduction 06 SUBJECT: Mths LEVEL: Higher TEACHER: Aidn Rontree Topics Covered: Powers nd Logs About Aidn: Aidn is our senior Mths techer t the Institute, where he hs been teching
More informationCS 188: Artificial Intelligence Fall 2010
CS 188: Artificil Intelligence Fll 2010 Lecture 18: Decision Digrms 10/28/2010 Dn Klein C Berkeley Vlue of Informtion 1 Decision Networks ME: choose the ction which mximizes the expected utility given
More informationSuppose we want to find the area under the parabola and above the x axis, between the lines x = 2 and x = -2.
Mth 43 Section 6. Section 6.: Definite Integrl Suppose we wnt to find the re of region tht is not so nicely shped. For exmple, consider the function shown elow. The re elow the curve nd ove the x xis cnnot
More information1. Find the derivative of the following functions. a) f(x) = 2 + 3x b) f(x) = (5 2x) 8 c) f(x) = e2x
I. Dierentition. ) Rules. *product rule, quotient rule, chin rule MATH 34B FINAL REVIEW. Find the derivtive of the following functions. ) f(x) = 2 + 3x x 3 b) f(x) = (5 2x) 8 c) f(x) = e2x 4x 7 +x+2 d)
More informationMonte Carlo method in solving numerical integration and differential equation
Monte Crlo method in solving numericl integrtion nd differentil eqution Ye Jin Chemistry Deprtment Duke University yj66@duke.edu Abstrct: Monte Crlo method is commonly used in rel physics problem. The
More informationMArkov decision processes (MDPs) have been widely
Spre Mrkov Deciion Procee with Cul Spre Tlli Entropy Regulriztion for Reinforcement Lerning yungje Lee, Sungjoon Choi, nd Songhwi Oh rxiv:709.0693v3 [c.lg] 3 Oct 07 Abtrct In thi pper, re Mrkov deciion
More information