Multi-Armed Bandits: Non-adaptive and Adaptive Sampling

Size: px
Start display at page:

Download "Multi-Armed Bandits: Non-adaptive and Adaptive Sampling"

Transcription

1 CSE 547/Stt 548: Mchine Lerning for Big Dt Lecture Multi-Armed Bndits: Non-dptive nd Adptive Smpling Instructor: Shm Kkde 1 The (stochstic) multi-rmed bndit problem The bsic prdigm is s follows: K Independent Arms: {1,... K} Ech rm returns rndom rewrd R if pulled. (simpler cse) ssume R is not time vrying. Gme: You chose rm t t time t. You then observe: X t = R t where R t is smpled from the underlying distribution of tht rm. Criticlly, the distribution over R is not known. 1.1 Regret: n online performnce mesure Our objective is to mximize our long term rewrd. We hve (possibly rndomized) sequentil strtegy/lgorithm A, which is of the form: t = A( 1, X 1, 2, X 2,... t 1, X t 1 ) In T rounds, our rewrd is: T E[ X t A where the expecttion is with respect to the rewrd process nd our lgorithm. Suppose: µ = E[R, nd let us ssume 0 µ 1. Also, define: µ = mx µ. In T rounds nd in expecttion, the best we cn do is obtin µ T. We will mesure our performnce by our expected regret, defined s follows: In T rounds, our (observed) regret is: µ T T X t A 1

2 nd our expected regret is: µ T E X t A where the expecttion is with the rndomness in our outcomes (nd possibly our lgorithm if it is rndomized). 1.2 Cvet: Our presenttion in these notes will be loose in terms of log( ) fctors, in both K nd T. There re multiple good tretments tht provide improvements in terms of these fctors. 2 Review: Hoeffding s bound With N smples, denote the smple men s: ˆµ = 1 X t. N Lemm 2.1. Supposing tht the X t s hve n i.i.d. distribution nd re bounded between 0 nd 1, then, with probbility greter thn 1 δ, we hve tht log(2/δ) ˆµ µ 2N. 3 Wrmup: A non-dptive strtegy t Suppose we first pull ech rm times, in n explortion phse. Then, for the reminder of the T steps, we pull the rm which hd the best observed rewrd during the explortion phse. By the union bound, with probbility greter thn 1 δ, for ll ctions, ˆµ µ O. To see this, we simply mke our error probbility to be δ/k, to the totl error probbility is δ. Thus ll the confidence intervls will hold. During the explortion rounds, our cumultive regret is t most K, trivil upper bound. During the exploittion rounds, let us bound our cumultive regret for the reminder of T K. Note tht for the rm i tht we pull, we must hve tht: ˆµ i ˆµ i where i is n optiml rm. This implies tht µ i µ c where c is universl constnt. To see this, note tht by construction of the lgorithm ˆµ i ˆµ i, which implies µ i ˆµ i ˆµ i µ i ˆµ i ˆµ i µ i µ i ˆµ i µ i ˆµ i µ i, nd the clim follows using the confidence intervl bounds. 2

3 Hence, our totl regret is: Now let us optimize for. µ T T X t K + O (T K) Lemm 3.1. (Regret of the non-dptive strtegy) The totl expected regret of the non-dptive strtegy is: µ T E X t ck 1/3 T 2/3 (log T ) 1/3 where c is universl constnt. Proof. Choose = K 2/3 T 2/3 nd δ = 1/T 2. Note tht with probbility greter thn 1 1/T 2, our regret is bounded by (K 1/3 T 2/3 (log(kt )) 1/3 ). Also, if we fil, the lrgest regret we cn py is T, nd this occurs with probbility less thn 1/T 2, so the reget is: exp. regret Pr(no filure event) K 1/3 T 2/3 (log(kt )) 1/3 + Pr(filure event)t c(1 1/T 2 )K 1/3 T 2/3 (log(kt )) 1/3 + 1 T. This shows tht the regret is bounded s O(K 1/3 T 2/3 (log(kt )) 1/3 ). For T > K, log(kt ) 2 log T (nd for K < T, the climed regret bound is trivilly true). This completes the proof (for different universl constnt). 3.1 A (minimx) optiml dptive lgorithm We will now provide n optiml (up to log fctors) lgorithms (optiml under the i.i.d. ssumption for the rewrds re distributed nd using tht the rewrds re upper bounded by 1). Let be the number of times we pulled rm up to time t. The question is wht rm should pull time t + 1? 3.2 Confidence bounds If we don t cre bout log fctors, then the following is strightforwrd rgument to see tht our confidence bounds will simultneously hold for ll times t (from 0 to ) nd ll K rms. Lemm 3.2. With probbility greter thn 1 δ, we will hve tht for ll times t K, ll [K, ˆµ,t µ c where c is universl constnt. Proof. We will ctully prove stronger sttement: suppose tht we observe the outcome of every rm, we will first provide probbilistic sttement for the confidence intervls of ll the rms (nd for ll smple sizes). Let us pply Hoeffding s bound with n error probbility of δ/(k 2 ). Specificlly, for the rm with smples, we hve tht with probbility greter thn 1 δ/(k 2 ): ˆµ, µ c 3

4 (by strightforwrd ppliction of Hoeffding s bound). Now tht the totl error probbility over ll rms n over smple size is: δ K 2 = δπ2 /6 =0 (the π 2 /6 is from Bsel s problem). Note the sum is finite, which mens the error totl probbility for ll of these confidence intervls is less thn constnt δ. We hve thus shown the following (note the quntifiers): with probbility greter thn 1 δ, tht for ll rms nd ll smple sizes 1 tht: ˆµ, µ c, (for possibility different constnt c). Observe tht the confidence bounds tht ny lgorithm uses t time t is due to hving smples, so we cn now pply the bove bound in this cse, where: log( K/δ) log(tk/δ) c c since t. This shows tht these confidence bounds re vlid for ll times t nd ll rms. The proof is completed by nothing for t K, log(kt) 2 log t. 3.3 The Upper Confidence Bound (UCB) Algorithm At ech time t, Pull rm: t = rg mx ˆµ,t + c := rg mx ˆµ,t + ConfBound,t (where c 10 is constnt). Observe rewrd X t. Updte µ,t,, nd ConfBound,t. With probbility greter thn 1 δ ll the confidence bounds will hold for ll rms nd ll times t. 3.4 Anlysis of UCB If pull rm t time t, wht is our instntneous regret, i.e. wht is: µ µ t? Let i be n optiml rm. Note by construction of the lgorithm we hve, if we pull rm t time t, then: ˆµ,t + ConfBound,t ˆµ i + ConfBound i µ i, the lst step follows due to tht µ i is contined within the confidence intervl for i. Using this we hve tht: µ t ˆµ,t ConfBound,t ˆµ i 2ConfBound,t 4

5 Theorem 3.3. (UCB regret) The totl expected regret of UCB is: µ T E X t c KT log T for n ppropritely chosen universl constnt c. Proof. The expected regret is bounded s: µ T E X t 2 ConfBound,t t 2c N t,t 2c log(t/δ)n,t. (1) Note the following constrint on the N,T s must hold: N,T = T One cn now show the worst cse setting of N,T tht mkes Eqution 1 s lrge s possible (subject to this constrint on the N,T s) is when = T/K. Finlly, to obtin the expected regret bound, the proof is identicl to tht of the previous rgument (in the non-dptive cse, where we choose δ = 1/T 2 ). 5

1 Online Learning and Regret Minimization

1 Online Learning and Regret Minimization 2.997 Decision-Mking in Lrge-Scle Systems My 10 MIT, Spring 2004 Hndout #29 Lecture Note 24 1 Online Lerning nd Regret Minimiztion In this lecture, we consider the problem of sequentil decision mking in

More information

Reinforcement learning II

Reinforcement learning II CS 1675 Introduction to Mchine Lerning Lecture 26 Reinforcement lerning II Milos Huskrecht milos@cs.pitt.edu 5329 Sennott Squre Reinforcement lerning Bsics: Input x Lerner Output Reinforcement r Critic

More information

A recursive construction of efficiently decodable list-disjunct matrices

A recursive construction of efficiently decodable list-disjunct matrices CSE 709: Compressed Sensing nd Group Testing. Prt I Lecturers: Hung Q. Ngo nd Atri Rudr SUNY t Bufflo, Fll 2011 Lst updte: October 13, 2011 A recursive construction of efficiently decodble list-disjunct

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Lerning Tom Mitchell, Mchine Lerning, chpter 13 Outline Introduction Comprison with inductive lerning Mrkov Decision Processes: the model Optiml policy: The tsk Q Lerning: Q function Algorithm

More information

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7

CS 188 Introduction to Artificial Intelligence Fall 2018 Note 7 CS 188 Introduction to Artificil Intelligence Fll 2018 Note 7 These lecture notes re hevily bsed on notes originlly written by Nikhil Shrm. Decision Networks In the third note, we lerned bout gme trees

More information

MATH362 Fundamentals of Mathematical Finance

MATH362 Fundamentals of Mathematical Finance MATH362 Fundmentls of Mthemticl Finnce Solution to Homework Three Fll, 2007 Course Instructor: Prof. Y.K. Kwok. If outcome j occurs, then the gin is given by G j = g ij α i, + d where α i = i + d i We

More information

38.2. The Uniform Distribution. Introduction. Prerequisites. Learning Outcomes

38.2. The Uniform Distribution. Introduction. Prerequisites. Learning Outcomes The Uniform Distribution 8. Introduction This Section introduces the simplest type of continuous probbility distribution which fetures continuous rndom vrible X with probbility density function f(x) which

More information

Review of Calculus, cont d

Review of Calculus, cont d Jim Lmbers MAT 460 Fll Semester 2009-10 Lecture 3 Notes These notes correspond to Section 1.1 in the text. Review of Clculus, cont d Riemnn Sums nd the Definite Integrl There re mny cses in which some

More information

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature

CMDA 4604: Intermediate Topics in Mathematical Modeling Lecture 19: Interpolation and Quadrature CMDA 4604: Intermedite Topics in Mthemticl Modeling Lecture 19: Interpoltion nd Qudrture In this lecture we mke brief diversion into the res of interpoltion nd qudrture. Given function f C[, b], we sy

More information

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning

Solution for Assignment 1 : Intro to Probability and Statistics, PAC learning Solution for Assignment 1 : Intro to Probbility nd Sttistics, PAC lerning 10-701/15-781: Mchine Lerning (Fll 004) Due: Sept. 30th 004, Thursdy, Strt of clss Question 1. Bsic Probbility ( 18 pts) 1.1 (

More information

1 Probability Density Functions

1 Probability Density Functions Lis Yn CS 9 Continuous Distributions Lecture Notes #9 July 6, 28 Bsed on chpter by Chris Piech So fr, ll rndom vribles we hve seen hve been discrete. In ll the cses we hve seen in CS 9, this ment tht our

More information

Continuous Random Variables

Continuous Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 217 Néhémy Lim Continuous Rndom Vribles Nottion. The indictor function of set S is rel-vlued function defined by : { 1 if x S 1 S (x) if x S Suppose tht

More information

Physics 202H - Introductory Quantum Physics I Homework #08 - Solutions Fall 2004 Due 5:01 PM, Monday 2004/11/15

Physics 202H - Introductory Quantum Physics I Homework #08 - Solutions Fall 2004 Due 5:01 PM, Monday 2004/11/15 Physics H - Introductory Quntum Physics I Homework #8 - Solutions Fll 4 Due 5:1 PM, Mondy 4/11/15 [55 points totl] Journl questions. Briefly shre your thoughts on the following questions: Of the mteril

More information

Math 426: Probability Final Exam Practice

Math 426: Probability Final Exam Practice Mth 46: Probbility Finl Exm Prctice. Computtionl problems 4. Let T k (n) denote the number of prtitions of the set {,..., n} into k nonempty subsets, where k n. Argue tht T k (n) kt k (n ) + T k (n ) by

More information

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a).

The First Fundamental Theorem of Calculus. If f(x) is continuous on [a, b] and F (x) is any antiderivative. f(x) dx = F (b) F (a). The Fundmentl Theorems of Clculus Mth 4, Section 0, Spring 009 We now know enough bout definite integrls to give precise formultions of the Fundmentl Theorems of Clculus. We will lso look t some bsic emples

More information

5.7 Improper Integrals

5.7 Improper Integrals 458 pplictions of definite integrls 5.7 Improper Integrls In Section 5.4, we computed the work required to lift pylod of mss m from the surfce of moon of mss nd rdius R to height H bove the surfce of the

More information

Heat flux and total heat

Heat flux and total heat Het flux nd totl het John McCun Mrch 14, 2017 1 Introduction Yesterdy (if I remember correctly) Ms. Prsd sked me question bout the condition of insulted boundry for the 1D het eqution, nd (bsed on glnce

More information

Chapter 5 : Continuous Random Variables

Chapter 5 : Continuous Random Variables STAT/MATH 395 A - PROBABILITY II UW Winter Qurter 216 Néhémy Lim Chpter 5 : Continuous Rndom Vribles Nottions. N {, 1, 2,...}, set of nturl numbers (i.e. ll nonnegtive integers); N {1, 2,...}, set of ll

More information

The Regulated and Riemann Integrals

The Regulated and Riemann Integrals Chpter 1 The Regulted nd Riemnn Integrls 1.1 Introduction We will consider severl different pproches to defining the definite integrl f(x) dx of function f(x). These definitions will ll ssign the sme vlue

More information

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3

UNIFORM CONVERGENCE. Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 UNIFORM CONVERGENCE Contents 1. Uniform Convergence 1 2. Properties of uniform convergence 3 Suppose f n : Ω R or f n : Ω C is sequence of rel or complex functions, nd f n f s n in some sense. Furthermore,

More information

19 Optimal behavior: Game theory

19 Optimal behavior: Game theory Intro. to Artificil Intelligence: Dle Schuurmns, Relu Ptrscu 1 19 Optiml behvior: Gme theory Adversril stte dynmics hve to ccount for worst cse Compute policy π : S A tht mximizes minimum rewrd Let S (,

More information

W. We shall do so one by one, starting with I 1, and we shall do it greedily, trying

W. We shall do so one by one, starting with I 1, and we shall do it greedily, trying Vitli covers 1 Definition. A Vitli cover of set E R is set V of closed intervls with positive length so tht, for every δ > 0 nd every x E, there is some I V with λ(i ) < δ nd x I. 2 Lemm (Vitli covering)

More information

Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments

Planning to Be Surprised: Optimal Bayesian Exploration in Dynamic Environments Plnning to Be Surprised: Optiml Byesin Explortion in Dynmic Environments Yi Sun, Fustino Gomez, nd Jürgen Schmidhuber IDSIA, Glleri 2, Mnno, CH-6928, Switzerlnd {yi,tino,juergen}@idsi.ch Abstrct. To mximize

More information

Lecture 3 ( ) (translated and slightly adapted from lecture notes by Martin Klazar)

Lecture 3 ( ) (translated and slightly adapted from lecture notes by Martin Klazar) Lecture 3 (5.3.2018) (trnslted nd slightly dpted from lecture notes by Mrtin Klzr) Riemnn integrl Now we define precisely the concept of the re, in prticulr, the re of figure U(, b, f) under the grph of

More information

arxiv: v1 [stat.ml] 9 Aug 2016

arxiv: v1 [stat.ml] 9 Aug 2016 On Lower Bounds for Regret in Reinforcement Lerning In Osbnd Stnford University, Google DeepMind iosbnd@stnford.edu Benjmin Vn Roy Stnford University bvr@stnford.edu rxiv:1608.02732v1 [stt.ml 9 Aug 2016

More information

AP Calculus Multiple Choice: BC Edition Solutions

AP Calculus Multiple Choice: BC Edition Solutions AP Clculus Multiple Choice: BC Edition Solutions J. Slon Mrch 8, 04 ) 0 dx ( x) is A) B) C) D) E) Divergent This function inside the integrl hs verticl symptotes t x =, nd the integrl bounds contin this

More information

Student Activity 3: Single Factor ANOVA

Student Activity 3: Single Factor ANOVA MATH 40 Student Activity 3: Single Fctor ANOVA Some Bsic Concepts In designed experiment, two or more tretments, or combintions of tretments, is pplied to experimentl units The number of tretments, whether

More information

A BRIEF INTRODUCTION TO UNIFORM CONVERGENCE. In the study of Fourier series, several questions arise naturally, such as: c n e int

A BRIEF INTRODUCTION TO UNIFORM CONVERGENCE. In the study of Fourier series, several questions arise naturally, such as: c n e int A BRIEF INTRODUCTION TO UNIFORM CONVERGENCE HANS RINGSTRÖM. Questions nd exmples In the study of Fourier series, severl questions rise nturlly, such s: () (2) re there conditions on c n, n Z, which ensure

More information

LECTURE NOTE #12 PROF. ALAN YUILLE

LECTURE NOTE #12 PROF. ALAN YUILLE LECTURE NOTE #12 PROF. ALAN YUILLE 1. Clustering, K-mens, nd EM Tsk: set of unlbeled dt D = {x 1,..., x n } Decompose into clsses w 1,..., w M where M is unknown. Lern clss models p(x w)) Discovery of

More information

Properties of the Riemann Integral

Properties of the Riemann Integral Properties of the Riemnn Integrl Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University Februry 15, 2018 Outline 1 Some Infimum nd Supremum Properties 2

More information

1 Structural induction, finite automata, regular expressions

1 Structural induction, finite automata, regular expressions Discrete Structures Prelim 2 smple uestions s CS2800 Questions selected for spring 2017 1 Structurl induction, finite utomt, regulr expressions 1. We define set S of functions from Z to Z inductively s

More information

Math 1B, lecture 4: Error bounds for numerical methods

Math 1B, lecture 4: Error bounds for numerical methods Mth B, lecture 4: Error bounds for numericl methods Nthn Pflueger 4 September 0 Introduction The five numericl methods descried in the previous lecture ll operte by the sme principle: they pproximte the

More information

p-adic Egyptian Fractions

p-adic Egyptian Fractions p-adic Egyptin Frctions Contents 1 Introduction 1 2 Trditionl Egyptin Frctions nd Greedy Algorithm 2 3 Set-up 3 4 p-greedy Algorithm 5 5 p-egyptin Trditionl 10 6 Conclusion 1 Introduction An Egyptin frction

More information

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS.

THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS. THE EXISTENCE-UNIQUENESS THEOREM FOR FIRST-ORDER DIFFERENTIAL EQUATIONS RADON ROSBOROUGH https://intuitiveexplntionscom/picrd-lindelof-theorem/ This document is proof of the existence-uniqueness theorem

More information

Lecture 1: Introduction to integration theory and bounded variation

Lecture 1: Introduction to integration theory and bounded variation Lecture 1: Introduction to integrtion theory nd bounded vrition Wht is this course bout? Integrtion theory. The first question you might hve is why there is nything you need to lern bout integrtion. You

More information

Numerical Integration. 1 Introduction. 2 Midpoint Rule, Trapezoid Rule, Simpson Rule. AMSC/CMSC 460/466 T. von Petersdorff 1

Numerical Integration. 1 Introduction. 2 Midpoint Rule, Trapezoid Rule, Simpson Rule. AMSC/CMSC 460/466 T. von Petersdorff 1 AMSC/CMSC 46/466 T. von Petersdorff 1 umericl Integrtion 1 Introduction We wnt to pproximte the integrl I := f xdx where we re given, b nd the function f s subroutine. We evlute f t points x 1,...,x n

More information

Is there an easy way to find examples of such triples? Why yes! Just look at an ordinary multiplication table to find them!

Is there an easy way to find examples of such triples? Why yes! Just look at an ordinary multiplication table to find them! PUSHING PYTHAGORAS 009 Jmes Tnton A triple of integers ( bc,, ) is clled Pythgoren triple if exmple, some clssic triples re ( 3,4,5 ), ( 5,1,13 ), ( ) fond of ( 0,1,9 ) nd ( 119,10,169 ). + b = c. For

More information

Appendix to Notes 8 (a)

Appendix to Notes 8 (a) Appendix to Notes 8 () 13 Comprison of the Riemnn nd Lebesgue integrls. Recll Let f : [, b] R be bounded. Let D be prtition of [, b] such tht Let D = { = x 0 < x 1

More information

14.3 comparing two populations: based on independent samples

14.3 comparing two populations: based on independent samples Chpter4 Nonprmetric Sttistics Introduction: : methods for mking inferences bout popultion prmeters (confidence intervl nd hypothesis testing) rely on the ssumptions bout probbility distribution of smpled

More information

Chapter 9: Inferences based on Two samples: Confidence intervals and tests of hypotheses

Chapter 9: Inferences based on Two samples: Confidence intervals and tests of hypotheses Chpter 9: Inferences bsed on Two smples: Confidence intervls nd tests of hypotheses 9.1 The trget prmeter : difference between two popultion mens : difference between two popultion proportions : rtio of

More information

Reinforcement Learning and Policy Reuse

Reinforcement Learning and Policy Reuse Reinforcement Lerning nd Policy Reue Mnuel M. Veloo PEL Fll 206 Reding: Reinforcement Lerning: An Introduction R. Sutton nd A. Brto Probbilitic policy reue in reinforcement lerning gent Fernndo Fernndez

More information

Bellman Optimality Equation for V*

Bellman Optimality Equation for V* Bellmn Optimlity Eqution for V* The vlue of stte under n optiml policy must equl the expected return for the best ction from tht stte: V (s) mx Q (s,) A(s) mx A(s) mx A(s) Er t 1 V (s t 1 ) s t s, t s

More information

Reversals of Signal-Posterior Monotonicity for Any Bounded Prior

Reversals of Signal-Posterior Monotonicity for Any Bounded Prior Reversls of Signl-Posterior Monotonicity for Any Bounded Prior Christopher P. Chmbers Pul J. Hely Abstrct Pul Milgrom (The Bell Journl of Economics, 12(2): 380 391) showed tht if the strict monotone likelihood

More information

Near-Bayesian Exploration in Polynomial Time

Near-Bayesian Exploration in Polynomial Time J. Zico Kolter kolter@cs.stnford.edu Andrew Y. Ng ng@cs.stnford.edu Computer Science Deprtment, Stnford University, CA 94305 Abstrct We consider the explortion/exploittion problem in reinforcement lerning

More information

Integral equations, eigenvalue, function interpolation

Integral equations, eigenvalue, function interpolation Integrl equtions, eigenvlue, function interpoltion Mrcin Chrząszcz mchrzsz@cernch Monte Crlo methods, 26 My, 2016 1 / Mrcin Chrząszcz (Universität Zürich) Integrl equtions, eigenvlue, function interpoltion

More information

New Integral Inequalities for n-time Differentiable Functions with Applications for pdfs

New Integral Inequalities for n-time Differentiable Functions with Applications for pdfs Applied Mthemticl Sciences, Vol. 2, 2008, no. 8, 353-362 New Integrl Inequlities for n-time Differentible Functions with Applictions for pdfs Aristides I. Kechriniotis Technologicl Eductionl Institute

More information

Lecture 21: Order statistics

Lecture 21: Order statistics Lecture : Order sttistics Suppose we hve N mesurements of sclr, x i =, N Tke ll mesurements nd sort them into scending order x x x 3 x N Define the mesured running integrl S N (x) = 0 for x < x = i/n for

More information

Problem Set 9. Figure 1: Diagram. This picture is a rough sketch of the 4 parabolas that give us the area that we need to find. The equations are:

Problem Set 9. Figure 1: Diagram. This picture is a rough sketch of the 4 parabolas that give us the area that we need to find. The equations are: (x + y ) = y + (x + y ) = x + Problem Set 9 Discussion: Nov., Nov. 8, Nov. (on probbility nd binomil coefficients) The nme fter the problem is the designted writer of the solution of tht problem. (No one

More information

Online Supplements to Performance-Based Contracts for Outpatient Medical Services

Online Supplements to Performance-Based Contracts for Outpatient Medical Services Jing, Png nd Svin: Performnce-bsed Contrcts Article submitted to Mnufcturing & Service Opertions Mngement; mnuscript no. MSOM-11-270.R2 1 Online Supplements to Performnce-Bsed Contrcts for Outptient Medicl

More information

13: Diffusion in 2 Energy Groups

13: Diffusion in 2 Energy Groups 3: Diffusion in Energy Groups B. Rouben McMster University Course EP 4D3/6D3 Nucler Rector Anlysis (Rector Physics) 5 Sept.-Dec. 5 September Contents We study the diffusion eqution in two energy groups

More information

Credibility Hypothesis Testing of Fuzzy Triangular Distributions

Credibility Hypothesis Testing of Fuzzy Triangular Distributions 666663 Journl of Uncertin Systems Vol.9, No., pp.6-74, 5 Online t: www.jus.org.uk Credibility Hypothesis Testing of Fuzzy Tringulr Distributions S. Smpth, B. Rmy Received April 3; Revised 4 April 4 Abstrct

More information

Lecture 3 Gaussian Probability Distribution

Lecture 3 Gaussian Probability Distribution Introduction Lecture 3 Gussin Probbility Distribution Gussin probbility distribution is perhps the most used distribution in ll of science. lso clled bell shped curve or norml distribution Unlike the binomil

More information

Name Solutions to Test 3 November 8, 2017

Name Solutions to Test 3 November 8, 2017 Nme Solutions to Test 3 November 8, 07 This test consists of three prts. Plese note tht in prts II nd III, you cn skip one question of those offered. Some possibly useful formuls cn be found below. Brrier

More information

Math 8 Winter 2015 Applications of Integration

Math 8 Winter 2015 Applications of Integration Mth 8 Winter 205 Applictions of Integrtion Here re few importnt pplictions of integrtion. The pplictions you my see on n exm in this course include only the Net Chnge Theorem (which is relly just the Fundmentl

More information

PARTIAL FRACTION DECOMPOSITION

PARTIAL FRACTION DECOMPOSITION PARTIAL FRACTION DECOMPOSITION LARRY SUSANKA 1. Fcts bout Polynomils nd Nottion We must ssemble some tools nd nottion to prove the existence of the stndrd prtil frction decomposition, used s n integrtion

More information

Lecture 20: Numerical Integration III

Lecture 20: Numerical Integration III cs4: introduction to numericl nlysis /8/0 Lecture 0: Numericl Integrtion III Instructor: Professor Amos Ron Scribes: Mrk Cowlishw, Yunpeng Li, Nthnel Fillmore For the lst few lectures we hve discussed

More information

S. S. Dragomir. 2, we have the inequality. b a

S. S. Dragomir. 2, we have the inequality. b a Bull Koren Mth Soc 005 No pp 3 30 SOME COMPANIONS OF OSTROWSKI S INEQUALITY FOR ABSOLUTELY CONTINUOUS FUNCTIONS AND APPLICATIONS S S Drgomir Abstrct Compnions of Ostrowski s integrl ineulity for bsolutely

More information

MIXED MODELS (Sections ) I) In the unrestricted model, interactions are treated as in the random effects model:

MIXED MODELS (Sections ) I) In the unrestricted model, interactions are treated as in the random effects model: 1 2 MIXED MODELS (Sections 17.7 17.8) Exmple: Suppose tht in the fiber breking strength exmple, the four mchines used were the only ones of interest, but the interest ws over wide rnge of opertors, nd

More information

Riemann Sums and Riemann Integrals

Riemann Sums and Riemann Integrals Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 2013 Outline 1 Riemnn Sums 2 Riemnn Integrls 3 Properties

More information

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom Lerning Gols Continuous Rndom Vriles Clss 5, 8.05 Jeremy Orloff nd Jonthn Bloom. Know the definition of continuous rndom vrile. 2. Know the definition of the proility density function (pdf) nd cumultive

More information

Administrivia CSE 190: Reinforcement Learning: An Introduction

Administrivia CSE 190: Reinforcement Learning: An Introduction Administrivi CSE 190: Reinforcement Lerning: An Introduction Any emil sent to me bout the course should hve CSE 190 in the subject line! Chpter 4: Dynmic Progrmming Acknowledgment: A good number of these

More information

Automated Recommendation Systems

Automated Recommendation Systems Automted Recommendtion Systems Collbortive Filtering Through Reinorcement Lerning Most Akhmizdeh Deprtment o MS&E, Stnord University Emil: mkhmi@stnord.edu Alexei Avkov Deprtment o Electricl Engineering,

More information

Online Markov Decision Processes under Bandit Feedback

Online Markov Decision Processes under Bandit Feedback Online Mrkov Decision Processes under Bndit Feedbck Gergely Neu, András György, Csb Szepesvári, András Antos Abstrct We consider online lerning in finite stochstic Mrkovin environments where in ech time

More information

Riemann Sums and Riemann Integrals

Riemann Sums and Riemann Integrals Riemnn Sums nd Riemnn Integrls Jmes K. Peterson Deprtment of Biologicl Sciences nd Deprtment of Mthemticl Sciences Clemson University August 26, 203 Outline Riemnn Sums Riemnn Integrls Properties Abstrct

More information

Homework 11. Andrew Ma November 30, sin x (1+x) (1+x)

Homework 11. Andrew Ma November 30, sin x (1+x) (1+x) Homewor Andrew M November 3, 4 Problem 9 Clim: Pf: + + d = d = sin b +b + sin (+) d sin (+) d using integrtion by prts. By pplying + d = lim b sin b +b + sin (+) d. Since limits to both sides, lim b sin

More information

A Fast and Reliable Policy Improvement Algorithm

A Fast and Reliable Policy Improvement Algorithm A Fst nd Relible Policy Improvement Algorithm Ysin Abbsi-Ydkori Peter L. Brtlett Stephen J. Wright Queenslnd University of Technology UC Berkeley nd QUT University of Wisconsin-Mdison Abstrct We introduce

More information

Chapter 6 Continuous Random Variables and Distributions

Chapter 6 Continuous Random Variables and Distributions Chpter 6 Continuous Rndom Vriles nd Distriutions Mny economic nd usiness mesures such s sles investment consumption nd cost cn hve the continuous numericl vlues so tht they cn not e represented y discrete

More information

Bandits and Exploration: How do we (optimally) gather information? Sham M. Kakade

Bandits and Exploration: How do we (optimally) gather information? Sham M. Kakade Bandits and Exploration: How do we (optimally) gather information? Sham M. Kakade Machine Learning for Big Data CSE547/STAT548 University of Washington S. M. Kakade (UW) Optimization for Big data 1 / 22

More information

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral

f(x) dx, If one of these two conditions is not met, we call the integral improper. Our usual definition for the value for the definite integral Improper Integrls Every time tht we hve evluted definite integrl such s f(x) dx, we hve mde two implicit ssumptions bout the integrl:. The intervl [, b] is finite, nd. f(x) is continuous on [, b]. If one

More information

Bases for Vector Spaces

Bases for Vector Spaces Bses for Vector Spces 2-26-25 A set is independent if, roughly speking, there is no redundncy in the set: You cn t uild ny vector in the set s liner comintion of the others A set spns if you cn uild everything

More information

ARITHMETIC OPERATIONS. The real numbers have the following properties: a b c ab ac

ARITHMETIC OPERATIONS. The real numbers have the following properties: a b c ab ac REVIEW OF ALGEBRA Here we review the bsic rules nd procedures of lgebr tht you need to know in order to be successful in clculus. ARITHMETIC OPERATIONS The rel numbers hve the following properties: b b

More information

The Periodically Forced Harmonic Oscillator

The Periodically Forced Harmonic Oscillator The Periodiclly Forced Hrmonic Oscilltor S. F. Ellermeyer Kennesw Stte University July 15, 003 Abstrct We study the differentil eqution dt + pdy + qy = A cos (t θ) dt which models periodiclly forced hrmonic

More information

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1

Exam 2, Mathematics 4701, Section ETY6 6:05 pm 7:40 pm, March 31, 2016, IH-1105 Instructor: Attila Máté 1 Exm, Mthemtics 471, Section ETY6 6:5 pm 7:4 pm, Mrch 1, 16, IH-115 Instructor: Attil Máté 1 17 copies 1. ) Stte the usul sufficient condition for the fixed-point itertion to converge when solving the eqution

More information

Lecture 1. Functional series. Pointwise and uniform convergence.

Lecture 1. Functional series. Pointwise and uniform convergence. 1 Introduction. Lecture 1. Functionl series. Pointwise nd uniform convergence. In this course we study mongst other things Fourier series. The Fourier series for periodic function f(x) with period 2π is

More information

Numerical Analysis: Trapezoidal and Simpson s Rule

Numerical Analysis: Trapezoidal and Simpson s Rule nd Simpson s Mthemticl question we re interested in numericlly nswering How to we evlute I = f (x) dx? Clculus tells us tht if F(x) is the ntiderivtive of function f (x) on the intervl [, b], then I =

More information

Reinforcement learning

Reinforcement learning Reinforcement lerning Regulr MDP Given: Trnition model P Rewrd function R Find: Policy π Reinforcement lerning Trnition model nd rewrd function initilly unknown Still need to find the right policy Lern

More information

Lecture 14: Quadrature

Lecture 14: Quadrature Lecture 14: Qudrture This lecture is concerned with the evlution of integrls fx)dx 1) over finite intervl [, b] The integrnd fx) is ssumed to be rel-vlues nd smooth The pproximtion of n integrl by numericl

More information

Recitation 3: More Applications of the Derivative

Recitation 3: More Applications of the Derivative Mth 1c TA: Pdric Brtlett Recittion 3: More Applictions of the Derivtive Week 3 Cltech 2012 1 Rndom Question Question 1 A grph consists of the following: A set V of vertices. A set E of edges where ech

More information

5 Probability densities

5 Probability densities 5 Probbility densities 5. Continuous rndom vribles 5. The norml distribution 5.3 The norml pproimtion to the binomil distribution 5.5 The uniorm distribution 5. Joint distribution discrete nd continuous

More information

Hoeffding, Azuma, McDiarmid

Hoeffding, Azuma, McDiarmid Hoeffding, Azum, McDirmid Krl Strtos 1 Hoeffding (sum of independent RVs) Hoeffding s lemm. If X [, ] nd E[X] 0, then for ll t > 0: E[e tx ] e t2 ( ) 2 / Proof. Since e t is conve, for ll [, ]: This mens:

More information

Analytical Methods Exam: Preparatory Exercises

Analytical Methods Exam: Preparatory Exercises Anlyticl Methods Exm: Preprtory Exercises Question. Wht does it men tht (X, F, µ) is mesure spce? Show tht µ is monotone, tht is: if E F re mesurble sets then µ(e) µ(f). Question. Discuss if ech of the

More information

For the percentage of full time students at RCC the symbols would be:

For the percentage of full time students at RCC the symbols would be: Mth 17/171 Chpter 7- ypothesis Testing with One Smple This chpter is s simple s the previous one, except it is more interesting In this chpter we will test clims concerning the sme prmeters tht we worked

More information

S. S. Dragomir. 1. Introduction. In [1], Guessab and Schmeisser have proved among others, the following companion of Ostrowski s inequality:

S. S. Dragomir. 1. Introduction. In [1], Guessab and Schmeisser have proved among others, the following companion of Ostrowski s inequality: FACTA UNIVERSITATIS NIŠ) Ser Mth Inform 9 00) 6 SOME COMPANIONS OF OSTROWSKI S INEQUALITY FOR ABSOLUTELY CONTINUOUS FUNCTIONS AND APPLICATIONS S S Drgomir Dedicted to Prof G Mstroinni for his 65th birthdy

More information

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo

Module 6 Value Iteration. CS 886 Sequential Decision Making and Reinforcement Learning University of Waterloo Module 6 Vlue Itertion CS 886 Sequentil Decision Mking nd Reinforcement Lerning University of Wterloo Mrkov Decision Process Definition Set of sttes: S Set of ctions (i.e., decisions): A Trnsition model:

More information

Principles of Real Analysis I Fall VI. Riemann Integration

Principles of Real Analysis I Fall VI. Riemann Integration 21-355 Principles of Rel Anlysis I Fll 2004 A. Definitions VI. Riemnn Integrtion Let, b R with < b be given. By prtition of [, b] we men finite set P [, b] with, b P. The set of ll prtitions of [, b] will

More information

CS 188: Artificial Intelligence Spring 2007

CS 188: Artificial Intelligence Spring 2007 CS 188: Artificil Intelligence Spring 2007 Lecture 3: Queue-Bsed Serch 1/23/2007 Srini Nrynn UC Berkeley Mny slides over the course dpted from Dn Klein, Sturt Russell or Andrew Moore Announcements Assignment

More information

Section 11.5 Estimation of difference of two proportions

Section 11.5 Estimation of difference of two proportions ection.5 Estimtion of difference of two proportions As seen in estimtion of difference of two mens for nonnorml popultion bsed on lrge smple sizes, one cn use CLT in the pproximtion of the distribution

More information

Multi-Bandit Best Arm Identification

Multi-Bandit Best Arm Identification Multi-Bndit Best Arm Identifiction Victor Gbillon Mohmmd Ghvmzdeh Alessndro Lzric INRIA Lille - Nord Europe, Tem SequeL {victor.gbillon,mohmmd.ghvmzdeh,lessndro.lzric}@inri.fr Sébstien Bubeck Deprtment

More information

When a force f(t) is applied to a mass in a system, we recall that Newton s law says that. f(t) = ma = m d dt v,

When a force f(t) is applied to a mass in a system, we recall that Newton s law says that. f(t) = ma = m d dt v, Impulse Functions In mny ppliction problems, n externl force f(t) is pplied over very short period of time. For exmple, if mss in spring nd dshpot system is struck by hmmer, the ppliction of the force

More information

8 Laplace s Method and Local Limit Theorems

8 Laplace s Method and Local Limit Theorems 8 Lplce s Method nd Locl Limit Theorems 8. Fourier Anlysis in Higher DImensions Most of the theorems of Fourier nlysis tht we hve proved hve nturl generliztions to higher dimensions, nd these cn be proved

More information

dt. However, we might also be curious about dy

dt. However, we might also be curious about dy Section 0. The Clculus of Prmetric Curves Even though curve defined prmetricly my not be function, we cn still consider concepts such s rtes of chnge. However, the concepts will need specil tretment. For

More information

21.6 Green Functions for First Order Equations

21.6 Green Functions for First Order Equations 21.6 Green Functions for First Order Equtions Consider the first order inhomogeneous eqution subject to homogeneous initil condition, B[y] y() = 0. The Green function G( ξ) is defined s the solution to

More information

3.4 Numerical integration

3.4 Numerical integration 3.4. Numericl integrtion 63 3.4 Numericl integrtion In mny economic pplictions it is necessry to compute the definite integrl of relvlued function f with respect to "weight" function w over n intervl [,

More information

MORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.)

MORE FUNCTION GRAPHING; OPTIMIZATION. (Last edited October 28, 2013 at 11:09pm.) MORE FUNCTION GRAPHING; OPTIMIZATION FRI, OCT 25, 203 (Lst edited October 28, 203 t :09pm.) Exercise. Let n be n rbitrry positive integer. Give n exmple of function with exctly n verticl symptotes. Give

More information

POLYPHASE CIRCUITS. Introduction:

POLYPHASE CIRCUITS. Introduction: POLYPHASE CIRCUITS Introduction: Three-phse systems re commonly used in genertion, trnsmission nd distribution of electric power. Power in three-phse system is constnt rther thn pulsting nd three-phse

More information

Tests for the Ratio of Two Poisson Rates

Tests for the Ratio of Two Poisson Rates Chpter 437 Tests for the Rtio of Two Poisson Rtes Introduction The Poisson probbility lw gives the probbility distribution of the number of events occurring in specified intervl of time or spce. The Poisson

More information

AM1 Mathematical Analysis 1 Oct Feb Exercises Lecture 3. sin(x + h) sin x h cos(x + h) cos x h

AM1 Mathematical Analysis 1 Oct Feb Exercises Lecture 3. sin(x + h) sin x h cos(x + h) cos x h AM Mthemticl Anlysis Oct. Feb. Dte: October Exercises Lecture Exercise.. If h, prove the following identities hold for ll x: sin(x + h) sin x h cos(x + h) cos x h = sin γ γ = sin γ γ cos(x + γ) (.) sin(x

More information

Polynomial Approximations for the Natural Logarithm and Arctangent Functions. Math 230

Polynomial Approximations for the Natural Logarithm and Arctangent Functions. Math 230 Polynomil Approimtions for the Nturl Logrithm nd Arctngent Functions Mth 23 You recll from first semester clculus how one cn use the derivtive to find n eqution for the tngent line to function t given

More information

Numerical Analysis. 10th ed. R L Burden, J D Faires, and A M Burden

Numerical Analysis. 10th ed. R L Burden, J D Faires, and A M Burden Numericl Anlysis 10th ed R L Burden, J D Fires, nd A M Burden Bemer Presenttion Slides Prepred by Dr. Annette M. Burden Youngstown Stte University July 9, 2015 Chpter 4.1: Numericl Differentition 1 Three-Point

More information

Lesson 1.6 Exercises, pages 68 73

Lesson 1.6 Exercises, pages 68 73 Lesson.6 Exercises, pges 68 7 A. Determine whether ech infinite geometric series hs finite sum. How do you know? ) + +.5 + 6.75 +... r is:.5, so the sum is not finite. b) 0.5 0.05 0.005 0.0005... r is:

More information