Online Learning: Bandit Setting
|
|
- Jade Harrell
- 5 years ago
- Views:
Transcription
1 Online Learning: Bandit Setting Daniel asabi Summer 04 Last Update: October 0, 06 Introduction [TODO Bandits. Stocastic setting Suppose tere exists unknown distributions ν,..., ν, suc tat te loss at eac iteration is cosen as l i,t ν i. Terefore te mean for eac of tese distributions can be represented as µ k E[l k,t. Denote te least expected loss wit µ min k {,...,} µ k. Define τ i (t to be te number of times arm i as been pulled. More formally: τ i (t t {I s i} s Lemma. Te (pseudo regret can be written as R(T k E [τ i (T were k is te difference between te mean loss of te action cosen and te action wit minimum loss: k µ k µ. Te stationarity assumption is implicit ere; te distributions are not canged across time orizon.
2 Proof. R(T E i l It,t min E l k,t k {,...,} i E [E [l It,t I t T µ E [µ It µ (µ k µ P(I t k k {,...,} T k k {,...,} k {,...,} k E [τ k (T P(I t k Terefore te only ting we need to worry about is τ k (T, te number of times eac armed is pulled. Exploration first: Consider a simple strategy: sample eac arm for C many times (in any order, ten start decision making decisions. Denote te empirical estimate mean for eac action wit ˆµ k. Here is te suggested algoritm:. For a fixed probability δ (0,, sample eac arm for C times, and computer teir empirical mean ˆµ k. For te rest of te T C remaining iterations, do te action wic as te minimum loss: ˆk arg min k {,...,} ˆµ k,t Using te Hoeffding bound we know tat: log(/δ ˆµ k µ k <, k {,..., } C wic means tat, te bigger te size of samples C are, te better our estimate of te means are, as expected. Define min{ i : i > 0} wic is te minimum difference between te true action means. In order to guarantee tat our we always coose te correct action wit mean µ, we need to make sure tat our estimates satisfy ˆµ k µ k < /: log(/δ < / C > ln(/δ C Te regret incurred for te first C iterations of sampling actions is See ttp://web.engr.illinois.edu/~kasab/learn/concentration.pdf
3 Optimisim in te face of uncertainty (α-ucb Define our estimate of eac mean until time t to be ˆµ k,t t τ k (t s l k,s{i s k}. Define te upper-confidence bound on action (arm k at time t to be U k,t ˆµ k,t + α log(t τ k (t, were α > 0 is a parameter wic controls te upper-bound estimates and we will set it later. At eac iteration coose te actions to be I t arg max k {,...,} U k,t. Wit tis coice we know tat E[ It I t µ µ It w..p. U k,t µ It max k {,...,} U k,t µ It Lemma. If I t i i (incorrect action ten U i,t U i,t. Te mistake migt be at least due to one of te following events.. A (t {U i,t µ }: te upper-bound estimate on te true action is too small.. A (t {U i,t µ ( µ i + i }: te upper-bound estimate on action i is too big. α ln T 3. A 3 (t {τ i (t }: number of samples from action i is too small. i Proof. To prove it, we can use proof by contradiction. Suppose all of te above are false; we will sow tat essentially I t i : U i,t > µ A is false µ i + i α ln T > µ i + τ i (t α ln t µ i + τ i (t α ln t ˆµ i,t + τ i (t A 3 is false A is false Lemma 3. P (A (t t α and P (A (t t α Proof. Again we use te Hoeffding bound: ( α ln t P ˆµ k,t µ t α s were s is te number of times te action i is sampled. Now we can sow tat ( ( t { } P (A (t P ˆµ k,t µ α ln t α ln t P ˆµ k τ k (t,t µ s s ( t α ln t P ˆµ k,t µ s s t t α t α s 3
4 Te claim P (A (t t α can be proved in a similar way. Lemma 4. Proof. E[τ k (T E[τ k (T E {I t k} α log T i + α, [ T E {I t k, τ k (t t 0 } + t 0 + tt 0 + k k {I t k, τ k (t > t 0 } P {I t k, τ k (t > t 0 } α ln T Now define t 0. Terefore te event A 3 of lemma as been satisfied. We can furter i simplify te previous equation: E[τ k (T t 0 + t 0 + t 0 + t 0 + tt 0 + tt 0 + tt 0 + tt 0 + Te last inequality comes from te fact tat P {I t k, τ k (t > t 0 } P {A (t A (t} [P {A (t} + P {A (t} t α t 0 + t α dt α t α t 0 + α Using te results of Lemma 4 into Lemma we would get: R(T. Adversarial setting k ( α log T k + α ( α k log T + α k 4
5 Input: decay parameters { } T. Initialize: Uniform distribution p [p,,..., p, over te set {,..., }. For t,..., T : Draw an arm I t based on probability distribution p t. Create te loss value ˆl i,t, based on l i,t and p t. Update te commulative loss ˆL i,t t s ˆl i,s. Update te probability distribution over actions: exp ( ˆLi,t p i,t+, for eac i ( η exp t ˆLk,t Lemma 5. For any sequence of actions in Algoritm??, wit non-increasing positive sequence η, η,..., we ave: p k,tˆlk,t min {,...,} ( T p,tˆl,t p k,t (ˆlk,t + ln Proof. We prove te inequality for any decisions and ignore te minimum. Terefore te left side is: T ( p T k,tˆl k,t p,tˆl,t. Te log-moment of te p k,tˆlk,t E k ptˆlk,t E k ptˆlk,t + ( ˆl,t ( ˆl,t ( redundant p k,tˆlk,t E k ptˆlk,t E k ptˆlk,t + ( ˆl,t ( ˆl,t ln exp E k pt ˆlk,t + ( ˆl,t ( ˆl,t ( ln exp E k pt ˆlk,t E pt exp ( ˆl,t ( ˆl,t ln E pt (exp E k pt ˆlk,t exp ( ˆl,t ( ˆl,t ( (ˆl,t E k ptˆlk,t ( ˆl,t 5
6 Now we simplify left two terms in Equation. In te following, we use te two inequalities ln x x and exp( x + x x / eac once: ( ˆl,t + E k ptˆlk,t ( E pt exp ( ˆl,t + E k ptˆlk,t E pt (exp ( ˆl,t + ˆl,t ηt E p tˆl,t E p tˆl,t Define te sortand notation Φ t (η η ln ( η exp ˆL,t. ( ˆl,t ln ln p,t exp ( ˆl,t exp ( ˆL,t exp ( (ˆL,t ( η ˆL,t exp t ˆLk,t exp ( ˆL,t ln η Φ t ( Φ t ( t ( η exp t ˆLk,t Summing te time index we ave: p k,tˆlk,t p It,tˆl It,t E p tˆl,t + Φ t ( Φ t ( Φ t ( Φ t ( (Φ t (+ Φ t ( Φ T ( Note tat Φ 0 (η 0. Also, Φ T ( ln Ten te summation becomes: p k,tˆlk,t p It,tˆl It,t ln ln exp ( ˆL,T exp ( ˆL,T ln ln exp ( ˆL,T ln ˆl,t E It p tˆlk,t E p tˆl,t + ln + (Φ t (+ Φ t ( 6
7 We can sow tat Φ t(η 0. Since we assumed tat for any t +. Terefore Φ t (+ Φ t ( 0. wic is te desired result. p k,tˆlk,t p It,tˆl It,t E p tˆl,t + ln Corollary. For any sequence of actions in Algoritm??, wit non-increasing positive sequence η, η,..., we ave: E p k,tˆlk,t min {,...,} ( [ T E p,tˆl,t E p k,t (ˆlk,t Proof. Take expectation from bot sides and use te fact tat E [min [. min [E [.... Te EXP3 algoritm If te second step in te Algoritm?? is ˆl i,t l i,t {I t i} /p t,i. Lemma 6. For te EXP3 algoritm te expected regret is bounded by + ln + ln For proper coice of { } T te overall bound is n ln Proof. p k,t (ˆlk,t T p k,t (l k,t /p k,t {I t k} p I t,t (l It,t/p It,t l I t,t/p It,t 7
8 Since te decisions I t are made in stocastic fasion we need to find te expectation wit respect to I t. [ T E p k,t (ˆlk,t E l I t,t/p It,t [ ηt E l I t,t/p It,t [ ηt E./p I t,t Wic would give te general form of te bound for EXP3. If we set te result. ln T we would get + ln ln T T + ln T ln T ln + T ln T ln 8
9 .3 Lower bounds.3. Preliminaries Te L divergence as??? property. L (p(x, y q(x, y L (p(x q(x + L (p(y x q(y x L (p(x q(x + p(xl (p(y x q(y x x Te Pinsker s inequality creates connection between te L divergence and te total variations divergence: sup p(x q(x L (p q x.3. Lower bounding... Teorem. Suppose Y i,, Y i,,... te i.i.d. sequence of costs. We want to find a lower bound on te regret. Te lower bound needs to old for any distribution of rewards (specifically te worst case of te distributions, tus inf wit respect to te reward distributions. It also needs to old to te best forecaster one can design (tus sup wit respect to forecasters. inf sup ( E T Y i,t min E Y i,t n i {0,...,} 0 Proof. Te idea of te proof is to analyze te beavior of any forecaster against two distributions tat differ sligtly: ( in one all of te distributions are /. ( in te oter all of te arm distributions are / except one wic is / + ɛ. Lemma 7. inf sup ( E T Y i,t min E Y i,t T ɛ ( i {0,...,} ɛ ln + ɛ ɛ T Proof. Define te loss l,t representing te loss value at time t for action. We coose action {,..., }. Define + different games. In eac of te games te distribution of losses are different. For te i-t game i, all of te loss values are iid random variables distributed wit Bernoulli of bias ɛ, except te -t arm, wic is distributed wit Bernoulli distribution of bias ɛ. Also define an additional game in wic all of te losses ave Bernoulli distribution wit bias ɛ. Suppose I t is te arm played by te algoritm at time t. Denote te empirical distribution over actions up to time t wit q t (q,t,..., q,t : q k,t t t {I t k} t 9
10 Let J be a random variable distributed according to q t. Define P to be te law of J, wen te forecaster plays te -t game, and we know: [ t P (J E {I t k} t were E [. means expectation wit respect to te distribution of te -t game. Te regret for te -t game is: R(T E (l,t l It,t Te regret can be simplified in te following form: E (l,t l It,t ɛt P (J Note tat P(l,t and l It,t 0( 0 + P(l,t 0 and l It,t (0 +ɛ part needs modification and more clear explanation. wic can be written as t ɛt ( P (J Averaging over all of te games we ave: [ T ( E (l,t l It,t ɛt P (J ɛ ɛ. Tis Note tat we want a lower bound on max (not average. But since average is less tan max, a good lower bound on average would also work for us. By te Pinsker s inequality we ave: P (J P + L(P P and ence P (J + L(P N P Note tat in te last step we used te fact tat te squared root function is concave and P (J. Te next step is to establis te distance measure between te probability distributions of losses for different games. T L(P T PT L(P0 P0 + t T L(P 0 P0 + t ( ɛ L + ɛ P t (y t L ( P t (. yt P t (. yt y t y t ;I ti ( ɛ P t (y t L + ɛ [ T E {I t } + y t ;I t i ( + ɛ P t (y t L + ɛ 0
11 L(P P L(P P [ L( ɛ + ɛ T E {I t } T L( ɛ + ɛ Tis last step I am confused wy? We know L( ɛ + ɛ ɛ ln + ɛ ɛ ten T L(P P ɛ ln + ɛ ɛ So far te lower bound is te following: sup R(T ɛt ( T +ɛ ɛ ln ɛ Te final step is te ɛ-tuning of te bound. Since te lower bound olds for any ɛ / we coose it in a way tat it N attains its biggest value. If we set ɛ α MT, were α is a real number to be tuned, tis would give us te desired result. ttp://cseweb.ucsd.edu/~kamalika/teacing/cse9w/lecture5.pdf ttp://courses. cs.wasington.edu/courses/cse599s/sp/scribes.tml Lower bounds: ttp:// berkeley.edu/~bartlett/courses/04fall-cs94stat60/lectures/bandit-lower-bound-notes. pdf ttp:// 4_Scribe_Notes.pdf ttp:// So good! Tis contains a very nice comparison between UCB and EXP3: ttp:// presentations/cmu_bandits.pdf 3 Contextual Bandits In contextual bandit unlike te standard bandits, te importance of actions are dependent on te context on wic tey are being done. In oter words weter a single action is optimal or not depends on its context. A simple example is to consider two contexts weekday and weekend. An action wic migt be optimal during weekend is not necessarily te best action for te weekday. Just like te standard bandits, in te contextual bandit problem, on eac of T rounds a learner is presented wit te coice of taking one of actions. Before making te coice of action, te learner observes a feature vector (context associated wit eac of its possible coices. In tis setting te learner as access to a ypotesis class, in wic te ypoteses receives action features (context and predict wic action will give te best reward. If te learner can guarantee to do nearly as well as te prediction of te best ypotesis in indsigt (to ave low regret, te learner is said to successfully compete wit tat class.
12 Algoritm Regret Hig probability bound Contextual Efficient Exp3.P O(T / Y N Y UCB O(T / Y N Y Exp4 O( T ln N N Y N Epoc-greedy O(T /3 Y Y Y LinUCB [4 O( Exp4.P O(T / Y Y N Table : Properties of popular bandit algoritms; N experts, T number of rounds, number of possible actions. If we ignore te contextual information we can just use te existing vanilla bandit algoritms. Terefore aving te contextual information one sould be able to get better guarantees. One way of looking at te contextual bandits is to tink of it as means to connect to te supervised learned, wic requires input features supplied by users for making predictions. An important point ere is tat te bandit problems are not supervised learning problems; for example in a click-or-not on one ad does not generally tell you if a different ad would ave been clicked on. Instead tis problem is inerently exploration-exploitation problem. Tat said, te solution to te contextual bandit sould be intuitive and reasonable from supervised learning setting; in fact some of te well-establised supervised learning tecniques will come andy in analysis of contextual bandits. Here is an approac wic tries to adapt te existing bandit algoritms as blackbox. Suppose te size of te context space is bounded and is small. Run a different k-armed Bandit for every value of context vector. Te regret and amount of information required to do well scales linearly in te number of contexts. Tis approac is a little counter-intuitive; good supervised learning algoritms often require information wic is (essentially independent of te number of contexts (instead tey depend on te complexity of te concept class define on top of te features/contexts. One can get inspiration from supervised learning. Define a policy space H from wic policies are cosen, and treat every policy (x H as a different arm. Tis removes an explicit dependence on te number of contexts, but it creates a linear dependence on te number of policies. Via Occams razor/vc dimension/margin bounds, we already know tat supervised learning requires experience muc smaller tan te number of policies. Te name contextual bandit is borrowed from Langford and Zang [3 but it as been known under oter names as well; e.g. bandit problems wit expert advice [, associative reinforcement learning [. Bibliograpical notes References [ Peter Auer. Using confidence bounds for exploitation-exploration trade-offs. J. Mac. Learn. Researc, 3:397 4, 003. [ Andrew G Barto and P Anandan. Pattern-recognizing stocastic learning automata. Systems, Man and Cybernetics, IEEE Transactions on, (3: , 985.
13 [3 Jon Langford and Tong Zang. Te epoc-greedy algoritm for multi-armed bandits wit side information. In Adv. Neural Info. Proc. Sys. (NIPS, pages 87 84, 008. [4 Liong Li, Wei Cu, Jon Langford, and Robert E Scapire. A contextual-bandit approac to personalized news article recommendation. pages ACM, 00. 3
Continuity and Differentiability Worksheet
Continuity and Differentiability Workseet (Be sure tat you can also do te grapical eercises from te tet- Tese were not included below! Typical problems are like problems -3, p. 6; -3, p. 7; 33-34, p. 7;
More informationBandit Algorithms. Zhifeng Wang ... Department of Statistics Florida State University
Bandit Algorithms Zhifeng Wang Department of Statistics Florida State University Outline Multi-Armed Bandits (MAB) Exploration-First Epsilon-Greedy Softmax UCB Thompson Sampling Adversarial Bandits Exp3
More informationLIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION
LIMITATIONS OF EULER S METHOD FOR NUMERICAL INTEGRATION LAURA EVANS.. Introduction Not all differential equations can be explicitly solved for y. Tis can be problematic if we need to know te value of y
More informationMVT and Rolle s Theorem
AP Calculus CHAPTER 4 WORKSHEET APPLICATIONS OF DIFFERENTIATION MVT and Rolle s Teorem Name Seat # Date UNLESS INDICATED, DO NOT USE YOUR CALCULATOR FOR ANY OF THESE QUESTIONS In problems 1 and, state
More informationFundamentals of Concept Learning
Aims 09s: COMP947 Macine Learning and Data Mining Fundamentals of Concept Learning Marc, 009 Acknowledgement: Material derived from slides for te book Macine Learning, Tom Mitcell, McGraw-Hill, 997 ttp://www-.cs.cmu.edu/~tom/mlbook.tml
More information1. Questions (a) through (e) refer to the graph of the function f given below. (A) 0 (B) 1 (C) 2 (D) 4 (E) does not exist
Mat 1120 Calculus Test 2. October 18, 2001 Your name Te multiple coice problems count 4 points eac. In te multiple coice section, circle te correct coice (or coices). You must sow your work on te oter
More informationMath 161 (33) - Final exam
Name: Id #: Mat 161 (33) - Final exam Fall Quarter 2015 Wednesday December 9, 2015-10:30am to 12:30am Instructions: Prob. Points Score possible 1 25 2 25 3 25 4 25 TOTAL 75 (BEST 3) Read eac problem carefully.
More information2.11 That s So Derivative
2.11 Tat s So Derivative Introduction to Differential Calculus Just as one defines instantaneous velocity in terms of average velocity, we now define te instantaneous rate of cange of a function at a point
More informationFunction Composition and Chain Rules
Function Composition and s James K. Peterson Department of Biological Sciences and Department of Matematical Sciences Clemson University Marc 8, 2017 Outline 1 Function Composition and Continuity 2 Function
More informationAlireza Shafaei. Machine Learning Reading Group The University of British Columbia Summer 2017
s s Machine Learning Reading Group The University of British Columbia Summer 2017 (OCO) Convex 1/29 Outline (OCO) Convex Stochastic Bernoulli s (OCO) Convex 2/29 At each iteration t, the player chooses
More informationYishay Mansour. AT&T Labs and Tel-Aviv University. design special-purpose planning algorithms that exploit. this structure.
A Sparse Sampling Algoritm for Near-Optimal Planning in Large Markov Decision Processes Micael Kearns AT&T Labs mkearns@researc.att.com Yisay Mansour AT&T Labs and Tel-Aviv University mansour@researc.att.com
More informationMulti-armed bandit models: a tutorial
Multi-armed bandit models: a tutorial CERMICS seminar, March 30th, 2016 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions)
More informationDerivatives of Exponentials
mat 0 more on derivatives: day 0 Derivatives of Eponentials Recall tat DEFINITION... An eponential function as te form f () =a, were te base is a real number a > 0. Te domain of an eponential function
More informationContinuity and Differentiability of the Trigonometric Functions
[Te basis for te following work will be te definition of te trigonometric functions as ratios of te sides of a triangle inscribed in a circle; in particular, te sine of an angle will be defined to be te
More informationA = h w (1) Error Analysis Physics 141
Introduction In all brances of pysical science and engineering one deals constantly wit numbers wic results more or less directly from experimental observations. Experimental observations always ave inaccuracies.
More informationMathematics 5 Worksheet 11 Geometry, Tangency, and the Derivative
Matematics 5 Workseet 11 Geometry, Tangency, and te Derivative Problem 1. Find te equation of a line wit slope m tat intersects te point (3, 9). Solution. Te equation for a line passing troug a point (x
More informationThe Multi-Arm Bandit Framework
The Multi-Arm Bandit Framework A. LAZARIC (SequeL Team @INRIA-Lille) ENS Cachan - Master 2 MVA SequeL INRIA Lille MVA-RL Course In This Lecture A. LAZARIC Reinforcement Learning Algorithms Oct 29th, 2013-2/94
More informationRunge-Kutta methods. With orders of Taylor methods yet without derivatives of f (t, y(t))
Runge-Kutta metods Wit orders of Taylor metods yet witout derivatives of f (t, y(t)) First order Taylor expansion in two variables Teorem: Suppose tat f (t, y) and all its partial derivatives are continuous
More informationAdvanced Machine Learning
Advanced Machine Learning Bandit Problems MEHRYAR MOHRI MOHRI@ COURANT INSTITUTE & GOOGLE RESEARCH. Multi-Armed Bandit Problem Problem: which arm of a K-slot machine should a gambler pull to maximize his
More informationDifferentiation in higher dimensions
Capter 2 Differentiation in iger dimensions 2.1 Te Total Derivative Recall tat if f : R R is a 1-variable function, and a R, we say tat f is differentiable at x = a if and only if te ratio f(a+) f(a) tends
More information7.1 Using Antiderivatives to find Area
7.1 Using Antiderivatives to find Area Introduction finding te area under te grap of a nonnegative, continuous function f In tis section a formula is obtained for finding te area of te region bounded between
More informationPoisson Equation in Sobolev Spaces
Poisson Equation in Sobolev Spaces OcMountain Dayligt Time. 6, 011 Today we discuss te Poisson equation in Sobolev spaces. It s existence, uniqueness, and regularity. Weak Solution. u = f in, u = g on
More information4. The slope of the line 2x 7y = 8 is (a) 2/7 (b) 7/2 (c) 2 (d) 2/7 (e) None of these.
Mat 11. Test Form N Fall 016 Name. Instructions. Te first eleven problems are wort points eac. Te last six problems are wort 5 points eac. For te last six problems, you must use relevant metods of algebra
More information2.8 The Derivative as a Function
.8 Te Derivative as a Function Typically, we can find te derivative of a function f at many points of its domain: Definition. Suppose tat f is a function wic is differentiable at every point of an open
More informationExam 1 Review Solutions
Exam Review Solutions Please also review te old quizzes, and be sure tat you understand te omework problems. General notes: () Always give an algebraic reason for your answer (graps are not sufficient),
More informationIntroduction to Bandit Algorithms. Introduction to Bandit Algorithms
Stochastic K-Arm Bandit Problem Formulation Consider K arms (actions) each correspond to an unknown distribution {ν k } K k=1 with values bounded in [0, 1]. At each time t, the agent pulls an arm I t {1,...,
More informationStochastic bandits: Explore-First and UCB
CSE599s, Spring 2014, Online Learning Lecture 15-2/19/2014 Stochastic bandits: Explore-First and UCB Lecturer: Brendan McMahan or Ofer Dekel Scribe: Javad Hosseini In this lecture, we like to answer this
More informationBandit models: a tutorial
Gdt COS, December 3rd, 2015 Multi-Armed Bandit model: general setting K arms: for a {1,..., K}, (X a,t ) t N is a stochastic process. (unknown distributions) Bandit game: a each round t, an agent chooses
More informationTime (hours) Morphine sulfate (mg)
Mat Xa Fall 2002 Review Notes Limits and Definition of Derivative Important Information: 1 According to te most recent information from te Registrar, te Xa final exam will be eld from 9:15 am to 12:15
More informationLab 6 Derivatives and Mutant Bacteria
Lab 6 Derivatives and Mutant Bacteria Date: September 27, 20 Assignment Due Date: October 4, 20 Goal: In tis lab you will furter explore te concept of a derivative using R. You will use your knowledge
More informationMath 212-Lecture 9. For a single-variable function z = f(x), the derivative is f (x) = lim h 0
3.4: Partial Derivatives Definition Mat 22-Lecture 9 For a single-variable function z = f(x), te derivative is f (x) = lim 0 f(x+) f(x). For a function z = f(x, y) of two variables, to define te derivatives,
More informationNUMERICAL DIFFERENTIATION
NUMERICAL IFFERENTIATION FIRST ERIVATIVES Te simplest difference formulas are based on using a straigt line to interpolate te given data; tey use two data pints to estimate te derivative. We assume tat
More informationHomework 1 Due: Wednesday, September 28, 2016
0-704 Information Processing and Learning Fall 06 Homework Due: Wednesday, September 8, 06 Notes: For positive integers k, [k] := {,..., k} denotes te set of te first k positive integers. Wen p and Y q
More informationA Reconsideration of Matter Waves
A Reconsideration of Matter Waves by Roger Ellman Abstract Matter waves were discovered in te early 20t century from teir wavelengt, predicted by DeBroglie, Planck's constant divided by te particle's momentum,
More information3.2 THE FUNDAMENTAL WELFARE THEOREMS
Essential Microeconomics -1-3.2 THE FUNDMENTL WELFRE THEOREMS Walrasian Equilibrium 2 First welfare teorem 3 Second welfare teorem (conve, differentiable economy) 12 Te omotetic preference 2 2 economy
More informationCombining functions: algebraic methods
Combining functions: algebraic metods Functions can be added, subtracted, multiplied, divided, and raised to a power, just like numbers or algebra expressions. If f(x) = x 2 and g(x) = x + 2, clearly f(x)
More informationVolume 29, Issue 3. Existence of competitive equilibrium in economies with multi-member households
Volume 29, Issue 3 Existence of competitive equilibrium in economies wit multi-member ouseolds Noriisa Sato Graduate Scool of Economics, Waseda University Abstract Tis paper focuses on te existence of
More informationLecture 19: UCB Algorithm and Adversarial Bandit Problem. Announcements Review on stochastic multi-armed bandit problem
Lecture 9: UCB Algorithm and Adversarial Bandit Problem EECS598: Prediction and Learning: It s Only a Game Fall 03 Lecture 9: UCB Algorithm and Adversarial Bandit Problem Prof. Jacob Abernethy Scribe:
More informationch (for some fixed positive number c) reaching c
GSTF Journal of Matematics Statistics and Operations Researc (JMSOR) Vol. No. September 05 DOI 0.60/s4086-05-000-z Nonlinear Piecewise-defined Difference Equations wit Reciprocal and Cubic Terms Ramadan
More informationCopyright c 2008 Kevin Long
Lecture 4 Numerical solution of initial value problems Te metods you ve learned so far ave obtained closed-form solutions to initial value problems. A closedform solution is an explicit algebriac formula
More informationFunctions of the Complex Variable z
Capter 2 Functions of te Complex Variable z Introduction We wis to examine te notion of a function of z were z is a complex variable. To be sure, a complex variable can be viewed as noting but a pair of
More information3.1 Extreme Values of a Function
.1 Etreme Values of a Function Section.1 Notes Page 1 One application of te derivative is finding minimum and maimum values off a grap. In precalculus we were only able to do tis wit quadratics by find
More informationModel Specification Testing in Nonparametric and Semiparametric Time Series Econometrics 1
Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics 1 By Jiti Gao 2 and Maxwell King 3 Abstract We propose a simultaneous model specification procedure for te conditional
More informationOSCILLATION OF SOLUTIONS TO NON-LINEAR DIFFERENCE EQUATIONS WITH SEVERAL ADVANCED ARGUMENTS. Sandra Pinelas and Julio G. Dix
Opuscula Mat. 37, no. 6 (2017), 887 898 ttp://dx.doi.org/10.7494/opmat.2017.37.6.887 Opuscula Matematica OSCILLATION OF SOLUTIONS TO NON-LINEAR DIFFERENCE EQUATIONS WITH SEVERAL ADVANCED ARGUMENTS Sandra
More informationNew Algorithms for Contextual Bandits
New Algorithms for Contextual Bandits Lev Reyzin Georgia Institute of Technology Work done at Yahoo! 1 S A. Beygelzimer, J. Langford, L. Li, L. Reyzin, R.E. Schapire Contextual Bandit Algorithms with Supervised
More informationRegularized Regression
Regularized Regression David M. Blei Columbia University December 5, 205 Modern regression problems are ig dimensional, wic means tat te number of covariates p is large. In practice statisticians regularize
More informationMath 31A Discussion Notes Week 4 October 20 and October 22, 2015
Mat 3A Discussion Notes Week 4 October 20 and October 22, 205 To prepare for te first midterm, we ll spend tis week working eamples resembling te various problems you ve seen so far tis term. In tese notes
More informationExercises for numerical differentiation. Øyvind Ryan
Exercises for numerical differentiation Øyvind Ryan February 25, 2013 1. Mark eac of te following statements as true or false. a. Wen we use te approximation f (a) (f (a +) f (a))/ on a computer, we can
More informationMAT244 - Ordinary Di erential Equations - Summer 2016 Assignment 2 Due: July 20, 2016
MAT244 - Ordinary Di erential Equations - Summer 206 Assignment 2 Due: July 20, 206 Full Name: Student #: Last First Indicate wic Tutorial Section you attend by filling in te appropriate circle: Tut 0
More information1 The concept of limits (p.217 p.229, p.242 p.249, p.255 p.256) 1.1 Limits Consider the function determined by the formula 3. x since at this point
MA00 Capter 6 Calculus and Basic Linear Algebra I Limits, Continuity and Differentiability Te concept of its (p.7 p.9, p.4 p.49, p.55 p.56). Limits Consider te function determined by te formula f Note
More informationClick here to see an animation of the derivative
Differentiation Massoud Malek Derivative Te concept of derivative is at te core of Calculus; It is a very powerful tool for understanding te beavior of matematical functions. It allows us to optimize functions,
More informationNUMERICAL DIFFERENTIATION. James T. Smith San Francisco State University. In calculus classes, you compute derivatives algebraically: for example,
NUMERICAL DIFFERENTIATION James T Smit San Francisco State University In calculus classes, you compute derivatives algebraically: for example, f( x) = x + x f ( x) = x x Tis tecnique requires your knowing
More informationPOLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY
APPLICATIONES MATHEMATICAE 36, (29), pp. 2 Zbigniew Ciesielski (Sopot) Ryszard Zieliński (Warszawa) POLYNOMIAL AND SPLINE ESTIMATORS OF THE DISTRIBUTION FUNCTION WITH PRESCRIBED ACCURACY Abstract. Dvoretzky
More informationCOS 402 Machine Learning and Artificial Intelligence Fall Lecture 22. Exploration & Exploitation in Reinforcement Learning: MAB, UCB, Exp3
COS 402 Machine Learning and Artificial Intelligence Fall 2016 Lecture 22 Exploration & Exploitation in Reinforcement Learning: MAB, UCB, Exp3 How to balance exploration and exploitation in reinforcement
More informationLinearized Primal-Dual Methods for Linear Inverse Problems with Total Variation Regularization and Finite Element Discretization
Linearized Primal-Dual Metods for Linear Inverse Problems wit Total Variation Regularization and Finite Element Discretization WENYI TIAN XIAOMING YUAN September 2, 26 Abstract. Linear inverse problems
More information1 + t5 dt with respect to x. du = 2. dg du = f(u). du dx. dg dx = dg. du du. dg du. dx = 4x3. - page 1 -
Eercise. Find te derivative of g( 3 + t5 dt wit respect to. Solution: Te integrand is f(t + t 5. By FTC, f( + 5. Eercise. Find te derivative of e t2 dt wit respect to. Solution: Te integrand is f(t e t2.
More informationBandits and Exploration: How do we (optimally) gather information? Sham M. Kakade
Bandits and Exploration: How do we (optimally) gather information? Sham M. Kakade Machine Learning for Big Data CSE547/STAT548 University of Washington S. M. Kakade (UW) Optimization for Big data 1 / 22
More informationThe Priestley-Chao Estimator
Te Priestley-Cao Estimator In tis section we will consider te Pristley-Cao estimator of te unknown regression function. It is assumed tat we ave a sample of observations (Y i, x i ), i = 1,..., n wic are
More informationAnalytic Functions. Differentiable Functions of a Complex Variable
Analytic Functions Differentiable Functions of a Complex Variable In tis capter, we sall generalize te ideas for polynomials power series of a complex variable we developed in te previous capter to general
More informationTwo optimization problems in a stochastic bandit model
Two optimization problems in a stochastic bandit model Emilie Kaufmann joint work with Olivier Cappé, Aurélien Garivier and Shivaram Kalyanakrishnan Journées MAS 204, Toulouse Outline From stochastic optimization
More informationFinancial Econometrics Prof. Massimo Guidolin
CLEFIN A.A. 2010/2011 Financial Econometrics Prof. Massimo Guidolin A Quick Review of Basic Estimation Metods 1. Were te OLS World Ends... Consider two time series 1: = { 1 2 } and 1: = { 1 2 }. At tis
More informationSECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY
(Section 3.2: Derivative Functions and Differentiability) 3.2.1 SECTION 3.2: DERIVATIVE FUNCTIONS and DIFFERENTIABILITY LEARNING OBJECTIVES Know, understand, and apply te Limit Definition of te Derivative
More information4.2 - Richardson Extrapolation
. - Ricardson Extrapolation. Small-O Notation: Recall tat te big-o notation used to define te rate of convergence in Section.: Definition Let x n n converge to a number x. Suppose tat n n is a sequence
More informationERROR BOUNDS FOR THE METHODS OF GLIMM, GODUNOV AND LEVEQUE BRADLEY J. LUCIER*
EO BOUNDS FO THE METHODS OF GLIMM, GODUNOV AND LEVEQUE BADLEY J. LUCIE* Abstract. Te expected error in L ) attimet for Glimm s sceme wen applied to a scalar conservation law is bounded by + 2 ) ) /2 T
More informationThe derivative function
Roberto s Notes on Differential Calculus Capter : Definition of derivative Section Te derivative function Wat you need to know already: f is at a point on its grap and ow to compute it. Wat te derivative
More informationWalrasian Equilibrium in an exchange economy
Microeconomic Teory -1- Walrasian equilibrium Walrasian Equilibrium in an ecange economy 1. Homotetic preferences 2 2. Walrasian equilibrium in an ecange economy 11 3. Te market value of attributes 18
More informationNatural Language Understanding. Recap: probability, language models, and feedforward networks. Lecture 12: Recurrent Neural Networks and LSTMs
Natural Language Understanding Lecture 12: Recurrent Neural Networks and LSTMs Recap: probability, language models, and feedforward networks Simple Recurrent Networks Adam Lopez Credits: Mirella Lapata
More informationPolynomial Interpolation
Capter 4 Polynomial Interpolation In tis capter, we consider te important problem of approximatinga function fx, wose values at a set of distinct points x, x, x,, x n are known, by a polynomial P x suc
More informationRevisiting the Exploration-Exploitation Tradeoff in Bandit Models
Revisiting the Exploration-Exploitation Tradeoff in Bandit Models joint work with Aurélien Garivier (IMT, Toulouse) and Tor Lattimore (University of Alberta) Workshop on Optimization and Decision-Making
More informationEFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS
Statistica Sinica 24 2014, 395-414 doi:ttp://dx.doi.org/10.5705/ss.2012.064 EFFICIENCY OF MODEL-ASSISTED REGRESSION ESTIMATORS IN SAMPLE SURVEYS Jun Sao 1,2 and Seng Wang 3 1 East Cina Normal University,
More informationOnline Learning and Sequential Decision Making
Online Learning and Sequential Decision Making Emilie Kaufmann CNRS & CRIStAL, Inria SequeL, emilie.kaufmann@univ-lille.fr Research School, ENS Lyon, Novembre 12-13th 2018 Emilie Kaufmann Sequential Decision
More informationRobotic manipulation project
Robotic manipulation project Bin Nguyen December 5, 2006 Abstract Tis is te draft report for Robotic Manipulation s class project. Te cosen project aims to understand and implement Kevin Egan s non-convex
More information3. THE EXCHANGE ECONOMY
Essential Microeconomics -1-3. THE EXCHNGE ECONOMY Pareto efficient allocations 2 Edgewort box analysis 5 Market clearing prices 13 Walrasian Equilibrium 16 Equilibrium and Efficiency 22 First welfare
More informationConvexity and Smoothness
Capter 4 Convexity and Smootness 4.1 Strict Convexity, Smootness, and Gateaux Differentiablity Definition 4.1.1. Let X be a Banac space wit a norm denoted by. A map f : X \{0} X \{0}, f f x is called a
More information. If lim. x 2 x 1. f(x+h) f(x)
Review of Differential Calculus Wen te value of one variable y is uniquely determined by te value of anoter variable x, ten te relationsip between x and y is described by a function f tat assigns a value
More information3.4 Worksheet: Proof of the Chain Rule NAME
Mat 1170 3.4 Workseet: Proof of te Cain Rule NAME Te Cain Rule So far we are able to differentiate all types of functions. For example: polynomials, rational, root, and trigonometric functions. We are
More information0.1 Differentiation Rules
0.1 Differentiation Rules From our previous work we ve seen tat it can be quite a task to calculate te erivative of an arbitrary function. Just working wit a secon-orer polynomial tings get pretty complicate
More informationThe total error in numerical differentiation
AMS 147 Computational Metods and Applications Lecture 08 Copyrigt by Hongyun Wang, UCSC Recap: Loss of accuracy due to numerical cancellation A B 3, 3 ~10 16 In calculating te difference between A and
More informationarxiv: v3 [cs.ds] 4 Aug 2017
Non-preemptive Sceduling in a Smart Grid Model and its Implications on Macine Minimization Fu-Hong Liu 1, Hsiang-Hsuan Liu 1,2, and Prudence W.H. Wong 2 1 Department of Computer Science, National Tsing
More informationMath 312 Lecture Notes Modeling
Mat 3 Lecture Notes Modeling Warren Weckesser Department of Matematics Colgate University 5 7 January 006 Classifying Matematical Models An Example We consider te following scenario. During a storm, a
More informationStationary Gaussian Markov Processes As Limits of Stationary Autoregressive Time Series
Stationary Gaussian Markov Processes As Limits of Stationary Autoregressive Time Series Lawrence D. Brown, Pilip A. Ernst, Larry Sepp, and Robert Wolpert August 27, 2015 Abstract We consider te class,
More informationTHE IDEA OF DIFFERENTIABILITY FOR FUNCTIONS OF SEVERAL VARIABLES Math 225
THE IDEA OF DIFFERENTIABILITY FOR FUNCTIONS OF SEVERAL VARIABLES Mat 225 As we ave seen, te definition of derivative for a Mat 111 function g : R R and for acurveγ : R E n are te same, except for interpretation:
More informationDifferential Calculus (The basics) Prepared by Mr. C. Hull
Differential Calculus Te basics) A : Limits In tis work on limits, we will deal only wit functions i.e. tose relationsips in wic an input variable ) defines a unique output variable y). Wen we work wit
More informationSection 2.7 Derivatives and Rates of Change Part II Section 2.8 The Derivative as a Function. at the point a, to be. = at time t = a is
Mat 180 www.timetodare.com Section.7 Derivatives and Rates of Cange Part II Section.8 Te Derivative as a Function Derivatives ( ) In te previous section we defined te slope of te tangent to a curve wit
More informationUniversity Mathematics 2
University Matematics 2 1 Differentiability In tis section, we discuss te differentiability of functions. Definition 1.1 Differentiable function). Let f) be a function. We say tat f is differentiable at
More informationConsider a function f we ll specify which assumptions we need to make about it in a minute. Let us reformulate the integral. 1 f(x) dx.
Capter 2 Integrals as sums and derivatives as differences We now switc to te simplest metods for integrating or differentiating a function from its function samples. A careful study of Taylor expansions
More informationEfficient algorithms for for clone items detection
Efficient algoritms for for clone items detection Raoul Medina, Caroline Noyer, and Olivier Raynaud Raoul Medina, Caroline Noyer and Olivier Raynaud LIMOS - Université Blaise Pascal, Campus universitaire
More informationINTRODUCTION TO CALCULUS LIMITS
Calculus can be divided into two ke areas: INTRODUCTION TO CALCULUS Differential Calculus dealing wit its, rates of cange, tangents and normals to curves, curve sketcing, and applications to maima and
More information2.3 Algebraic approach to limits
CHAPTER 2. LIMITS 32 2.3 Algebraic approac to its Now we start to learn ow to find its algebraically. Tis starts wit te simplest possible its, and ten builds tese up to more complicated examples. Fact.
More informationBootstrap confidence intervals in nonparametric regression without an additive model
Bootstrap confidence intervals in nonparametric regression witout an additive model Dimitris N. Politis Abstract Te problem of confidence interval construction in nonparametric regression via te bootstrap
More informationOnline Learning under Full and Bandit Information
Online Learning under Full and Bandit Information Artem Sokolov Computerlinguistik Universität Heidelberg 1 Motivation 2 Adversarial Online Learning Hedge EXP3 3 Stochastic Bandits ε-greedy UCB Real world
More informationChapter 2 Limits and Continuity
4 Section. Capter Limits and Continuity Section. Rates of Cange and Limits (pp. 6) Quick Review.. f () ( ) () 4 0. f () 4( ) 4. f () sin sin 0 4. f (). 4 4 4 6. c c c 7. 8. c d d c d d c d c 9. 8 ( )(
More informationPrecalculus Test 2 Practice Questions Page 1. Note: You can expect other types of questions on the test than the ones presented here!
Precalculus Test 2 Practice Questions Page Note: You can expect oter types of questions on te test tan te ones presented ere! Questions Example. Find te vertex of te quadratic f(x) = 4x 2 x. Example 2.
More informationNew Distribution Theory for the Estimation of Structural Break Point in Mean
New Distribution Teory for te Estimation of Structural Break Point in Mean Liang Jiang Singapore Management University Xiaou Wang Te Cinese University of Hong Kong Jun Yu Singapore Management University
More informationIntegral Calculus, dealing with areas and volumes, and approximate areas under and between curves.
Calculus can be divided into two ke areas: Differential Calculus dealing wit its, rates of cange, tangents and normals to curves, curve sketcing, and applications to maima and minima problems Integral
More informationCSCE 478/878 Lecture 2: Concept Learning and the General-to-Specific Ordering
Outline Learning from eamples CSCE 78/878 Lecture : Concept Learning and te General-to-Specific Ordering Stepen D. Scott (Adapted from Tom Mitcell s slides) General-to-specific ordering over ypoteses Version
More informationBasic Nonparametric Estimation Spring 2002
Basic Nonparametric Estimation Spring 2002 Te following topics are covered today: Basic Nonparametric Regression. Tere are four books tat you can find reference: Silverman986, Wand and Jones995, Hardle990,
More informationLecture 10: Carnot theorem
ecture 0: Carnot teorem Feb 7, 005 Equivalence of Kelvin and Clausius formulations ast time we learned tat te Second aw can be formulated in two ways. e Kelvin formulation: No process is possible wose
More informationCS522 - Partial Di erential Equations
CS5 - Partial Di erential Equations Tibor Jánosi April 5, 5 Numerical Di erentiation In principle, di erentiation is a simple operation. Indeed, given a function speci ed as a closed-form formula, its
More informationChapter 5 FINITE DIFFERENCE METHOD (FDM)
MEE7 Computer Modeling Tecniques in Engineering Capter 5 FINITE DIFFERENCE METHOD (FDM) 5. Introduction to FDM Te finite difference tecniques are based upon approximations wic permit replacing differential
More information